Metadata-Version: 2.4
Name: itertoolkit
Version: 1.5.5
Summary: An itertools-inspired toolkit for cached iterator and data-structure processing
Requires-Python: >=3.11
Requires-Dist: gsppy>=5.3.0
Requires-Dist: matplotlib>=3.10.8
Requires-Dist: networkx>=3.6.1
Requires-Dist: numpy>=2.4.4
Requires-Dist: pandas>=3.0.2
Requires-Dist: plotly>=6.6.0
Requires-Dist: scikit-learn>=1.8.0
Requires-Dist: scipy>=1.17.1
Requires-Dist: seaborn>=0.13.2
Description-Content-Type: text/markdown

# itertoolkit

Functions creating iterators and cached data pipelines for efficient looping.

`itertoolkit` is an `itertoolkit`-inspired wrapper focused on practical data processing. It keeps the lazy, composable style of iterator algebra, then adds cache-aware helpers so repeated list and data-structure transformations run faster.

The goal is simple:

- Keep memory usage low with lazy iterators.
- Speed up repeated workloads with caching.
- Make iterator pipelines readable and reusable.

## Installation

```bash
pip install itertoolkit
```

## Quick Start

```python
from itertoolkit import count, islice

# Example: base itertoolkit stream
stream = (x * x for x in count(1))
print(list(islice(stream, 5)))  # [1, 4, 9, 16, 25]

# Example: cached computation workflow (concept)
# result = itertoolkit.cached_map(expensive_fn, dataset, cache_key="v1")
```

## Why It Is Faster

`itertoolkit` performance comes from combining:

- Lazy iteration, so intermediate materialization is avoided.
- Cache-first wrappers, so repeated transformations are reused.
- Composable pipelines, so complex loops stay compact and optimized.

In repeated analytics or feature-building jobs, the first pass computes and stores results, and later passes can fetch from cache instead of recomputing every step.

## Core Iterator Families

### General iterators

| Iterator concept | Input | Output shape | Typical use |
| --- | --- | --- | --- |
| Running reduction | iterable, func | incremental totals | rolling stats |
| Batching | iterable, n | tuples of size n | chunk processing |
| Chaining | multiple iterables | one continuous stream | merging sources |
| Selection | data + selectors | filtered stream | mask-based filtering |
| Windowing | iterable | adjacent pairs/windows | transition analysis |
| Truncation | predicate/slice | bounded output | safe handling of infinite streams |

### Combinatoric iterators

| Iterator concept | Output |
| --- | --- |
| Cartesian products | all pairings across inputs |
| Permutations | order-sensitive tuples |
| Combinations | order-insensitive unique tuples |
| Combinations with replacement | tuples allowing repeated values |

## Pipeline Pattern

Use this pattern when processing large lists, tables, graphs, or text records:

1. Start from one or more iterables.
2. Chain filtering, mapping, grouping, and batching.
3. Add cache boundaries around expensive stages.
4. Materialize only where needed (`list`, `tuple`, `DataFrame`, model input).

```python
from itertoolkit import chain

sources = [[1, 2, 3], [4, 5], [6]]
pipeline = (x * 10 for x in chain.from_iterable(sources) if x % 2 == 0)
print(list(pipeline))  # [20, 40, 60]
```

## Caching Strategy

Recommended caching behavior for data-heavy workloads:

- Key by transformation signature and input fingerprint.
- Keep deterministic steps cacheable.
- Invalidate cache on function/version changes.
- Persist long-running results between sessions.

This makes repeated preprocessing and feature extraction significantly cheaper.


## License

MIT