Metadata-Version: 2.3
Name: checkpointer
Version: 2.0.1
Summary: A Python library for memoizing function results with support for multiple storage backends, async runtimes, and automatic cache invalidation
Project-URL: Repository, https://github.com/Reddan/checkpointer.git
Author: Hampus Hallman
License: Copyright 2024 Hampus Hallman
        
        Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Requires-Python: >=3.12
Requires-Dist: relib
Description-Content-Type: text/markdown

# checkpointer &middot; [![License](https://img.shields.io/badge/license-MIT-blue)](https://github.com/Reddan/checkpointer/blob/master/LICENSE) [![pypi](https://img.shields.io/pypi/v/checkpointer)](https://pypi.org/project/checkpointer/) [![Python 3.12](https://img.shields.io/badge/python-3.12-blue)](https://pypi.org/project/checkpointer/)

`checkpointer` is a Python library for memoizing function results. It simplifies caching by providing a decorator-based API and supports various storage backends. It's designed for computationally expensive operations where caching can save time, or during development to avoid waiting for redundant computations. ⚡️

Adding or removing `@checkpoint` doesn't change how your code works, and it can be applied to any function, including ones you've already written, without altering their behavior or introducing side effects. The original function remains unchanged and can still be called directly when needed.

### Key Features:
- 🗂️ **Multiple Storage Backends**: Supports in-memory, pickle, or your own custom storage.
- 🎯 **Simple Decorator API**: Apply `@checkpoint` to functions.
- 🔄 **Async and Sync Compatibility**: Works with synchronous functions and any Python async runtime (e.g., `asyncio`, `Trio`, `Curio`).
- ⏲️ **Custom Expiration Logic**: Automatically invalidate old checkpoints.
- 📂 **Flexible Path Configuration**: Control where checkpoints are stored.

---

## Installation

```bash
pip install checkpointer
```

---

## Quick Start 🚀

```python
from checkpointer import checkpoint

@checkpoint
def expensive_function(x: int) -> int:
    print("Computing...")
    return x ** 2

result = expensive_function(4)  # Computes and stores result
result = expensive_function(4)  # Loads from checkpoint
```

---

## How It Works

When you use `@checkpoint`, the function's **arguments** (`args`, `kwargs`) are hashed to create a unique identifier for each call. This identifier is used to store and retrieve cached results. If the same arguments are passed again, `checkpointer` will return the cached result instead of recomputing.

Additionally, `checkpointer` ensures that caches are invalidated when a function’s implementation or any of its dependencies change. Each function is assigned a hash based on:
1. **Its source code**: Changes to the function’s code update its hash.
2. **Dependent functions**: If a function calls others, changes to those will also update the hash.

### Example: Cache Invalidation by Function Dependencies

```python
def multiply(a, b):
    return a * b

@checkpoint
def helper(x):
    return multiply(x + 1, 2)

@checkpoint
def compute(a, b):
    return helper(a) + helper(b)
```

If you change `multiply`, the checkpoints for both `helper` and `compute` will be invalidated and recomputed.

---

## Parameterization

### Global Configuration

You can configure a custom `Checkpointer`:

```python
from checkpointer import checkpoint

checkpoint = checkpoint(format="memory", root_path="/tmp/checkpoints")
```

Extend this configuration by calling itself again:

```python
extended_checkpoint = checkpoint(format="pickle", verbosity=0)
```

### Per-Function Customization

```python
@checkpoint(format="pickle", verbosity=0)
def my_function(x, y):
    return x + y
```

### Combining Configurations

```python
checkpoint = checkpoint(format="memory", verbosity=1)
quiet_checkpoint = checkpoint(verbosity=0)
pickle_checkpoint = checkpoint(format="pickle", root_path="/tmp/pickle_checkpoints")

@checkpoint
def compute_square(n: int) -> int:
    return n ** 2

@quiet_checkpoint
def compute_quietly(n: int) -> int:
    return n ** 3

@pickle_checkpoint
def compute_sum(a: int, b: int) -> int:
    return a + b
```

### Layered Caching

```python
IS_DEVELOPMENT = True  # Toggle based on environment

dev_checkpoint = checkpoint(when=IS_DEVELOPMENT)

@checkpoint(format="memory")
@dev_checkpoint
def some_expensive_function():
    print("Performing a time-consuming operation...")
    return sum(i * i for i in range(10**6))
```

- In development: Both `dev_checkpoint` and `memory` caches are active.
- In production: Only the `memory` cache is active.

---

## Usage

### Force Recalculation
Use `rerun` to force a recalculation and overwrite the stored checkpoint:

```python
result = expensive_function.rerun(4)
```

### Bypass Checkpointer
Use `fn` to directly call the original, undecorated function:

```python
result = expensive_function.fn(4)
```

This is especially useful **inside recursive functions**. By using `.fn` within the function itself, you avoid redundant caching of intermediate recursive calls while still caching the final result at the top level.

### Retrieve Stored Checkpoints
Access stored results without recalculating:

```python
stored_result = expensive_function.get(4)
```

---

## Storage Backends

`checkpointer` supports flexible storage backends, including built-in options and custom implementations.

### Built-In Backends

1. **PickleStorage**: Saves checkpoints to disk using Python's `pickle` module.
2. **MemoryStorage**: Caches checkpoints in memory for fast, non-persistent use.

To use these backends, pass either `"pickle"` or `PickleStorage` (and similarly for `"memory"` or `MemoryStorage`) to the `format` parameter:
```python
from checkpointer import checkpoint, PickleStorage, MemoryStorage

@checkpoint(format="pickle")  # Equivalent to format=PickleStorage
def disk_cached(x: int) -> int:
    return x ** 2

@checkpoint(format="memory")  # Equivalent to format=MemoryStorage
def memory_cached(x: int) -> int:
    return x * 10
```

### Custom Storage Backends

Create custom storage backends by implementing methods for storing, loading, and managing checkpoints. For example, a custom storage backend might use a database, cloud storage, or a specialized format.

Example usage:
```python
from checkpointer import checkpoint, Storage
from typing import Any
from pathlib import Path
from datetime import datetime

class CustomStorage(Storage):  # Optional for type hinting
    @staticmethod
    def exists(path: Path) -> bool: ...
    @staticmethod
    def checkpoint_date(path: Path) -> datetime: ...
    @staticmethod
    def store(path: Path, data: Any) -> None: ...
    @staticmethod
    def load(path: Path) -> Any: ...
    @staticmethod
    def delete(path: Path) -> None: ...

@checkpoint(format=CustomStorage)
def custom_cached(x: int):
    return x ** 2
```

This flexibility allows you to adapt `checkpointer` to meet any storage requirement, whether persistent or in-memory.

---

## Configuration Options ⚙️

| Option         | Type                                | Default     | Description                                 |
|----------------|-------------------------------------|-------------|---------------------------------------------|
| `format`       | `"pickle"`, `"memory"`, `Storage`   | `"pickle"`  | Storage backend format.                     |
| `root_path`    | `Path`, `str`, or `None`            | User Cache  | Root directory for storing checkpoints.     |
| `when`         | `bool`                              | `True`      | Enable or disable checkpointing.            |
| `verbosity`    | `0` or `1`                          | `1`         | Logging verbosity.                          |
| `path`         | `Callable[..., str]`                | `None`      | Custom path for checkpoint storage.         |
| `should_expire`| `Callable[[datetime], bool]`        | `None`      | Custom expiration logic.                    |

---

## Full Example 🛠️

```python
import asyncio
from checkpointer import checkpoint

@checkpoint
def compute_square(n: int) -> int:
    print(f"Computing {n}^2...")
    return n ** 2

@checkpoint(format="memory")
async def async_compute_sum(a: int, b: int) -> int:
    await asyncio.sleep(1)
    return a + b

async def main():
    result1 = compute_square(5)
    print(result1)

    result2 = await async_compute_sum(3, 7)
    print(result2)

    result3 = async_compute_sum.get(3, 7)
    print(result3)

asyncio.run(main())
```
