Metadata-Version: 2.1
Name: anystore
Version: 0.1.12
Summary: Store and cache things anywhere
Home-page: https://github.com/investigativedata/anystore
License: GPL-3.0
Author: Simon Wörpel
Author-email: simon.woerpel@pm.me
Requires-Python: >=3.11,<4
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: adlfs (>2023.10,<2025)
Requires-Dist: banal (>=1.0.6,<2.0.0)
Requires-Dist: cloudpickle (>=3.0.0,<4.0.0)
Requires-Dist: cryptography (>=42.0.4)
Requires-Dist: fsspec (>2023.10,<2025)
Requires-Dist: gcsfs (>2023.10,<2025)
Requires-Dist: orjson (>=3.9.15,<4.0.0)
Requires-Dist: pyaml (>=23.12,<25.0)
Requires-Dist: pydantic (>=2.6.3,<3.0.0)
Requires-Dist: pydantic-settings (>=2.2.1,<3.0.0)
Requires-Dist: pytest (>=8.0.2,<9.0.0)
Requires-Dist: rich (>=13.7.0,<14.0.0)
Requires-Dist: s3fs (>2023.10,<2025)
Requires-Dist: shortuuid (>=1.0.13,<2.0.0)
Requires-Dist: sshfs (>=2024.9.0,<2025.0.0)
Requires-Dist: structlog (>=24.4.0,<25.0.0)
Requires-Dist: typer (>=0.9,<0.13)
Project-URL: Bug Tracker, https://github.com/investigativedata/anystore/issues
Project-URL: Documentation, https://github.com/investigativedata/anystore
Project-URL: Repository, https://github.com/investigativedata/anystore
Description-Content-Type: text/markdown

[![anystore on pypi](https://img.shields.io/pypi/v/anystore)](https://pypi.org/project/anystore/)
[![Python test and package](https://github.com/investigativedata/anystore/actions/workflows/python.yml/badge.svg)](https://github.com/investigativedata/anystore/actions/workflows/python.yml)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)
[![Coverage Status](https://coveralls.io/repos/github/investigativedata/anystore/badge.svg?branch=main)](https://coveralls.io/github/investigativedata/anystore?branch=main)
[![GPL-3.0 License](https://img.shields.io/pypi/l/anystore)](./LICENSE)

# anystore

Store anything anywhere. `anystore` provides a high-level storage and retrieval interface for various supported _store_ backends, such as `redis`, `sql`, `file`, `http`, cloud-storages and anything else supported by [`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/index.html).

Think of it as a `key -> value` store, and `anystore` acts as a cache backend. And when _keys_ become filenames and _values_ become byte blobs, `anystore` becomes actually a file-like storage backend – but always with the same and interchangeable interface.

### Why?

[In our several data engineering projects](https://investigativedata.io/projects) we always wrote boilerplate code that handles the featureset of `anystore` but not in a reusable way. This library shall be a stable foundation for data wrangling related python projects.

### Examples

#### Base cli interface:

```shell
anystore -i ./local/foo.txt -o s3://mybucket/other.txt

echo "hello" | anystore -o sftp://user:password@host:/tmp/world.txt

anystore -i https://investigativedata.io > index.html

anystore --store sqlite:///db keys <prefix> 

anystore --store redis://localhost put foo "bar"

anystore --store redis://localhost get foo  # -> "bar"
```
#### Use in your applications:

```python
from anystore import smart_read, smart_write

data = smart_read("s3://mybucket/data.txt")
smart_write(".local/data", data)
```

#### Simple cache example via decorator:

[`@anycache` is used for api view cache in `ftmq-api`](https://github.com/investigativedata/ftmstore-fastapi/blob/main/ftmstore_fastapi/views.py)

```python
from anystore import get_store, anycache

cache = get_store("redis://localhost")

@anycache(store=cache, key_func=lambda q: f"api/list/{q.make_key()}", ttl=60)
def get_list_view(q: Query) -> Response:
    result = ... # complex computing will be cached
    return result
```

#### Mirror file collections:

```python
from anystore import get_store

source = get_store("https://example.org/documents/archive1")  # directory listing
target = get_store("s3://mybucket/files", backend_config={"client_kwargs": {
    "aws_access_key_id": "my-key",
    "aws_secret_access_key": "***",
    "endpoint_url": "https://s3.local"
}})  # can be configured via ENV as well

for path in source.iterate_keys():
    # streaming copy:
    with source.open(path) as i:
        with target.open(path, "wb") as o:
            i.write(o.read())
```

## Documentation

Find the docs at [investigativedata.io/docs/anystore](https://investigativedata.io/docs/anystore)

## Used by

- [ftmq](https://github.com/investigativedata/ftmq)
- [investigraph](https://github.com/investigativedata/investigraph)
- [ftmq-api](https://github.com/investigativedata/ftmq-api)
- [leakrfc](https://github.com/investigativedata/leakrfc)


## Development

This package is using [poetry](https://python-poetry.org/) for packaging and dependencies management, so first [install it](https://python-poetry.org/docs/#installation).

Clone this repository to a local destination.

Within the repo directory, run

    poetry install --with dev

This installs a few development dependencies, including [pre-commit](https://pre-commit.com/) which needs to be registered:

    poetry run pre-commit install

Before creating a commit, this checks for correct code formatting (isort, black) and some other useful stuff (see: `.pre-commit-config.yaml`)

### testing

`anystore` uses [pytest](https://docs.pytest.org/en/stable/) as the testing framework.

    make test

