Metadata-Version: 2.4
Name: distributed-sqlite
Version: 0.2.0
Summary: Distributed SQLite-compatible storage engine backed by S3
Requires-Python: >=3.11
Requires-Dist: boto3>=1.34
Requires-Dist: msgpack>=1.0
Requires-Dist: pydantic>=2.0
Requires-Dist: sqlalchemy>=2.0
Provides-Extra: dev
Requires-Dist: alembic>=1.13; extra == 'dev'
Requires-Dist: moto[s3]>=5.0; extra == 'dev'
Requires-Dist: pytest-timeout>=2.3; extra == 'dev'
Requires-Dist: pytest-xdist>=3.5; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# distributed-sqlite

A distributed SQLite-compatible storage engine backed solely by AWS S3.

## Overview

`distributed-sqlite` provides a standard SQLAlchemy/DBAPI2 interface over an
append-only, segment-based storage model on S3. It supports:

- **Snapshot isolation** — each transaction reads from a consistent snapshot
- **Optimistic concurrency** — CAS-based manifest commits with automatic retry
- **Conflict detection** — write-set intersection check; raises `ConflictError` on true conflicts
- **Exponential backoff with jitter** — full jitter retry up to 10 attempts
- **WAL-like semantics** — immutable segments + versioned manifests, never mutates committed data
- **Crash recovery** — orphaned segments (written but not committed) are detected and safely ignored
- **Alembic migrations** — Alembic sees a standard SQLite interface; all DDL and migration ops work unchanged
- **Local caching** — LRU disk cache for segments, in-memory snapshot cache

## Storage Layout

```
{bucket}/{prefix}/
  manifests/v{N:020d}.json   # Immutable manifest per version
  segments/{uuid}.seg        # Immutable append-only segments (msgpack)
  root.json                  # Eventually-consistent version hint
```

## Connection URL

```
distributed_sqlite+distributed_sqlite:///<bucket>/<prefix>
```

## Quick Start

```python
from distributed_sqlite.engine import bootstrap, open_connection, create_engine

# Initialize the store (idempotent)
bootstrap("my-bucket", "mydb")

# Raw DBAPI2 connection
with open_connection("my-bucket", "mydb") as conn:
    cur = conn.cursor()
    cur.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)")
    cur.execute("INSERT INTO users VALUES (1, 'Alice')")
    conn.commit()

# SQLAlchemy engine
import sqlalchemy as sa
engine = create_engine("distributed_sqlite+distributed_sqlite:///my-bucket/mydb")
```

## Environment Variables

| Variable | Default | Description |
|---|---|---|
| `AWS_ACCESS_KEY_ID` | — | AWS credentials |
| `AWS_SECRET_ACCESS_KEY` | — | AWS credentials |
| `AWS_DEFAULT_REGION` | `us-east-1` | AWS region |
| `AWS_ENDPOINT_URL` | — | Custom endpoint (LocalStack, MinIO) |
| `DISTRIBUTED_SQLITE_CACHE_DIR` | `~/.distributed_sqlite/cache` | Local cache directory |
| `DISTRIBUTED_SQLITE_CHECKPOINT_INTERVAL` | `50` | Delta segments between checkpoints |
| `DISTRIBUTED_SQLITE_MAX_RETRIES` | `10` | Max commit retry attempts |
| `DISTRIBUTED_SQLITE_RETRY_BASE_SECONDS` | `0.05` | Backoff base delay |
| `DISTRIBUTED_SQLITE_RETRY_MAX_SECONDS` | `30.0` | Max backoff delay |

## Architecture

See [docs/architecture.md](docs/architecture.md) for the full design narrative.

## Development

```bash
cp .env.example .env   # fill in your AWS credentials
uv sync
uv run pytest tests/ -v
```
