Metadata-Version: 2.4
Name: sqlalchemy-paradedb
Version: 0.5.0
Summary: Typed SQLAlchemy helpers for ParadeDB
Project-URL: Homepage, https://github.com/paradedb/sqlalchemy-paradedb
Project-URL: Repository, https://github.com/paradedb/sqlalchemy-paradedb
Project-URL: Issues, https://github.com/paradedb/sqlalchemy-paradedb/issues
Project-URL: Changelog, https://github.com/paradedb/sqlalchemy-paradedb/blob/main/CHANGELOG.md
Author: ParadeDB
License-File: LICENSE
Keywords: bm25,faceted search,facets,full text search,hybrid search,paradedb,postgres,postgresql,search,sqlalchemy,text search
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Database
Requires-Python: >=3.10
Requires-Dist: sqlalchemy>=2.0.32
Provides-Extra: dev
Requires-Dist: mypy>=1.19.0; extra == 'dev'
Requires-Dist: pre-commit>=4.0.0; extra == 'dev'
Requires-Dist: ruff>=0.15.0; extra == 'dev'
Provides-Extra: test
Requires-Dist: alembic>=1.13.0; extra == 'test'
Requires-Dist: psycopg[binary]>=3.1.18; extra == 'test'
Requires-Dist: pytest-cov>=6.0; extra == 'test'
Requires-Dist: pytest>=8.0.0; extra == 'test'
Description-Content-Type: text/markdown

<!-- ParadeDB: Postgres for Search and Analytics -->
<h1 align="center">
  <a href="https://paradedb.com"><img src="https://github.com/paradedb/paradedb/raw/main/docs/logo/readme.svg" alt="ParadeDB"></a>
<br>
</h1>

<p align="center">
  <b>Simple, Elastic-quality search for Postgres</b><br/>
</p>

<h3 align="center">
  <a href="https://paradedb.com">Website</a> &bull;
  <a href="https://docs.paradedb.com">Docs</a> &bull;
  <a href="https://paradedb.com/slack/">Community</a> &bull;
  <a href="https://paradedb.com/blog/">Blog</a> &bull;
  <a href="https://docs.paradedb.com/changelog/">Changelog</a>
</h3>

---

# sqlalchemy-paradedb

[![PyPI](https://img.shields.io/pypi/v/sqlalchemy-paradedb)](https://pypi.org/project/sqlalchemy-paradedb/)
[![Python Versions](https://img.shields.io/pypi/pyversions/sqlalchemy-paradedb)](https://pypi.org/project/sqlalchemy-paradedb/)
[![Downloads](https://img.shields.io/pypi/dm/sqlalchemy-paradedb)](https://pypi.org/project/sqlalchemy-paradedb/)
[![Codecov](https://codecov.io/gh/paradedb/sqlalchemy-paradedb/graph/badge.svg)](https://codecov.io/gh/paradedb/sqlalchemy-paradedb)
[![License](https://img.shields.io/github/license/paradedb/sqlalchemy-paradedb?color=blue)](https://github.com/paradedb/sqlalchemy-paradedb?tab=MIT-1-ov-file#readme)
[![Slack URL](https://img.shields.io/badge/Join%20Slack-purple?logo=slack&link=https%3A%2F%2Fparadedb.com%2Fslack)](https://paradedb.com/slack)
[![X URL](https://img.shields.io/twitter/url?url=https%3A%2F%2Ftwitter.com%2Fparadedb&label=Follow%20%40paradedb)](https://x.com/paradedb)

[ParadeDB](https://paradedb.com) integration for SQLAlchemy: typed helpers for BM25 indexes, search predicates, scoring, snippets, facets, and migration ergonomics.

## Requirements & Compatibility

| Component  | Supported                     |
| ---------- | ----------------------------- |
| Python     | 3.10+                         |
| SQLAlchemy | 2.0.32+                       |
| ParadeDB   | 0.22.0+                       |
| PostgreSQL | 15+ (with ParadeDB extension) |

## Installation

```bash
uv add sqlalchemy-paradedb
```

For local development:

```bash
uv sync --extra test --extra dev
```

## Quick Start

### Prerequisites

Install `pg_search` in your Postgres database and connect SQLAlchemy to that database.

### Create a BM25 Index

```python
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing

products_bm25_idx = Index(
    "products_bm25_idx",
    indexing.BM25Field(Product.id),
    indexing.BM25Field(
        Product.description,
        tokenizer=indexing.tokenize.unicode(lowercase=True),
    ),
    indexing.BM25Field(
        Product.category,
        tokenizer=indexing.tokenize.literal(),
    ),
    postgresql_using="bm25",
    postgresql_with={"key_field": "id"},
)
```

For JSON columns named `metadata`, use `metadata_` as the ORM attribute name.

### Query with ParadeDB Predicates

```python
from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import pdb, search

stmt = (
    select(Product.id, Product.description)
    .where(search.match_any(Product.description, "running", "shoes"))
    .order_by(pdb.score(Product.id).desc())
    .limit(10)
)

with Session(engine) as session:
    rows = session.execute(stmt).all()
```

### Rows + Facets in a Single Query

```python
from sqlalchemy import select
from sqlalchemy.orm import Session
from paradedb.sqlalchemy import facets, search

base = (
    select(Product.id, Product.description)
    .where(search.match_all(Product.description, "running"))
    .order_by(Product.id)
    .limit(10)
)

stmt = facets.with_rows(
    base,
    agg=facets.multi(
        facets.value_count(field="id"),
        facets.terms(field="category", size=10),
    ),
    key_field=Product.id,
)

with Session(engine) as session:
    rows = session.execute(stmt).all()
    facet_payload = facets.extract(rows)
```

## Search Patterns

### Fuzzy Matching

```python
from paradedb.sqlalchemy import search

search.term(Product.description, "shose", distance=1)
search.match_any(Product.description, "wirless", distance=1, prefix=True)
search.term(Product.description, "rnnuing", distance=1, transpose_cost_one=True)
```

Use fuzzy options on `term`, `match_any`, or `match_all`; there is no separate `search.fuzzy(...)` helper.

### Phrase Prefix and More-Like-This

```python
from paradedb.sqlalchemy import search

search.phrase_prefix(Product.description, ["running", "sh"])
search.more_like_this(Product.id, document_id=1, fields=["description"])
```

### Proximity Composition

```python
from sqlalchemy import select
from paradedb.sqlalchemy import search

prox = search.prox_array("running").within(1, search.prox_regex("sho.*"), ordered=True)
stmt = select(Product.id).where(search.proximity(Product.description, prox))
```

## Indexing and Tokenizers

Tokenizer config can be expressed as a structured mapping:

```python
from sqlalchemy import Index
from paradedb.sqlalchemy import indexing

products_bm25_idx = Index(
    "products_bm25_idx",
    indexing.BM25Field(Product.id),
    indexing.BM25Field(
        Product.description,
        tokenizer=indexing.tokenize.from_config(
            {
                "tokenizer": "simple",
                "filters": ["lowercase", "stemmer"],
                "stemmer": "english",
                "alias": "description_simple",
            }
        ),
    ),
    indexing.BM25Field(
        Product.description,
        tokenizer=indexing.tokenize.from_config(
            {
                "tokenizer": "ngram",
                "args": [3, 8],
                "named_args": {"prefix_only": True},
                "alias": "description_ngram",
            }
        ),
    ),
    postgresql_using="bm25",
    postgresql_with={"key_field": "id"},
)
```

Validate that a field is indexed with the expected tokenizer:

```python
from paradedb.sqlalchemy import indexing

indexing.assert_indexed(engine, Product.category, tokenizer="literal")
```

Inspect BM25 metadata for a mapped table:

```python
from paradedb.sqlalchemy import indexing

meta = indexing.describe(engine, Product.__table__)
```

## Alembic Operations

Import once in migration environment startup so Alembic registers ParadeDB operations:

```python
import paradedb.sqlalchemy.alembic  # noqa: F401
```

Use custom operations in migrations:

```python
op.create_bm25_index(
    "products_bm25_idx",
    "products",
    ["id", "description"],
    key_field="id",
    table_schema="public",
)
op.reindex_bm25("products_bm25_idx", concurrently=True, schema="public")
op.drop_bm25_index("products_bm25_idx", if_exists=True, schema="public")
```

`op.reindex_bm25(..., concurrently=True)` must run outside a transaction (autocommit block).

## Diagnostics Helpers

`paradedb.sqlalchemy.diagnostics` exposes wrapper functions for ParadeDB diagnostics:

```python
from paradedb.sqlalchemy import diagnostics

indexes = diagnostics.paradedb_indexes(engine)
segments = diagnostics.paradedb_index_segments(engine, "products_bm25_idx")
check = diagnostics.paradedb_verify_index(engine, "products_bm25_idx", sample_rate=0.1)
all_checks = diagnostics.paradedb_verify_all_indexes(engine, schema_pattern="public")
```

## Common Errors

### `with_rows requires ORDER BY`

```python
from sqlalchemy import select
from paradedb.sqlalchemy import facets

# Missing order_by(...)
base = select(Product.id).limit(10)
facets.with_rows(base, agg=facets.value_count(field="id"), key_field=Product.id)
```

### `with_rows requires LIMIT`

```python
from sqlalchemy import select
from paradedb.sqlalchemy import facets

# Missing limit(...)
base = select(Product.id).order_by(Product.id)
facets.with_rows(base, agg=facets.value_count(field="id"), key_field=Product.id)
```

### `with_rows requires a ParadeDB predicate`

```python
from sqlalchemy import select
from paradedb.sqlalchemy import facets

# ensure_predicate=False disables automatic search.all(...) injection
facets.with_rows(
    select(Product.id).order_by(Product.id).limit(10),
    agg=facets.value_count(field="id"),
    key_field=Product.id,
    ensure_predicate=False,
)
```

### `tokenizer config requires 'tokenizer'`

```python
from paradedb.sqlalchemy import indexing

indexing.tokenize.from_config({"filters": ["lowercase"]})
```

## Examples

- [Quick Start](examples/quickstart/quickstart.py)
- [Faceted Search](examples/faceted_search/faceted_search.py)
- [Autocomplete](examples/autocomplete/autocomplete.py)
- [More Like This](examples/more_like_this/more_like_this.py)
- [Hybrid Search (RRF)](examples/hybrid_rrf/hybrid_rrf.py)
- [RAG](examples/rag/rag.py)

## Documentation

- **ParadeDB Official Docs**: <https://docs.paradedb.com>
- **ParadeDB Website**: <https://paradedb.com>

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, running tests, linting, and the PR workflow.

## Support

If you're missing a feature or have found a bug, please open a
[GitHub Issue](https://github.com/paradedb/django-paradedb/issues/new/choose).

To get community support, you can:

- Post a question in the [ParadeDB Slack Community](https://paradedb.com/slack)
- Ask for help on our [GitHub Discussions](https://github.com/paradedb/paradedb/discussions)

If you need commercial support, please [contact the ParadeDB team](mailto:sales@paradedb.com).

## License

sqlalchemy-paradedb is licensed under the [MIT License](https://github.com/paradedb/sqlalchemy-paradedb?tab=MIT-1-ov-file#readme).
