Metadata-Version: 2.4
Name: asyncpg_vector
Version: 0.1.0
Summary: PostgreSQL vector support for asyncpg
Author-email: Levente Hunyadi <hunyadi@gmail.com>
Maintainer-email: Levente Hunyadi <hunyadi@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/hunyadi/asyncpg_vector
Project-URL: Source, https://github.com/hunyadi/asyncpg_vector
Keywords: asyncpg,postgresql-extension,vector,vector-search
Classifier: Development Status :: 5 - Production/Stable
Classifier: Framework :: AsyncIO
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Database :: Front-Ends
Classifier: Topic :: Software Development :: Libraries
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: asyncpg>=0.30
Requires-Dist: typing-extensions>=4.15; python_version < "3.11"
Provides-Extra: dev
Requires-Dist: asyncpg-stubs>=0.30; extra == "dev"
Requires-Dist: build>=1.3; extra == "dev"
Requires-Dist: mypy>=1.18; extra == "dev"
Requires-Dist: ruff>=0.14; extra == "dev"
Dynamic: license-file

# PostgreSQL vector support for asyncpg

Adds PostgreSQL [vector](https://github.com/pgvector/pgvector) support for Python.

Registers data types `vector` and `halfvec` from the PostgreSQL extension `vector` to the asynchronous PostgreSQL client `asyncpg`, and marshals vector data to and from PostgreSQL database tables.

Internally, the data is packed into a Python `bytes` object, with single-precision float vectors stored on 4 bytes per item (for class `Vector`) and half-precision float vectors stored on 2 bytes per item (for class `HalfVector`). Data is (un)packed using `struct` from the standard library as necessary.

This module provides functionality similar to [pgvector-python](https://github.com/pgvector/pgvector-python) but imports minimum dependencies (e.g. no dependency on `numpy`).

## Setup

#### Install the package

```sh
pip install asyncpg_vector
```

#### Initialize

Register vector types with your database connection or connection pool:

**Connection**:

```python
from asyncpg_vector import register_vector

async def main() -> None:
    ...

    await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
    await register_vector(conn)
```

**Pool**:

```python
from asyncpg_vector import register_vector

async def init_connection(conn: asyncpg.Connection) -> None:
    await conn.execute("CREATE EXTENSION IF NOT EXISTS vector")
    await register_vector(conn)

async def main() -> None:
    ...

    pool = await asyncpg.create_pool(..., init=init_connection)
```

#### Perform similarity search

First, create a table and an index:

```python
async def create(conn: asyncpg.Connection) -> None:
    await conn.execute("""
        CREATE TABLE IF NOT EXISTS items
        (
            id bigint NOT NULL GENERATED ALWAYS AS IDENTITY,
            content text NOT NULL,
            embedding halfvec(1536) NOT NULL,
            CONSTRAINT pk_items PRIMARY KEY (id)
        );

        CREATE INDEX IF NOT EXISTS embedding_index ON items
        USING hnsw (embedding halfvec_cosine_ops);
    """)
```

Next, find documents in a knowledge base that match a search phrase using vector similarity with approximate nearest neighbor semantics:

```python
from asyncpg_vector import HalfVector

async def search(conn: asyncpg.Connection, phrase: str) -> list[str]:
    ...

    embedding_response = await ai_client.embeddings.create(
        input=phrase,
        model="text-embedding-3-small",
        encoding_format="base64"
    )
    embedding = HalfVector.from_float_base64(embedding_response.data[0].embedding)
    query = """
        SELECT
            id,
            content,
            embedding <=> $1 AS distance
        FROM items
        ORDER BY distance
        LIMIT 5
    """
    rows = await conn.fetch(query, embedding)

    ...
```
