Metadata-Version: 2.3
Name: pgai
Version: 0.10.0
Summary: AI workflows in your PostgreSQL database
Project-URL: Homepage, https://github.com/timescale/pgai
Project-URL: Repository, https://github.com/timescale/pgai
Project-URL: Bug Tracker, https://github.com/timescale/pgai/issues
Project-URL: Documentation, https://github.com/timescale/pgai/tree/main/docs
Keywords: ai,postgres
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: PostgreSQL License
Classifier: Operating System :: MacOS
Classifier: Operating System :: OS Independent
Classifier: Operating System :: POSIX
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: boto3-stubs[s3,sts]>=1.37.30
Requires-Dist: boto3<2.0,>=1.35.0
Requires-Dist: click<9.0,>=8.0
Requires-Dist: datadog-lambda<7.0,>=6.9
Requires-Dist: docling==2.21.0
Requires-Dist: exceptiongroup<2.0,>=1.0
Requires-Dist: filetype==1.2.0
Requires-Dist: google-cloud-aiplatform<2.0,>=1.78.0
Requires-Dist: ijson>=3.3.0
Requires-Dist: langchain-openai<1.0,>=0.1
Requires-Dist: langchain-text-splitters<1.0,>=0.2
Requires-Dist: litellm<1.66.0,>=1.65.0
Requires-Dist: ollama<0.5.0,>=0.4.5
Requires-Dist: openai<2.0,>=1.44
Requires-Dist: pgvector<1.0,>=0.3
Requires-Dist: psycopg[binary]<4.0,>=3.2
Requires-Dist: pydantic<3.0,>=2.0
Requires-Dist: pymupdf4llm==0.0.17
Requires-Dist: python-dotenv<2.0,>=1.0
Requires-Dist: pytimeparse<2.0,>=1.1
Requires-Dist: semver>=3.0.4
Requires-Dist: smart-open==7.1.0
Requires-Dist: structlog<25.0,>=24.0
Requires-Dist: tiktoken<1.0,>=0.7
Requires-Dist: typing-extensions<5.0,>=4.0
Requires-Dist: voyageai<0.3.2,>=0.3.1
Provides-Extra: sqlalchemy
Requires-Dist: alembic>=1.14.0; extra == 'sqlalchemy'
Requires-Dist: sqlalchemy>=2.0.36; extra == 'sqlalchemy'
Description-Content-Type: text/markdown

<p align="center">
    <img height="200" src="https://github.com/timescale/pgai/blob/main/docs/images/pgai_logo.png?raw=true" alt="pgai"/>
</p>

<p></p>
<div align=center>

<h3>Power your RAG and Agentic applications with PostgreSQL</h3>

<div>
  <a href="https://github.com/timescale/pgai/tree/main/docs"><strong>Docs</strong></a> ·
  <a href="https://discord.gg/KRdHVXAmkp"><strong>Join the pgai Discord!</strong></a> ·
  <a href="https://tsdb.co/gh-pgai-signup"><strong>Try timescale for free!</strong></a> ·
  <a href="https://github.com/timescale/pgai/releases"><strong>Changelog</strong></a>
</div>
</div>
<br/>

A Python library that transforms PostgreSQL into a robust, production-ready retrieval engine for RAG and Agentic applications.

- 🔄 Automatically create and synchronize vector embeddings from PostgreSQL data and S3 documents. Embeddings update automatically as data changes.

- 🔍 Powerful vector and semantic search with pgvector and pgvectorscale.

- 🛡️ Production-ready out-of-the-box: Supports batch processing for efficient embedding generation, with built-in handling for model failures, rate limits, and latency spikes.

- 🐘 Works with any PostgreSQL database, including Timescale Cloud, Amazon RDS, Supabase and more.

# Getting Started

Install:

```bash
pip install pgai
```

```bash
pgai install -d "postgresql://postgres:postgres@localhost:5432/postgres"
```

The key "secret sauce" of pgai Vectorizer is its declarative approach to
embedding generation. Simply define your pipeline and let Vectorizer handle the
operational complexity of keeping embeddings in sync, even when embedding
endpoints are unreliable. You can define a simple version of the pipeline as
follows:

```sql
SELECT ai.create_vectorizer(
     'wiki'::regclass,
     loading => ai.loading_column(column_name=>'text'),
     embedding => ai.embedding_openai(model=>'text-embedding-ada-002', dimensions=>'1536'),
     destination => ai.destination_table(target_table=>'wiki_embedding_storage')
    )
```

The vectorizer will automatically create embeddings for all the rows in the
`wiki` table, and, more importantly, will keep the embeddings synced with the
underlying data as it changes.  **Think of it almost like declaring an index** on
the `wiki` table, but instead of the database managing the index datastructure
for you, the vectorizer is managing the embeddings. 

Checkout our full quick start on [github](https://github.com/timescale/pgai#quick-start)
