Metadata-Version: 2.3
Name: docs2db-api
Version: 0.2.0
Summary: Query Docs2DB RAG databases with hybrid search and reranking
Keywords: rag,retrieval,embeddings,semantic-search,vector-search,hybrid-search,pgvector,postgresql,llm
Author: Ellis Low
Author-email: Ellis Low <elow@redhat.com>
License: Apache-2.0
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Dist: cachetools>=6.1.0
Requires-Dist: docling>=2.42.1
Requires-Dist: httpx>=0.28.1
Requires-Dist: pgvector>=0.3.8
Requires-Dist: psycopg[binary]>=3.2.3
Requires-Dist: psycopg-pool>=3.2.6
Requires-Dist: pydantic-settings>=2.10.1
Requires-Dist: pyyaml>=6.0.1
Requires-Dist: rich>=14.0.0
Requires-Dist: structlog>=25.4.0
Requires-Dist: torch>=2.7.1
Requires-Dist: torchvision>=0.22.1
Requires-Dist: sentence-transformers>=3.3.0
Requires-Dist: tqdm>=4.67.1
Requires-Dist: xxhash>=3.5.0
Requires-Dist: fastapi>=0.116.1
Requires-Dist: uvicorn>=0.35.0
Requires-Dist: llama-stack>=0.2.20
Requires-Python: >=3.12, <3.13
Project-URL: Changelog, https://github.com/rhel-lightspeed/docs2db-api/blob/main/CHANGELOG.md
Project-URL: Documentation, https://github.com/rhel-lightspeed/docs2db-api#readme
Project-URL: Homepage, https://github.com/rhel-lightspeed/docs2db-api
Project-URL: Issues, https://github.com/rhel-lightspeed/docs2db-api/issues
Project-URL: Repository, https://github.com/rhel-lightspeed/docs2db-api
Description-Content-Type: text/markdown

# Docs2DB-API

Query a [Docs2DB](https://github.com/rhel-lightspeed/docs2db) RAG database with modern retrieval techniques. Docs2DB-API provides a Python library for hybrid search (vector + BM25) with reranking.

**What it does:**
- Queries RAG databases created by [docs2db](https://github.com/rhel-lightspeed/docs2db)
- Hybrid search: combines vector similarity with BM25 full-text search
- Reciprocal Rank Fusion (RRF) for result combination
- Cross-encoder reranking for improved result quality
- Question refinement for query expansion
- Universal RAG engine adaptable to multiple API frameworks

**What it's for:**
- Building RAG applications and agents
- Adding document search to LLM systems
- Serving RAG APIs (FastAPI, LlamaStack, custom frameworks)

## Installation

```bash
uv add docs2db-api
```

## Quickstart

**Step 1: Create a database with [docs2db](https://github.com/rhel-lightspeed/docs2db)**

```bash
uv tool install docs2db
docs2db pipeline /path/to/documents
```

This creates `ragdb_dump.sql`.

**Step 2: Restore and query**

```bash
# Start database
uv run docs2db-api db-start

# Restore dump
uv run docs2db-api db-restore ragdb_dump.sql

# Check status
uv run docs2db-api db-status
```

**Step 3: Use in your application**

```python
import asyncio
from docs2db_api.rag.engine import UniversalRAGEngine, RAGConfig

async def main():
    # Initialize engine with defaults (auto-detects database from environment)
    engine = UniversalRAGEngine()
    await engine.start()
    
    # # Or with specific settings
    # config = RAGConfig(
    #     model_name="granite-30m-english",
    #     max_chunks=5,
    #     similarity_threshold=0.7
    # )
    # db_config = {
    #     "host": "localhost",
    #     "port": "5432",
    #     "database": "ragdb",
    #     "user": "postgres",
    #     "password": "postgres"
    # }
    # engine = UniversalRAGEngine(config=config, db_config=db_config)
    # await engine.start()
    
    # Search
    result = await engine.search_documents("How do I configure authentication?")
    for doc in result.documents:
        print(f"Score: {doc['similarity_score']:.3f}")
        print(f"Source: {doc['document_path']}")
        print(f"Text: {doc['text'][:200]}...\n")

asyncio.run(main())
```

## LlamaStack Integration

Docs2DB-API includes a native LlamaStack tool provider for agent-based RAG. See the complete demo with setup scripts and examples:

**[📁 demos/llama-stack/](demos/llama-stack/)** - LlamaStack RAG tool provider with agent demos

(this needs to be adjusted to work from pypi)

## Configuration

### Database Configuration

**Configuration precedence (highest to lowest):**
1. CLI arguments: `--host`, `--port`, `--db`, `--user`, `--password`
2. Environment variables: `DOCS2DB_DB_HOST`, `DOCS2DB_DB_PORT`, `DOCS2DB_DB_DATABASE`, `DOCS2DB_DB_USER`, `DOCS2DB_DB_PASSWORD`
3. `DOCS2DB_DB_URL`: `postgresql://user:pass@host:port/database`
4. `postgres-compose.yml` in current directory
5. Defaults: `localhost:5432`, user=`postgres`, password=`postgres`, db=`ragdb`

**Examples:**

```bash
# Use defaults
uv run docs2db-api db-status

# Environment variables
export DOCS2DB_DB_HOST=prod.example.com
export DOCS2DB_DB_DATABASE=mydb
uv run docs2db-api db-status

# DOCS2DB_DB_URL (cloud providers)
export DOCS2DB_DB_URL="postgresql://user:pass@host:5432/db"
uv run docs2db-api db-status

# CLI arguments
uv run docs2db-api db-status --host localhost --db mydb
```

**Note:** Don't mix `DOCS2DB_DB_URL` with individual `DOCS2DB_DB_*` variables.

### LLM Configuration (Query Refinement)

Configure the LLM used for query refinement:

```bash
export DOCS2DB_LLM_BASE_URL=http://localhost:11434      # OpenAI-compatible API (e.g., Ollama)
export DOCS2DB_LLM_MODEL=qwen2.5:7b-instruct            # Model name
export DOCS2DB_LLM_TIMEOUT=30.0                         # HTTP timeout (seconds)
export DOCS2DB_LLM_TEMPERATURE=0.7                      # Generation temperature
export DOCS2DB_LLM_MAX_TOKENS=500                       # Max tokens per response
```

## RAG Configuration

RAG settings control retrieval behavior (similarity thresholds, reranking, refinement, etc.) and can be stored in the database or provided at query time.

### Available Settings

- `refinement_prompt` - Custom prompt for query refinement
- `enable_refinement` (refinement) - Enable question refinement (true/false)
- `enable_reranking` (reranking) - Enable cross-encoder reranking (true/false)
- `similarity_threshold` - Similarity threshold 0.0-1.0
- `max_chunks` - Maximum chunks to return
- `max_tokens_in_context` - Maximum tokens in context window
- `refinement_questions_count` - Number of refined questions to generate

### Configuration Precedence (highest to lowest)

1. **Query parameters** - Passed directly to `engine.search_documents()` or CLI `--threshold`, `--limit`, etc.
2. **RAGConfig object** - Provided when initializing `UniversalRAGEngine`
3. **Database settings** - Stored in database via `docs2db config` command
(see [docs2db](https://github.com/rhel-lightspeed/docs2db))
4. **Code defaults** - Built-in fallback values

## Commands

### Database Management

```bash
docs2db-api db-start               # Start PostgreSQL with Podman/Docker
docs2db-api db-stop                # Stop PostgreSQL (data preserved)
docs2db-api db-destroy             # Stop and delete all data
docs2db-api db-status              # Check connection and stats
docs2db-api db-restore <file>      # Restore database from dump
docs2db-api manifest               # Generate list of documents
```

### Querying

```bash
# Basic search
docs2db-api query "How do I configure authentication?"

# Advanced options
docs2db-api query "deployment guide" \
  --model granite-30m-english \
  --limit 20 \
  --threshold 0.8 \
  --no-refine                     # Disable question refinement
```

## RAG Features

Docs2DB-API implements modern retrieval techniques:

- **Contextual chunks** - LLM-generated context situating each chunk within its document ([Anthropic's approach](https://www.anthropic.com/engineering/contextual-retrieval))
- **Hybrid search** - Combines BM25 (lexical) and vector embeddings (semantic)
- **Reciprocal Rank Fusion (RRF)** - Intelligent result combination
- **Cross-encoder reranking** - Improved result quality
- **Question refinement** - Query expansion for better matches
- **PostgreSQL full-text search** - tsvector with GIN indexing for BM25
- **pgvector similarity** - Fast vector search with HNSW indexes
- **Universal RAG engine** - Adaptable to multiple API frameworks

## License

See [LICENSE](LICENSE) for details.
