Metadata-Version: 2.4
Name: oiko-query-builder
Version: 0.1.0
Summary: Natural language to PostgreSQL SQL query builder powered by DSPy
Project-URL: Homepage, https://github.com/oiko/oiko-query-builder
Project-URL: Documentation, https://github.com/oiko/oiko-query-builder#readme
Project-URL: Repository, https://github.com/oiko/oiko-query-builder
Project-URL: Issues, https://github.com/oiko/oiko-query-builder/issues
Author-email: Oiko <oiko@example.com>
License-Expression: MIT
License-File: LICENSE
Keywords: ai,dspy,llm,natural-language,postgresql,query-builder,sql,text-to-sql
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: dspy>=2.5
Provides-Extra: all
Requires-Dist: asyncpg>=0.29; extra == 'all'
Requires-Dist: psycopg[binary]>=3.1; extra == 'all'
Provides-Extra: async
Requires-Dist: asyncpg>=0.29; extra == 'async'
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Provides-Extra: sync
Requires-Dist: psycopg[binary]>=3.1; extra == 'sync'
Description-Content-Type: text/markdown

# oiko-query-builder

Natural language to PostgreSQL SQL query builder powered by [DSPy](https://dspy.ai).

Write questions in plain language, get optimized SQL queries back.

## Installation

```bash
# Core (SQL generation only)
pip install oiko-query-builder

# With sync PostgreSQL support (psycopg)
pip install oiko-query-builder[sync]

# With async PostgreSQL support (asyncpg)
pip install oiko-query-builder[async]

# Everything
pip install oiko-query-builder[all]
```

## Quick Start

```python
from oiko_query_builder import QueryAgent

agent = QueryAgent(
    lm="openai/gpt-4o-mini",
    api_key="sk-...",
    schema="""
    Table: users
      - id: integer, PK
      - name: varchar
      - email: varchar
      - active: boolean
      - created_at: timestamp
    Table: orders
      - id: integer, PK
      - user_id: integer, FK -> users.id
      - total: numeric
      - status: varchar
      - created_at: timestamp
    """,
)

# Generate SQL
result = agent.query("show me the 10 most recent active users")
print(result.sql)
# SELECT id, name, email, created_at
# FROM users
# WHERE active = true
# ORDER BY created_at DESC LIMIT 10
```

## With Database Execution

```python
agent = QueryAgent(
    lm="openai/gpt-4o-mini",
    api_key="sk-...",
    dsn="postgresql://user:pass@localhost:5432/mydb",
)

# Auto-discover schema
agent.introspect()

# Generate + execute
result = agent.execute("how many orders were placed this month?")
print(result.rows)    # [{"count": 142}]
print(result.sql)     # SELECT COUNT(*) AS count FROM orders WHERE ...
```

## Async Support

```python
import asyncio
from oiko_query_builder import QueryAgent

async def main():
    agent = QueryAgent(
        lm="openai/gpt-4o-mini",
        api_key="sk-...",
        dsn="postgresql://user:pass@localhost:5432/mydb",
    )
    await agent.aintrospect()

    result = await agent.aexecute("top 5 customers by total spending")
    print(result.rows)

asyncio.run(main())
```

## Multi-turn Conversations

```python
agent = QueryAgent(lm="openai/gpt-4o-mini", api_key="sk-...", schema="...")

# First question
result = agent.query("show me user John Smith")
print(result.sql)  # SELECT ... WHERE name = 'John Smith'

# Follow-up (agent remembers context)
result = agent.query("show their orders")
print(result.sql)  # SELECT ... FROM orders WHERE user_id = ...

# Set entities manually
agent.set_entity("User", "John Smith (id=7)")
result = agent.query("show their recent orders")
```

## Safety Levels

```python
from oiko_query_builder import QueryAgent, SafetyLevel

# STRICT (default) - SELECT only
agent = QueryAgent(lm="...", api_key="...", safety_level=SafetyLevel.STRICT)

# MODERATE - SELECT, INSERT, UPDATE allowed
agent = QueryAgent(lm="...", api_key="...", safety_level="moderate")

# PERMISSIVE - all operations except DROP DATABASE
agent = QueryAgent(lm="...", api_key="...", safety_level="permissive")
```

## Custom Examples

```python
import dspy
from oiko_query_builder import QueryAgent

agent = QueryAgent(lm="openai/gpt-4o-mini", api_key="sk-...", schema="...")

# Add domain-specific examples for better generation
agent.add_examples([
    dspy.Example(
        question="Show VIP customers",
        db_schema="Table: customers\n  - id: integer\n  - tier: varchar",
        context=None,
        memory_context=None,
        sql_query="SELECT id, name FROM customers WHERE tier = 'vip'",
        has_limit=False,
        limit_value=None,
    ).with_inputs("question", "db_schema", "context", "memory_context"),
])
```

## DSPy Optimization

```python
from oiko_query_builder import QueryAgent, create_training_example

agent = QueryAgent(lm="openai/gpt-4o-mini", api_key="sk-...", schema="...")

# Create training data
train = [
    create_training_example(
        question="active users count",
        schema="Table: users\n  - id: integer\n  - active: boolean",
        expected_sql="SELECT COUNT(*) FROM users WHERE active = true",
    ),
    # ... more examples
]

# Optimize prompts automatically
agent.optimize(train_examples=train)
```

## Supported LLMs

Any LLM supported by DSPy works out of the box:

```python
# OpenAI
agent = QueryAgent(lm="openai/gpt-4o-mini", api_key="sk-...")

# Anthropic
agent = QueryAgent(lm="anthropic/claude-sonnet-4-20250514", api_key="sk-ant-...")

# Ollama (local)
agent = QueryAgent(lm="ollama_chat/llama3.2", api_key="")

# Any other DSPy-supported provider
agent = QueryAgent(lm="together_ai/meta-llama/...", api_key="...")
```

## API Reference

### `QueryAgent`

| Method | Description |
|---|---|
| `query(question)` | Generate SQL from natural language |
| `execute(question)` | Generate + execute SQL |
| `aquery(question)` | Async SQL generation |
| `aexecute(question)` | Async generate + execute |
| `introspect()` | Auto-discover database schema |
| `set_schema(schema)` | Set schema manually |
| `add_examples(examples)` | Add few-shot examples |
| `optimize(train)` | Optimize with DSPy |
| `explain(sql)` | Explain SQL in natural language |
| `set_entity(key, value)` | Set entity in memory |
| `clear_memory()` | Clear conversation history |

### `GenerationResult`

| Field | Type | Description |
|---|---|---|
| `sql` | `str` | Generated SQL query |
| `success` | `bool` | Whether generation succeeded |
| `has_limit` | `bool` | Whether SQL has LIMIT |
| `limit_value` | `int?` | LIMIT value if present |
| `explanation` | `str?` | Chain-of-thought reasoning |
| `retries_used` | `int` | Number of retries used |

### `QueryResult`

| Field | Type | Description |
|---|---|---|
| `sql` | `str` | Executed SQL query |
| `success` | `bool` | Whether execution succeeded |
| `rows` | `list[dict]?` | Query result rows |
| `row_count` | `int?` | Number of rows |
| `execution_time_ms` | `float?` | Execution time |

## License

MIT
