Metadata-Version: 2.4
Name: llm-json-streaming
Version: 0.1.1
Summary: A unified interface for streaming structured JSON from OpenAI, Anthropic, and Google Gemini.
Project-URL: Homepage, https://github.com/daniel-style/llm-json-streaming
Project-URL: Bug Tracker, https://github.com/daniel-style/llm-json-streaming/issues
Project-URL: Documentation, https://github.com/daniel-style/llm-json-streaming#readme
Project-URL: Source Code, https://github.com/daniel-style/llm-json-streaming
Project-URL: PyPI Package, https://pypi.org/project/llm-json-streaming/
Project-URL: Test PyPI, https://test.pypi.org/project/llm-json-streaming/
Project-URL: API Reference, https://github.com/daniel-style/llm-json-streaming/blob/main/API_REFERENCE.md
Project-URL: Changelog, https://github.com/daniel-style/llm-json-streaming/blob/main/CHANGELOG.md
Author-email: Daniel Wu <edanielwu@gmail.com>
License-File: LICENSE
Keywords: anthropic,gemini,json,llm,openai,pydantic,streaming,structured-outputs
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Requires-Dist: anthropic
Requires-Dist: google-genai
Requires-Dist: json-repair>=0.30.0
Requires-Dist: openai>=2.0.0
Requires-Dist: pydantic>=2.0
Requires-Dist: python-dotenv
Provides-Extra: dev
Requires-Dist: black>=23.0.0; extra == 'dev'
Requires-Dist: isort>=5.0.0; extra == 'dev'
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pre-commit>=3.0.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Description-Content-Type: text/markdown

# LLM JSON Streaming

[![PyPI Version](https://img.shields.io/pypi/v/llm-json-streaming.svg)](https://pypi.org/project/llm-json-streaming/)
[![Python Versions](https://img.shields.io/pypi/pyversions/llm-json-streaming.svg)](https://pypi.org/project/llm-json-streaming/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Tests](https://img.shields.io/badge/Tests-Passing-brightgreen.svg)](https://github.com/daniel-style/llm-json-streaming/actions)

A unified Python library for streaming structured JSON outputs from OpenAI, Anthropic (Claude), and Google Gemini.

This library abstracts the differences between providers' structured output APIs and provides a consistent interface to stream JSON data and parsed Pydantic objects.

## Features

- **Unified Interface**: Use a single API to interact with OpenAI, Anthropic, and Google Gemini.
- **JSON Streaming**: Access raw JSON chunks as they are generated (`delta`).
- **Structured Outputs**: Enforce schema validation using Pydantic models.
- **Partial Parsing**: Access accumulated JSON strings during streaming.
- **Claude Structured Outputs**: Automatically upgrades Claude Sonnet 4.5 / Opus 4.1 requests to Anthropic's JSON outputs for guaranteed schemas.
- **Claude Prefill Strategy**: Older Claude models avoid tool calls entirely—schema-aware prefilling keeps responses JSON-only while still streaming deltas. Includes JSON repair for partial object support.
- **Google Gemini Support**: Native structured outputs with JSON repair for enhanced partial object support.

## Installation

### 📦 From PyPI (Recommended)

Install the package from PyPI using `pip` or `uv`:

```bash
# Using pip
pip install llm-json-streaming

# Using uv (recommended)
uv add llm-json-streaming
```

### 🧪 From Test PyPI

For testing pre-release versions:

```bash
# Using pip
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ llm-json-streaming

# Using uv
uv add --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ llm-json-streaming
```

### 🛠️ From Source

Install from source for development:

```bash
# Clone the repository
git clone https://github.com/daniel-style/llm-json-streaming.git
cd llm-json-streaming

# Using uv (recommended)
uv sync

# Or using pip
pip install -e .
```

### 📋 Package Information

- **PyPI**: https://pypi.org/project/llm-json-streaming/
- **Test PyPI**: https://test.pypi.org/project/llm-json-streaming/
- **Current Version**: 0.1.0
- **Python**: 3.9+
- **Dependencies**: Automatically installed

## Configuration

Set your API keys in a `.env` file:

```ini
OPENAI_API_KEY=your_openai_api_key
OPENAI_BASE_URL=https://api.openai.com/v1

ANTHROPIC_API_KEY=your_anthropic_api_key
ANTHROPIC_BASE_URL=https://api.anthropic.com

GEMINI_API_KEY=your_gemini_api_key
GOOGLE_BASE_URL=https://generativelanguage.googleapis.com  # Optional
```

## Usage

### 🚀 Quick Start

Define your output schema using Pydantic and pass it to the provider:

```python
import asyncio
import os
from pydantic import BaseModel
from llm_json_streaming import create_provider

# 1. Define your schema
class UserProfile(BaseModel):
    name: str
    age: int
    bio: str
    skills: list[str] = []

async def main():
    # 2. Initialize provider using the factory
    # Available: "openai", "anthropic", "claude", "google"
    # Ensure environment variables are set, or pass api_key="..."
    try:
        # For Anthropic, you can optionally specify mode:
        provider = create_provider("openai")  # Use OpenAI
        # provider = create_provider("anthropic", mode="auto")  # Anthropic with auto-detection
        # provider = create_provider("google")  # Google Gemini
    except ValueError as e:
        print(f"Provider creation error: {e}")
        return

    prompt = "Generate a profile for a fictional software engineer."

    # 3. Stream results
    print("🔄 Streaming JSON...")
    try:
        async for chunk in provider.stream_json(prompt, UserProfile):
            # Real-time partial parsed object (recommended for streaming updates)
            if "partial_object" in chunk:
                obj = chunk["partial_object"]
                # Handle both dict and Pydantic objects
                if hasattr(obj, 'name'):  # Pydantic object
                    name = obj.name or "..."
                    age = obj.age if obj.age else "?"
                else:  # Dict object
                    name = obj.get('name', "...")
                    age = obj.get('age', "?")

                print(f"\r📝 Current: {name}, Age: {age}", end="", flush=True)

            # Final parsed object (complete and validated)
            if "final_object" in chunk:
                final_profile = chunk["final_object"]
                print(f"\n\n✅ Complete: {final_profile.name}, Age: {final_profile.age}")
                print(f"📋 Bio: {final_profile.bio}")
                if final_profile.skills:
                    print(f"🛠️  Skills: {', '.join(final_profile.skills)}")
                break

    except Exception as e:
        print(f"\n❌ Error during streaming: {e}")

if __name__ == "__main__":
    asyncio.run(main())
```

### 🔧 Advanced Usage

#### Multiple Providers Comparison

```python
import asyncio
from llm_json_streaming import create_provider
from pydantic import BaseModel

class TaskResult(BaseModel):
    title: str
    status: str
    priority: int

async def compare_providers():
    providers = {
        "OpenAI": create_provider("openai"),
        "Anthropic": create_provider("anthropic", mode="auto"),
        "Google": create_provider("google")
    }

    prompt = "Create a software development task with title, status, and priority"

    results = {}
    for name, provider in providers.items():
        try:
            async for chunk in provider.stream_json(prompt, TaskResult):
                if "final_object" in chunk:
                    results[name] = chunk["final_object"]
                    print(f"✅ {name}: {results[name].title}")
                    break
        except Exception as e:
            print(f"❌ {name} failed: {e}")

    return results

# Run comparison
# asyncio.run(compare_providers())
```

#### Error Handling & Type Safety

```python
import asyncio
from llm_json_streaming import create_provider
from pydantic import BaseModel, ValidationError

class APIResponse(BaseModel):
    success: bool
    data: dict
    error_message: str = ""

async def safe_streaming_example():
    try:
        provider = create_provider("anthropic")  # Fallback provider

        async for chunk in provider.stream_json(
            "Process this user request",
            APIResponse
        ):
            if "partial_object" in chunk:
                obj = chunk["partial_object"]

                # Safe object handling
                if isinstance(obj, dict):
                    # Handle dict objects
                    success = obj.get('success', False)
                elif hasattr(obj, 'success'):
                    # Handle Pydantic objects
                    success = obj.success
                else:
                    print("⚠️  Unexpected object type")
                    continue

                # Process partial results...

            if "final_object" in chunk:
                final = chunk["final_object"]
                print(f"✅ Final result: {final}")
                break

    except ValidationError as e:
        print(f"❌ Schema validation error: {e}")
    except Exception as e:
        print(f"❌ Streaming error: {e}")
```

## Streaming Interface

The `stream_json()` method yields dictionaries with different types of content during streaming:

### Chunk Fields

- **`partial_object`**: The current best parsed object. Available from the beginning of streaming in all modes:
  - **Early stage**: Returns partial dictionaries for incomplete JSON
  - **Later stage**: Returns validated Pydantic model instances for complete/repairable JSON
- **`delta`**: Raw text characters as they are generated by the LLM.
- **`final_object`**: The complete, validated Pydantic object when streaming finishes.
- **`partial_json`**: The current accumulated JSON text string.
- **`final_json`**: The complete JSON text string when streaming finishes.

### Recommended Usage Pattern

```python
async for chunk in provider.stream_json(prompt, UserProfile):
    # Use partial_object for real-time updates (recommended)
    if "partial_object" in chunk:
        user_profile = chunk["partial_object"]
        # Available from the beginning - starts as dict, becomes Pydantic object
        # Handle both types gracefully for consistent UI updates
        if hasattr(user_profile, 'model_dump'):
            # Pydantic model (complete/repairable JSON)
            name = user_profile.name or "..."
        else:
            # Dictionary (incomplete JSON)
            name = user_profile.get('name', "...")

        update_ui(name)  # Update UI with current best data

    # Use final_object for the final result
    if "final_object" in chunk:
        final_profile = chunk["final_object"]
        # Process the complete validated object
        save_result(final_profile)
```

## Supported Providers & Models

| Provider | Default Model | Method Used |
|----------|---------------|-------------|
| OpenAI   | `gpt-4o-2024-08-06` | `response_format` (Structured Outputs) via `beta.chat.completions` |
| Anthropic   | `claude-3-5-sonnet-20240620` (auto-switches to Structured Outputs for `claude-sonnet-4.5*` / `claude-opus-4.1*`) | Prefill JSON streaming for legacy models, Structured Outputs (`output_format` + beta header) for Sonnet 4.5 / Opus 4.1 |
| Google   | `gemini-2.5-flash` | `response_mime_type="application/json"` with structured outputs via Google GenAI SDK |

### Anthropic Mode Configuration

You can configure which strategy Anthropic models use through multiple methods:

#### Method 1: Constructor Mode (Recommended)

```python
from llm_json_streaming import create_provider

# Force structured outputs mode
provider = create_provider("anthropic", mode="structured")

# Force prefill mode
provider = create_provider("anthropic", mode="prefill")

# Auto-detection based on model (default)
provider = create_provider("anthropic", mode="auto")
```

#### Method 2: Method Parameter Override

```python
# Temporary override per request
async for chunk in provider.stream_json(prompt, UserProfile,
                                       model="claude-3-5-sonnet-20240620",
                                       use_structured_outputs=True):
    # Uses structured outputs regardless of auto-detection
```

#### Mode Priority

1. **Constructor mode** (`mode=` parameter) - Highest priority
2. **Method parameter** (`use_structured_outputs=`) - Medium priority
3. **Auto-detection** - Based on model capabilities - Lowest priority

### Anthropic Structured Outputs

Claude Sonnet 4.5 and Claude Opus 4.1 support Anthropic's structured output beta.
When using structured mode, chunks include partial JSON text and final Pydantic objects automatically.

### Anthropic Prefill Mode

All other Claude models receive schema-derived instructions and an assistant prefill (e.g., `{` or `{"field":`) so they skip generic preambles and stream JSON directly—no tool definitions or tool-use deltas required.

Enhanced with multi-level partial object support:
- **Real-time partial objects**: Available from the first token, even with incomplete JSON
- **Progressive improvement**: Starts with partial dictionaries, upgrades to Pydantic objects when JSON becomes complete
- **JSON repair**: Automatically fixes incomplete JSON to enable better partial parsing
- **Consistent interface**: Behaves like structured outputs while maintaining backward compatibility

### Google Gemini Support

Google Gemini models use the Google GenAI SDK with native structured outputs:

```python
from llm_json_streaming import create_provider

provider = create_provider("google")
async for chunk in provider.stream_json(prompt, UserProfile, model="gemini-2.5-flash"):
    # Handle streaming chunks
    if "partial_object" in chunk:
        print(chunk["partial_object"])
```

**Key Features:**
- **Native Structured Outputs**: Uses `response_mime_type="application/json"` for guaranteed JSON responses
- **JSON Repair**: Automatic repair of incomplete JSON for enhanced partial object support
- **Schema Validation**: Direct Pydantic schema integration for type-safe responses
- **Streaming**: Real-time partial objects with progressive enhancement

**Configuration:**
- Set `GEMINI_API_KEY` environment variable (required)
- Optionally set `GOOGLE_BASE_URL` for custom endpoints
- Default model: `gemini-2.5-flash`

## Testing

### Running Tests

To run the tests with `uv`:

```bash
# Run all tests
uv run pytest

# Run specific test file
uv run pytest tests/test_providers.py

# Run with coverage
uv run pytest --cov=llm_json_streaming
```

### Quick Validation

Test the package installation and basic functionality:

```bash
# Using the test package
git clone https://github.com/daniel-style/llm-json-streaming.git
cd llm-json-streaming/test_package

# Test with uv
cd llm-test-project
uv add llm-json-streaming==0.1.0
uv run python basics_test.py
```

## Troubleshooting

### 🔧 Common Issues

#### Installation Issues

**Problem**: `ModuleNotFoundError: No module named 'llm_json_streaming'`
```bash
# Solution: Install the package
pip install llm-json-streaming
# or
uv add llm-json-streaming
```

**Problem**: Dependency conflicts
```bash
# Solution: Use virtual environment
python -m venv myenv
source myenv/bin/activate  # Windows: myenv\Scripts\activate
pip install llm-json-streaming
```

#### API Key Issues

**Problem**: Authentication errors
```bash
# Solution: Set environment variables
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
export GEMINI_API_KEY="your-key"

# Or create .env file
echo "OPENAI_API_KEY=your-key" > .env
```

#### Streaming Issues

**Problem**: No `final_object` received
- **Cause**: Provider might have returned incomplete JSON
- **Solution**: Check `partial_object` for partial results and improve prompt clarity

**Problem**: Mixed object types (dict vs Pydantic)
```python
# Solution: Handle both types safely
if "partial_object" in chunk:
    obj = chunk["partial_object"]
    if hasattr(obj, 'field_name'):  # Pydantic object
        value = obj.field_name
    else:  # Dict object
        value = obj.get('field_name')
```

#### Provider-Specific Issues

**OpenAI**:
- Model: `gpt-4o-2024-08-06` (default)
- Rate limits: Check OpenAI API quotas
- Service issues: Check [OpenAI Status](https://status.openai.com/)

**Anthropic**:
- Model: `claude-3-5-sonnet-20240620` (default)
- Structured outputs: Available for Claude Sonnet 4.5+ and Opus 4.1+
- Mode selection: `auto`, `structured`, `prefill`

**Google Gemini**:
- Model: `gemini-2.5-flash` (default)
- API key: Required, no free tier
- Regional availability: Check [Google AI Studio](https://aistudio.google.com/)

### 🚨 Error Codes Reference

| Error Code | Description | Solution |
|------------|-------------|----------|
| 401 | Invalid API key | Check environment variables |
| 429 | Rate limit exceeded | Wait and retry, or upgrade plan |
| 503 | Service unavailable | Try again later or switch provider |
| ValueError | Invalid provider name | Use: "openai", "anthropic", "claude", "google" |

### 📞 Getting Help

1. **Check the [test results](test_package/TEST_RESULTS.md)** for known issues
2. **Review usage examples** in the test package
3. **Open an issue** on GitHub with:
   - Python version
   - Package version
   - Error message
   - Minimal reproduction code
4. **Check provider documentation**:
   - [OpenAI API](https://platform.openai.com/docs)
   - [Anthropic API](https://docs.anthropic.com)
   - [Google Gemini API](https://ai.google.dev/docs)

## Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss what you would like to change.

### Development Setup

```bash
# Clone and set up development environment
git clone https://github.com/daniel-style/llm-json-streaming.git
cd llm-json-streaming

# Using uv (recommended)
uv sync

# Or using pip
pip install -e ".[dev]"
```

### Running Tests

```bash
# All tests
uv run pytest

# With coverage
uv run pytest --cov=llm_json_streaming

# Specific provider tests
uv run pytest tests/test_openai_integration.py
```

## License

[MIT](LICENSE)

## 📚 Additional Resources

- **PyPI Package**: https://pypi.org/project/llm-json-streaming/
- **Test PyPI**: https://test.pypi.org/project/llm-json-streaming/
- **GitHub Repository**: https://github.com/daniel-style/llm-json-streaming
- **Issue Tracker**: https://github.com/daniel-style/llm-json-streaming/issues
- **Documentation**: See inline code documentation and examples
