Metadata-Version: 2.4
Name: cmdop-llm
Version: 0.1.12
Summary: Python SDK for CMDOP LLM Service - OpenAI-compatible API for 200+ AI models
Project-URL: Homepage, https://cmdop.com
Project-URL: Documentation, https://sdk.cmdop.com
Project-URL: Repository, https://github.com/markolofsen/cmdop-client
Project-URL: Issues, https://github.com/markolofsen/cmdop-client/issues
Author-email: CMDOP Team <support@cmdop.com>
License-Expression: MIT
License-File: LICENSE
Keywords: ai,anthropic,api,chat,claude,cmdop,gpt,llm,openai,openrouter
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: httpx<1,>=0.23.0
Requires-Dist: openai<3.0.0,>=1.0.0
Requires-Dist: pydantic<3,>=2.0.0
Requires-Dist: typing-extensions>=4.5.0
Provides-Extra: dev
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pyright>=1.1.300; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: respx>=0.20.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Description-Content-Type: text/markdown

# CMDOP LLM Python SDK

Python SDK for CMDOP LLM Service - OpenAI-compatible API for 200+ AI models.

## Installation

```bash
pip install cmdop-llm
```

## Quick Start

```python
from cmdop_llm import CmdopLLM

client = CmdopLLM(api_key="your-api-key")

response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
```

## Features

- **Drop-in OpenAI replacement** - Same API, different models
- **200+ Models** - GPT-4, Claude, Llama, Mistral, Gemini via single endpoint
- **Streaming** - Real-time token streaming
- **Tool Calling** - Function calling support
- **Structured Output** - Parse responses to Pydantic models
- **Embeddings** - Text embeddings generation
- **Vision & OCR** - Image analysis and text extraction (auto model selection)
- **Image Generation** - FLUX, Gemini and other models (auto model selection)
- **Web Search** - AI-powered web search with citations
- **URL Fetch** - Fetch and analyze any URL content
- **Models API** - List, filter, and get pricing for all available models
- **Short Links** - URL shortening service (Bitly-like)
- **Async Support** - Full async/await support

## Environment Variables

```bash
export CMDOP_API_KEY="your-api-key"
export CMDOP_BASE_URL="https://llm.cmdop.com/v1"  # Optional, default
```

## Usage Examples

### Chat Completion

```python
from cmdop_llm import CmdopLLM

client = CmdopLLM()

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing."}
    ],
    temperature=0.7,
    max_tokens=1000,
)
print(response.choices[0].message.content)
```

### Streaming

```python
stream = client.chat.completions.create(
    model="anthropic/claude-sonnet-4",
    messages=[{"role": "user", "content": "Write a poem."}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
```

### Tool Calling

```python
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }
}]

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools,
    tool_choice="auto",
)

if response.choices[0].message.tool_calls:
    tool_call = response.choices[0].message.tool_calls[0]
    print(f"Function: {tool_call.function.name}")
    print(f"Arguments: {tool_call.function.arguments}")
```

### Web Search

```python
from cmdop_llm import CmdopLLM, UserLocation

client = CmdopLLM()

# Basic web search
result = client.search.web("What is the capital of France?")
print(result.content)

# Print citations
for citation in result.citations:
    print(f"- {citation.title}: {citation.url}")

# Search with options
result = client.search.web(
    "Latest AI news",
    max_searches=5,
    allowed_domains=["bbc.com", "cnn.com", "reuters.com"],
    user_location=UserLocation(country="US", city="New York"),
)
print(result.content)
```

### URL Fetch & Analysis

```python
# Fetch and analyze a specific URL
result = client.search.fetch(
    url="https://en.wikipedia.org/wiki/Python_(programming_language)",
    prompt="What are the key features of Python? List top 5.",
)
print(result.content)
```

### Vision Analysis

```python
# With explicit model
result = client.vision.analyze(
    image_url="https://example.com/image.jpg",
    prompt="Describe this image",
    model="google/gemini-2.0-flash-001"
)
print(result.description)

# Auto model selection with quality preset
result = client.vision.analyze(
    image_url="https://example.com/image.jpg",
    prompt="What's in this image?",
    model_quality="balanced"  # fast|balanced|best
)
print(result.extracted_text)
```

### OCR Text Extraction

```python
# Auto model selection with quality preset
result = client.ocr.extract(
    image_url="https://example.com/document.png",
    model_quality="fast"  # cheapest auto-selected model
)
print(result.text)

# With explicit model (overrides model_quality)
result = client.ocr.extract(
    image_url="https://example.com/document.png",
    model="openai/gpt-4o-mini"
)
print(result.text)
```

### Image Generation

```python
# Auto model selection with quality preset
response = client.images.generate(
    prompt="A futuristic cityscape at sunset",
    size="1024x1024",
    model_quality="balanced"  # fast|balanced|best
)
print(response.data[0].url)

# With explicit model (overrides model_quality)
response = client.images.generate(
    model="google/gemini-2.0-flash-exp:free",
    prompt="A serene mountain landscape",
    size="1024x1024",
)
print(response.data[0].url)
```

### Embeddings

```python
response = client.embeddings.create(
    model="openai/text-embedding-3-small",
    input="Hello, world!"
)
print(response.data[0].embedding[:5])  # First 5 dimensions
```

### Structured Output with Pydantic

```python
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int
    city: str

# Parse response directly into Pydantic model
response = client.beta.chat.completions.parse(
    model="openai/gpt-4o",
    messages=[
        {"role": "user", "content": "Extract: John is 30 years old and lives in Tokyo"}
    ],
    response_format=Person,
)

person = response.choices[0].message.parsed
print(f"{person.name}, {person.age}, {person.city}")  # John, 30, Tokyo
```

### JSON Schema Response Format

```python
response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "List 3 colors"}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "colors",
            "schema": {
                "type": "object",
                "properties": {
                    "colors": {"type": "array", "items": {"type": "string"}}
                },
                "required": ["colors"]
            }
        }
    }
)
```

### Async Usage

```python
import asyncio
from cmdop_llm import AsyncCmdopLLM

async def main():
    client = AsyncCmdopLLM()

    # Chat
    response = await client.chat.completions.create(
        model="openai/gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    print(response.choices[0].message.content)

    # Web Search
    result = await client.search.web("Latest tech news")
    print(result.content)

    # Vision (auto model)
    result = await client.vision.analyze(
        image_url="https://example.com/photo.jpg"
    )
    print(result.description)

asyncio.run(main())
```

### Short Links (URL Shortening)

```python
from cmdop_llm import CmdopLLM

client = CmdopLLM()

# Create a short link
link = client.short_links.create(
    url="https://example.com/very/long/url/path",
    custom_slug="my-link",  # optional
    ttl="7d",               # optional: 1h, 24h, 7d, 30d
    max_hits=100            # optional
)
print(link.short_url)  # https://llm.cmdop.com/s/abc123

# List all links
links = client.short_links.list(limit=50, active_only=True)
for link in links.data:
    print(f"{link.code}: {link.hits} hits -> {link.target_url}")

# Get link info
info = client.short_links.get("abc123")
print(f"Hits: {info.hits}, Active: {info.is_active}")

# Get statistics
stats = client.short_links.stats()
print(f"Total: {stats.total_links}, Hits: {stats.total_hits}")

# Deactivate a link
client.short_links.delete("abc123")
```

### List Available Models

```python
from cmdop_llm import CmdopLLM

client = CmdopLLM()

# List all available models
models = client.models.list()
for model in models.data:
    print(f"{model.id}: {model.name} ({model.context_length} tokens)")

# Filter by provider
anthropic_models = client.models.list(provider="anthropic")

# Filter by vision support
vision_models = client.models.list(supports_vision=True)

# Filter by context length (min 100k tokens)
long_context = client.models.list(min_context_length=100000)

# Filter by price (max $1 per 1M prompt tokens)
cheap_models = client.models.list(max_prompt_price=1.0)

# Search by name or ID
gpt_models = client.models.list(search="gpt-4")

# Combine filters
result = client.models.list(
    provider="openai",
    supports_vision=True,
    min_context_length=128000,
    max_prompt_price=10.0,
)

# Get specific model details
model = client.models.retrieve("anthropic/claude-sonnet-4")
print(f"Name: {model.name}")
print(f"Context: {model.context_length}")
print(f"Prompt price: ${model.pricing.prompt_cost_per_million()}/1M tokens")
print(f"Vision support: {model.supports_vision}")
```

## Available Models

Access 200+ models including:

- **OpenAI**: gpt-4o, gpt-4o-mini, gpt-4.1, o1, o3-mini
- **Anthropic**: claude-sonnet-4, claude-opus-4, claude-3.5-haiku
- **Google**: gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash
- **Meta**: llama-4-maverick, llama-3.3-70b
- **Mistral**: mistral-large, mistral-medium
- **Image Gen**: gemini-2.0-flash-exp (free), flux-pro, dall-e-3

Use model format: `provider/model-name` (e.g., `openai/gpt-4o`)

## Auto Model Selection

For Vision, OCR, and Image Generation endpoints, the `model` parameter is optional. You can control quality/cost tradeoff with `model_quality`:

```python
# Auto-select by quality level
result = client.vision.analyze(
    image_url="https://...",
    model_quality="fast"      # cheapest, auto-selected
)

result = client.vision.analyze(
    image_url="https://...",
    model_quality="balanced"  # mid-tier (llama-3.2-11b-vision)
)

result = client.vision.analyze(
    image_url="https://...",
    model_quality="best"      # highest quality (gpt-4o)
)

# Same for OCR
result = client.ocr.extract(
    image_url="https://...",
    model_quality="balanced"
)

# Same for images
response = client.images.generate(
    prompt="A sunset",
    model_quality="best"  # flux-1.1-pro
)
```

| Quality | Vision/OCR Model | Image Model |
|---------|------------------|-------------|
| `fast` | Auto cheapest | Auto cheapest |
| `balanced` | llama-3.2-11b-vision | gemini-3-pro-image |
| `best` | gpt-4o | flux-1.1-pro |

**Note**: Explicit `model` parameter always takes priority over `model_quality`.

## API Reference

### CmdopLLM

```python
CmdopLLM(
    api_key: str = None,       # From CMDOP_API_KEY env if not set
    base_url: str = None,      # Default: https://llm.cmdop.com/v1
    timeout: float = None,     # Request timeout
    max_retries: int = 2,      # Retry count
)
```

### Resources

| Resource | Description |
|----------|-------------|
| `client.chat.completions` | Chat completions (OpenAI compatible) |
| `client.beta.chat.completions.parse()` | Structured output with Pydantic |
| `client.embeddings` | Text embeddings (OpenAI compatible) |
| `client.images` | Image generation with `model_quality` support |
| `client.models` | List and filter available models with pricing |
| `client.vision` | Vision analysis with `model_quality` support |
| `client.ocr` | OCR extraction with `model_quality` support |
| `client.search` | Web search and URL fetch (CMDOP specific) |
| `client.short_links` | URL shortening service (CMDOP specific) |

### Models Methods

```python
# List models with optional filters
client.models.list(
    provider: str = None,            # Filter by provider (e.g., "openai", "anthropic")
    supports_vision: bool = None,    # Filter for vision-capable models
    min_context_length: int = None,  # Minimum context length
    max_prompt_price: float = None,  # Max prompt price per 1M tokens
    max_completion_price: float = None,  # Max completion price per 1M tokens
    search: str = None,              # Search in model ID and name
    refresh: bool = False,           # Force refresh from API
) -> ModelsResponse

# Get specific model by ID
client.models.retrieve(
    model_id: str,                   # e.g., "anthropic/claude-sonnet-4"
) -> Model
```

### Search Methods

```python
# Web search with AI-summarized results
client.search.web(
    query: str,                      # Search query
    model: str = "claude-3-5-haiku-20241022",
    max_tokens: int = 1024,
    max_searches: int = 5,           # Max web searches (1-10)
    allowed_domains: list[str] = None,
    blocked_domains: list[str] = None,
    user_location: UserLocation = None,
) -> WebSearchResponse

# Fetch and analyze URL content
client.search.fetch(
    url: str,                        # URL to fetch
    prompt: str = "Summarize this page",
    model: str = "claude-3-5-haiku-20241022",
    max_tokens: int = 1024,
) -> WebSearchResponse
```

### Short Links Methods

```python
# Create a short link
client.short_links.create(
    url: str,                        # Target URL to shorten
    custom_slug: str = None,         # Custom slug (optional)
    ttl: str = None,                 # Time to live: "1h", "24h", "7d", "30d"
    max_hits: int = None,            # Max hits before expiration
) -> ShortLinkCreateResponse

# List short links
client.short_links.list(
    limit: int = 50,                 # Max results (1-100)
    offset: int = 0,                 # Pagination offset
    active_only: bool = False,       # Only active links
) -> ShortLinkListResponse

# Get link info
client.short_links.get(
    code: str,                       # Link code or custom slug
) -> ShortLink

# Get statistics
client.short_links.stats() -> ShortLinkStatsResponse

# Deactivate a link
client.short_links.delete(
    code: str,                       # Link code or custom slug
) -> ShortLinkDeleteResponse
```

### Response Types

```python
# ModelsResponse
response.object      # "list"
response.data        # List of Model objects

# Model
model.id             # Model ID (e.g., "openai/gpt-4o")
model.name           # Display name
model.description    # Model description (optional)
model.context_length # Max context length in tokens
model.pricing        # ModelPricing object
model.architecture   # ModelArchitecture (optional)
model.top_provider   # TopProvider info (optional)
model.created        # Unix timestamp (optional)
model.owned_by       # Provider name (property)
model.supports_vision  # Vision capability (property)

# ModelPricing
pricing.prompt       # Cost per prompt token (string)
pricing.completion   # Cost per completion token (string)
pricing.image        # Cost per image (optional)
pricing.request      # Cost per request (optional)
pricing.prompt_cost_per_million()      # Returns float
pricing.completion_cost_per_million()  # Returns float

# ModelArchitecture
architecture.tokenizer     # Tokenizer type (e.g., "GPT")
architecture.instruct_type # Instruction format
architecture.modality      # e.g., "text->text", "text+image->text"

# WebSearchResponse
response.id          # Unique response ID
response.content     # AI-generated response
response.citations   # List of SearchCitation
response.model       # Model used
response.usage       # SearchUsage (input_tokens, output_tokens)
response.stop_reason # Why response stopped

# SearchCitation
citation.title       # Source page title
citation.url         # Source URL
citation.cited_text  # Quoted text (optional)

# UserLocation
UserLocation(
    country="US",    # ISO 3166-1 alpha-2 code
    city="New York",
    region="NY",
    timezone="America/New_York",
)

# VisionAnalyzeResponse
response.description    # Image description
response.extracted_text # Text found in image
response.model          # Model used
response.cost_usd       # Cost in USD
response.tokens_input   # Input tokens used
response.tokens_output  # Output tokens used

# OCRResponse
response.text          # Extracted text
response.model         # Model used
response.cost_usd      # Cost in USD
response.tokens_input  # Input tokens used
response.tokens_output # Output tokens used

# ShortLink
link.code              # Unique code
link.short_url         # Full short URL
link.target_url        # Target URL
link.custom_slug       # Custom slug (optional)
link.hits              # Current hit count
link.max_hits          # Max hits (optional)
link.expires_at        # Expiration datetime (optional)
link.is_active         # Whether link is active
link.created_at        # Creation datetime

# ShortLinkCreateResponse
response.code          # Created link code
response.short_url     # Full short URL
response.target_url    # Target URL

# ShortLinkListResponse
response.data          # List of ShortLink
response.total         # Total count
response.limit         # Page limit
response.offset        # Page offset

# ShortLinkStatsResponse
stats.total_links      # Total links count
stats.total_hits       # Total hits across all links
stats.active_links     # Active links count
```

## Error Handling

```python
from cmdop_llm import CmdopLLM, BadRequestError, RateLimitError, AuthenticationError

client = CmdopLLM()

try:
    response = client.chat.completions.create(
        model="invalid/model",
        messages=[{"role": "user", "content": "Hello"}]
    )
except BadRequestError as e:
    print(f"Invalid request: {e}")
except RateLimitError as e:
    print(f"Rate limited: {e}")
except AuthenticationError as e:
    print(f"Auth error: {e}")
```

## License

MIT
