Metadata-Version: 2.4
Name: revenium-python-sdk
Version: 0.1.2
Summary: The official Revenium Python SDK — unified AI metering middleware for OpenAI, Anthropic, Google, Ollama, LiteLLM, Perplexity, and fal.ai.
Author-email: Revenium <support@revenium.io>
License: MIT
Project-URL: Homepage, https://github.com/revenium/revenium-python-sdk
Project-URL: Bug Tracker, https://github.com/revenium/revenium-python-sdk/issues
Project-URL: Documentation, https://docs.revenium.io
Keywords: sdk,ai,llm,middleware,metering,revenium,openai,anthropic,google,ollama,litellm,perplexity,fal
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: revenium_metering>=6.8.2
Provides-Extra: openai
Requires-Dist: wrapt; extra == "openai"
Requires-Dist: openai>=1.0.0; extra == "openai"
Requires-Dist: python-dotenv>=0.19.0; extra == "openai"
Provides-Extra: langchain
Requires-Dist: langchain>=0.1.16; extra == "langchain"
Requires-Dist: langchain-openai>=0.1.0; extra == "langchain"
Requires-Dist: langchain-core>=0.1.0; extra == "langchain"
Provides-Extra: perplexity
Requires-Dist: wrapt; extra == "perplexity"
Requires-Dist: python-dotenv>=0.19.0; extra == "perplexity"
Provides-Extra: perplexity-openai
Requires-Dist: wrapt; extra == "perplexity-openai"
Requires-Dist: openai>=1.0.0; extra == "perplexity-openai"
Requires-Dist: python-dotenv>=0.19.0; extra == "perplexity-openai"
Provides-Extra: perplexity-native
Requires-Dist: wrapt; extra == "perplexity-native"
Requires-Dist: perplexityai>=0.1.0; extra == "perplexity-native"
Requires-Dist: python-dotenv>=0.19.0; extra == "perplexity-native"
Provides-Extra: google
Requires-Dist: wrapt; extra == "google"
Provides-Extra: google-genai
Requires-Dist: wrapt; extra == "google-genai"
Requires-Dist: google-genai>=0.1.0; extra == "google-genai"
Requires-Dist: python-dotenv; extra == "google-genai"
Provides-Extra: google-vertex
Requires-Dist: wrapt; extra == "google-vertex"
Requires-Dist: vertexai>=1.0.0; extra == "google-vertex"
Requires-Dist: python-dotenv; extra == "google-vertex"
Provides-Extra: anthropic
Requires-Dist: wrapt; extra == "anthropic"
Requires-Dist: anthropic; extra == "anthropic"
Requires-Dist: python-dotenv>=0.19.0; extra == "anthropic"
Provides-Extra: ollama
Requires-Dist: wrapt; extra == "ollama"
Requires-Dist: ollama; extra == "ollama"
Provides-Extra: fal
Requires-Dist: wrapt; extra == "fal"
Requires-Dist: fal-client>=0.13.0; extra == "fal"
Requires-Dist: python-dotenv>=0.19.0; extra == "fal"
Provides-Extra: litellm
Requires-Dist: wrapt; extra == "litellm"
Requires-Dist: litellm; extra == "litellm"
Provides-Extra: litellm-proxy
Requires-Dist: wrapt; extra == "litellm-proxy"
Requires-Dist: litellm[proxy]; extra == "litellm-proxy"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Requires-Dist: freezegun; extra == "dev"
Requires-Dist: openai-responses>=0.12.0; extra == "dev"
Requires-Dist: requests; extra == "dev"
Requires-Dist: wrapt; extra == "dev"
Requires-Dist: python-dotenv>=0.19.0; extra == "dev"
Requires-Dist: anthropic; extra == "dev"
Requires-Dist: boto3; extra == "dev"
Requires-Dist: ollama; extra == "dev"
Dynamic: license-file

# Revenium Python SDK

[![PyPI version](https://img.shields.io/pypi/v/revenium-python-sdk.svg)](https://pypi.org/project/revenium-python-sdk/)
[![Python Versions](https://img.shields.io/pypi/pyversions/revenium-python-sdk.svg)](https://pypi.org/project/revenium-python-sdk/)
[![Documentation](https://img.shields.io/badge/docs-revenium.io-blue)](https://docs.revenium.io)
[![Website](https://img.shields.io/badge/website-revenium.ai-blue)](https://www.revenium.ai)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

The official Revenium Python SDK — unified AI metering middleware for deeply attributed AI usage metrics. Supports OpenAI, Anthropic, Google (Gemini/Vertex AI), fal.ai, Ollama, LiteLLM, and Perplexity.

## Features

- **Unified SDK**: Single package with middleware for all major AI providers — install only what you need
- **Zero Code Changes**: Drop-in integration — just import and all API calls are automatically metered
- **Streaming Support**: Full streaming support for all providers (both sync and async)
- **Decorator Support**: `@revenium_metadata` for automatic metadata injection and `@revenium_meter` for selective metering
- **Tool Metering**: `@meter_tool` to meter arbitrary tool/function calls alongside LLM API metering
- **Prompt Capture**: Optional capture of prompts and responses for analytics and debugging
- **Terminal Summary**: Real-time cost and usage summaries in your terminal (human-readable or JSON)
- **Distributed Tracing**: Built-in trace visualization fields for cross-service observability
- **Asynchronous Processing**: Background thread management for non-blocking metering operations
- **Graceful Shutdown**: Ensures all metering data is properly sent even during application shutdown
- **Thread-Safe**: Production-ready with `contextvars`-based context management for concurrent applications

## Supported Providers

| Provider | Extra | Install Command |
|----------|-------|----------------|
| OpenAI | `openai` | `pip install revenium-python-sdk[openai]` |
| Azure OpenAI | `openai` | `pip install revenium-python-sdk[openai]` |
| Anthropic | `anthropic` | `pip install revenium-python-sdk[anthropic]` |
| AWS Bedrock (Anthropic) | `anthropic` | `pip install revenium-python-sdk[anthropic]` |
| Google Gemini | `google-genai` | `pip install revenium-python-sdk[google-genai]` |
| Google Vertex AI | `google-vertex` | `pip install revenium-python-sdk[google-vertex]` |
| Ollama | `ollama` | `pip install revenium-python-sdk[ollama]` |
| LiteLLM (Client) | `litellm` | `pip install revenium-python-sdk[litellm]` |
| LiteLLM (Proxy) | `litellm-proxy` | `pip install revenium-python-sdk[litellm-proxy]` |
| Perplexity (via OpenAI) | `perplexity-openai` | `pip install revenium-python-sdk[perplexity-openai]` |
| Perplexity (Native SDK) | `perplexity-native` | `pip install revenium-python-sdk[perplexity-native]` |
| fal.ai | `fal` | `pip install revenium-python-sdk[fal]` |
| LangChain | `langchain` | `pip install revenium-python-sdk[langchain]` |

## Feature Matrix

| Feature | OpenAI | Anthropic | Google | Ollama | LiteLLM | Perplexity | fal.ai |
|---------|--------|-----------|--------|--------|---------|------------|--------|
| Chat Completions | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Streaming | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Embeddings | Yes | - | Yes | Yes | Yes | - | - |
| Vision/Multimodal | Yes | Yes | Yes | - | Yes | - | Yes |
| Image Generation | - | - | Yes | - | - | - | Yes |
| Video Generation | - | - | Yes | - | - | - | Yes |
| Prompt Capture | Yes | Yes | Yes | - | Yes | - | - |
| Terminal Summary | Yes | Yes | Yes | Yes | Yes | - | - |
| Azure / Bedrock | Azure | Bedrock | Vertex AI | - | All | - | - |
| LangChain Integration | Yes | - | - | - | - | - | - |
| CrewAI Integration | - | - | - | - | Yes | - | - |
| Proxy Mode | - | - | - | - | Yes | - | - |

## Installation

```bash
# Core SDK only
pip install revenium-python-sdk

# With a specific provider
pip install revenium-python-sdk[openai]

# Multiple providers
pip install "revenium-python-sdk[openai,anthropic,ollama]"
```

## Quick Start

### 1. Configure Environment Variables

Create a `.env` file in your project directory:

```env
# Required
REVENIUM_METERING_API_KEY=hak_your_revenium_api_key_here
REVENIUM_METERING_BASE_URL=https://api.revenium.ai

# Provider API keys (set whichever you use)
OPENAI_API_KEY=sk-your_openai_key
ANTHROPIC_API_KEY=sk-ant-your_anthropic_key
GOOGLE_API_KEY=your_google_key
PERPLEXITY_API_KEY=pplx_your_key
FAL_KEY=your_fal_key
FIREWORKS_API_KEY=your_fireworks_key

# Optional
# REVENIUM_LOG_LEVEL=DEBUG
```

### 2. Import and Use

Just import the middleware for your provider. That's it - all API calls are automatically metered:

```python
from dotenv import load_dotenv
load_dotenv()

import openai
import revenium_middleware_openai  # Auto-initializes on import

client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
# Usage data automatically sent to Revenium
```

---

## Provider Usage Guides

### OpenAI

Supports chat completions, streaming, embeddings, function calling, and vision/multimodal.

```python
from dotenv import load_dotenv
load_dotenv()

import openai
import revenium_middleware_openai  # Auto-initializes

client = openai.OpenAI()

# Basic chat completion
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "organizationName": "AcmeCorp",
        "productName": "customer-chatbot",
        "trace_id": "session-123",
        "task_type": "chat"
    }
)

# Streaming
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

# Embeddings
embedding = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quick brown fox"
)
```

#### Azure OpenAI

The middleware automatically detects Azure OpenAI when using `AzureOpenAI()` and resolves deployment names to standard model names for accurate pricing.

```python
from openai import AzureOpenAI
import revenium_middleware_openai

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    api_version="2024-02-01"
)

response = client.chat.completions.create(
    model="my-gpt4-deployment",  # Azure deployment name
    messages=[{"role": "user", "content": "Hello!"}]
)
# Model name automatically resolved for pricing
```

**Azure environment variables:**
- `AZURE_OPENAI_ENDPOINT` - Your Azure OpenAI endpoint
- `AZURE_OPENAI_API_KEY` - Your Azure OpenAI API key
- `AZURE_OPENAI_DEPLOYMENT` - Default deployment name

**Examples:** `examples/openai/` - `openai_basic.py`, `openai_streaming.py`, `azure_basic.py`, `azure_streaming.py`

---

### Anthropic

Supports messages, streaming, vision/multimodal, and AWS Bedrock integration.

```python
from dotenv import load_dotenv
load_dotenv()

import anthropic
import revenium_middleware_anthropic  # Auto-initializes

client = anthropic.Anthropic()

# Basic message
message = client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=100,
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "organizationName": "AcmeCorp",
        "productName": "support-bot",
        "trace_id": "session-456"
    }
)

# Streaming
with client.messages.stream(
    model="claude-3-haiku-20240307",
    max_tokens=200,
    messages=[{"role": "user", "content": "Tell me a story"}],
    usage_metadata={"task_type": "creative"}
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
```

**Note:** The middleware only wraps `messages.create` and `messages.stream` endpoints. Other Anthropic SDK features work normally but aren't metered.

#### AWS Bedrock

The middleware provides complete AWS Bedrock integration with automatic detection.

```python
import anthropic
import revenium_middleware_anthropic

# Bedrock is automatically detected when AWS credentials are available
# and base_url contains 'amazonaws.com'
client = anthropic.AnthropicBedrock(
    aws_region="us-east-1"
)

message = client.messages.create(
    model="anthropic.claude-3-haiku-20240307-v1:0",
    max_tokens=100,
    messages=[{"role": "user", "content": "Hello from Bedrock!"}]
)
```

**Provider detection** automatically routes between Bedrock and direct Anthropic API based on:
- AWS credentials availability (`aws configure`, IAM roles, environment variables)
- Base URL detection (when `base_url` contains `amazonaws.com`)
- Defaults to direct Anthropic API - Bedrock only used when explicitly configured

**Bedrock environment variables:**

| Variable | Description | Default |
|----------|-------------|---------|
| `AWS_REGION` | AWS region for Bedrock | `us-east-1` |
| `REVENIUM_BEDROCK_DISABLE` | Set to `1` to disable Bedrock support | Not set |

**AWS authentication** uses the standard credential chain: environment variables, `~/.aws/credentials`, IAM roles, AWS SSO. Required permissions: `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream`.

**Supported Bedrock models:**

| Anthropic Model | Bedrock Model ID |
|----------------|------------------|
| `claude-3-opus-20240229` | `anthropic.claude-3-opus-20240229-v1:0` |
| `claude-3-sonnet-20240229` | `anthropic.claude-3-sonnet-20240229-v1:0` |
| `claude-3-haiku-20240307` | `us.anthropic.claude-3-5-haiku-20241022-v1:0` |
| `claude-3-5-sonnet-20240620` | `anthropic.claude-3-5-sonnet-20240620-v1:0` |
| `claude-3-5-sonnet-20241022` | `anthropic.claude-3-5-sonnet-20241022-v2:0` |
| `claude-3-5-haiku-20241022` | `anthropic.claude-3-5-haiku-20241022-v1:0` |

For other models, the middleware uses the format `anthropic.{model_name}`.

**Examples:** `examples/anthropic/` - `anthropic-basic.py`, `anthropic-streaming.py`, `anthropic-bedrock.py`, `anthropic-advanced.py`

---

### Google AI (Gemini / Vertex AI)

Supports chat completions, streaming, embeddings, image generation (Imagen), video generation, and vision/multimodal. Choose between Google AI SDK (simple API key setup) or Vertex AI SDK (production-grade with full token counting).

```bash
# Google AI SDK only (Gemini Developer API)
pip install "revenium-python-sdk[google-genai]"

# Vertex AI SDK only (recommended for production)
pip install "revenium-python-sdk[google-vertex]"
```

#### Google AI SDK

```python
from dotenv import load_dotenv
load_dotenv()

import revenium_middleware_google
from google import genai

client = genai.Client()
response = client.models.generate_content(
    model="gemini-2.0-flash-001",
    contents="Hello! Introduce yourself in one sentence.",
    usage_metadata={
        "organizationName": "AcmeCorp",
        "task_type": "chat"
    }
)
print(response.text)
```

#### Vertex AI SDK

```python
from dotenv import load_dotenv
load_dotenv()

import revenium_middleware_google
import vertexai
from vertexai.generative_models import GenerativeModel

vertexai.init(project="your-gcp-project", location="us-central1")
model = GenerativeModel("gemini-2.0-flash-001")
response = model.generate_content("Hello!")
print(response.text)
```

**Which SDK should I choose?**

| Use Case | Recommended SDK | Why |
|----------|----------------|-----|
| Quick prototyping | Google AI SDK | Simple API key setup |
| Production applications | Vertex AI SDK | Full token counting, enterprise features |
| Embeddings-heavy workloads | Vertex AI SDK | Complete token tracking for embeddings |
| Enterprise/GCP environments | Vertex AI SDK | Advanced Google Cloud integration |

**Note:** Google AI SDK embeddings don't return token counts due to API limitations, but requests are still tracked.

**Google AI environment variables:**
- `GOOGLE_API_KEY` - For Google AI SDK
- `GOOGLE_CLOUD_PROJECT` - For Vertex AI SDK
- `GOOGLE_CLOUD_LOCATION` - Vertex AI region (default: `us-central1`)

**For Vertex AI**, authenticate with: `gcloud auth application-default login`

**Examples:** `examples/google/` - `getting_started_google_ai.py`, `getting_started_vertex_ai.py`, `simple_streaming_test.py`, `simple_embeddings_test.py`

---

### Ollama

Supports chat completions, text generation, embeddings, and streaming. Works with any Ollama model.

```python
from dotenv import load_dotenv
load_dotenv()

import ollama
import revenium_middleware_ollama  # Auto-initializes

# Chat completion
response = ollama.chat(
    model='qwen2.5:0.5b',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    usage_metadata={
        "organizationName": "AcmeCorp",
        "task_type": "chat"
    }
)
print(response['message']['content'])

# Streaming
for chunk in ollama.chat(
    model='qwen2.5:0.5b',
    messages=[{'role': 'user', 'content': 'Tell me a story'}],
    stream=True
):
    print(chunk['message']['content'], end='', flush=True)

# Text generation
response = ollama.generate(model='qwen2.5:0.5b', prompt='Once upon a time')

# Embeddings (single and batch)
response = ollama.embed(model='nomic-embed-text', input='Hello world')
response = ollama.embed(model='nomic-embed-text', input=['Text 1', 'Text 2', 'Text 3'])
```

**Supported endpoints:** `ollama.chat()`, `ollama.generate()`, `ollama.embed()`

**OpenAI compatibility mode:** You can also use Ollama with the OpenAI SDK:

```python
import openai
import revenium_middleware_openai

openai.api_key = 'ollama'
openai.base_url = 'http://localhost:11434/v1/'

response = openai.chat.completions.create(
    model="gemma2:2b",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={"organizationName": "AcmeCorp"}
)
```

**Prerequisites:** Ensure Ollama is running (`ollama serve`) before making API calls.

**Examples:** `examples/ollama/` - `getting_started.py`, `example_streaming.py`, `example_metadata.py`, `embeddings_example.py`

---

### LiteLLM

Supports all LLM providers available through LiteLLM with two integration patterns: client-side middleware and server-side proxy callbacks.

#### Client Mode

```python
from dotenv import load_dotenv
load_dotenv()

import revenium_middleware_litellm_client.middleware  # Auto-initializes
import litellm
import os

litellm.api_base = os.getenv("LITELLM_PROXY_URL")
litellm.api_key = os.getenv("LITELLM_API_KEY")

response = litellm.completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "organizationName": "AcmeCorp",
        "task_type": "chat"
    }
)
```

#### Proxy Mode

Add the callback to your LiteLLM `config.yaml` for server-side integration:

```yaml
litellm_settings:
  callbacks: ["revenium_middleware_litellm_proxy.middleware.proxy_handler_instance"]
```

When using the LiteLLM proxy, pass metadata via HTTP headers (`x-revenium-*`).

#### LiteLLM Decorators

LiteLLM provides additional tracking decorators beyond the standard `@revenium_metadata` and `@revenium_meter`:

| Decorator | Purpose |
|-----------|---------|
| `@track_agent()` | Identify the AI agent |
| `@track_task()` | Classify the type of work |
| `@track_trace()` | Set trace ID for distributed tracing |
| `@track_organization()` | Track multi-tenant organizations |
| `@track_subscription()` | Track subscription-based billing |
| `@track_product()` | Track product-specific usage |
| `@track_subscriber()` | Identify end users |
| `@track_quality()` | Track response quality scores |

All decorators support static values, extraction from function arguments (`name_from_arg`), or extraction from object attributes (`name_from_attr`).

#### CrewAI Integration

```bash
pip install "revenium-middleware-litellm[crewai]"
```

Pre-built wrapper for tracking CrewAI agent executions. **Note:** CrewAI requires Python 3.12 or earlier.

**LiteLLM environment variables:**
- `LITELLM_PROXY_URL` - Your LiteLLM proxy URL
- `LITELLM_API_KEY` - Your LiteLLM proxy API key

**Examples:** `examples/litellm/` - `getting_started.py`, `litellm_proxy_example.py`, `crewai_decorator_example.py`

---

### Perplexity

Supports both the OpenAI SDK (with Perplexity base URL) and the native Perplexity SDK, with streaming support.

#### Using OpenAI SDK

```python
from dotenv import load_dotenv
load_dotenv()

from openai import OpenAI
import revenium_middleware_perplexity  # Auto-patches OpenAI

client = OpenAI(
    api_key=os.getenv("PERPLEXITY_API_KEY"),
    base_url="https://api.perplexity.ai"
)

response = client.chat.completions.create(
    model="sonar",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    usage_metadata={"organizationName": "AcmeCorp"}
)
```

#### Using Native Perplexity SDK

```python
from perplexity import Perplexity
import revenium_middleware_perplexity  # Auto-patches Perplexity

client = Perplexity(api_key=os.getenv("PERPLEXITY_API_KEY"))

response = client.chat.completions.create(
    model="sonar",
    messages=[{"role": "user", "content": "Hello!"}]
)
```

Both approaches work identically - the middleware automatically detects which SDK you're using.

**Streaming:**

```python
stream = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True,
    usage_metadata={"task_type": "creative_writing"}
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
```

**Examples:** `examples/perplexity/` - `getting_started.py`, `basic.py`, `streaming.py`, `example_decorator.py`

---

### fal.ai

Supports image, video, and audio generation through fal.ai with automatic media type detection.

```python
import revenium_middleware_fal  # Auto-activates
import fal_client

result = fal_client.subscribe(
    "fal-ai/flux/dev",
    arguments={
        "prompt": "A beautiful sunset over mountains",
        "image_size": "landscape_16_9"
    },
    usage_metadata={
        "organizationName": "AcmeCorp",
        "task_type": "image-generation"
    }
)

for image in result.get("images", []):
    print(f"Image URL: {image['url']}")
```

**Supported methods:** `fal_client.run`, `fal_client.subscribe`, `fal_client.stream` (and their async variants: `run_async`, `subscribe_async`, `stream_async`)

**Media type detection:** The middleware automatically detects the type of media being generated (image, video, audio) based on the application name for accurate cost tracking.

**Environment variables:**
- `FAL_KEY` - Your fal.ai API key

---

### LangChain

Callback handler that automatically tracks LLM calls, chains, tools, and agent actions.

```python
from langchain_openai import ChatOpenAI
from revenium_middleware_langchain import ReveniumCallbackHandler

handler = ReveniumCallbackHandler(
    trace_id="session-123",
    agent_name="support_agent"
)

llm = ChatOpenAI(model="gpt-4", callbacks=[handler])
response = llm.invoke("Hello!")
```

**With chains:**

```python
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm | output_parser
result = chain.invoke({"topic": "programming"})
```

**With agents:**

```python
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent

@tool
def get_weather(city: str) -> str:
    """Get the weather for a city."""
    return f"Sunny, 72F in {city}"

agent = create_react_agent(llm, [get_weather])
result = agent.invoke(
    {"messages": [HumanMessage(content="Weather in NYC?")]},
    config={"callbacks": [handler]}
)
```

**Async support:**

```python
from revenium_middleware_langchain import AsyncReveniumCallbackHandler

handler = AsyncReveniumCallbackHandler(trace_id="async-session")
llm = ChatOpenAI(model="gpt-4", callbacks=[handler])
response = await llm.ainvoke("Hello!")
```

**Supported providers:** OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, Cohere, HuggingFace, Ollama. Provider is auto-detected from LangChain class name or model name prefix.

**Programmatic configuration:**

```python
from revenium_middleware_langchain import ReveniumCallbackHandler, ReveniumConfig, SubscriberConfig

config = ReveniumConfig(
    api_key="hak_your_api_key",
    environment="production",
    organization_name="my_org",
    product_name="my_product",
    subscriber=SubscriberConfig(id="user_123", email="user@example.com"),
)

handler = ReveniumCallbackHandler(config=config, trace_id="session-123")
```

---

## Metadata Fields

Add business context to any API call by passing a `usage_metadata` dictionary. All fields are optional.

| Field | Description | Use Case |
|-------|-------------|----------|
| `trace_id` | Unique session or conversation identifier | Link multiple API calls together for debugging, session analytics, or distributed tracing |
| `task_type` | Type of AI task being performed | Categorize usage by workload (e.g., `"chat"`, `"code-generation"`, `"doc-summary"`) for cost analysis |
| `subscriber.id` | Unique user identifier | Track individual user consumption for billing, rate limiting, or analytics |
| `subscriber.email` | User email address | Identify users for support, compliance, or usage reports |
| `subscriber.credential.name` | Authentication credential name | Track which API key or service account made the request |
| `subscriber.credential.value` | Authentication credential value | Associate usage with specific credentials for security auditing |
| `organizationName` | Organization or company name | Multi-tenant cost allocation, usage quotas per organization. Auto-creates if not found |
| `subscription_id` | Subscription plan identifier | Track usage against subscription limits, identify plan upgrade opportunities |
| `productName` | Your product or feature name | Attribute AI costs to specific features (e.g., `"customer-chatbot"`, `"email-assistant"`). Auto-creates if not found |
| `agent` | AI agent or bot identifier | Distinguish between multiple AI agents or automation workflows |
| `response_quality_score` | Custom quality rating (0.0-1.0) | Track user satisfaction or automated quality metrics for model performance analysis |

**Example:**

```python
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "trace_id": "conv-28a7e9d4",
        "task_type": "customer-support",
        "subscriber": {
            "id": "user-1234",
            "email": "user@example.com",
            "credential": {
                "name": "engineering-api-key",
                "value": "sk-1234567890abcdef"
            }
        },
        "organizationName": "AcmeCorp",
        "subscription_id": "pro-plan-Q1",
        "productName": "customer-support-chatbot",
        "agent": "support-agent",
        "response_quality_score": 0.92
    }
)
```

**Deprecation notice:** The old field names `organizationId`, `organization_id`, `productId`, and `product_id` are still supported for backward compatibility but are deprecated. Use `organizationName` and `productName` for new implementations.

**API Reference:** [Complete metadata field documentation](https://revenium.readme.io/reference/meter_ai_completion)

---

## Trace Visualization & Distributed Tracing

Enhanced observability fields for tracking AI operations across environments, regions, and workflows. Fields can be set via environment variables (static/deployment-level defaults) or passed directly in `usage_metadata` (dynamic/per-request values). Direct values always take precedence.

### Available Fields

| Field | Environment Variable (Fallback) | Description | Use Case |
|-------|----------------------------------|-------------|----------|
| `environment` | `REVENIUM_ENVIRONMENT` (auto-detects: `ENVIRONMENT`, `DEPLOYMENT_ENV`) | Deployment environment | Track usage across `production`, `staging`, `dev` |
| `region` | `REVENIUM_REGION` (auto-detects: `AWS_REGION`, `AZURE_REGION`, `GCP_REGION`) | Cloud region identifier | Multi-region deployment tracking and latency analysis |
| `credential_alias` | `REVENIUM_CREDENTIAL_ALIAS` | Human-readable API key name | Track which credential was used for rotation and auditing |
| `trace_type` | `REVENIUM_TRACE_TYPE` | Workflow category (max 128 chars, alphanumeric/hyphens/underscores) | Group similar workflows (e.g., `"customer-support"`, `"data-analysis"`) |
| `trace_name` | `REVENIUM_TRACE_NAME` | Human-readable trace label (max 256 chars) | Label trace instances (e.g., `"Customer Support Chat"`) |
| `parent_transaction_id` | `REVENIUM_PARENT_TRANSACTION_ID` | Parent transaction ID | Link child operations to parents across microservices |
| `transaction_name` | `REVENIUM_TRANSACTION_NAME` | Human-friendly operation name | Label operations (e.g., `"Generate Response"`, `"Analyze Sentiment"`) |
| `retry_number` | `REVENIUM_RETRY_NUMBER` | Retry attempt number (0 = first attempt) | Track retry attempts for failed operations |

**Note:** `operation_type` (e.g., `CHAT`, `EMBED`, `TOOL_CALL`) and `operation_subtype` (e.g., `function_call`, `streaming`) are automatically detected by the middleware and cannot be overridden.

### Usage

**Static fields via environment variables** (deployment-level defaults):

```bash
# .env file
REVENIUM_ENVIRONMENT=production
REVENIUM_REGION=us-east-1
REVENIUM_CREDENTIAL_ALIAS=prod-openai-key
REVENIUM_TRACE_TYPE=customer-support
```

**Dynamic fields via `usage_metadata`** (per-request values):

```python
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "environment": "production",
        "region": "us-east-1",
        "trace_type": "customer-support",
        "trace_name": "Support Chat Session",
        "transaction_name": "Generate Response",
        "parent_transaction_id": "parent-txn-123"
    }
)
```

**Best practice:** Use environment variables for static deployment configuration (`environment`, `region`, `credential_alias`) and pass dynamic values (`trace_name`, `transaction_name`, `organizationName`) directly in `usage_metadata` or via decorators.

### Distributed Tracing Example

```python
import uuid

workflow_id = str(uuid.uuid4())

# Step 1: Parent operation
parent_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Analyze this document"}],
    usage_metadata={
        "trace_id": "analysis-session-456",
        "transaction_name": "Document Analysis",
        "task_type": "analysis"
    }
)

# Step 2: Child operation linked to parent
child_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarize findings"}],
    usage_metadata={
        "trace_id": "analysis-session-456",
        "parent_transaction_id": parent_response.id,
        "transaction_name": "Summarize Results",
        "task_type": "summarization"
    }
)
```

---

## Decorator Support

### `@revenium_metadata` - Automatic Metadata Injection

Automatically injects metadata into all API calls within a function's scope. Eliminates the need to pass `usage_metadata` to every API call.

```python
from revenium_middleware import revenium_metadata

@revenium_metadata(
    trace_id="session-12345",
    task_type="customer-support",
    organizationName="AcmeCorp",
    environment="production"
)
def handle_customer_query(question: str) -> str:
    # All API calls automatically include the decorator metadata
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": question}]
    )
    return response.choices[0].message.content

answer = handle_customer_query("How do I reset my password?")
```

**Features:**
- **DRY Principle**: Define metadata once, apply to all API calls in the function
- **Composable**: Decorators can be nested - inner decorators inherit and override outer ones
- **API-level override**: `usage_metadata` passed directly to API calls always takes precedence over decorator metadata
- **Async support**: Works with both sync and async functions
- **Thread-safe**: Uses `contextvars` for proper isolation

**Nested decorators (metadata merging):**

```python
@revenium_metadata(organizationName="AcmeCorp", environment="production")
def outer_function():
    # Gets: organizationName, environment
    response1 = client.chat.completions.create(...)

    @revenium_metadata(trace_id="inner-trace", task_type="analysis")
    def inner_function():
        # Gets: organizationName, environment (inherited) + trace_id, task_type (added)
        response2 = client.chat.completions.create(...)
        return response2

    return inner_function()
```

**API-level override:**

```python
@revenium_metadata(organizationName="AcmeCorp", task_type="default")
def mixed_metadata():
    # Uses decorator metadata
    response1 = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}]
    )

    # API-level metadata overrides decorator's task_type
    response2 = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}],
        usage_metadata={
            "task_type": "special-override",  # Overrides decorator
            "trace_id": "api-level-trace"     # Adds new field
            # organizationName still inherited from decorator
        }
    )
```

### `@revenium_meter` - Selective Metering

Control which functions are metered when selective metering mode is enabled. This is useful for metering only specific high-value operations while ignoring others.

**Note:** This decorator only has an effect when `REVENIUM_SELECTIVE_METERING=true` is set. By default, all API calls are metered automatically.

```bash
# Enable selective metering
export REVENIUM_SELECTIVE_METERING=true
```

```python
from revenium_middleware import revenium_meter, revenium_metadata

@revenium_meter()
@revenium_metadata(task_type="premium-feature", organizationName="PremiumTier")
def premium_feature(prompt: str) -> str:
    # This WILL be metered (decorated with @revenium_meter)
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

def free_feature(prompt: str) -> str:
    # This will NOT be metered (no @revenium_meter decorator)
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content
```

**Accepted values for `REVENIUM_SELECTIVE_METERING`:**
- `"true"`, `"1"`, `"yes"`, `"on"` (case-insensitive) - Selective metering enabled
- `"false"`, `"0"`, `"no"`, `"off"`, or unset - All calls metered (default)

**Decorator order matters:** Place `@revenium_meter` before `@revenium_metadata` (outer to inner).

---

## Tool Metering

The `@meter_tool` decorator lets you meter arbitrary tool/function calls (web scrapers, database lookups, API fetchers, image generators, etc.) alongside your automatic LLM API metering. Available via `revenium_metering` v6.8.2+.

```python
import os
from revenium_middleware import meter_tool, configure

# Configure the metering client for tool calls
configure(
    metering_url=os.getenv("REVENIUM_METERING_BASE_URL", "https://api.revenium.ai"),
    api_key=os.environ["REVENIUM_METERING_API_KEY"],
)

# Decorate any tool function to automatically meter it
@meter_tool("customer-database", operation="lookup", agent="support-bot")
def lookup_customer(customer_id: str) -> dict:
    """Timing and success/failure are automatically tracked."""
    return {"name": "Jane Smith", "plan": "Enterprise"}

# The decorator reports the tool call to Revenium automatically
result = lookup_customer("CUST-42")
```

**Manual reporting:**

```python
from revenium_middleware import report_tool_call

report_tool_call(
    tool_id="my-tool",
    operation="fetch",
    duration_ms=1234,
    success=True,
    usage_metadata={"records": 42},
)
```

---

## Prompt Capture

Optional capture of prompts and responses for analytics and debugging. **Disabled by default** to protect sensitive data.

### Enable

```bash
export REVENIUM_CAPTURE_PROMPTS=true
```

### What Gets Captured

| Field | Description | Source |
|-------|-------------|--------|
| `system_prompt` | System prompt content | From `system` parameter / system message |
| `input_messages` | User/assistant messages as JSON | From `messages` parameter |
| `output_response` | Assistant's response content | From response content blocks |
| `prompts_truncated` | Truncation flag | Set to `true` if any field exceeded 50,000 characters |

Each field has a maximum length of **50,000 characters**. If exceeded, it's truncated with a `...[TRUNCATED]` marker.

### Example

```python
import os
os.environ["REVENIUM_CAPTURE_PROMPTS"] = "true"

import revenium_middleware_openai
from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
    usage_metadata={"organizationName": "DemoOrg"}
)
# System prompt, input messages, and output response are now captured
```

Prompt capture works with both streaming and non-streaming requests, and with multimodal content (text, images, etc.).

### Security Considerations

- Prompts may contain sensitive user data
- Responses may include confidential information
- Only enable in environments where data capture is appropriate
- Ensure compliance with your data privacy policies
- Use selective metering with `@revenium_meter` to control which calls are captured

---

## Terminal Summary Output

Display a cost and usage summary in your terminal after each API request. Useful for development, debugging, and monitoring AI costs in real-time.

### Configuration

| Environment Variable | Values | Description |
|---------------------|--------|-------------|
| `REVENIUM_PRINT_SUMMARY` | `false` (default), `true` or `human`, `json` | Controls output format |
| `REVENIUM_TEAM_ID` | Your team ID | Required to fetch and display cost information |

```bash
# Enable human-readable output
export REVENIUM_PRINT_SUMMARY=human

# Required for cost display (find in Revenium web app)
export REVENIUM_TEAM_ID=your-team-id-here
```

### Human-Readable Format

```
============================================================
REVENIUM USAGE SUMMARY
============================================================
Model: gpt-4o-mini
Provider: OPENAI
Duration: 1.23s

Token Usage:
  Input Tokens:  150
  Output Tokens: 250
  Total Tokens:  400

Cost: $0.000045

Trace ID: abc-123
============================================================
```

### JSON Format

```json
{"model":"gpt-4o-mini","provider":"OPENAI","durationSeconds":1.23,"inputTokenCount":150,"outputTokenCount":250,"totalTokenCount":400,"cost":0.000045,"costStatus":"available","traceId":"abc-123"}
```

### Cost Status

| Scenario | Display |
|----------|---------|
| Cost available | `$0.000045` |
| `REVENIUM_TEAM_ID` set, cost pending | `Pending (aggregating... check Revenium dashboard)` |
| `REVENIUM_TEAM_ID` not set | `Add REVENIUM_TEAM_ID to see pricing` |

---

## Configuration Reference

### Required Environment Variables

| Variable | Description |
|----------|-------------|
| `REVENIUM_METERING_API_KEY` | Your Revenium API key (starts with `hak_`) |

### Optional Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `REVENIUM_METERING_BASE_URL` | `https://api.revenium.ai` | Revenium API endpoint |
| `REVENIUM_LOG_LEVEL` | `INFO` | Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` |
| `REVENIUM_CAPTURE_PROMPTS` | `false` | Enable prompt capture |
| `REVENIUM_PRINT_SUMMARY` | `false` | Terminal output: `false`, `true`/`human`, `json` |
| `REVENIUM_SELECTIVE_METERING` | `false` | Only meter `@revenium_meter` decorated functions |
| `REVENIUM_TEAM_ID` | - | Team ID for cost display in terminal summary |
| `REVENIUM_ENVIRONMENT` | - | Deployment environment (auto-detects from `ENVIRONMENT`, `DEPLOYMENT_ENV`) |
| `REVENIUM_REGION` | - | Cloud region (auto-detects from `AWS_REGION`, `AZURE_REGION`, `GCP_REGION`) |
| `REVENIUM_CREDENTIAL_ALIAS` | - | Human-readable API key name |
| `REVENIUM_TRACE_TYPE` | - | Workflow category identifier |
| `REVENIUM_TRACE_NAME` | - | Human-readable trace label |
| `REVENIUM_PARENT_TRANSACTION_ID` | - | Parent transaction ID for distributed tracing |
| `REVENIUM_TRANSACTION_NAME` | - | Human-friendly operation name |
| `REVENIUM_RETRY_NUMBER` | - | Retry attempt number |
| `REVENIUM_BEDROCK_DISABLE` | - | Set to `1` to disable Bedrock auto-detection |

### Provider-Specific Environment Variables

| Variable | Provider | Description |
|----------|----------|-------------|
| `OPENAI_API_KEY` | OpenAI | OpenAI API key |
| `AZURE_OPENAI_ENDPOINT` | Azure OpenAI | Azure endpoint URL |
| `AZURE_OPENAI_API_KEY` | Azure OpenAI | Azure API key |
| `AZURE_OPENAI_DEPLOYMENT` | Azure OpenAI | Default deployment name |
| `ANTHROPIC_API_KEY` | Anthropic | Anthropic API key |
| `AWS_REGION` | Bedrock | AWS region for Bedrock (default: `us-east-1`) |
| `GOOGLE_API_KEY` | Google AI | Google AI SDK API key |
| `GOOGLE_CLOUD_PROJECT` | Vertex AI | GCP project ID |
| `GOOGLE_CLOUD_LOCATION` | Vertex AI | GCP location (default: `us-central1`) |
| `PERPLEXITY_API_KEY` | Perplexity | Perplexity API key |
| `FAL_KEY` | fal.ai | fal.ai API key |
| `LITELLM_PROXY_URL` | LiteLLM | LiteLLM proxy URL |
| `LITELLM_API_KEY` | LiteLLM | LiteLLM proxy API key |

---

## Troubleshooting

| Issue | Solution |
|-------|----------|
| **Middleware not working** | Verify `REVENIUM_METERING_API_KEY` is set correctly (must start with `hak_`) |
| **No data in dashboard** | Enable debug logging with `REVENIUM_LOG_LEVEL=DEBUG` |
| **Import errors** | Ensure the correct extra is installed (e.g., `pip install revenium-python-sdk[openai]`) |
| **Azure: wrong model name** | Middleware auto-resolves deployment names; check with debug logging |
| **Bedrock: AccessDenied** | Ensure `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream` permissions |
| **Bedrock: requests go to Anthropic** | Verify AWS credentials: `aws sts get-caller-identity` |
| **Google: embeddings show 0 tokens** | Expected with Google AI SDK; use Vertex AI for full token counting |
| **Google: "No module named 'vertexai'"** | Install correct extra: `pip install "revenium-python-sdk[google-vertex]"` |
| **Vertex AI: authentication errors** | Run `gcloud auth application-default login` |
| **Ollama: connection errors** | Ensure Ollama is running: `ollama serve` |
| **LangChain: provider shows "unknown"** | Ensure you're using a supported LangChain LLM class |
| **Streaming errors** | Check provider credentials; middleware auto-falls back gracefully |

**Debug mode:** Set `REVENIUM_LOG_LEVEL=DEBUG` to see detailed provider detection, routing decisions, and metering payloads.

**Force direct Anthropic API:** Set `REVENIUM_BEDROCK_DISABLE=1` to disable Bedrock auto-detection.

**Check initialization status:** Use `revenium_middleware_<provider>.is_initialized()` to verify setup.

---

## Logging

This module uses Python's standard logging system. Control the log level with the `REVENIUM_LOG_LEVEL` environment variable:

```bash
# Enable debug logging
export REVENIUM_LOG_LEVEL=DEBUG

# Or when running your script
REVENIUM_LOG_LEVEL=DEBUG python your_script.py
```

Available log levels:
- `DEBUG`: Detailed debugging information (provider detection, routing decisions, metering payloads)
- `INFO`: General information (default)
- `WARNING`: Warning messages only
- `ERROR`: Error messages only
- `CRITICAL`: Critical error messages only

## Compatibility

- Python 3.8+
- Works with all supported AI provider SDKs (latest versions recommended)
- Thread-safe and production-ready for concurrent applications

## Documentation

For detailed documentation, visit [docs.revenium.io](https://docs.revenium.io)

## Contributing

See [CONTRIBUTING.md](./CONTRIBUTING.md)

## Code of Conduct

See [CODE_OF_CONDUCT.md](./CODE_OF_CONDUCT.md)

## Security

See [SECURITY.md](./SECURITY.md)

## License

This project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.

## Support

For issues, feature requests, or contributions:

- **Website**: [www.revenium.ai](https://www.revenium.ai)
- **GitHub Repository**: [revenium/revenium-python-sdk](https://github.com/revenium/revenium-python-sdk)
- **Issues**: [Report bugs or request features](https://github.com/revenium/revenium-python-sdk/issues)
- **Documentation**: [docs.revenium.io](https://docs.revenium.io)
- **Email**: support@revenium.io

---

**Built by Revenium**
