Metadata-Version: 2.4
Name: agenteval_sdk
Version: 0.3.5
Summary: Telemetry and Observability SDK for AI Agents — with Trace Tree MRI
License-Expression: MIT
Keywords: ai,agents,observability,tracing,llm
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: requests
Requires-Dist: flask>=2.0

# agenteval_sdk

Python SDK for the [AgentEval](https://agenteval.in) observability platform. Instrument your AI agents in minutes and get **Trace Tree MRI** — a full visual breakdown of every agent decision, latency, PII exposure, and cost.

## Install

```bash
pip install agenteval_sdk
```

## Quick start

```python
import os
from agenteval_sdk import AgentEvalClient, trace

client = AgentEvalClient(
    api_key=os.environ["AGENTEVAL_API_KEY"],
    project_id="aml-system",
)

# Decorate each agent function — span hierarchy is wired automatically
@trace(name="Orchestrator", latency_threshold=5000)
def run_case(case_id: str) -> str:
    result = kyc_agent(case_id)
    return result

@trace(name="KYC_Agent", latency_threshold=3000)
def kyc_agent(case_id: str) -> str:
    output = "KYC: PASS"
    client.log_trace(
        agent_name="KYC_Agent",
        input_text=f"Check case {case_id}",
        output_text=output,
        latency_ms=340,
        tokens_used=180,
        status="pass",
        pii_scan=True,          # lightweight regex pre-flight before Presidio
    )
    return output

run_case("TXN-9921")
```

This produces a Trace Tree visible in the AgentEval dashboard — parent/child spans, Gantt swimlane, PII audit trail, and cost estimate — with zero manual wiring.

## `@trace` decorator

```python
@trace                                           # bare
@trace(name="MyAgent")                          # custom display name
@trace(name="MyAgent", latency_threshold=2000)  # red UI alert if > 2 s
```

`latency_threshold` (ms) controls when the node appears red in the Trace Tree UI.  
Nested agents inherit the parent's threshold unless they define their own.

## `AgentEvalClient.log_trace()`

| Parameter | Type | Description |
|-----------|------|-------------|
| `agent_name` | `str` | Display name in the UI |
| `input_text` | `str` | Agent input / prompt |
| `output_text` | `str` | Agent response |
| `latency_ms` | `int` | Wall-clock time for this span |
| `tokens_used` | `int` | Token count (for cost estimate) |
| `status` | `str` | `"pass"` / `"fail"` / `"NEW"` |
| `trace_id` | `str` | Auto-generated if omitted |
| `session_id` | `str` | Group related traces (e.g. one customer case) |
| `pii_scan` | `bool` | Client-side regex PII scan before sending |
| `pii_types` | `list[str]` | Explicit PII types (overrides `pii_scan`) |
| `span_id` | `str` | Auto-populated from `@trace` ContextVar |
| `parent_span_id` | `str` | Auto-populated from `@trace` ContextVar |
| `latency_threshold` | `int` | Per-span override (ms) |

## Standalone `log_trace()`

For lighter usage without `AgentEvalClient`:

```python
from agenteval_sdk import log_trace

log_trace(
    trace_id="abc-123",
    input_text="Hello",
    output_text="World",
    latency_ms=200,
    tokens_used=50,
    status="pass",
    api_key=os.environ["AGENTEVAL_API_KEY"],
    pii_scan=True,
)
```

## Environment variables

| Variable | Default | Description |
|----------|---------|-------------|
| `AGENTEVAL_API_URL` | `https://sandbox.agenteval.in/trace` | Ingestion endpoint |
| `AGENTEVAL_API_KEY` | — | API key (or pass directly) |

## Offline / local-first

`AgentEvalClient` writes every trace to a local SQLite database before attempting the network POST. No traces are lost on network failure. Call `client.flush()` to retry unsynced traces.

## What you get in the dashboard

- **Trace Tree MRI** — interactive D3 tree of agent spans, coloured red for slow nodes
- **Gantt swimlane** — execution timeline per span  
- **PII Audit Trail** — which agents saw PII, what types, redacted by Presidio  
- **Cost estimate** — token usage → USD at GPT-4o-mini rates  
- **Kill switch** — freeze all agent traffic instantly from the operator panel

