Metadata-Version: 2.4
Name: keiro
Version: 0.10.0
Summary: Keiro client — call the EB1 multi-model ensemble API.
Author: Keiro Engineering
License-Expression: LicenseRef-Proprietary
Project-URL: Homepage, https://pypi.org/project/keiro/
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: <3.14,>=3.11
Description-Content-Type: text/markdown
Requires-Dist: requests>=2.32.2
Requires-Dist: PyYAML>=6.0.1
Requires-Dist: rich>=13.0.0
Provides-Extra: dev
Requires-Dist: pytest>=8.3.2; extra == "dev"
Requires-Dist: ruff>=0.12.0; extra == "dev"

# Keiro

EB1 multi-model ensemble inference. Run multiple frontier models in parallel
and synthesize the best response.

## Quick start

```bash
pip install keiro
keiro setup
```

```python
from keiro import models

print(models("eb1-preview", "What is machine learning?"))
```

Or from the command line:

```bash
keiro "What is machine learning?"
```

## How it works

EB1 sends your prompt to multiple frontier models (Claude, GPT, Gemini) in
parallel, then a judge synthesizes the strongest elements into a single
response. The result is more accurate and more complete than any individual
model.

## Models

| Model | Description |
|-------|-------------|
| `eb1-preview` (default) | Adaptive GNN-routed ensemble |
| `eb1-delta-preview` | Adaptive ensemble with orchestration |
| `eb1` | Standard 5-model ensemble |
| `eb1-pro` | Extended 6-model ensemble |
| `eb1-frontier` | Highest quality, max reasoning |
| `eb1-codex` | Optimized for code and SWE tasks |
| `eb1-fast` | Low latency, lighter models |
| `eb1-fast-preview` | Adaptive routing, low latency |
| `eb1-frontier-preview` | Adaptive routing, max quality |
| `claude-opus-4-6` | Direct passthrough (no ensemble) |
| `gpt-5.2` | Direct passthrough |

```python
from keiro import models

# Default adaptive ensemble
answer = models("eb1-preview", "Solve this step by step: what is 23 * 47?")

# Max quality
answer = models("eb1-frontier", "Prove that sqrt(2) is irrational.")

# Low latency
answer = models("eb1-fast", "Summarize this in one sentence.")

# Direct passthrough to a single model
answer = models("claude-opus-4-6", "Write a haiku")
```

## Prompt-first API

```python
from keiro import models

# Structured response with usage metadata
reply = models.response("eb1-preview", "Explain quantum computing.")
print(reply.text)
print(reply.usage)

# Reusable model binding with fixed parameters
creative = models.instance("eb1-preview", temperature=0.8)
print(creative("Write a limerick about debugging."))

# Streaming
for chunk in models.stream("eb1-preview", "Draft a launch email."):
    print(chunk, end="")
```

## Full client

```python
from keiro import Client

client = Client()

# Chat completions API
response = client.chat(
    messages=[{"role": "user", "content": "Explain quantum computing."}],
    model="eb1-preview",
)
print(response["choices"][0]["message"]["content"])

# Rate limit visibility
print(client.rate_limits)
# RateLimitInfo(limit_requests=1000, remaining_requests=999, ...)

client.close()
```

## CLI

```bash
keiro "What is ML?"                 # one-shot response
keiro                               # interactive REPL
keiro -m eb1-fast "Quick answer"    # specific model
echo context | keiro "Summarize"    # pipe context as input
keiro setup                         # configure credentials
keiro models                        # list available models
```

## Configuration

**Interactive setup** (recommended):

```bash
keiro setup
```

This validates your API key against the gateway and saves credentials to
`~/.keiro/credentials`.

**Environment variables**:

```bash
export KEIRO_API_KEY="your-api-key"
```

**Explicit arguments**:

```python
from keiro import Client

client = Client(api_key="your-key")
```

Precedence: explicit arguments > environment variables > credentials file.

## Requirements

- Python 3.11+
- No GPU required (inference runs on hosted infrastructure)
