Metadata-Version: 2.4
Name: aef-framework
Version: 0.1.1
Summary: Standalone Agent Evaluation Framework (AEF)
Project-URL: Homepage, https://github.com/HewlettPackard/AEF
Project-URL: Repository, https://github.com/HewlettPackard/AEF
Project-URL: Documentation, https://github.com/HewlettPackard/AEF/tree/main/docs
Project-URL: Issues, https://github.com/HewlettPackard/AEF/issues
Author-email: Gayathri Saranathan <gayathri.saranathan@hpe.com>, Aalap Tripathy <aalap.tripathy@hpe.com>, Tarun Kumar <tarun.kumar2@hpe.com>
Maintainer-email: Gayathri Saranathan <gayathri.saranathan@hpe.com>, Aalap Tripathy <aalap.tripathy@hpe.com>, Tarun Kumar <tarun.kumar2@hpe.com>
License: Apache-2.0
License-File: LICENSE
Keywords: a2a,agent,agent-to-agent,ai,evaluation,framework,testing
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.10
Requires-Dist: certifi
Requires-Dist: deprecated
Requires-Dist: fastapi
Requires-Dist: google-adk
Requires-Dist: google-genai
Requires-Dist: langfuse
Requires-Dist: litellm
Requires-Dist: openai
Requires-Dist: python-dotenv
Requires-Dist: requests
Requires-Dist: streamlit
Requires-Dist: uvicorn
Description-Content-Type: text/markdown

# AEF - Agent Evaluation Framework

AEF is a framework to **generate tests, run/evaluate trajectories, collect feedback, and self-evolve** agent behavior.

The workflow is intentionally minimal and framework-agnostic:
- `aef generate` calls the **generation component/tool**
- `aef evaluate` calls the **evaluation component/tool**
- `aef feedback` calls the **feedback component/tool**
- `aef evolve` calls the **evolution component/tool**

Internally, these are routed through an **A2A bus** so the same flow works for sub-agents implemented with different frameworks.

---

## Installation

### From PyPI (Coming Soon)

Once published, install via pip or uv:

```bash
pip install aef-framework
```

or with uv:

```bash
uv pip install aef-framework
```

### Local Development Install with uv

AEF uses [uv](https://github.com/astral-sh/uv) for fast, reliable Python package management.

#### 1. Install uv (if not already installed)

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

#### 2. Create a virtual environment

```bash
cd AEF
uv venv --python=3.11
```

This creates a `.venv` directory with Python 3.11 (or use `3.10`, `3.12` as needed).

#### 3. Activate the virtual environment

```bash
source .venv/bin/activate  # Linux/macOS
# or
.venv\Scripts\activate     # Windows
```

#### 4. Install AEF in editable mode

```bash
uv pip install -e .
```

This installs AEF and all dependencies, making the `aef` command available.

#### 5. Verify installation

```bash
aef --help
```

### Traditional pip install (local)

If you prefer using pip:

```bash
python -m venv .venv
source .venv/bin/activate
pip install -e .
```

---

## Core Principles

- **Universal sub-agent support** via adapter contract (`python`, `cli`, `http`)
- **Single essential loop**: Generate → Evaluate → Feedback → Evolve
- **Composable A2A components** instead of tightly-coupled command logic
- **Versioned evolution profiles** with before/after evaluation comparison

---

## Essential Workflow

### 1) Generate trajectories
```bash
aef generate --config configs/fleet_ccc_run.json --n 10
```

### 2) Evaluate against a golden run
```bash
aef evaluate --config configs/fleet_ccc_run.json --golden run_YYYYMMDD_xxxxxx
```

### 3) Submit feedback
```bash
aef feedback --agent fleet_ccc --text "Agent should ask confirmation before delete operations"
```

### 4) Evolve (auto-apply + compare)
```bash
aef evolve --config configs/fleet_ccc_run.json --n 10
```

`aef evolve` now performs:
1. baseline evaluate
2. classify feedback into amendments
3. apply evolution profile
4. re-evaluate and report before/after score delta

---

## Use AEF With Any Sub-Agent

Set `agent.adapter_type` in your config:

- `python`: ADK/Python agent entrypoint `module_or_file.py:agent_var`
- `cli`: shell command template using `{step}` / `{goal}` placeholders
- `http`: endpoint that accepts `{ goal, step, session_id? }`

See detailed usage in [docs/USING_ANY_SUBAGENT.md](docs/USING_ANY_SUBAGENT.md).

Full prerequisites and onboarding checklist:
- [docs/ADOPTING_NEW_AGENT.md](docs/ADOPTING_NEW_AGENT.md)

---

## A2A Components

AEF components exposed through the internal bus:
- `generation.generate`
- `evaluation.evaluate`
- `feedback.submit_text`
- `feedback.submit_annotations`
- `evolution.evolve`

See [docs/A2A_COMPONENTS.md](docs/A2A_COMPONENTS.md).

---

## Evolution Outputs

Evolution applies and versions runtime amendments per agent under:

- `prompts/evolution_profiles/<agent>/latest.json`
- `prompts/evolution_profiles/<agent>/profile_<timestamp>.json`

These profiles contain:
- prompt addenda
- tool policies
- generator hints
- agent hints
- rubric updates

See [docs/SELF_EVOLUTION.md](docs/SELF_EVOLUTION.md).

---

## Minimal Command Reference

```bash
# Generate
aef generate --config <config.json> --n 10

# Direct A2A tool call
aef a2a --config <config.json> --component generation --tool generate --payload '{"n": 2}'

# Evaluate golden by run id
aef evaluate --config <config.json> --golden <run_id>

# Feedback
aef feedback --agent <agent_name> --text "..."

# Evolve
aef evolve --config <config.json> --n 10

# Compare two eval runs
aef compare --run <run_a> --vs <run_b>

# Query runs / memory
aef query runs --agent <agent_name>
aef query memory --agent <agent_name> --all-memory
aef query memory --agent <agent_name> --history
```

---

## Documentation

- [docs/AEF_WORKFLOW.md](docs/AEF_WORKFLOW.md)
- [docs/A2A_COMPONENTS.md](docs/A2A_COMPONENTS.md)
- [docs/USING_ANY_SUBAGENT.md](docs/USING_ANY_SUBAGENT.md)
- [docs/SELF_EVOLUTION.md](docs/SELF_EVOLUTION.md)
- [docs/PUBLISHING.md](docs/PUBLISHING.md) - PyPI package publishing guide

---

## Contributing

Contributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and guidelines.

---

## License

AEF is released under the Apache License 2.0. See [LICENSE](LICENSE) for details.
