Metadata-Version: 2.4
Name: atomicguard
Version: 2.24.2
Summary: A Dual-State Agent Framework for reliable LLM code generation with guard-validated loops
Author-email: Matthew Thompson <thompsonson@gmail.com>
Maintainer-email: Matthew Thompson <thompsonson@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/thompsonson/atomicguard
Project-URL: Repository, https://github.com/thompsonson/atomicguard
Project-URL: Documentation, https://github.com/thompsonson/atomicguard#readme
Project-URL: Issues, https://github.com/thompsonson/atomicguard/issues
Project-URL: Changelog, https://github.com/thompsonson/atomicguard/blob/main/CHANGELOG.md
Keywords: llm,agents,code-generation,neuro-symbolic,guards,ai,validation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Code Generators
Classifier: Typing :: Typed
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: matplotlib>=3.10.0
Requires-Dist: openhands-ai>=0.27.0
Requires-Dist: pydantic-ai>=1.0.0
Requires-Dist: pyflakes>=3.0
Requires-Dist: click>=8.1.0
Provides-Extra: web
Requires-Dist: fastapi>=0.115.0; extra == "web"
Requires-Dist: uvicorn[standard]>=0.34.0; extra == "web"
Requires-Dist: jinja2>=3.0.0; extra == "web"
Requires-Dist: markdown>=3.7; extra == "web"
Provides-Extra: experiment
Requires-Dist: datasets>=2.0.0; extra == "experiment"
Requires-Dist: huggingface_hub>=0.20; extra == "experiment"
Requires-Dist: swebench>=4.1.0; extra == "experiment"
Requires-Dist: docker>=7.0.0; extra == "experiment"
Dynamic: license-file

# AtomicGuard

[![CI](https://github.com/thompsonson/atomicguard/actions/workflows/ci.yml/badge.svg)](https://github.com/thompsonson/atomicguard/actions/workflows/ci.yml)
[![codecov](https://codecov.io/gh/thompsonson/atomicguard/branch/main/graph/badge.svg)](https://codecov.io/gh/thompsonson/atomicguard)
[![PyPI version](https://badge.fury.io/py/atomicguard.svg)](https://badge.fury.io/py/atomicguard)
[![Python versions](https://img.shields.io/pypi/pyversions/atomicguard.svg)](https://pypi.org/project/atomicguard/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A Dual-State Agent Framework for reliable LLM code generation.

## Why AtomicGuard?

AI agents hallucinate. Worse, those hallucinations **compound** — each generation builds on the last, and errors propagate through the workflow.

AtomicGuard solves this by combining to aspects - **decompose goals** into small measurable tasks and through **Bounded Indeterminacy**: the LLM generates content, but a deterministic state machine controls the logic. Every generation is validated before the workflow advances.

| Challenge | Solution |
|-----------|----------|
| 🛡️ **Safety** | Dual-State Architecture & Atomic Action Pairs |
| 💾 **State** | Versioned Repository Items & Configuration Snapshots |
| 🌐 **Scale** | Multi-Agent Coordination via Shared DAG |
| 📈 **Improvement** | Continuous Learning from Guard Verdicts |

→ [Learn more about the architecture](docs/design/architecture.md)

> **New to AtomicGuard?** Start with the [Getting Started Guide](docs/guide/getting-started.md).

**Paper:** *Managing the Stochastic: Foundations of Learning in Neuro-Symbolic Systems for Software Engineering* (Thompson, 2025)

## Overview

AtomicGuard implements guard-validated generation loops that dramatically improve LLM reliability. The core abstraction is the **Atomic Action Pair** ⟨agen, G⟩ — coupling each generation action with a validation guard.

Key results (Yi-Coder 9B, n=50):

| Task | Baseline | Guarded | Improvement |
|------|----------|---------|-------------|
| Template | 35% | 90% | +55pp |
| Password | 82% | 98% | +16pp |
| LRU Cache | 94% | 100% | +6pp |

## Installation

```bash
# From PyPI
pip install atomicguard

# From source
git clone https://github.com/thompsonson/atomicguard.git
cd atomicguard
uv venv && source .venv/bin/activate
uv pip install -e ".[dev,test]"
```

## Quick Start

```python
from atomicguard import (
    OllamaGenerator, SyntaxGuard, TestGuard,
    CompositeGuard, ActionPair, DualStateAgent,
    InMemoryArtifactDAG
)

# Setup
generator = OllamaGenerator(model="qwen2.5-coder:7b")
guard = CompositeGuard([SyntaxGuard(), TestGuard("assert add(2, 3) == 5")])
action_pair = ActionPair(generator=generator, guard=guard)
agent = DualStateAgent(action_pair, InMemoryArtifactDAG(), rmax=3)

# Execute
artifact = agent.execute("Write a function that adds two numbers")
print(artifact.content)
```

See [examples/](examples/) for more detailed usage, including a [mock example](examples/basic_mock.py) that works without an LLM.

## LLM Backends

AtomicGuard supports multiple LLM backends. Each generator implements `GeneratorInterface` and can be swapped in with no other code changes.

### Ollama (local or cloud)

Uses the OpenAI-compatible API. Works with any Ollama-served model:

```python
from atomicguard.infrastructure.llm import OllamaGenerator

# Local instance (default: http://localhost:11434/v1)
generator = OllamaGenerator(model="qwen2.5-coder:7b")
```

### HuggingFace Inference API

Connects to HuggingFace Inference Providers via `huggingface_hub`. Supports any model available through the HF Inference API, including third-party providers like Together AI.

```bash
# Install the optional dependency
pip install huggingface_hub

# Set your API token
export HF_TOKEN="hf_your_token_here"
```

```python
from atomicguard.infrastructure.llm import HuggingFaceGenerator
from atomicguard.infrastructure.llm.huggingface import HuggingFaceGeneratorConfig

# Default: Qwen/Qwen2.5-Coder-32B-Instruct
generator = HuggingFaceGenerator()

# Custom model and provider
generator = HuggingFaceGenerator(HuggingFaceGeneratorConfig(
    model="Qwen/Qwen2.5-Coder-32B-Instruct",
    provider="together",       # or "auto", "hf-inference"
    temperature=0.7,
    max_tokens=4096,
))
```

Drop-in replacement in any workflow:

```python
from atomicguard import (
    SyntaxGuard, TestGuard, CompositeGuard,
    ActionPair, DualStateAgent, InMemoryArtifactDAG
)
from atomicguard.infrastructure.llm import HuggingFaceGenerator

generator = HuggingFaceGenerator()
guard = CompositeGuard([SyntaxGuard(), TestGuard("assert add(2, 3) == 5")])
action_pair = ActionPair(generator=generator, guard=guard)
agent = DualStateAgent(action_pair, InMemoryArtifactDAG(), rmax=3)

artifact = agent.execute("Write a function that adds two numbers")
print(artifact.content)
```

## Benchmarks

Run the simulation from the paper:

```bash
python -m benchmarks.simulation --model yi-coder:9b --trials 50 --task all --output results/results.db --format sqlite

# Generate report
python -m benchmarks.simulation --visualize --output results/results.db --format sqlite
```

## Project Structure

```
atomicguard/
├── src/atomicguard/     # Core library
├── benchmarks/          # Simulation code
├── docs/
│   ├── guide/           # User-facing documentation
│   ├── reference/       # Reference material (glossary, AP catalog, flow diagrams)
│   ├── design/          # Architecture, plans, extensions, decisions
│   ├── theory/          # Formal agent theory and domain definitions
│   └── blog/            # Blog posts
├── examples/            # Usage examples
└── results/             # Generated reports & charts
```

## Web Dashboard

The web dashboard lets you browse experiment results, manage workflow configurations, and inspect per-instance artifact DAGs.

### Starting the dashboard

```bash
# Auto-detect ./output directory
uv run python -m atomicguard.web --output-dir output/

# Or point to a specific experiment's artifact_dags
uv run python -m atomicguard.web --artifact-dir output/my_experiment/model/artifact_dags/

# Custom host/port
uv run python -m atomicguard.web --output-dir output/ --host 127.0.0.1 --port 3000
```

Opens at `http://localhost:8000` by default. The DAG viewer shows per-instance artifact graphs to review workflow progress, guard feedback, and retries.

### Workflow management via database

Import workflow JSON configs and AP context into a SQLite database for editing in the UI:

```bash
# Import all workflows and AP context
uv run python -m atomicguard.web import-all \
  --workflows-dir examples/swe_bench_common/workflows/ \
  --ap-context examples/swe_bench_common/ap_context.json

# Export a workflow back to JSON
uv run python -m atomicguard.web export-workflow s1-tdd

# Export all AP context
uv run python -m atomicguard.web export-context
```

The database defaults to `~/.atomicguard/web.db`. Override with `--db path/to/db`.

For running the dashboard as a persistent background service (systemd, launchd, nginx reverse proxy), see [docs/guide/web-dashboard-service.md](docs/guide/web-dashboard-service.md).

### Legacy compatibility

The dashboard is also available via the old entry point:

```bash
uv run python -m examples.dashboard --output-dir output/
```

## Development

```bash
# Install dependencies
uv sync

# Run unit tests
just test

# Run all tests (unit + architecture)
just test-all

# Lint and format
just lint
just fmt

# Type check
just typecheck

# Full CI pipeline
just ci
```

See the `justfile` for all available commands.

## Citation

If you use this framework in your research, please cite the paper:

> Thompson, M. (2025). Managing the Stochastic: Foundations of Learning in Neuro-Symbolic Systems for Software Engineering. arXiv preprint arXiv:2512.20660.

```bibtex
@misc{thompson2025managing,
  title={Managing the Stochastic: Foundations of Learning in Neuro-Symbolic Systems for Software Engineering},
  author={Thompson, Matthew},
  year={2025},
  eprint={2512.20660},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2512.20660}
}
```

## License

MIT
