Metadata-Version: 2.4
Name: codex-agent-framework
Version: 0.1.8
Summary: A lightweight event-driven Codex agent runtime.
Author: Baptiste
License-Expression: MIT
Keywords: agent,ai,codex,openai,tools
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: audioop-lts; python_version >= "3.13"
Requires-Dist: beautifulsoup4
Requires-Dist: codex-backend-sdk
Requires-Dist: fastapi
Requires-Dist: filetype
Requires-Dist: get-gecko-driver
Requires-Dist: modict
Requires-Dist: numpy
Requires-Dist: odfpy
Requires-Dist: openai
Requires-Dist: openpyxl
Requires-Dist: pathspec
Requires-Dist: pillow
Requires-Dist: pydub
Requires-Dist: pypdf
Requires-Dist: pynteract
Requires-Dist: python-docx
Requires-Dist: PyYAML
Requires-Dist: regex
Requires-Dist: requests
Requires-Dist: rich
Requires-Dist: selenium
Requires-Dist: textual
Requires-Dist: tiktoken
Requires-Dist: trafilatura
Requires-Dist: uvicorn
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Dynamic: license-file

# codex-agent

`codex-agent-framework` is a lightweight Python runtime for building interactive, tool-using AI agents.

It provides a reusable `Agent` abstraction with persistent sessions, local tools, slash commands, contextual providers, voice hooks, image handling, and document-reading helpers. The package can be used as a CLI assistant or embedded as a library in your own application.

> Status: early alpha. APIs are still evolving.

## Highlights

- Local FastAPI agent server with a terminal TUI client and GTK tray controller.
- Process-isolated agent runtime so the local server stays responsive while the agent works.
- Persistent JSON sessions with `latest`, explicit session IDs, and session navigation commands.
- Durable RAG memory backed by runtime JSON, with semantic search and automatic turn summaries.
- Scheduled wakeups for one-shot or periodic autonomous agent turns.
- Built-in local tools for reading, writing, editing, showing, observing, Bash, and Python execution.
- Extensible tool, command, and provider decorators.
- Event-driven internals for UI, streaming, voice, and automation integrations.
- Document extraction helpers for folders, text files, URLs, PDFs, DOCX, XLSX, ODT, HTML, and more.
- Optional image generation, image observation, voice, and LaTeX integration points.
- Packaged prompt assets and runtime defaults.

## Requirements

- Python 3.10 or newer.
- A working model backend configuration compatible with `codex-backend-sdk` / OpenAI usage.
- GTK 3 plus Ayatana AppIndicator or AppIndicator bindings are needed for the tray controller on Linux.
- Firefox / GeckoDriver may be needed for Selenium-backed web extraction paths.

## Installation

From a local checkout:

```bash
python -m pip install -e .
```

For development:

```bash
python -m pip install -e '.[dev]'
```

## Quick start

Run the local server plus terminal TUI:

```bash
codex-agent
```

Run only the long-lived local server:

```bash
codex-agent server
```

Connect a TUI to an already running server:

```bash
codex-agent chat
```

Start only the tray controller:

```bash
codex-agent tray
```

Install the user services for the server and tray controller:

```bash
codex-agent install-service
```

Or use the agent from Python:

```python
from codex_agent import Agent

agent = Agent(
    session="new",
    username="Baptiste",
    voice_enabled=False,
)

agent("Summarize this project in three bullet points.")
```

For an interactive Python-driven session:

```python
from codex_agent import Agent

agent = Agent(session="latest", voice_enabled=False)
agent.interact()
```

## Runtime directory and sessions

By default, local runtime state is stored in:

```text
~/.agent_runtime
```

This directory contains, among other things:

```text
sessions/      persisted conversation histories as JSON
workfolder/    generated or uploaded files
tools/         user runtime tools
providers/     user runtime context providers
commands/      user runtime slash commands
images/        generated or persisted image outputs
memory.json    durable RAG memory entries
wakeups.json   scheduled autonomous wakeups
tui.json       currently registered TUI client process
```

You can override the runtime location with:

```bash
AGENT_RUNTIME_DIR=/tmp/my-agent-runtime codex-agent
```

Session behavior:

- `Agent(session="new")` starts a fresh session.
- `Agent(session="latest")` resumes the newest saved session.
- `Agent(session="<session_id>")` loads a specific saved session.
- `Agent(session="/path/to/session.json")` loads a session file directly.

Session IDs are timestamp-based and lexicographically sortable.

## Built-in slash commands

Inside the interactive agent, commands start with `/`.

Common commands:

```text
/help                         list available commands
/sessions                     list saved sessions
/new_session                  create a new session
/load_session latest          load latest session
/load_session <session_id>    load a specific session
/delete_session <session_id>  delete a session
/next_session                 move to the next/newer session
/previous_session             move to the previous/older session
/compact                      compact completed history turns
/config                       show model-related config
/config model=gpt-test verbosity=low
/model                        show current model
/model gpt-test               update model
/reasoning high               update reasoning effort
/verbosity low                update verbosity
```

## Built-in tools

The default agent registers local tools that can be exposed to the model:

| Tool | Purpose |
| --- | --- |
| `read` | Extract text from files, folders, URLs, and common document formats. |
| `write` | Write or overwrite one or more complete UTF-8 text files. |
| `edit` | Apply exact-string replacements to local text files. |
| `python` | Execute Python in a persistent interactive shell. |
| `bash` | Execute shell commands. |
| `observe` | Load an image into the conversation for visual analysis. |
| `show` | Open a file, folder, or URL with the system default app/browser. |
| `memory_add` / `memory_edit` / `memory_delete` / `memory_search` | Manage durable semantic memory entries. |
| `schedule_wakeup` / `cancel_wakeup` / `list_wakeups` | Manage automatic future turns. |
| `open_tui` / `close_tui` | Open or close the local TUI through the running server. |

Use these with care: Bash, Python, write, and edit run with the current user's privileges.

## Local server, TUI, and tray

The server exposes the agent through a thin FastAPI bridge. Its REST and SSE endpoints intentionally mirror the Python agent surface where practical, returning the same `modict`/dict-backed payloads used internally.

Key endpoints:

```text
GET  /health
GET  /status
GET  /config
GET  /session
GET  /sessions
GET  /messages
GET  /memory
GET  /tools
GET  /wakeups
GET  /events
GET  /events/replay
POST /turns
POST /interrupt
POST /tui/open
POST /tui/close
POST /restart
```

The TUI is a visual client only. It connects over SSE, replays the latest turn when opened mid-session, tracks event sequence cursors, and reconnects after server restarts or transient stream loss. The server accepts one TUI client at a time, while allowing the same client process to replace a stale SSE subscription during reconnect.

The tray can start or stop the user service, open or close the TUI, and keep the local agent available independently of the terminal UI.

## Extending the agent

### Define a tool

```python
from codex_agent import Agent, tool

@tool
def add(a: int, b: int) -> int:
    """Return the sum of two integers."""
    return a + b

agent = Agent(session="new", voice_enabled=False)
agent.add_tool(add)
```

### Define a slash command

```python
from codex_agent import Agent, command, get_agent

@command
def hello(name="world"):
    return f"Hello, {name}! Current session: {get_agent().current_session_id}"

agent = Agent(session="new", voice_enabled=False)
agent.add_command(hello)
print(agent("/hello Baptiste"))
```

### Define a context provider

Providers inject ephemeral context into each model call. Their output is not persisted in the session.

```python
from codex_agent import Agent, provider

@provider
def app_context():
    return "The user is working on the codex-agent repository."

agent = Agent(session="new", voice_enabled=False)
agent.add_provider(app_context)
```

### Runtime extensions

The agent also loads Python modules from the runtime directory:

```text
~/.agent_runtime/tools/*.py
~/.agent_runtime/providers/*.py
~/.agent_runtime/commands/*.py
```

Decorated functions in those files are registered automatically when the agent starts.

## Events

`Agent` exposes an event bus for UI and automation integrations.

Example:

```python
from codex_agent import Agent, MessageAddedEvent

agent = Agent(session="new", voice_enabled=False)

@agent.on(MessageAddedEvent)
def log_message(event):
    print(event.message.type)
```

Useful exported events include:

- `MessageAddedEvent`
- `ResponseStartEvent`
- `ResponseContentDeltaEvent`
- `ResponseDoneEvent`
- `ToolCallStartEvent`
- `ToolCallDoneEvent`
- `AgentInterruptedEvent`
- `AudioPlaybackEvent`

## Configuration

`Agent` accepts configuration through keyword arguments:

```python
agent = Agent(
    session="latest",
    model="gpt-5.4",
    reasoning_effort="medium",
    verbosity="medium",
    input_token_limit=128000,
    auto_compact=True,
    web_search_enabled=False,
    image_generation_enabled=False,
    voice_enabled=False,
)
```

Configuration is persisted to `agent_config.json` in the runtime directory when updated through agent helpers or slash commands.

## Project layout

```text
codex_agent/                    Python package
codex_agent/agent.py            Agent, AgentConfig, AgentSession
codex_agent/builtin_tools.py    Built-in local tools
codex_agent/builtin_commands.py Built-in slash commands
codex_agent/builtin_providers.py Built-in context providers
codex_agent/prompts/            Packaged prompt templates
codex_agent/get_text/           Document extraction helpers
tests/                          Test suite
pyproject.toml                  Package metadata and build config
MANIFEST.in                     Source distribution includes
```

## Testing

Run the full suite:

```bash
python -m pytest
```

The tests isolate `AGENT_RUNTIME_DIR` automatically, so they should not create or resume sessions from your real `~/.agent_runtime`.

Current baseline:

```text
293 passed
```

## Packaging

Build source and wheel distributions with:

```bash
python -m pip install build
python -m build
```

The distribution includes prompt text files and `codex_agent/get_text/default_gitignore` through package data and `MANIFEST.in`.

## Recent changes
- `0.1.8`: scope TUI replay/SSE catch-up to the active session and make bash/python subprocesses inherit the project Python environment, including service-launched agents.
- `0.1.7`: add durable RAG memory, scheduled wakeups, process-isolated server runtime, tray/service controls, robust SSE replay/reconnect, richer TUI status, and improved token estimates.
- `0.1.6`: add the FastAPI REST/SSE bridge, HTTP/SSE client, async-style agent mainloop, and decoupled TUI operation.
- `0.1.5`: refine the Textual chat UI with pure black backgrounds, subtler markers/scrollbars, and English UI labels.
- `0.1.4`: make the Textual REPL UI the default, add multiline input, assistant turn/step events, cleaner tool-call rendering, and `/voice` configuration.
- `0.1.3`: pass the current session id as `prompt_cache_key` to the Codex backend for prompt cache reuse.
- `0.1.2`: add `/clear` in the terminal chat to clear the screen without changing session history.

## Safety notes

This project is designed to let an AI assistant act on the local machine. That is powerful and potentially risky.

Recommended practices:

- Use a dedicated runtime directory for experiments.
- Review tool calls before enabling autonomous workflows.
- Avoid running the agent with elevated privileges.
- Keep secrets out of prompts, logs, and committed runtime files.
- Prefer temporary workfolders in tests and demos.

## License

MIT. See [LICENSE](LICENSE).
