Metadata-Version: 2.4
Name: agenterm
Version: 0.3.2
Summary: Terminal-native agent runtime with persistent sessions, branching, and MCP
Project-URL: Homepage, https://agenterm.ai
Project-URL: Documentation, https://agenterm.ai/docs
Project-URL: Repository, https://github.com/Tiziano-AI/agenterm
Project-URL: Issues, https://github.com/Tiziano-AI/agenterm/issues
Project-URL: Changelog, https://github.com/Tiziano-AI/agenterm/releases
Author: Tiziano Contorno
License-Expression: MIT
License-File: LICENSE
Keywords: agents,ai,cli,llm,mcp,openai,terminal
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: aiosqlite>=0.22.1
Requires-Dist: mcp>=1.25.0
Requires-Dist: openai-agents[litellm]>=0.6.4
Requires-Dist: prompt-toolkit>=3.0.52
Requires-Dist: python-dotenv>=1.2.1
Requires-Dist: rich>=14.2.0
Requires-Dist: ruamel-yaml>=0.18.17
Requires-Dist: typer>=0.20.1
Description-Content-Type: text/markdown

# agenterm

**Terminal-native agent runtime** built on the OpenAI Agents SDK and MCP.

Persistent sessions. Branching conversations. Multi-provider. Local-first.

## Installation

```bash
# Recommended: isolated install (like pipx, but faster)
uv tool install agenterm

# Or try without installing
uvx agenterm --help

# Or traditional pip
pip install agenterm
```

After installation, run:

```bash
agenterm --version    # verify installation
agenterm repl         # start interactive session
```

## Required CLI tools (local)

agenterm’s local toolset wraps external CLI binaries. If any are missing, the CLI
fails fast with an install hint.

```bash
brew install ripgrep fd bat tree
```

`stat` is provided by the OS (BSD/GNU); ensure it is available on `PATH`.

On non‑macOS platforms, install the same binaries via your system package
manager and ensure they are on `PATH`.

## Overview

agenterm runs inside any project directory and provides:

- `agenterm repl` — start an interactive REPL session (prompts to save config if missing).
- `agenterm run` — execute a one‑shot run (prompt args, `--file`, or piped stdin).
- `agenterm inspect` — inspect background responses (`response`), local runs (`run`), run event ledgers (`run-events`), SDK turns (`turn`), or agent_run reports (`agent-run`).
- `agenterm config` — create and view effective configuration (`save`, `show`, `path`).
- `agenterm agents` — view/manage agents (instruction files) (`list`, `show`, `path`, `save`).
- `agenterm mcp` — diagnostics, inspection, and MCP server exposure.
- `agenterm artifacts` — browse durable artifacts produced by the agent.
- `agenterm session` — introspection and management of local Agents sessions.
- `agenterm branch` — manage session branches (list/use/new/fork/delete/runs).

The engine is a single‑path pipeline:

> CLI / REPL → AppConfig + SessionState → Agents engine (Responses + tools + MCP) → output

For product vision and target architecture, see `VISION.md` and `ARCH.md`.
This README focuses on how to **use** the CLI.

## Development Setup

For contributing or running from source:

1. Install dependencies:

   ```bash
   uv sync
   ```

2. Initialize configuration (first-time setup):

   Option A (recommended): start the REPL and save a config when prompted.

   ```bash
   uv run agenterm repl
   ```

   Option B (non-interactive): write a baseline config explicitly.

   ```bash
   uv run agenterm config save --scope global
   ```

   Use `--force` to overwrite an existing `~/.agenterm/config.yaml`.

   Optional: create a per-project override:

   ```bash
   uv run agenterm config save --scope local
   ```

   Optional: write baseline agent files:

   ```bash
   uv run agenterm agents save --scope global
   ```

3. Run the CLI:

   ```bash
   uv run agenterm --help
   ```

4. Ensure credentials are available:

   - For `openai/...` models: set `OPENAI_API_KEY` in your shell environment, or
     run `agenterm repl` and use `/config key <KEY>` to save it to `~/.agenterm/.env`.
   - For `gateway/...` models: set the env vars referenced by
     `providers.gateway.routes.*.api_key_env` (local backends may omit keys).

The runtime dependencies (Agents SDK, OpenAI client, MCP types) live in `.venv` and are treated as the runtime source of truth for types and behavior.

## Core commands

### Flags and output format

agenterm uses **command-scoped flags**. Put flags after the subcommand name:

```bash
uv run agenterm run --model openai/gpt-5.2-pro --config path/to/config.yaml "..."
```

Common flags:

- `--config PATH` (`-c`) — select a config file (commands that load config).
- `--format human|json` (`-o`) — select output format per command (`--format json` emits a single JSON envelope).
- Human output uses Rich panels/tables; use JSON output for machine parsing.
- `--model ID` (`-m`) — override the configured model (`openai/<model>` or `gateway/<route>/<model>`).
- `--agent NAME` — override the configured agent name (commands that resolve agent files, like `run`, `repl`, and `agents show/path`).
- `--quiet` (`-q`) — suppress informational notices.
- `--verbose` (`-v`) — show verbose error details.
- `--version` (`-V`) — show the CLI version and exit.

### One‑shot runs (`agenterm run "..."`)

Basic usage:

```bash
uv run agenterm run "describe this repo"
```

#### Streaming behavior

Foreground runs always stream internally. By default (quiet), the CLI prints only the final answer and footer. With `--live`, the CLI prints tool calls/outputs and reasoning labels as they arrive. Background runs are the only non‑streaming mode.

#### Non‑interactive by design

`agenterm run` never prompts. Approvals are auto‑resolved and short config/agent source notes are emitted to stderr. Use the REPL for interactive approvals or config/agent save prompts.

#### Input sources

Input can come from:

- Positional prompt arguments,
- `--file FILE` (`-f`),
- `stdin` — used only when no other source is provided.

#### Rendering and output modes

Foreground runs support two orthogonal toggles:

- **Quiet (default)** — do not pass `--live`:
  - Suppresses mid‑stream labels; shows only the final answer and a run summary panel
    (attachments count, tools, usage, response/session IDs).
- **Live (`--live`)**:
  - Streams `[reasoning]`, `[tool call]`, `[tool output]` as they arrive and renders the final answer with Markdown.
- **JSON output (`--format json`)**:
  - Use `agenterm <command> --format json …` to force machine‑readable JSON output for that command.
  - Emits a single JSON envelope to stdout and suppresses human formatting. The envelope is shared across CLI surfaces and has the shape `{schema_version, trace_id, ts, result:{resource, payload}}` (see `ARCH.md`).
  - Suppresses streaming accessories even when `--live` is set (JSON mode emits exactly one document).

Defaults for `--live` / JSON output come from `run.live` and `run.json_output` in `config.yaml` (`--format` overrides per invocation).

#### Cancelled runs (automatic resume)

When a run is cancelled mid‑turn, agenterm records in‑flight turn items and
automatically prepends them to the next run input. Tool calls are not re‑executed;
the recorded tool outputs (including rejections/denials) are reused as‑is.

#### Attaching files and URLs

Attach local files or URLs to a `run` invocation:

```bash
uv run agenterm run \
  --attach path/to/file.py \
  --attach https://example.com/spec.pdf \
  "summarize these"
```

#### Tools and bundles

Enable tools (file search, web_search (`web.run`), local tools `parallel`/`info`/`rg`/`fd`/`bat`/`tree`/`stat`,
`agent_run`, `agent_run_report`, `apply_patch`, `plan`, `image_generation`, and `shell` on macOS)
as follows:

- Configure tool families, bundles, and policies in `config.yaml` (see “Configuration”).
- Use `tools.bundles` and `tools.default_bundles` in `config.yaml` to control which tools and MCP servers are active.
- For one‑shot runs, override selection per invocation with:
  - `--tools-bundle NAME` (repeatable) to select bundle names.
  - `--tool KEY` (repeatable) to select individual tool keys.
  - Passing either flag overrides default bundles; selection becomes **only** the
    specified bundles/keys (tools are still disabled by `--no-tools`).
- Cap or extend agent loops via `agent.max_turns` (SDK turns per run, 1–500) in `config.yaml`.
  - When the limit is exceeded, the run stops with `MaxTurnsExceeded`; completed SDK turns remain in session history.
    Start a new run or raise `agent.max_turns` to continue.

#### Dangerous tools

Dangerous is a **tool label** used for gating (not a tool family). Defaults are
policy-driven but can be overridden in config.

Default dangerous refs:

- `fn:shell`, `fn:apply_patch`, `fn:user:*`, `mcp:*`
- Hosted MCP connectors: `hosted:mcp:*`
- `plan`, `agent_run_report`, safe read tools (`parallel`, `info`, `rg`, `fd`, `bat`, `tree`, `stat`),
  and `agent_run` are **safe by default** (no side effects).

Overrides (config):

- `tools.dangerous.add`: force‑mark tool refs as dangerous.
- `tools.dangerous.remove`: force‑mark tool refs as safe.
  - Example: mark `web_search` (`web.run`) dangerous or mark `apply_patch` safe.

Tool ref syntax:

- explicit tool keys (e.g., `hosted:openai:web_search` for `web_search`/`web.run`, `fn:shell`, `mcp:files`)
- built-in names (`file_search`, `web_search` (`web.run`), `image_generation`, `parallel`, `shell`, `apply_patch`,
  `plan`, `agent_run_report`, `steward`, `rg`, `fd`, `bat`, `tree`, `stat`, `agent_run`)
- `fn:user:<name>` or `fn:user:*`
- `hosted:mcp:<name>` or `hosted:mcp:*`
- `mcp:<server_key>` or `mcp:*`

Behavior:

- One-shot run:
  - Without `--allow-dangerous`, dangerous tools are dropped from selection.
  - Approvals are always auto‑resolved; `agenterm run` never prompts. Use the REPL for interactive approvals.
- REPL:
  - Allows dangerous tools; approvals default to `prompt` mode (manual) and are handled via a modal approvals overlay.
  - Passing `--approvals auto` starts the session in `auto` mode.

agenterm does **not** attempt to classify arbitrary shell strings as “read-only”.

Platform notes:

- `fn:shell` is **macOS-only** (Seatbelt via `sandbox-exec`).
- `fn:apply_patch` is **cross-platform**, but is confined to the workspace root and always goes through approvals.

#### Structured outputs

You can drive structured JSON outputs via a JSON/YAML schema configured in `config.yaml`:

```bash
uv run agenterm config save --scope local
${EDITOR:-vi} .agenterm/config.yaml
uv run agenterm run "answer using this schema"
```

Set `model.text_format_file` in your config to point at a schema file (for example `examples/structured/answer_schema.yaml`). This loads a JSON/YAML schema into `model.text_format` and installs a structured output schema so the model emits responses that follow your JSON schema.

Structured output schemas are enforced in strict mode: object schemas are normalized to include
`additionalProperties: false`, and invalid schemas fail fast before any request is sent.

#### Tracing

Consolidated trace flag:

- `--trace on|off` — enable/disable tracing for this invocation
- `--trace ID:GROUP:key=val,...` — all-in-one trace configuration (implicitly enables tracing)

These map into the following fields in `AppConfig.run`:

- `run.trace_enabled`
- `run.trace_id`
- `run.group_id`
- `run.trace_metadata`

They are forwarded to the Agents `RunConfig` so your observability backend can correlate CLI runs and REPL sessions. The REPL `/status` command surfaces the active tracing configuration.

Quick lookup (non-REPL):

- `agenterm trace show` — display effective trace configuration (`--format json` for scripts).

## Background runs & inspection

### Background runs

Use `--background` to submit a server‑side (non‑streaming) run:

```bash
uv run agenterm run --no-tools --background "run this in the background"
```

Behavior:

- The `run` command prints only a `response_id` for background submissions:

  ```text
  response_id: resp_...
  ```

- Background runs only attach **hosted-plane** tools (`hosted:*`).

- All client-plane tools (`fn:*`, `mcp:*`, `fn:user:*`) are stripped automatically in background mode (background runs cannot reach your local machine), including local MCP servers.

- Hosted MCP connectors can be used in background mode by selecting:
  - `hosted:mcp:<name>` (from `mcp.connectors[*].name`),
  - and passing `--allow-dangerous` when the connector is marked dangerous
    (default: hosted MCP connectors are dangerous; overrideable via `tools.dangerous.remove`).

### Inspecting background results

Use `agenterm inspect response` to inspect a previously submitted background response:

```bash
uv run agenterm inspect response RESPONSE_ID
```

This fetches a single Responses object by id and prints:

- `response_id`
- Model id (including concrete version)
- `background: true` when the run was submitted with `--background`
- Status (for example, `completed`, `failed`)
- Usage summary (`in=… out=… total=… tokens`)
- A best‑effort `output:` block with aggregated text output

This path is strictly **read‑only**:

- It does not mutate Agents sessions or local session history (SDK turns).
- It exists solely to make `--background` runs easy to inspect without leaving the CLI.

### Inspecting local runs, run events, and SDK turns

Local inspection reads the SQLite store and does not require provider access:

```bash
uv run agenterm inspect run SESSION_ID 3
uv run agenterm inspect run-events SESSION_ID 3
uv run agenterm inspect turn SESSION_ID 7
```

To inspect delegated `agent_run` reports (stored as artifacts):

```bash
uv run agenterm inspect agent-run REPORT_ID
```

## Sessions for one-shot runs

Every one-shot `agenterm run` invocation uses a local SQLite-backed session (`AgentermSQLiteSession`, an agenterm-owned wrapper around Agents `AdvancedSQLiteSession`) so transcripts and usage are persisted in a shared SQLite store.

- By default, each run creates a fresh session with a UUID `session_id`.

- To reuse a session across multiple runs and continue from local session history (SDK turns), pass:

  ```bash
  uv run agenterm run --session <UUID> "continue this agent in the same session"
  ```

- `--session <UUID>` is **resume-or-fail**: if the session does not exist locally, `agenterm` exits with an error. Use `agenterm session list` to discover valid ids.

- To target a specific branch within a session, use `--branch <BRANCH_ID>` (defaults to the session head branch):

  ```bash
  uv run agenterm run --session <UUID> --branch main "continue on a branch"
  ```

  To override the server-side store per run (without editing config), use
  `--store/--no-store`:

  ```bash
  uv run agenterm run --store "enable server-side store for this run"
  ```

  Default is `model.store: false` (provider store off). Set `model.store: true` to
  enable server-side persistence by default.

### Inspecting stored sessions

Use the session subcommands to inspect stored transcripts and usage:

- `agenterm session list` — list sessions in the local store (compact table with single-line last-message previews).
- `agenterm session show <ID>` — print metadata; includes `usage_total` when the per-request ledger is present.
- `agenterm session runs <ID> [--branch <BRANCH_ID>]` — show run history for a branch; tokens include cached/reasoning totals when available.
- `agenterm branch list <SESSION_ID>` — list branches for a session.

## Interactive REPL (`agenterm repl`)

Start the REPL:

```bash
uv run agenterm repl
```

Input defaults:
- Enter sends the prompt.
- Ctrl+J inserts a newline (Shift+Enter is not distinct in macOS iTerm2 without remapping).
- Ctrl+C clears the current input line.
- Pasting multi-line text does not auto-send; press Enter to submit.

### Key slash commands

**Core commands:**
- `/help [topic]` — tiered help; shows overview or per-command details.
- `/status` — session snapshot (model, tools, config, IDs).
- `/errors` — show captured stderr/logging lines from the REPL error buffer.
- `/last diff [preview|full]` — show the most recent apply_patch diff (preview is line‑bounded; `full` is unbounded).
- `/last tools [preview|full]` — show the most recent tool call/output (bounded preview by default; `full` is explicit).
- `/last approvals [preview|full]` — show the most recent approvals (pending + resolved; includes rejection reasons).
- `/trace [show|clear|on|off]` — tracing controls for this session.
- `/quit` — exit REPL (also Ctrl+D).
  - On exit, agenterm prints a resume command with the current session id.

**Model and tools:**
- `/model [ID]` — show or set model.
  - `/model list <provider>[/<route>] [filter]` — list cached model IDs for a provider or gateway route.
  - `/model refresh <provider>[/<route>]` — refresh the model registry (`openai` hits `/models`; gateway uses config allowlists).
  - Model ids are validated against provider registries implied by the prefix (`openai/` cache or gateway allowlist).
- `/tools [sub]` — view and adjust tool bundles and individual tools.
- `/approvals [sub]` — manage tool approvals (mode: `prompt|auto`; supports `list|show|approve|reject`).

**Settings (consolidated under `/config`):**
- `/config key <KEY>` — set API key for this session.
- `/config verbosity [low|medium|high|unset]` — adjust output detail.
- `/config reasoning [effort|summary] <VALUE>` — adjust reasoning effort/summary.

**REPL UI controls (`/repl`):**
- `/repl theme dark|light` — select the REPL prompt theme.
  - Dark/light themes use warm amber CRT‑inspired theme tokens for prompt UI and CLI accents.
- `/repl color-depth auto|mono|ansi|default|truecolor` — set prompt-toolkit color depth (default `auto`, which picks the highest supported depth; truecolor on macOS Terminal/iTerm2).
- `/repl mouse on|off` — enable/disable mouse support in the prompt UI (default on).
- `/repl completion off|commands|full` — toggle completion (slash commands only, or slash + history suggestions).
- `/repl edit-mode emacs|vi` — set editing mode (vi default, emacs optional).

**UX toggles (`/ux`):**
- `/ux markdown on|off` — toggle styled Markdown rendering for agent output.
- `/ux reasoning off|summary` — show/hide reasoning summaries (summary-only; no chain-of-thought dumps).
- `/ux diffs off|summary` — show/hide apply_patch diff artifacts in the transcript.
- `/ux stream final|live` — choose final-only vs live text streaming (Markdown renders only in final mode).
- `/ux verbosity quiet|normal|debug` — control detail level (labels-only → bounded details → raw event types).

**Session management (consolidated under `/session`):**
- `/session new` — start a fresh REPL conversation.
- `/session list` — list known sessions in the local SQLite store (compact table with single-line last-message previews).
- `/session use <ID>` — attach to an existing stored session.
- `/session runs [N]` — show recent runs for the current session.
  - Attaching to a session replaces the in-memory transcript with the stored branch history.

**Branch management (consolidated under `/branch`):**
- `/branch list` — list branches for the current session.
- `/branch use <BRANCH_ID>` — switch to a branch.
- `/branch new [BRANCH_ID]` — create a new branch from head and switch to it.
- `/branch fork <RUN_NUMBER> [BRANCH_ID]` — create a branch from a run (exclusive) and switch to it.
- `/branch delete <BRANCH_ID> [--force|-f|force]` — delete a branch (never `main`; deleting current branch requires force).
- `/branch runs [N]` — show recent runs for the active branch.
  - Switching branches replaces the in-memory transcript with the selected branch history.

**Context and control:**
- `/attach <PATH|URL>` — stage files/URLs for the next run (e.g., `/attach README.md`).
- `/attach list` — show staged and last-used attachments.
- `/attach remove <PATH|URL>|all` — unstage attachments for the next run (does not reset conversation).
- `/attach clear` — clear staged attachments and cancel any pending “use last attachments” state.
- `/agent` — show the current agent.
- `/agent list` — list available agents.
- `/agent <NAME>` — switch agent and branch from head (e.g., `/agent default`).
- `/compress` — run compression and switch to the compression continuation (snapshot/compaction, provider‑dependent); tracked as a run in the session ledger.
- `/compress show` — print the latest compression continuation (snapshot/compaction), grouped by content/citation type; UI-only output with binary payloads redacted.
- `/mcp refresh|status` — refresh MCP servers or show status/tool counts.
- `/again` — re-run the last prompt; forks from the last run when it has turns, otherwise retries in place (requires a stored session).
- `/edit` — edit the last prompt; forks from the last run when it has turns, otherwise edits in place (requires a stored session).
- ESC/Ctrl+C — cancel the current run or compression immediately (in‑flight items are captured and automatically reused).
  - If cancellation stalls, the REPL escalates to a hard cancel and persists the cancelled run; the next run resumes with the captured items.

### REPL UX and state

- The REPL is full‑screen and scrollable:
  - Transcript output renders in a dedicated scrollable window with internal scrollback.
  - Slash‑command outputs render as transcript blocks with consistent spacing.
- The layout is stable:
  - Transcript (scrollable) → adaptive completion panel → spacer → composer → spacer → status bar.
- The transcript uses mouse-wheel scrolling without a visible scrollbar.
- Stream mode is explicit: `/ux stream final|live`. Live mode streams plain text deltas (no panels); final mode renders the completed answer (Markdown when enabled).
- The REPL uses the same `AppConfig` / Agents engine as `run`.
- If a run fails before a response id is recorded, the REPL discards staged items and continues with local session history replay on the next prompt.
- If you cancel a run, agenterm records the cancelled attempt; the next run resumes with the captured items on top of local session history.
  - It adds interactive state and approvals on top.
- Approvals appear as a modal overlay with approve/reject actions; decisions still emit transcript blocks.
- The status bar highlights `AUTO` in bold danger red when approvals mode is automatic.
- The prompt chevron is `>`. Approval posture is shown in the HUD/placeholder notice.
- The HUD/placeholder surfaces short notices (approval pending, run running).
- The bottom toolbar is a minimal status line:
  - `agent: <name>[model] | effort: <value> | tools: <bundle> | context: <percent>%` where `<percent>` uses provider-reported
    input tokens for the **last provider request** of the last completed run, normalized by `model.context_window` when configured;
    when `model.context_window` is null, the toolbar shows the raw token count (packing uses a 128k token fallback internally).
  - `AUTO` appears when approvals mode is auto (right-aligned when width permits, otherwise appended).
  - For per-run and cumulative totals, use `agenterm session runs --branch …` / `agenterm session show <SESSION_ID>` (token ledger).
  This is informational and does not change model behavior.

### Tool approvals in the REPL

Tool approvals in the REPL are non‑blocking:

- When a `shell` or `apply_patch` tool call requires approval, the model run runs as a background task while the UI stays responsive.
- When an approval is requested, the REPL opens a modal approval overlay and emits a bounded `[approval requested]` transcript block.
  - `[approved]` / `[rejected]` blocks appear when the decision resolves.
- While a run is running:
  - Press `y`/`n` in the modal to approve/reject the next pending tool call.
  - Press `n` and supply a reason to reject with a user-provided reason (forwarded to the model).
  - Use `/approvals list` and `/approvals show <ID>` to inspect details on demand.
  - Use `/last approvals` to inspect the most recent approval requests and decisions (bounded; includes rejection reasons).
  - Use `/approvals auto` to approve everything pending and continue without prompts.
  - Use `/approvals prompt` to return to manual approvals.
  - Normal prompts and most commands are rejected with a short message instead of hanging.
- Auto‑approval is supported and shared with `run`:
  - `/approvals auto` enables auto‑approval for the session.
  - `repl.approvals.mode: auto` in config starts REPL sessions in auto‑approval mode.

## Attachments

- Attach local files or URLs with `--attach PATH|URL` (repeatable).
- Prompts can come from:
  - Positional args,
  - `--file FILE`,
  - `stdin` (when no other source is provided).

Under the hood, agenterm maps prompts and attachments into Responses input items
based on provider capabilities (file_id, image_url, file_url).

Attachment mapping:

- Text files map to `input_text` (inline text).
- Image files are uploaded when file IDs are supported; otherwise they require an explicit `http(s)` URL
  (or inline data URLs when explicitly allowed).
- PDF files are uploaded when file IDs are supported; otherwise they require an explicit `http(s)` URL
  (or inline data URLs when explicitly allowed).
- `http(s)` URLs map to `input_image.image_url` for images or `input_file.file_url` for other files.
- Gateway (LiteLLM) lanes convert Responses inputs to Chat Completions, so they only accept
  `input_image.image_url` and `input_file.file_data` (file IDs/URLs are rejected).
- Inline data URLs are disabled by default and must be explicitly enabled via `attachments.*`.

Example assets in `examples/assets/`:

- `examples/assets/agents_guide.pdf` — long-form document for testing file search/attachments.
- `examples/assets/expression.jpeg` — math expression image for vision tests
  (prompt: "simplify this expression").

## Examples

Examples are runnable demos and test assets. Canonical behavior and contracts live
in `README.md` and `ARCH.md`. Reference-only material lives in `docs/ref/`.
See `docs/ref/README.md` for the reference index.

See `examples/README.md` for the full examples index.
Reference input-item example: `docs/ref/response-input-example.yaml`.

## Configuration (AppConfig)

agenterm reads a single `config.yaml` into an `AppConfig` (see `ARCH.md`). Discovery order:

   1. `--config PATH` (explicit).
   2. `.agenterm/config.yaml` (project-local override; when present).
   3. `~/.agenterm/config.yaml` (global baseline; Linux and macOS only).

   Use `agenterm config save --scope global` to create the global baseline config,
   `agenterm config save --scope local` to create a per-project override, or
   `agenterm config path` to see the active config location.

   If no config file exists yet, agenterm runs with a fully-typed built-in default
   `AppConfig`. The REPL offers a one-time prompt to save a config, while
   `agenterm run` stays non-interactive and prints a short config-source note on stderr.
   In greenfield mode, any config/agent file validation error is fatal; the CLI
   instructs you to delete the local/global `.agenterm/` directories and retry.

### Agents (instruction files)

The **agent name** is the filename (without extension) for the instruction
file. There is no separate display name — the agent name is used everywhere.

agenterm resolves the active agent from the highest‑priority source available.
Local agent files override global ones, and REPL `/agent` overrides everything
for the active session.

On `agenterm agents save`, bundled agents are copied to:
```
~/.agenterm/agents/        # global defaults
./.agenterm/agents/        # project-local overrides
```

Local saves prefer copying global agents when they exist.

`agenterm agents show` prints the full effective agent file:
- Human output renders Markdown by default.
- Use `--plain` for raw text.
- Use `--format json` for a single JSON envelope.

Use `agenterm agents path` to report the effective source + path.

Starter instruction templates live in `docs/ref/agent-instruction-templates.md`
(reference-only); adapt them to your repo constraints.

**Resolution order:**
1. `--agent NAME` (CLI explicit override)
2. `./.agenterm/agents/<name>.md` project-local override (when present)
3. `~/.agenterm/agents/<name>.md` user default (if it exists and is readable)
4. Bundled `src/agenterm/data/agents/<name>.md` shipped with the package

In the REPL, `/agent <NAME>` switches agents and creates a new branch from
head (unless there is no session history yet, in which case the current branch metadata
is updated in place); this live override is the highest priority within that
session.

The default `default` agent is a generalist companion tuned for local,
tool-using work. The bundled `coding_agent` agent establishes:
- Quality-first principles
- Tool discipline (when to use shell vs apply_patch)
- File editing conventions
- Error handling and escalation patterns

Edit `~/.agenterm/agents/<name>.md` to customize agent behavior globally, or
`./.agenterm/agents/<name>.md` to customize it per-project.

### Additional notes

- The canonical, complete, commented starter template is generated from the live `AppConfig` schema (see `agenterm config save`), so it always stays in sync with the code.
- Runtime configuration lives in `config.yaml`, while agent files live in `.agenterm/agents/` (local/global overrides):
  - Agent/model defaults
  - Tools
  - MCP servers
  - Retries (provider/MCP/store retry policies)
  - Guardrails
  - Tool bundles
  - Shell policy
- CLI flags are per‑run overrides layered on top of this config.

### Retries (provider/MCP/store)

Retries are configured centrally in `retries` within `config.yaml`. SDK‑internal retries are
disabled; agenterm’s own retry policy is the single source of truth. Use:

- `retries.provider` for model calls (streamed/background/agent_run).
- `retries.mcp` for MCP `list_tools` / `call_tool` retries.
- `retries.store` for SQLite `SQLITE_BUSY` backoff.

### Local continuity (always)

- Agenterm CLI always replays local session history (SDK turns) for each run (bounded by packing).
- `previous_response_id` is never used.
- Store (`model.store`) controls **provider persistence only**:
  - Default is off (`model.store: false`).
  - Override per invocation with `--store` or `--no-store`.
  - Enables background inspection and response retrieval.
- `last_response_id` is stored per branch only when provider storage is enabled
  (for inspection identifiers, not for continuity).

Session history (SDK turns) and usage are persisted locally via the Agents session store:

- Agenterm CLI creates an `AgentermSQLiteSession` (an agenterm-owned wrapper around Agents `AdvancedSQLiteSession`) backed by the history DB at `~/.agenterm/<version>/history.sqlite3`.
- Both `run` and `repl` keep full transcripts without requiring additional configuration files.
- agenterm records per‑branch per‑run status in the meta DB at `~/.agenterm/<version>/store.sqlite3` to support post‑mortem debugging.

### Artifacts vault (generated images)

Some model/tool outputs are too large to store directly in session history (SDK turns) (base64 image payloads). agenterm treats these as **durable artifacts**:

- Location (single per-user root):
  - History DB: `~/.agenterm/<version>/history.sqlite3`
  - Meta store DB: `~/.agenterm/<version>/store.sqlite3`
  - Artifacts root: `~/.agenterm/artifacts/`
  - Images: `~/.agenterm/artifacts/images/<artifact_id>.<ext>`
- Storage model:
  - Artifact bytes live on disk.
  - The DB stores stable references (artifact id + path + metadata).
  - No deduplication/content hashing is performed (artifact ids are UUIDs).
  - Session history stores `image_generation_call.result: null`; rehydration is UI‑only and never re‑injects base64 into model input.
  - UX surfacing:
  - REPL: iTerm2 inline previews bounded by the configured max width/height, plus durable paths in the transcript.
    - Commands: `/artifacts list [N]`, `/artifacts show <ID>`, `/artifacts open <ID>`, `/artifacts agent-run <ID>`,
      `/inspect agent-run <ID> [--json]`
    - Inline previews are only shown in iTerm2; other terminals show the path only.
    - Previews preserve aspect ratio and fit within the configured bounds.
  - CLI: `agenterm artifacts list|show|open|agent-run`
    - `agenterm artifacts list` supports `--session-id` and `--trace-id` filters.
  - One-shot runs: when stdout is a TTY, `agenterm run ...` prints image artifact cards even in quiet mode.
  - JSON mode: `agenterm run --format json ...` includes `result.payload.artifacts` so scripts can capture paths deterministically.
- Retention:
  - agenterm does not delete artifacts automatically (no retention caps). Delete files manually if desired.

### Compression (compression.*)

Compression runs **before** a run when packed history is too large. It creates a
compression continuation branch (snapshot/compaction) and switches head; replay stays local.

- Strategy:
  - `snapshot`: run `steward.snapshot` (continuation snapshot).
  - `compaction_if_supported`: use `/responses/compact` when the provider declares support; fall back to snapshot otherwise.
  - `both_if_supported`: run both snapshot + compaction and select the active continuation branch via `primary_branch`.
- Trigger:
  - `ask`: prompt in the REPL only; non‑REPL runs skip compression unless `trigger: auto`.
  - `auto`: run compression without prompting.
- Threshold:
  - Percent‑only: `compression.threshold_percent` (0 < x ≤ 1).
  - Uses `model.context_window` (default 400000) as the reference budget; the threshold is capped
    by the input token budget (`context_window - max_output_tokens`).
- Drop policy:
  - `deny`: refuse to drop turns after compression (default).
  - `ask`: prompt before dropping turns.
  - `allow`: drop oldest turns without prompting when still over budget.

Example:

```yaml
compression:
  # snapshot | compaction_if_supported | both_if_supported
  strategy: compaction_if_supported
  # ask | auto
  trigger: ask
  # Compress when history reaches 50% of the context window.
  threshold_percent: 0.5
  # When running both snapshot + compaction, which continuation branch becomes active.
  primary_branch: snapshot
  # deny | ask | allow
  drop_policy: deny
```

Example config file: `examples/config/compression_percent.yaml`
Gateway route example: `examples/config/gateway_routes.yaml`

### Canonical defaults (mirrors `AppConfig` in code)

```yaml
agent:
  # Model identifier (Responses/Agents models only).
  # Model IDs are prefixed: openai/<model> or gateway/<route>/<model>.
  # Validation uses provider registries (OpenAI /models cache or gateway allowlist).
  model: openai/gpt-5.2
  # Agent name (instruction file name; bundled agents live in src/agenterm/data/agents).
  # Use /agent <NAME> in the REPL to switch and branch from head.
  name: default
  # SDK turn limit per run (1–500).
  max_turns: 250

model:
  # Sampling
  temperature: 1.0
  #top_p: 0.9
  #frequency_penalty: 0.0
  #presence_penalty: 0.0

  # Output length (visible + reasoning tokens where applicable)
  max_output_tokens: 128000

  # Reasoning configuration for gpt-5 / o-series models.
  # Set to null for non-GPT-5 models.
  reasoning:
    effort: medium
    summary: concise

  # Model verbosity (maps to Responses text.verbosity)
  # low|medium|high
  verbosity: medium

  # Include usage so the CLI can surface token summaries.
  include_usage: true

  # Advanced knobs (map directly to ModelSettings / Responses):
  tool_choice: auto
  # agenterm always disables provider-level tool parallelism (parallel_tool_calls=false).
  truncation: disabled
  store: false
  context_window: 400000           # default tokens; set null for 128k fallback; manual; for packing + UI %
  prompt_cache_retention: 24h           # or "in_memory"
  # Optional per-request metadata forwarded to the model API.
  #metadata:
  #  run_id: demo-123
  # Structured output (JSON schema). When set, this populates model.text_format and installs
  # a structured output schema so the model emits responses that follow your JSON schema.
  #text_format:
  #  type: json_schema
  #  name: example_answer
  #  schema:
  #    type: object
  #    properties:
  #      summary:
  #        type: string
  #        description: Short natural-language summary of the result.
  #    required: ["summary"]
  #    additionalProperties: false
  #  strict: true
  #  description: Example structured answer format
  #
  # Alternatively, you can point at a JSON/YAML schema file. Exactly one of
  # text_format/text_format_file may be set.
  #text_format_file: examples/structured/answer_schema.yaml
  #top_logprobs: 0
  #extra_headers:
  #  X-Experiment: B
  #extra_query:
  #  service_tier: priority
  #extra_body:
  #  metadata:
  #    run_id: demo-123

providers:
  openai:
    # Optional OpenAI-compatible endpoint override (e.g., Azure).
    # When set, this overrides OPENAI_BASE_URL.
    base_url: null
  gateway:
    routes:
      openrouter:
        provider: openrouter
        base_url: null
        api_key_env: OPENROUTER_API_KEY
        # Optional static headers (non-secret).
        #headers:
        #  X-Route: openrouter
        model_allowlist:
          - google/gemini-3
        allow_any_model: false

retries:
  provider:
    max_retries: 10
    base_backoff_seconds: 0.5
    max_backoff_seconds: 8.0
    jitter_ratio: 0.25
    retry_after_max_seconds: 60.0
  mcp:
    max_retries: 3
    base_backoff_seconds: 1.0
    max_backoff_seconds: 8.0
    jitter_ratio: 0.25
    retry_after_max_seconds: null
  store:
    max_retries: 5
    base_backoff_seconds: 0.05
    max_backoff_seconds: 1.0
    jitter_ratio: 0.25
    retry_after_max_seconds: null

run:
  # Default per-run behavior for one-shot runs.
  background: false
  # Stream idle timeout (no stream events/model bytes; resets on any event). Server may still enforce its own limits.
  timeout_seconds: 600.0
  # Stream progress timeout (whitespace-only tool-arg deltas do not count; null disables).
  progress_timeout_seconds: 60.0
  # Max streamed tool-argument chars per tool call (sum of deltas; null disables).
  tool_args_max_chars: 200000
  live: false                      # false = quiet summary, true = stream live events
  json_output: false               # true = emit JSON envelope instead of text output

  # Optional tracing defaults; can also be overridden via CLI --trace on|off|ID:GROUP:key=val,...
  trace_enabled: true             # master tracing switch
  #trace_id: "my-trace-id"
  #group_id: "my-group-id"
  #trace_metadata:
  #  run_id: "demo-run-1"
  #trace_include_sensitive_data: false

repl:
  approvals:
    mode: prompt                  # prompt | auto
  transcript:
    tool_output_max_lines: 60
    shell_preview_max_chars: 80
    tool_detail_max_lines: 24
    tool_detail_max_chars: 240
    mcp_args_preview_max_chars: 240
    attachments_max_lines: 6
    attachments_path_max_chars: 120
  ui:
    theme: dark                    # dark | light
    color_depth: auto              # auto | mono | ansi | default | truecolor
    editing_mode: vi               # emacs | vi
    mouse: true
    completion: full               # off | commands | full
    max_transcript_entries: 400
  ux:
    markdown: true
    reasoning: summary
    reasoning_summary_max_chars: 0
    diffs: summary
    stream: final                # final | live
    verbosity: normal             # quiet | normal | debug

tools:
  # Max serialized tool output length (chars) for FunctionTools + MCP bridge.
  max_chars: 20000
  file_search:
    vector_store_ids: []
    max_num_results: 5
    include_search_results: true
    #filters: {}
    #ranking_options: {}
  web_search:
    search_context_size: high   # low|medium|high
    user_location:
      type: approximate
      country: US
    #filters: {}
  shell:
    # Maximum wall-clock time per shell command (ms).
    timeout_ms: 60000
    # Maximum output length enforced by agenterm (also forwarded to the model as a hint).
    max_chars: 20000
    sandbox:
      network: allow          # allow | deny
    # Working directory for commands (resolved under the workspace root). When omitted, the workspace root is used.
    #working_dir: .
    # Custom env vars merged with safe defaults (PATH, HOME, SHELL, etc.).
    # Set env: null to use defaults only.
    env:
      UV_CACHE_DIR: .uv-cache
  # Apply patch uses the workspace editor; no tunables.
  apply_patch: {}
  parallel: {}
  plan: {}
  agent_run:
    bundle: handoff
  agent_run_report: {}
  rg: {}
  fd: {}
  bat: {}
  tree: {}
  stat: {}
  image_generation:
    model: gpt-image-1.5
    #background: auto          # transparent|opaque|auto
    input_fidelity: high       # high|low
    moderation: low            # auto|low
    #output_compression: 75
    output_format: png         # png|webp|jpeg
    #partial_images: 0
    quality: high              # low|medium|high|auto
    #size: auto                # 1024x1024|1024x1536|1536x1024|auto
    #input_image_mask: {}
  # Function tools are declared here; the actual Python callables are
  # registered in code (see core.function_tools).
  function_tools: []
  # Example:
  #   - name: get_weather
  # Dangerous overrides (tool refs).
  dangerous:
    add: []
    remove: []

  # Tool bundles for selection. Bundles are referenced by name via
  # `tools.default_bundles` and the REPL `/tools` command. Tool keys encode
  # capability families:
  #   - hosted:openai:* (OpenAI hosted tools; background-compatible)
  #   - hosted:mcp:* (OpenAI hosted MCP connectors; background-compatible)
  #   - fn:* (built-in FunctionTools; local harness)
  #   - fn:user:* (user FunctionTools; local harness)
  #   - mcp:* (client MCP servers; local harness)
  bundles:
    inspect:
      tools:
        - fn:parallel
        - fn:rg
        - fn:fd
        - fn:bat
        - fn:tree
        - fn:stat
    plan:
      tools:
        - fn:plan
    delegate:
      tools:
        - fn:agent_run
        - fn:agent_run_report
    steward:
      tools:
        - fn:steward
    edit:
      tools:
        - fn:apply_patch
    shell:
      tools:
        - fn:shell
    integrations:
      selectors:
        - mcp:*
        - hosted:mcp:*
    extensions:
      selectors:
        - fn:user:*
    subagents:
      tools:
        - fn:agent_run
        - fn:agent_run_report
        - fn:steward
    handoff:
      tools:
        - hosted:openai:web_search
        - hosted:openai:file_search
        - hosted:openai:image_generation
      scope: delegate
  # Bundles that are active by default at startup (main scope).
  default_bundles:
    - agenterm

mcp:
  # Declare MCP servers (client-managed) and connectors (provider-hosted) here.
  connectors: []
  servers: []
  convert_schemas_to_strict: false
  bridge:
    # Hard caps for MCP tool outputs stored in history (size clamp via tools.max_chars).
    max_content_items: 50
  expose:
    enabled: false
    transport: stdio
    host: 127.0.0.1
    port: 8000
    allow_dangerous: false
    bundles: null
    tools: null
  # Example (stdio):
  # servers:
  #   - key: files
  #     kind: stdio
  #     stdio:
  #       command: uvx
  #       args: ["mcp-files-server"]
  #       cwd: .

steward:
  agent:
    # Steward agent name (bundled/local/global agent file).
    name: steward
    # Optional model override for Steward tasks.
    model: null
    # Truncation mode for Steward tasks.
    truncation: auto
    # Optional output-token cap for Steward tasks.
    max_output_tokens: 128000
    # Inline instructions (mutually exclusive with path/source).
    instructions: null
    # Path to a Steward agent file (mutually exclusive with instructions/source).
    path: null
    # Optional agent file name to resolve (mutually exclusive with instructions/path).
    source: null
  tasks:
    # Max queued Steward tasks per session/branch.
    max_pending: 1

compression:
  # snapshot | compaction_if_supported | both_if_supported
  strategy: snapshot
  # ask | auto
  trigger: ask
  # Percent of context_window (0 < x <= 1) that triggers compression.
  threshold_percent: 0.5
  # When running both snapshot + compaction, which continuation branch becomes active.
  primary_branch: snapshot
  # deny | ask | allow
  drop_policy: deny

attachments:
  # file_id_or_url_only | allow_inline_data_url
  image_input_mode: file_id_or_url_only
  # Allow inline data URLs when supported by the provider.
  allow_inline_data_url: false
  # Hard cap for inline payload bytes when enabled.
  max_inline_bytes: 0

guardrails:
  load_modules: []
  # Names must correspond to guardrails registered in `core.guardrails_registry`.
  # See `examples/guardrails/no_pii_input.py` for a reference input guardrail.
  input: []
  output: []
```

Run `agenterm config save --scope global` to generate the global baseline `~/.agenterm/config.yaml` from the schema-derived template.
When you want project-specific overrides, run `agenterm config save --scope local` to seed `.agenterm/config.yaml` (from the global config when present), then adjust only the fields you care about for that project.

## Tools, MCP, and approvals

agenterm’s tool selection is structural: you select **tool keys** (bundles + direct keys), and the engine attaches exactly the resolved selection (config is not rewritten per-session).

### Tool keys (capability families)

Tool keys are capability identifiers (not adapters). They encode family:

- `fn:<name>` — built‑in FunctionTools (`fn:user:<name>` for user‑defined).
- `hosted:openai:<tool>` — OpenAI hosted tools.
- `hosted:mcp:<connector>` — OpenAI‑hosted MCP connectors.
- `mcp:<server_key>` — client‑managed MCP servers (each server exposes many tools).

Hosted tools (including hosted MCP connectors) are **OpenAI‑plane only**. Gateway
models use Chat Completions semantics and accept **FunctionTools only**, so
selecting hosted tools with `gateway/...` models is invalid and will fail at
runtime. MCP servers remain available on both planes via the MCP‑as‑FunctionTools
bridge. Background runs attach **only** `hosted:*` tools.

Common keys:

- Hosted built-ins:
  - `hosted:openai:file_search`
  - `hosted:openai:web_search`
  - `hosted:openai:image_generation`
- Built‑in FunctionTools:
  - `fn:parallel`
  - `fn:shell` (macOS-only)
  - `fn:apply_patch`
  - `fn:plan`
  - `fn:agent_run_report`
  - `fn:steward`
  - `fn:rg`
  - `fn:fd`
  - `fn:bat`
  - `fn:tree`
  - `fn:stat`
  - `fn:agent_run`
- User FunctionTools (declared in `tools.function_tools`):
  - `fn:user:<name>`
- MCP servers (declared in `mcp.servers`):
  - `mcp:<server_key>`
- MCP connectors (declared in `mcp.connectors`, provider-hosted):
  - `hosted:mcp:<name>`

### MCP: connectors vs servers

- **MCP connectors** (`mcp.connectors`) are provider-hosted and attach as the single tool named `hosted_mcp` (Agents `HostedMCPTool`). Select them via `hosted:mcp:<name>`.
- **MCP servers** (`mcp.servers`) are client-managed; their tools are discovered at runtime and exposed as function tools whose names begin with `mcp__...` (agenterm rewrites names to be globally unique and OpenAI-valid). Select servers via `mcp:<server_key>`.
  - Hosted MCP connectors are OpenAI‑plane only; gateway models use MCP servers via FunctionTools.

### Bundles and defaults

- `tools.bundles`: bundle name → bundle definition (`bundles`, `tools`, `selectors`, `scope`).
- `tools.default_bundles`: the default main-scope selection for the CLI and REPL.
- Default bundle: `agenterm` (composes all main-scope bundles: `inspect`, `plan`,
  `subagents`, `edit`, `shell`, `integrations`, `extensions`).
- `subagents` bundle: sub-agent tools (`agent_run`, `agent_run_report`, `steward`).
- Delegate-only bundle: `handoff` (tools handed off to agent_run sub-agents).
- `fn:agent_run` uses `tools.agent_run.bundle` (default `handoff`) for delegated
  one-shot runs; delegate bundles are hosted-only and not selectable by the main
  agent.

### Tools gate

`--no-tools` (CLI) and `/tools off` (REPL) hide all tools from the model without changing the selection. When the gate is off, agenterm strips tool config and MCP tools from the runtime agent.

### Approvals summary

- REPL:
  - `shell` and `apply_patch` operations always register approvals.
  - Approval requests open a modal overlay (details via `/approvals list` / `/approvals show <ID>`).
  - While approvals are pending, type `y`/`n` to approve/reject the next pending item.
  - While approvals are pending, type `n <reason...>` to reject and supply a reason (forwarded to the model).
  - `/last approvals` prints a bounded approvals audit trail (pending + resolved) for the active REPL session.
  - `/approvals auto` enables auto‑approval and approves everything currently pending.
  - `/approvals prompt` returns to manual approvals.
- One-shot runs:
  - Approvals are auto‑resolved; `agenterm run` never prompts.
  - Dangerous tools (after applying `tools.dangerous` overrides) require
    `--allow-dangerous`; without it, runs drop them from selection.

### Tool availability vs configuration

Some tools require additional provider configuration to be truly available:

- File search:
  - Attaches a `FileSearchTool` only when `vector_store_ids` is non‑empty.
  - If selected but no IDs are configured, the engine will **not** attach `file_search` to avoid invalid requests.
- Web search:
  - Attaches a `WebSearchTool` when selected.
- Shell:
  - Attaches only on **macOS** when selected (Seatbelt sandbox via `sandbox-exec`).
- `apply_patch`:
  - Attaches when selected.
  - Uses a workspace-scoped editor that refuses to operate on paths outside the workspace root.
- `plan`:
  - Attaches when selected.
  - Supports incremental operations (`get`, `set`, `clear`, `add`, `update`, `delete`) with stable `step_id`s.
  - Persists a plan snapshot for mutating ops and returns a JSON envelope with the plan payload; the CLI renders the plan block from the persisted snapshot.
  - Payload fields: `plan_state`, `steps[{step_id, step, status}]`, `explanation`, `revision`, `plan_created_at`, `plan_updated_at`.
- `agent_run_report`:
  - Attaches when selected.
  - Returns the stored delegated run report for a `report_id`.
- Safe local tools (`rg`, `fd`, `bat`, `tree`, `stat`):
  - Attach when selected.
  - Read‑only, workspace‑confined, paginated, and bounded (paged outputs include
    a `page` object with `cursor_kind`, `cursor`, `limit`, `returned`, `has_more`,
    and `next_cursor`; envelope `truncated` is reserved for output clamping).
- `agent_run`:
  - Attaches when selected.
  - Executes a single‑run delegated agent (`max_turns=cfg.agent.max_turns`) with explicit `model`, `instructions`, and `input`.
  - Returns a summary envelope plus `report_id`; the full report is stored in the artifacts vault.
  - Inspect reports with `/artifacts agent-run <ID>` or `agenterm artifacts agent-run <ID>`.

`/status` and `/tools` reflect both:

- Declaration (which tool blocks exist), selection (bundles + keys), and
- The **actual** attached tools where possible, so you can see when a tool is configured but not yet ready (for example, `file_search` without vector store IDs).

## Includes

The Responses API supports an `include` parameter to request additional sections of the response (tool results, images, reasoning, and so on). Agenterm CLI does **not** expose this as a user‑facing knob.

Instead, Agenterm CLI always uses a fixed, high‑value include set:

- `file_search_call.results`
- `web_search_call.results`
- `web_search_call.action.sources`
- `reasoning.encrypted_content`

These values are wired into `ModelSettings.response_include` internally and are not configurable via `config.yaml` or REPL commands. This keeps the runtime predictable while still capturing the key accessories needed for sessions, analytics, and inspection.

`agent_run` retries once without unsupported include values (for example,
`reasoning.encrypted_content`) and records a warning in the report when this
fallback is applied.

## MCP configuration & diagnostics

MCP servers are configured under `mcp.servers` in `config.yaml`.

- MCP servers are activated by selection: include `mcp:<server_key>` in a bundle's `tools` (or use selectors like `mcp:*` in the `integrations` bundle) and ensure that bundle is selected (typically via `tools.default_bundles` or the REPL `/tools` command).
- MCP tools are exposed to the model as function tools with deterministic names:
  - Prefer readable names: `mcp__<server_key>__<tool_name>` (when OpenAI-valid and within 64 chars).
  - Otherwise agenterm shortens the name (slug + short hash) to satisfy the OpenAI function-name contract (allowed characters and max length).
- For gateway models, MCP servers are converted into FunctionTools at run time; outputs are serialized as the MCP tool output envelope and surfaced like other FunctionTool results.

- Construction is pure (config → server objects): `engine.mcp_factory.build_mcp_servers`.
- Lifecycle is engine-owned (connect-before-use, cleanup-on-exit): `engine.mcp_pool.McpServerPool`.

Example (stdio):

```yaml
mcp:
  servers:
    - key: files
      kind: stdio
      name: filesystem
      tool_filter:
        allowed_tool_names: ["read", "list"]
      stdio:
        command: uvx
        args: ["mcp-files-server"]
        cwd: .
```

Diagnostics commands:

- `uv run agenterm mcp servers` — list configured MCP servers.
- `uv run agenterm mcp status` — show connection status and tool counts (`agenterm mcp status --format json` emits a single JSON envelope).
- `uv run agenterm mcp tools` — list all discovered MCP tools grouped by server.
- `uv run agenterm mcp validate` — connect and validate configuration.
- `uv run agenterm mcp inspect` — inspect tools exposed by the servers (prints names; `--out PATH` writes a JSON file).
- `uv run agenterm mcp serve` — run FastMCP to expose local FunctionTools (uses `mcp.expose` config).

From inside the REPL:

- `/mcp refresh` — triggers MCP tool discovery using the current selection.
- `/mcp status` — shows MCP server status and discovered tool counts.
- Updates `/tools` and `/status` summaries accordingly.
  - When discovery fails, the status bar MCP segment and `/status` include an explicit error marker; use `/mcp status` for per-server error detail and `/mcp refresh` to retry.

## Guardrails

Guardrails are configured via:

- `guardrails.load_modules: list[str]` (optional)
- `guardrails.input: list[str]`
- `guardrails.output: list[str]`

Guardrail names are resolved via `core.guardrails_registry`, which maps names to concrete input/output guardrail implementations.

agenterm ships with one built-in input guardrail:

- `no_secrets_input` — blocks obvious credential-like strings (API keys, private keys) in user input.

To enable it:

```yaml
guardrails:
  input:
    - no_secrets_input
```

For custom guardrails, list modules to import (so their `register_*_guardrail(...)` calls run at import time), then reference their registered names:

```yaml
guardrails:
  load_modules:
    - my_project.guardrails
  input:
    - my_guardrail_name
```

The `examples` directory includes a reference guardrail:

- `examples/guardrails/no_pii_input.py` — a simple “no PII in input” guardrail.

Execution behavior:

- Input guardrails run on the initial run input.
- Output guardrails run on the final output before it is returned.

## Running from source

For local development in this repo:

- Install dependencies:

  ```bash
  uv sync
  ```

- Basic smoke tests:

  ```bash
  uv run python -m compileall src
  uv run agenterm run "describe this repo"
  uv run agenterm repl
  ```

Run the gate from the repo root:

```bash
uv run devtools/gate.py              # default gate (offline; no secrets; no network)
uv run devtools/gate.py --full       # include provider/network integration tests (requires credentials)
uv run devtools/gate.py --no-tests   # skip pytest entirely (fast feedback)
```

What it runs (in order):

- `ruff check --fix src devtools` (lint rules configured in `pyproject.toml`)
- `ruff format src devtools` (applies formatting fixes)
- `basedpyright .` (scoped to `src`/`devtools` via `pyproject.toml` include; excludes `.uv-cache`/`.worktrees`)
- `python -m compileall src devtools`
- `pytest -q` (omitted with `--no-tests`; exit code 5 for “no tests collected” is treated as success)
- `pytest -q -m integration_provider` (only with `--full`; opt-in tests that hit real providers/network and require credentials)
- `agenterm --help` smoke
- ripgrep policies: no type ignores/noqa, no reflection, no silent swallows, no eval/exec, no shell=True, no empty defs, no Any/cast, no broad `object` annotations
- `python -m devtools.policy_dict_params` (policy: forbid `dict[...]` in function parameter annotations; use `Mapping`/`MutableMapping`)
- `python -m devtools.file_size_check` (≤500 lines / 18 kB per file under `src`/`devtools`)
- `uv lock --check`

## Environment & `~/.agenterm/.env`

agenterm relies on OpenAI environment variables plus any gateway route key env vars you configure.

Environment loading:

- agenterm reads environment variables from your shell (the canonical source).
- For convenience, agenterm optionally loads **one global env file**: `~/.agenterm/.env`.
  - agenterm intentionally does **not** auto-load `.env` files from your project directory to reduce the risk of accidentally committing secrets.
  - The env file does **not** override existing shell environment variables (shell wins; `override=False`).
- This repo includes `.env.example` as a template; copy it to `~/.agenterm/.env` if you want a file-based setup.
- In CI, prefer setting `OPENAI_API_KEY` as an environment variable via your secrets manager (do not rely on the global file).

### Key variables

- `OPENAI_API_KEY` (required for `openai/...` models):
  - Used by the OpenAI client for OpenAI Responses calls.
  - `agenterm run` and `agenterm inspect response` fail fast with a clear message when an OpenAI model is selected and no key is configured.
  - The REPL:
    - Allows you to start without a key.
    - Provides `/config key <OPENAI_API_KEY>` to save it to `~/.agenterm/.env` (and set it for the current session).
    - Rejects OpenAI-backed prompts issued without a key with a user‑facing error panel instead of a traceback.

- Gateway route keys (optional):
  - Set the env vars referenced by `providers.gateway.routes.*.api_key_env`.
  - Examples: `OPENROUTER_API_KEY`, `ANTHROPIC_API_KEY`, `GOOGLE_API_KEY`, `XAI_API_KEY`.

### Optional OpenAI settings (forwarded to the SDK when set)

- `OPENAI_ORG_ID` — maps to the OpenAI client `organization`.
- `OPENAI_PROJECT_ID` — maps to the OpenAI client `project`.
- `OPENAI_BASE_URL` — overrides the API base URL (for example, Azure or another OpenAI‑compatible endpoint).
  - If `providers.openai.base_url` is set, it overrides this value.
- `OPENAI_WEBHOOK_SECRET` — webhook verification secret; exposed for completeness, though Agenterm CLI does not currently expose webhook endpoints.

### Shell tool sandbox (macOS)

Availability:
- The shell tool is **macOS-only** because it relies on Seatbelt via `sandbox-exec`.
- On non-macOS platforms (e.g. Linux), the shell tool is **not exposed** (tool key `fn:shell` is unavailable).

The shell tool runs commands inside a macOS Seatbelt sandbox enforced at OS level:

- **Filesystem writes confined** to workspace root (cwd where `agenterm` launched)
- **Temp directories allowed** (`/tmp`, `/private/var/folders`, `/private/tmp`)
- **Network access configurable** via `tools.shell.sandbox.network` (`allow` | `deny`)
- **Safe environment** — commands inherit safe defaults (`PATH`, `HOME`, `SHELL`, etc.) with optional custom vars via `tools.shell.env`

Configuration:

```yaml
tools:
  shell:
    sandbox:
      network: allow      # "allow" | "deny"
```

Full shell features work within the sandbox (pipes, redirection, chaining). Writes outside the workspace are blocked at OS level regardless of command structure.
