Metadata-Version: 2.4
Name: coding-cli-runtime
Version: 0.5.0
Summary: Reusable CLI runtime primitives for provider-backed automation workflows
Author-email: LLM Eval maintainers <llm-eval-maintainers@users.noreply.github.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/pj-ms/llm-eval/tree/main/packages/coding-cli-runtime
Project-URL: Repository, https://github.com/pj-ms/llm-eval
Project-URL: Issues, https://github.com/pj-ms/llm-eval/issues
Project-URL: Changelog, https://github.com/pj-ms/llm-eval/blob/main/packages/coding-cli-runtime/CHANGELOG.md
Keywords: cli,runtime,llm,automation,schema-validation
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: build>=1.2.2; extra == "dev"
Requires-Dist: bump-my-version>=0.26.0; extra == "dev"
Requires-Dist: mypy>=1.13.0; extra == "dev"
Requires-Dist: pre-commit>=4.0.0; extra == "dev"
Requires-Dist: pytest==8.3.3; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: ruff>=0.8.0; extra == "dev"
Requires-Dist: setuptools>=77.0.3; extra == "dev"
Requires-Dist: twine>=5.1.1; extra == "dev"
Requires-Dist: wheel; extra == "dev"
Dynamic: license-file

# coding-cli-runtime

[![PyPI](https://img.shields.io/pypi/v/coding-cli-runtime)](https://pypi.org/project/coding-cli-runtime/)
[![Python](https://img.shields.io/pypi/pyversions/coding-cli-runtime)](https://pypi.org/project/coding-cli-runtime/)
[![Build](https://github.com/pj-ms/llm-eval/actions/workflows/ci.yml/badge.svg)](https://github.com/pj-ms/llm-eval/actions/workflows/ci.yml)
[![License](https://img.shields.io/pypi/l/coding-cli-runtime)](LICENSE)

A Python library for orchestrating LLM coding agent CLIs — [Claude Code](https://docs.anthropic.com/en/docs/claude-code), [Codex](https://github.com/openai/codex), [Gemini CLI](https://github.com/google-gemini/gemini-cli), and [GitHub Copilot](https://docs.github.com/en/copilot).

These CLIs each have different invocation patterns, output formats, error
shapes, and timeout behaviors. This library normalizes all of that behind
a common `CliRunRequest` → `CliRunResult` contract, so your automation
code doesn't need provider-specific subprocess handling.

The package now exposes a stable core API plus preview provider adapters:

- `coding_cli_runtime` — stable metadata, launch primitives, schema helpers,
  subprocess/session execution, and provider facts.
- `coding_cli_runtime.providers` — preview provider-aware adapters that own
  run/parse/session/recovery flows for Claude, Codex, Copilot, and Gemini.

**What it does (and why not just `subprocess.run`):**

- Run any provider CLI with unified request/result types and timeout enforcement
- Query the model catalog (with user-override and live-cache fallback)
- Classify failures as retryable vs fatal per provider
- Look up provider auth, config dirs, and headless launch flags
- Build non-interactive launch commands without hardcoding provider flags
- Find session logs and preserved conversations after a run
- Run long-lived sessions with process-group cleanup and transcript mirroring
- Preview provider-aware launch/result flows with typed adapter events and
  normalized raw execution views
- No Python package dependencies — only requires the provider CLIs themselves

## Installation

```bash
pip install coding-cli-runtime
# or
uv add coding-cli-runtime
```

Requires Python 3.10+.

## Examples

### Execute a provider CLI

```python
import asyncio
from pathlib import Path
from coding_cli_runtime import CliRunRequest, run_cli_command

request = CliRunRequest(
    cmd_parts=("codex", "--model", "gpt-5.4", "--quiet", "exec", "fix the tests"),
    cwd=Path("/tmp/my-project"),
    timeout_seconds=120,
)
result = asyncio.run(run_cli_command(request))

print(result.returncode)        # 0
print(result.error_code)        # "none"
print(result.duration_seconds)  # 14.2
print(result.stdout_text[:200])
```

Swap `codex` for `claude`, `gemini`, or `copilot` — the request/result
shape stays the same. A synchronous variant `run_cli_command_sync` is also
available.

### Pick a model from the provider catalog

```python
from coding_cli_runtime import get_provider_spec

codex = get_provider_spec("codex")
print(codex.default_model)   # "gpt-5.3-codex"
print(codex.model_source)    # "codex_cli_cache", "override", or "code"

for model in codex.models:
    print(f"  {model.name}: {model.description}")
```

The catalog covers all four providers — each with model names, reasoning
levels, default settings, and visibility flags.

Model lists are resolved with a three-tier fallback:

1. **User override** — drop a JSON file at
   `~/.config/coding-cli-runtime/providers/<provider>.json` to use your own
   model list immediately, without waiting for a package update.
2. **Live CLI cache** — for Codex, the library reads
   `~/.codex/models_cache.json` (auto-refreshed by the Codex CLI) when
   present. Other providers fall through because their CLIs don't expose a
   machine-readable model list.
3. **Hardcoded fallback** — the model list shipped with the package.

Override file format:

```json
{
  "default_model": "claude-sonnet-4-7",
  "models": [
    "claude-sonnet-4-7",
    {
      "name": "claude-opus-5",
      "description": "Latest opus model",
      "controls": [
        { "name": "effort", "kind": "choice", "choices": ["low", "high"], "default": "low" }
      ]
    }
  ]
}
```

Set `CODING_CLI_RUNTIME_CONFIG_DIR` to change the config directory
(default: `~/.config/coding-cli-runtime`).

### Decide whether to retry a failed run

```python
from coding_cli_runtime import classify_provider_failure

classification = classify_provider_failure(
    provider="gemini",
    stderr_text="429 Resource exhausted: rate limit exceeded",
)

if classification.retryable:
    print(f"Retryable ({classification.category}) — will retry")
else:
    print(f"Fatal ({classification.category}) — giving up")
```

Works for all four providers. Recognizes auth failures, rate limits,
network transients, and other provider-specific error patterns.

### Use preview provider-aware adapters

```python
from pathlib import Path

from coding_cli_runtime.providers import claude

request = claude.ClaudeExecRequest(
    model="claude-sonnet-4-6",
    prompt="Summarize the repository status as JSON.",
    cwd=Path("/tmp/my-project"),
    output_format="json",
    transcript_path=Path("/tmp/claude-conversation.jsonl"),
)

preview = claude.prepare_launch(request)
print(preview.display_text)

result = claude.run_sync(request)
print(result.raw_execution.returncode)
print(result.parsed_output.structured_output)

session = claude.find_session(
    request.cwd,
    result.raw_execution.started_at,
    prompt_text=request.prompt,
)
conversation = claude.get_conversation(session)
print(conversation.line_count)
```

These preview adapters are provider-aware and provider-specific on purpose.
They are the right API when you want package-owned parsing, session lookup,
conversation retrieval, launch preview, adapter events, and provider-specific
recovery behavior.

These `coding_cli_runtime.providers.*` APIs are still preview surfaces and may
evolve faster than the stable core metadata/helpers.

### Common integration tasks

#### Check whether a provider CLI is installed

```python
from coding_cli_runtime import is_provider_installed

if not is_provider_installed("claude"):
    raise RuntimeError("Claude Code is not available on PATH")
```

This is intentionally minimal: it checks whether the provider binary exists on
PATH. Deeper CLI drift validation belongs in maintainer tooling, not the
runtime API.

#### Resolve workspace env vars and session search paths

```python
from coding_cli_runtime import (
    get_provider_contract,
    resolve_session_search_paths,
    resolve_workspace_env,
)

gemini = get_provider_contract("gemini")

# Derive provider-specific workspace env vars from contract metadata
env = resolve_workspace_env(gemini, "/tmp/run-dir")
# {"GEMINI_CLI_IDE_WORKSPACE_PATH": "/tmp/run-dir"}

# Expand concrete host paths for session log searches
paths = resolve_session_search_paths(gemini)
# (Path.home() / ".gemini" / "tmp",)
```

Use these helpers when you want the contract facts turned into concrete
filesystem/env values without rebuilding the same glue logic in your code.

### Look up provider contract metadata

```python
from coding_cli_runtime import get_provider_contract, build_env_overlay, resolve_config_paths, render_prompt

# Get structured metadata for any supported provider
contract = get_provider_contract("claude")
print(contract.binary)                        # "claude"
print(contract.auth.api_key_env_var)          # "CLAUDE_API_KEY"
print(contract.paths.config_dir)              # "~/.claude"
print(contract.headless.approval.flag)        # "--dangerously-skip-permissions"

# Build env var overlay for subprocess
env = build_env_overlay(contract, api_key="sk-...", base_url="https://custom.example.com")
# {"CLAUDE_API_KEY": "sk-...", "ANTHROPIC_BASE_URL": "https://custom.example.com"}

# Resolve config paths for container mounts
host_dir, container_dir = resolve_config_paths(contract, containerized=True)
# ("/home/user/.claude", "/root/.claude")

# Resolve prompt delivery (stdin vs flag vs activation)
payload = render_prompt(contract.headless.prompt, "Fix the bug")
# payload.args = ()            (stdin delivery for Claude)
# payload.stdin_text = "Fix the bug"
```

`ProviderContract` is structured as nested sub-contracts
(`AuthContract`, `PathContract`, `HeadlessContract`, `OutputContract`,
`IoContract`, `SessionDiscoveryContract`, `DiagnosticsContract`) so callers
can drill into whichever aspect they need. This is reference metadata,
not a command-construction control plane — callers keep their own
command assembly and adopt contract fields selectively.

### Query provider I/O conventions

```python
from coding_cli_runtime import get_provider_contract

gemini = get_provider_contract("gemini")

# Workspace env vars with value semantics
for wev in gemini.io.workspace_env_vars:
    print(f"{wev.name} = {wev.value_source}")
    # GEMINI_CLI_IDE_WORKSPACE_PATH = execution_dir

# Session discovery (where session logs live)
sd = gemini.session_discovery
print(sd.session_roots)  # ("tmp",)
print(sd.session_glob)   # "*/chats/session-*.json"

# Output format support
codex = get_provider_contract("codex")
print(codex.output.output_path_flag)    # "-o"
print(codex.output.schema_path_flag)    # "--output-schema"

# Diagnostics (Copilot only)
copilot = get_provider_contract("copilot")
if copilot.diagnostics:
    print(copilot.diagnostics.log_glob)  # "logs/process-*.log"
```

`WorkspaceEnvVar.value_source` uses a closed vocabulary:
`"execution_dir"` or `"workspace_root"`.

### Build headless launch commands

```python
from coding_cli_runtime import build_claude_headless_core, build_codex_headless_core

# Claude: binary + --print + --permission-mode + --dangerously-skip-permissions + --model
cmd = build_claude_headless_core("claude-sonnet-4-6")
cmd.extend(["--output-format", "text", "--disallowedTools", "Bash,Task"])

# Codex: binary + exec + --full-auto + --sandbox + --skip-git-repo-check + --model
cmd = build_codex_headless_core("gpt-5.4", sandbox_mode="read-only")
cmd.extend(["-C", str(workdir)])
```

Headless core helpers emit the standard flags for non-interactive runs.
Consumers append app-specific tails (tool restrictions, output paths, etc.).

### Find session logs after a run

```python
import time
from coding_cli_runtime import find_codex_session, find_claude_session

# Find the most recent Codex session log for a given working directory
session = find_codex_session("/path/to/project", since_ts=time.time() - 300)
if session:
    print(f"Session log: {session}")  # ~/.codex/sessions/.../conversation.jsonl
```

Works for Codex and Claude. Scans provider config directories for session
files matching the working directory and time window.

## Key types

| Type | Purpose |
|------|---------|
| `CliRunRequest` | Command spec: cmd, cwd, env, timeout, stream paths |
| `CliRunResult` | Result: returncode, stdout/stderr, duration, error code |
| `ErrorCode` | `none` · `spawn_failed` · `timed_out` · `non_zero_exit` |
| `ProviderSpec` | Provider catalog entry with models, controls, defaults |
| `ProviderContract` | Structured provider CLI metadata (auth, paths, headless, I/O, sessions) |
| `WorkspaceEnvVar` | Env var with value-source semantics (`execution_dir`, `workspace_root`) |
| `FailureClassification` | Classified error with retryable flag and category |

### Run long-lived CLI sessions

For CLI runs that take minutes (e.g., full app generation), use
`run_interactive_session()` instead of `run_cli_command()`. It adds:

- Process-group cleanup (kills orphaned child processes on timeout)
- Transcript mirroring (streams CLI output to a file while the process runs)
- Automatic retries on transient failures

```python
from coding_cli_runtime import run_interactive_session

result = await run_interactive_session(
    cmd_parts=("claude", "--print", "--model", "claude-sonnet-4-6"),
    cwd=workdir,
    stdin_text=prompt,
    logger=logger,
    timeout_seconds=600,
)
```

Only `cmd_parts`, `cwd`, `stdin_text`, and `logger` are required.
Other parameters have sensible defaults.

## API summary

The full public API is listed in [`__init__.py`](src/coding_cli_runtime/__init__.py).
Key function groups:

| Group | Functions |
|-------|-----------|
| Execution | `run_cli_command`, `run_cli_command_sync`, `run_interactive_session` |
| Provider metadata | `get_provider_contract`, `get_provider_spec`, `list_provider_specs` |
| Contract helpers | `build_env_overlay`, `resolve_config_paths`, `render_prompt`, `resolve_auth`, `resolve_workspace_env`, `resolve_session_search_paths` |
| Headless launch | `build_claude_headless_core`, `build_codex_headless_core`, `build_copilot_headless_core`, `build_gemini_headless_core` |
| Codex batch | `build_codex_exec_spec` |
| Failure handling | `classify_provider_failure` |
| Installation check | `is_provider_installed` |
| Session logs | `find_codex_session`, `find_claude_session` |
| Schema | `load_schema`, `validate_payload` |
| Utilities | `redact_text`, `build_model_id`, `normalize_path_str` |

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and quality checks.

## Prerequisites

This package does **not** bundle any CLI binaries or credentials. You must
install and authenticate the relevant provider CLI yourself before using the
execution helpers.

## Status

Pre-1.0. API may change between minor versions.

## License

MIT
