Metadata-Version: 2.4
Name: joust
Version: 1.0.1
Summary: Parallel prototype orchestrator for coding agents. Split one vague feature request into N variants, implement them in isolated git worktrees, judge, and pick a winner.
Project-URL: Homepage, https://github.com/nclandrei/joust
Project-URL: Source, https://github.com/nclandrei/joust
Project-URL: Issues, https://github.com/nclandrei/joust/issues
Author: Andrei-Mihai Nicolae
License: MIT
License-File: LICENSE
Keywords: agents,claude-code,cli,git-worktree,prototypes
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Version Control :: Git
Requires-Python: >=3.11
Requires-Dist: typer>=0.12
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == 'dev'
Description-Content-Type: text/markdown

# joust

**Parallel prototype orchestrator for coding agents.**

Splits one vague feature request into N variants, implements each in an
isolated git worktree, lets an agent judge them against a rubric, and gates
the final pick on a human.

Joust is in the same family as [proctor](https://github.com/nclandrei/proctor)
and [magellan](https://github.com/nclandrei/magellan): a tool whose
`--help` text is the agent's onboarding. Joust never calls an LLM itself. It
manages state, worktrees, and prompts; an agent runner (Claude Code, Codex,
or a shell script) does the thinking.

## Install

### As a Claude Code plugin (recommended)

```
/plugin marketplace add nclandrei/joust
/plugin install joust@joust
```

The plugin puts `joust` on PATH automatically at session start. See
[`docs/plugin.md`](docs/plugin.md) for the full guide.

### As a standalone CLI

```bash
uv tool install joust      # or: pipx install joust
```

Or from a clone for local development:

```bash
git clone https://github.com/nclandrei/joust.git
cd joust
uv venv
uv pip install -e .
```

Then `joust --version` and `joust --help` to get started.

## The workflow

```
joust init <slug> --goal "what you are exploring"
joust scenarios suggest --count 3      # prints a brainstorming playbook
joust scenarios add minimal   -d "..."
joust scenarios add power     -d "..."
joust scenarios add opinionated -d "..."
joust runner suggest                   # asks agent to ask user which model
joust runner set --runner claude-code --model opus
joust scenarios approve --yes          # HUMAN gate: approve the scenario set
joust lock                             # materializes N git worktrees
# --- dispatch one subagent per worktree ---
joust record variant-01 --status built --summary "..."
joust record variant-02 --status built --summary "..."
joust record variant-03 --status failed --summary "..."
joust judge                            # prints a judging playbook
joust score variant-01 --correctness 4 --simplicity 5 --fit 5 --extensibility 2 --rationale "..."
joust score variant-02 ...
joust score variant-03 ...
joust compare                          # comparison table
joust diff variant-01                  # files/lines changed in this variant
joust pick variant-01 --yes            # human-only gate
joust merge variant-01 --yes           # merge winner into the base branch
joust archive --prune                  # clean up losing branches
```

At any point, `joust log` prints the append-only event timeline for the
active run (`joust log --json` for machine-readable output,
`joust log --last 10` for the most recent events).

For a full walkthrough with realistic output at every step, see
[`docs/examples/first-run.md`](docs/examples/first-run.md) — three
rate-limit variants on a Flask endpoint with a custom rubric.

## Design principles (borrowed from proctor/magellan)

1. **`--help` is the onboarding.** A fresh agent runs `joust --help` and
   `joust <cmd> --help` and can complete a run without reading source.
2. **No LLM calls.** Joust emits static playbooks for brainstorming and
   judging; the calling agent reads them and acts. That keeps joust
   deterministic, testable, and model-agnostic.
3. **Contract-first, append-only state.** Every run has a `manifest.json`
   plus an `events.jsonl` log. Nothing is mutated in place.
4. **Two human gates.** `joust scenarios approve --yes` sits between
   brainstorming and worktree creation (so the agent cannot silently lock a
   run with scenarios it made up). `joust pick --yes` sits between judging
   and merging (so the agent cannot silently declare a winner). Both refuse
   to run without `--yes`.

5. **Runner is recorded, not assumed.** `joust runner suggest` prints a
   playbook telling the agent to ask the user which model to dispatch
   subagents as (e.g. opus / sonnet / haiku). The answer is recorded with
   `joust runner set` and shows up in every variant's `PROMPT.md`. Joust
   itself never spawns a subagent — the driver does, using the recorded
   model.

## State layout

```
$JOUST_HOME/                             # default: ~/.joust
  active.json                            # repo_path -> {slug, run_id}
  runs/
    <slug>/
      <run-id>/
        manifest.json
        events.jsonl
        worktrees/
          variant-01/                    # git worktree + PROMPT.md
          variant-02/
          variant-03/
        artifacts/
          variant-01/                    # files copied by `joust record --artifact`
```

Each variant lives on a branch named `joust/<run-id>/<variant-id>`, rooted at
the SHA of the base ref captured at `joust init` time.

## How to drive it from Claude Code

The simplest usage is a one-paragraph skill:

> Read `joust --help`, then `joust scenarios suggest --help`. Propose
> scenarios and register them. Run `joust lock`. Then spawn one subagent per
> worktree (use the Agent tool with `isolation: "worktree"` pointing at each
> worktree path). Each subagent's prompt is the worktree's `PROMPT.md`. When
> they finish, they call `joust record`. Then call `joust judge`, fill in
> scores with `joust score`, and run `joust compare`. Stop and ask the human
> to run `joust pick`.

Joust stays out of the model layer entirely — it just gives the agent a
structured, inspectable place to put its work.

## Composition with proctor

If a variant is a UI or end-to-end feature, run `proctor` inside its
worktree to capture manual-test evidence. `joust` owns "which variant wins";
`proctor` owns "does this variant actually work". The two are orthogonal.

## Housekeeping

- **`joust diff <variant>`** — shows a short-stat diff for the variant's
  branch against the base, plus the list of changed files. Useful during
  judging to see at a glance which variant touched what.
- **`joust log`** — prints the append-only event timeline for the active
  run (init, scenario_add, approve, lock, record, score, pick, merge,
  archive). Supports `--json` and `--last N`.
- **`joust merge <variant> --yes`** — merges the picked variant's branch
  back into the base branch. Refuses unless a variant has been picked and
  `--yes` is passed.
- **`joust doctor`** — diagnoses drift between joust state, git's view of
  worktrees/branches, and the filesystem. Detects stale active pointers
  (when the run dir has been `rm -rf`d), missing variant worktrees, and
  orphan `joust/<run-id>/variant-*` branches left over from aborted runs.
  Run `joust doctor --fix` to clean them up.
- **`joust runs`** — lists every run for the current repo.
- **`joust use <slug> <run-id>`** — switches which run is active in the
  current repo.
- **`joust report`** — prints filesystem paths for the active run's
  manifest, events log, and worktrees.
- **`joust --version`** / **`-V`** — prints the installed joust version
  and exits.

## Testing

```bash
uv pip install -e ".[dev]"
pytest tests/
```

The smoke test exercises the full pipeline end-to-end against a throwaway
git repo and asserts final manifest + events state.

## Status

v1.0 — stable, public, installable three ways: Claude Code plugin,
PyPI, or from a clone. 44 tests pass across Python 3.11/3.12/3.13 on
macOS and Linux in CI.

Milestones reached:

- **v0.2.0** — repo public, PyPI release, CI on push/PR. ✅
- **v0.3.0** — customizable rubric (`joust init --rubric ...`), richer
  `scenarios suggest` playbook that surfaces prior runs, and a full
  walkthrough at [`docs/examples/first-run.md`](docs/examples/first-run.md). ✅
- **v1.0.0** — Claude Code plugin under `plugin/` that puts `joust` on
  PATH at session start. See [`docs/plugin.md`](docs/plugin.md). ✅

See `docs/superpowers/specs/` for the design.
