# DiffGuard

**Catches the structural breaks that pass code review.**

## The catch

Flask changed `redirect()`'s default from 302 to 303 ([PR #5898](https://github.com/pallets/flask/pull/5898)). A reviewer sees a one-line diff. DiffGuard sees 7 callers that silently change behavior:

```
$ diffguard review eca5fd1d~1..eca5fd1d

⚠ DiffGuard: 2 changes need review

1. DEFAULT VALUE CHANGED: redirect(location, code=302, Response) → redirect(location, code=303, Response)
   File: src/flask/helpers.py:241
   Impact: 7 callers rely on the default:
     auth.py:25   `return redirect(url_for("auth.login"))`
     auth.py:77   `return redirect(url_for("auth.login"))`
     auth.py:105  `return redirect(url_for("index"))`
     auth.py:116  `return redirect(url_for("index"))`
     blog.py:81   `return redirect(url_for("blog.index"))`
   Review: Verify callers expect the new default value

2. DEFAULT VALUE CHANGED: App.redirect(self, location, code=302) → App.redirect(self, location, code=303)
   File: src/flask/sansio/app.py:935
   Impact: 7 callers rely on the default
   Review: Verify callers expect the new default value
```

Real output from DiffGuard run against Flask commit `eca5fd1d`. Signature display simplified for readability.

## What DiffGuard is

DiffGuard is a **verification layer** for code changes. Not a review tool — reviews give opinions, DiffGuard gives facts. It uses tree-sitter AST analysis to detect structural changes in git diffs and traces their impact through your codebase.

**What it catches:** Function signature changes, removed/renamed symbols, default value changes — and shows you every caller affected.

**What it doesn't catch:** Logic bugs, behavioral changes beyond signatures, performance issues, security vulnerabilities. DiffGuard detects a specific class of **structural breaks**, not all bugs.

When there's nothing structural to report, it stays silent (exit code 0, no output).

## Get started

```bash
pip install diffguard
diffguard review main..feature
```

Exit codes: `0` = nothing noteworthy, `1` = findings, `2` = error.

## How it works

1. **Parses the diff** using tree-sitter AST analysis (not regex)
2. **Extracts symbols** — functions, classes, signatures
3. **Detects high-signal changes** — signature changes, removed symbols, default value changes
4. **Scans for callers** — finds files that reference changed symbols
5. **Outputs actionable context** — or stays silent if nothing matters

See [How It Works](how-it-works.md) for the full technical approach.

## Why not X?

| | DiffGuard | CodeRabbit | Aider repo-map |
|---|---|---|---|
| **Setup** | `pip install` (30 seconds) | Account + GitHub app + config | Locked inside Aider |
| **Cost** | Free | $15–30/seat/month | Free (Aider-only) |
| **Privacy** | Code never leaves your machine | Code on their servers | Local |
| **Works with any agent** | Yes — CLI + JSON | GitHub PR comments only | Aider only |
| **Output** | Silent when nothing matters | Comments on every PR | N/A |

## Agent integration

Add one line to your agent config — DiffGuard is silent when nothing matters.

- **Claude Code** — Add to `CLAUDE.md` or wire as a [hook](agent-integration.md#claude-code-hook). See [snippet](claude-md-snippet.md).
- **Cursor** — Add `.cursor/rules/diffguard.mdc`. See [snippet](cursor-rule-snippet.md).
- **Any agent** — One instruction: `Before reviewing diffs, run: diffguard review <base>..HEAD`

See the full [Agent Integration Guide](agent-integration.md) for hooks, CI patterns, and examples.

## GitHub Action

```yaml
# .github/workflows/diffguard.yml
name: DiffGuard PR Review
on:
  pull_request:
    types: [opened, synchronize, reopened]
permissions:
  contents: read
  pull-requests: write
jobs:
  diffguard:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: ostehost/diffguard@main
```

When findings exist, DiffGuard posts a PR comment. When there's nothing noteworthy, it stays silent.

## Languages

- **Python** — most mature, extensive real-world validation
- TypeScript / JavaScript
- Go
- More planned (Rust, Java, C#)

## Philosophy

1. **Silence is a feature.** No findings? No output. Most diffs don't need structural analysis.
2. **Local-first.** Your code never leaves your machine. No SaaS, no API keys, no accounts.
3. **Agent-native.** CLI + JSON output. `pip install` and go. Works with any agent or workflow.
4. **Precision over recall.** We'd rather miss a minor issue than cry wolf on every PR.
# Quick Start

## Install

```bash
pip install diffguard
```

## `diffguard review` — the primary command

Surfaces high-signal structural changes. Silent when nothing is noteworthy.

```bash
# Review last commit
diffguard review HEAD~1..HEAD

# Review a PR branch
diffguard review main..feature-branch
```

### Exit codes

| Code | Meaning |
|------|---------|
| 0 | No high-signal findings (no output) |
| 1 | Findings present — read stdout |
| 2 | Error |

### Example output (text)

```
⚠ DiffGuard: 2 changes need review

1. DEFAULT VALUE CHANGED: redirect(location, code=302, Response) → redirect(location, code=303, Response)
   File: src/flask/helpers.py:241
   Impact: 7 callers rely on the default:
     auth.py:25   `return redirect(url_for("auth.login"))`
     auth.py:77   `return redirect(url_for("auth.login"))`
     auth.py:105  `return redirect(url_for("index"))`
     blog.py:81   `return redirect(url_for("blog.index"))`
   Review: Verify callers expect the new default value

2. DEFAULT VALUE CHANGED: App.redirect(self, location, code=302) → App.redirect(self, location, code=303)
   File: src/flask/sansio/app.py:935
   Impact: 7 callers rely on the default
   Review: Verify callers expect the new default value
```

*Real output from DiffGuard run against Flask commit `eca5fd1d`.*

### Example output (JSON)

```bash
diffguard review HEAD~1..HEAD --format json
```

When there are no findings:

```json
{
  "version": "0.1.0",
  "ref_range": "HEAD~1..HEAD",
  "findings": [],
  "stats": {
    "files_analyzed": 1,
    "symbols_changed": 0,
    "silence_reason": "no high-signal changes"
  }
}
```

See [Schema Reference](schema.md#review-output) for the full schema.

---

## `diffguard summarize` — full structural summary

Always produces output. Gives a complete map of what changed structurally — useful for agents that want the full picture, not just the high-signal items.

```bash
# Summarize last commit
diffguard summarize HEAD~1..HEAD

# Choose output tier
diffguard summarize HEAD~1..HEAD --format oneliner
diffguard summarize HEAD~1..HEAD --format short
diffguard summarize HEAD~1..HEAD --format json
```

### Example output (JSON)

```json
{
  "schema_version": "1.1",
  "meta": {
    "ref_range": "abc1234..def5678",
    "stats": { "files": 3, "additions": 340, "deletions": 89 },
    "warnings": [],
    "timing_ms": 187.4
  },
  "files": [
    {
      "path": "src/auth/client.ts",
      "language": "typescript",
      "change_type": "modified",
      "changes": [
        {
          "kind": "function_removed",
          "name": "authenticate",
          "signature": "authenticate(apiKey: string): Promise<Session>",
          "line": 45,
          "breaking": true
        }
      ]
    }
  ],
  "summary": {
    "change_types": { "feature": 1, "refactor": 2 },
    "breaking_changes": [...],
    "focus": ["authenticate() removed — callers need migration"]
  },
  "tiered": {
    "oneliner": "Replace API key auth with OAuth2 PKCE; 2 breaking changes",
    "short": "Removes authenticate(apiKey), adds authenticateOAuth(config)...",
    "detailed": "..."
  }
}
```

!!! note "Illustrative example"
    The JSON above is illustrative of the schema structure. Field names and types match the real schema — see [Schema Reference](schema.md#summarize-output) for details.

See [Schema Reference](schema.md) for the full output format.

---

## When to use which

| Scenario | Command |
|----------|---------|
| CI gate / pre-review check | `diffguard review` |
| Agent needs full structural map | `diffguard summarize` |
| Quick "anything breaking?" check | `diffguard review` |
| Feeding context to an AI reviewer | `diffguard summarize --format json` |

## What DiffGuard tells you

DiffGuard reports **structural facts**: which functions changed, what signatures broke, what was removed, which callers are affected.

It does **not** tell you *why* something changed, whether the logic is correct, or whether it's a good idea. That's the reviewer's job.

**Scope:** Signatures, removed symbols, default value changes. Not logic, security, or performance.
# Real-World Catches: DiffGuard vs. Historical OSS Bugs

> These are real commits in real repos where DiffGuard flags exactly what went wrong — before users found out the hard way.

---

## 1. 🏆 Flask: `redirect()` default changed from 302 → 303

**Repo:** [pallets/flask](https://github.com/pallets/flask)
**Commit:** `eca5fd1d` (merged via PR [#5898](https://github.com/pallets/flask/pull/5898))
**Issue:** [#5895](https://github.com/pallets/flask/issues/5895)
**Milestone:** Flask 3.2.0

### What changed

The `redirect()` function's `code` parameter default changed from `302` to `303`:

```python
# Before
def redirect(location: str, code: int = 302, ...) -> BaseResponse:

# After
def redirect(location: str, code: int = 303, ...) -> BaseResponse:
```

### Why it matters

HTTP 302 and 303 have subtly different semantics. 302 *sometimes* preserves the HTTP method (browser-dependent), while 303 *always* converts to GET. Any caller relying on 302's method-preservation behavior (e.g., API endpoints expecting POST→POST redirects) would silently break — no errors, just different behavior.

### DiffGuard's output

> Signature display simplified for readability — run the command yourself to see parameter type annotations.

```
⚠ DiffGuard: 2 changes need review

1. DEFAULT VALUE CHANGED: redirect(location, code=302, Response) → redirect(location, code=303, Response)
   File: src/flask/helpers.py:241
   Impact: 7 callers rely on the default:
     auth.py:25   `return redirect(url_for("auth.login"))`
     auth.py:77   `return redirect(url_for("auth.login"))`
     auth.py:105  `return redirect(url_for("index"))`
     blog.py:81   `return redirect(url_for("blog.index"))`
   Review: Verify callers expect the new default value

2. DEFAULT VALUE CHANGED: App.redirect(self, location, code=302) → App.redirect(self, location, code=303)
   File: src/flask/sansio/app.py:935
   Impact: 7 callers rely on the default
   Review: Verify callers expect the new default value
```

### Why this is a great story

- Flask is one of the most popular Python web frameworks (~70k GitHub stars)
- The change is intentional but **silently breaks callers** — no TypeError, no warning
- DiffGuard identifies the exact callers that rely on the default and need verification
- A human reviewer could easily miss the behavioral difference between 302 and 303

**Headline: "DiffGuard would have caught Flask PR #5898 before it shipped."**

---

## 2. httpx: `Request(method=)` narrows from `str | bytes` to `str`

**Repo:** [encode/httpx](https://github.com/encode/httpx)
**Commit:** `6622553` (PR [#3378](https://github.com/encode/httpx/pull/3378))

### What changed

The `Request.__init__()` `method` parameter type was narrowed from `str | bytes` to `str`:

```python
# Before
class Request:
    def __init__(self, method: str | bytes, url: URL | str, ...):

# After
class Request:
    def __init__(self, method: str, url: URL | str, ...):
```

### Why it matters

Any code passing `method=b"GET"` (bytes) would break with an `AttributeError` on `method.upper()` at runtime. The PR author acknowledged this was "nominally an API change" but believed it was a "bugfix in practice." Still — silent breakage for anyone using bytes.

### DiffGuard's output

```
⚠ DiffGuard: 1 change needs review

1. SIGNATURE CHANGED: __init__(self, method: str | bytes, ...) → __init__(self, method: str, ...)
   File: httpx/_models.py:311
   Impact: 63 callers rely on the default
   Review: Check all callers handle the new signature
```

### Why this is a great story

- httpx is the modern Python HTTP client (~13k stars), used by FastAPI's test client
- Type narrowing is exactly the kind of "looks harmless" change that breaks real code
- DiffGuard catches it instantly — no need to read the diff line-by-line

---

## 3. Pydantic: `@serializer` renamed to `@field_serializer` (symbol removed)

**Repo:** [pydantic/pydantic](https://github.com/pydantic/pydantic)
**Commit:** `11edcb2c` (PR [#5331](https://github.com/pydantic/pydantic/pull/5331))

### What changed

The `@serializer` decorator was renamed to `@field_serializer`, and a new `@model_serializer` was added alongside it. The old name was removed entirely.

### DiffGuard's output

```
⚠ DiffGuard: 5 changes need review

1. PARAMETER ADDED (BREAKING): make_generic_field_serializer(serializer, mode)
   → make_generic_field_serializer(serializer, mode, type)
   Impact: Breaking change — callers will break with missing required argument

2. SYMBOL REMOVED: serializer(__field, *fields, ...)
   File: pydantic/decorators.py:341
   Impact: 19 callers will break

3-4. (additional overloads of the removed symbol)

5. SYMBOL REMOVED: serializer(__field, *fields, mode='wrap', ...)
   Impact: 19 callers will break
```

### Why this is a great story

- Pydantic v2 was the biggest Python library migration in recent memory
- DiffGuard catches both the symbol removal AND identifies 19 internal callers that reference it
- This is the kind of rename that grep can find, but DiffGuard does it *automatically* as part of review

---

## Dogfooding Notes

While running DiffGuard on these repos, I noted:

1. **Missed catch — Django `UniqueConstraint(name=None)` → `UniqueConstraint(name)` (commit `b172cbdf33`):** A parameter changed from optional (`name=None`) to required (`name`). DiffGuard returned exit 0 with no findings. This is a false negative — removing a default value is a breaking change. **Filed as a potential improvement.**

2. **Output verbosity on httpx proxy commit:** 14 findings for the proxies→proxy migration. The output is very long. A summary mode or grouping related changes (e.g., "proxy parameter added to 9 HTTP method functions") would help for large refactors.

3. **Caller detection quality:** DiffGuard correctly identifies callers in both source and test files, which is excellent. The Flask example showing `auth.py` and `blog.py` callers makes the impact immediately tangible.

4. **Speed:** All reviews completed in under 5 seconds on these repos. Fast enough for CI integration.
# How It Works

## The approach: selective trigger

Most diffs don't contain structural breaks. New functions, body-only changes, formatting — none of these affect callers. DiffGuard's core design: **stay silent unless the change is structurally significant.**

This means DiffGuard only reports when it finds:

| Trigger | What it means |
|---------|---------------|
| **Signature changed** | Function contract changed — callers may pass wrong arguments |
| **Default value changed** | Callers relying on the default get different behavior silently |
| **Symbol removed** | Dependents will break |
| **Symbol moved** | Imports need updating |

Body-only changes (same signature, different implementation) are internal refactors. They don't affect callers. DiffGuard ignores them.

## The pipeline

```
git diff → parse → extract → match → classify → scan callers → output (or silence)
```

1. **Parse the diff** — tree-sitter builds ASTs for before/after versions of each changed file. Not regex — full syntax trees.
2. **Extract symbols** — functions, classes, methods with full signatures, line numbers, and scope.
3. **Match old ↔ new** — O(n) dict-based name matching. No fuzzy rename detection (accuracy over comprehensiveness).
4. **Classify changes** — labels each symbol: added, removed, modified, moved, signature_changed. Sets `breaking` flag where applicable.
5. **Scan for callers** — two-stage: `git grep` pre-filters for speed, then tree-sitter confirms references in non-diff files.
6. **Apply selective trigger** — only produce output if high-signal changes exist AND have external callers.

Typical timing: ~200ms for a 1000-line diff.

## Why tree-sitter

Tree-sitter provides C-speed parsing with pre-built binaries for 40+ languages. It gives DiffGuard real syntax trees instead of regex-based guesses. Adding a new language is mechanical: grammar + query patterns.

Currently supported: Python (most mature), TypeScript/JavaScript, Go. More planned.

## Precision over recall

We tested three iterations before landing on the current design.

Early versions tried to report on every structural change in a diff. A/B testing against 12 real commits from Flask, FastAPI, Pydantic, and httpx showed that most PRs don't benefit from structural analysis — the reviewer can read the diff fine on their own.

The selective trigger changed the results:

| Metric | Result |
|--------|--------|
| **Precision** | 100% — when it spoke, it was right |
| **Silence rate** | 58% — stayed quiet on 7 of 12 PRs |
| **False positives** | 0 |

The key insight: making silence the default turned a marginally useful tool into a precision instrument. A tool that says "`redirect()` default changed from 302 to 303, 7 callers affected" is always right. A tool that comments on every PR trains you to ignore it.

## What agents get

Without DiffGuard, an AI reviewing a PR sees:
```
-def redirect(location, code=302, Response=None):
+def redirect(location, code=303, Response=None):
```

With DiffGuard:
```
DEFAULT VALUE CHANGED: redirect(location, code=302) → redirect(location, code=303)
Impact: 5 callers rely on the default:
  auth.py:25  `return redirect(url_for("auth.login"))`
  auth.py:77  `return redirect(url_for("index"))`
  blog.py:81  `return redirect(url_for("blog.index"))`
Review: Verify callers expect HTTP 303 instead of 302
```

The difference: "I see a number changed" vs. "I see a behavioral change that affects 5 call sites across 3 files."

## Limitations

DiffGuard's value scales with PR size:

| PR size | Value |
|---------|-------|
| Small (<100 lines, 1-2 files) | **Minimal.** The reviewer can read the whole diff. |
| Medium (200-500 lines) | **Moderate.** Structural overview saves time. |
| Large (500+ lines, multiple files) | **Significant.** Linear reading of 1000+ lines misses structural patterns. |

DiffGuard is not magic. On small, focused PRs, you don't need it.

For detailed internals, see [Architecture](architecture.md).
# Agent Integration Guide

DiffGuard works with any AI agent that can run shell commands.

## Quick setup

```bash
pip install diffguard
```

Add one instruction to your agent's system prompt or config:

```
Before reviewing any diff, run: diffguard review <base>..HEAD
```

That's it. DiffGuard is silent (exit 0) when nothing is noteworthy, so it won't add noise.

## Two commands for agents

### `diffguard review` — selective, high-signal (primary)

```bash
diffguard review main..HEAD --format json
```

Returns only high-signal findings: signature changes, removed symbols, default value changes. Silent when nothing matters. Best for CI gates and "should I look closer?" decisions.

### `diffguard summarize` — full structural map

```bash
diffguard summarize main..HEAD --format json
```

Returns a complete structural summary of the diff (~200-300 tokens). Always produces output. Best when the agent needs a full map before reading the diff.

## Exit codes (review command)

| Code | Meaning | Agent action |
|------|---------|--------------|
| 0 | No high-signal findings | Continue normally |
| 1 | Findings present | Read stdout, address each finding |
| 2 | Error (not a repo, bad ref) | Report the error |

## Example output

### Review (text)

```
⚠ DiffGuard: 2 changes need review

1. DEFAULT VALUE CHANGED: redirect(location, code=302, Response) → redirect(location, code=303, Response)
   File: src/flask/helpers.py:241
   Impact: 7 callers rely on the default:
     auth.py:25   `return redirect(url_for("auth.login"))`
     ...
   Review: Verify callers expect the new default value
```

*Real output from Flask commit `eca5fd1d`.*

### Review (JSON)

```json
{
  "version": "0.1.0",
  "ref_range": "main..HEAD",
  "findings": [
    {
      "category": "SIGNATURE_CHANGED",
      "symbol": "authenticate",
      "file": "src/auth/users.py",
      "line": 34,
      "before_signature": "def authenticate(name, email)",
      "after_signature": "def authenticate(name, email, role=\"viewer\")",
      "impact": {
        "production_callers": 3,
        "test_callers": 2,
        "callers": [...]
      },
      "review_hint": "Check all callers handle the new signature"
    }
  ],
  "stats": {
    "files_analyzed": 5,
    "symbols_changed": 8,
    "silence_reason": null
  }
}
```

!!! note "Illustrative example"
    The JSON above shows the schema structure with realistic field values. See [Schema Reference](schema.md#review-output) for the full specification.

## Claude Code

Add the snippet to your repo's `CLAUDE.md` — see [claude-md-snippet.md](claude-md-snippet.md).

### Claude Code Hook

Wire DiffGuard as a hook that runs automatically after edits:

`.claude/settings.json`:

```json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write|MultiEdit",
        "command": "diffguard review HEAD~1..HEAD --format text 2>/dev/null || true"
      }
    ]
  }
}
```

Or as a one-shot check when a task completes:

```json
{
  "hooks": {
    "TaskCompleted": [
      {
        "command": "diffguard review $(git merge-base main HEAD)..HEAD"
      }
    ]
  }
}
```

## Cursor

Add a rule file at `.cursor/rules/diffguard.mdc` — see [cursor-rule-snippet.md](cursor-rule-snippet.md).

## Integration patterns

### CI/CD pre-review

```bash
# In your CI pipeline, before AI review
FINDINGS=$(diffguard review $BASE_SHA..HEAD --format json)
if [ $? -eq 1 ]; then
  echo "$FINDINGS" | your-agent-review-command
fi
```

### Git hook

```bash
# .git/hooks/post-commit
diffguard review HEAD~1..HEAD
```

## Scope

DiffGuard catches **structural breaks**: signature changes, removed symbols, default value changes. It does **not** catch logic bugs, security issues, or performance problems. See [What it catches](index.md#what-it-catches).

## Supported languages

- **Python** — most mature, extensive real-world validation
- TypeScript / JavaScript
- Go
- More planned (Rust, Java, C#)
# Schema Reference

DiffGuard has two commands with different JSON output schemas.

---

## Review output

`diffguard review <ref-range> --format json`

The review command outputs a flat list of high-signal findings. When there are no findings, `findings` is empty and `silence_reason` explains why.

### Top-level

| Field | Type | Description |
|-------|------|-------------|
| `version` | `str` | Schema version (currently `"0.1.0"`) |
| `ref_range` | `str` | Git ref range analyzed |
| `findings` | `list[Finding]` | High-signal findings (may be empty) |
| `stats` | `ReviewStats` | Analysis statistics |

### `ReviewStats`

| Field | Type | Description |
|-------|------|-------------|
| `files_analyzed` | `int` | Number of files analyzed |
| `symbols_changed` | `int` | Total symbol-level changes detected |
| `silence_reason` | `str | null` | Why no findings were reported (null if findings exist) |

### `Finding`

| Field | Type | Description |
|-------|------|-------------|
| `category` | `str` | One of: `DEFAULT_VALUE_CHANGED`, `SIGNATURE_CHANGED`, `SYMBOL_REMOVED`, `PARAMETER_ADDED`, `PARAMETER_REMOVED`, `MOVED` |
| `symbol` | `str` | Symbol name |
| `file` | `str` | File path |
| `line` | `int | null` | Line number |
| `before_signature` | `str` | Previous signature (when applicable) |
| `after_signature` | `str` | New signature (when applicable) |
| `impact` | `Impact` | Caller impact analysis |
| `review_hint` | `str` | Suggested reviewer action |

### `Impact`

| Field | Type | Description |
|-------|------|-------------|
| `production_callers` | `int` | Number of non-test callers |
| `test_callers` | `int` | Number of test callers |
| `callers` | `list[Caller]` | Up to 10 caller locations |

### `Caller`

| Field | Type | Description |
|-------|------|-------------|
| `file` | `str` | File path |
| `line` | `int` | Line number |
| `source` | `str` | Source line text |

### Example

```json
{
  "version": "0.1.0",
  "ref_range": "eca5fd1d~1..eca5fd1d",
  "findings": [
    {
      "category": "DEFAULT_VALUE_CHANGED",
      "symbol": "redirect",
      "file": "src/flask/helpers.py",
      "line": 241,
      "before_signature": "def redirect(location, code=302, Response=None)",
      "after_signature": "def redirect(location, code=303, Response=None)",
      "impact": {
        "production_callers": 7,
        "test_callers": 2,
        "callers": [
          {"file": "auth.py", "line": 25, "source": "return redirect(url_for(\"auth.login\"))"}
        ]
      },
      "review_hint": "Verify callers expect the new default value"
    }
  ],
  "stats": {
    "files_analyzed": 2,
    "symbols_changed": 2,
    "silence_reason": null
  }
}
```

!!! note "Illustrative"
    This example is based on real DiffGuard output against Flask commit `eca5fd1d`, with some fields simplified for clarity. Field names and types are accurate.

---

## Summarize output

`diffguard summarize <ref-range> --format json`

The summarize command outputs a complete structural map of the diff. Defined by Pydantic v2 models in `src/diffguard/schema.py`.

### `DiffGuardOutput` (top-level)

| Field | Type | Description |
|-------|------|-------------|
| `schema_version` | `str` | Currently `"1.1"` |
| `meta` | `Meta` | Run metadata: ref range, stats, timing |
| `files` | `list[FileChange]` | Per-file semantic changes |
| `summary` | `Summary` | Aggregate: change types, breaking changes, focus areas |
| `tiered` | `TieredSummary` | Human-readable summaries at different token budgets |

### `Meta`

| Field | Type | Description |
|-------|------|-------------|
| `ref_range` | `str` | Git ref range analyzed |
| `stats` | `DiffStats` | `files`, `additions`, `deletions` counts |
| `warnings` | `list[str]` | Parse errors, truncation signals |
| `timing_ms` | `float | None` | Wall-clock time for analysis |

### `FileChange`

| Field | Type | Description |
|-------|------|-------------|
| `path` | `str` | File path relative to repo root |
| `language` | `str | None` | Detected language |
| `change_type` | `"added" | "removed" | "modified" | "renamed"` | File-level change type |
| `generated` | `bool` | Lock files, protobuf output, etc. |
| `binary` | `bool` | Binary file (skipped) |
| `parse_error` | `bool` | Tree-sitter couldn't parse this file |
| `unsupported_language` | `bool` | No grammar available |
| `changes` | `list[SymbolChange]` | Symbol-level changes |

### `SymbolChange`

| Field | Type | Description |
|-------|------|-------------|
| `kind` | `str` | One of: `function_added`, `function_removed`, `function_modified`, `class_added`, `class_removed`, `class_modified`, `signature_changed`, `moved` |
| `name` | `str` | Symbol name |
| `signature` | `str | None` | Full signature (for added/removed) |
| `before_signature` | `str | None` | Old signature (for `signature_changed`) |
| `after_signature` | `str | None` | New signature (for `signature_changed`) |
| `file_from` | `str | None` | Source file (for `moved`) |
| `line` | `int | None` | Line number in new file |
| `breaking` | `bool` | Whether this breaks the public API |
| `detail` | `dict | None` | Language-specific metadata |

### `Summary`

| Field | Type | Description |
|-------|------|-------------|
| `change_types` | `dict[str, int]` | Counts by category |
| `breaking_changes` | `list[SymbolChange]` | All breaking changes |
| `focus` | `list[str]` | Most important items for reviewer attention |

### `TieredSummary`

| Field | Type | Description |
|-------|------|-------------|
| `oneliner` | `str` | ~20 tokens |
| `short` | `str` | ~80 tokens |
| `detailed` | `str` | Full narrative |

### Design principles

- **Semantic change units.** Function/class level with signatures — not line numbers.
- **Breaking changes at top level.** Not buried in file details.
- **No opinions.** Structural facts only.
- **Graceful degradation.** Parse errors and unsupported languages are flagged, never crashes.
# Architecture

## Pipeline

```
git diff ──→ parse ──→ extract ──→ match ──→ classify ──→ summarize ──→ JSON
             │         │           │         │            │
             │         │           │         │            └─ tiered summaries
             │         │           │         └─ added/removed/modified/moved
             │         │           └─ name-match old↔new symbols (O(n) dict)
             │         └─ tree-sitter queries → functions, classes, methods
             └─ py-tree-sitter parses old + new file versions
```

**Typical timing:** ~200ms for a 1000-line diff.

## Modules

Each module has a single responsibility. No horizontal imports between engine modules.

| Module | Input | Output | Responsibility |
|--------|-------|--------|---------------|
| `cli.py` | CLI args | exit code + JSON/text | Click CLI entry point. Commands: `review`, `summarize`, `context` (hidden alias for review), `install-hook`. Orchestrates pipeline and determines output. |
| `git.py` | ref range | changed files + old/new content | All git subprocess calls. Nothing else touches git. |
| `engine/_types.py` | — | — | Shared type aliases and dataclasses (`Symbol`, `ParseResult`, `compute_body_hash`). |
| `engine/parser.py` | source file | syntax tree | Tree-sitter parsing. No git logic, no matching. |
| `engine/matcher.py` | old symbols + new symbols | matched pairs | Name-based symbol matching. O(n) dict lookup. |
| `engine/classifier.py` | matched pairs | classified changes | Labels: added, removed, modified, moved, signature_changed. Sets `breaking` flag. |
| `engine/signatures.py` | old + new signatures | breaking change flags + category labels | Signature comparison. Detects parameter changes, return type changes, default value changes. |
| `engine/deps.py` | symbol names + git ref | external references | Dependency/caller detection. Uses `git grep` to pre-filter, then tree-sitter to confirm references in non-diff files. |
| `engine/summarizer.py` | classified changes | tiered text | Generates oneliner, short, detailed summaries. |
| `engine/pipeline.py` | ref range + content provider | `DiffGuardOutput` | Orchestrates parse → match → classify → summarize for all files. |
| `schema.py` | — | — | Pydantic models. The contract. |

## Language plugin system

The `languages/` package provides per-language tree-sitter support. Each language module (e.g., `languages/python/__init__.py`) exports:

| Function | Purpose |
|----------|---------|
| `get_language()` | Returns the `tree_sitter.Language` object |
| `extract_symbols(tree, source)` | Walks the parsed tree and returns `list[Symbol]` |

The top-level `languages/__init__.py` provides:

- `SUPPORTED_LANGUAGES` — set of supported language names
- `detect_language(filename)` — maps file extensions to language names
- `get_parser(language)` — returns a configured `tree_sitter.Parser`
- `get_language_module(language)` — dynamically imports the language module

`languages/_utils.py` contains shared helpers (e.g., `node_text()` for safe node text extraction).

### Supported languages

| Language | Module | Grammar |
|----------|--------|---------|
| Python | `languages/python/` | tree-sitter-python |
| TypeScript | `languages/typescript/` | tree-sitter-typescript |
| JavaScript | `languages/typescript/` (shared) | tree-sitter-javascript |
| Go | `languages/go/` | tree-sitter-go |

## Symbol extraction

DiffGuard uses tree-sitter to parse source files and walk the AST to extract:

- Function/method definitions with signatures
- Class/struct/interface definitions
- Line numbers and scope
- Body hashes for change detection

For each changed file, DiffGuard parses both the old and new versions, extracts symbols from each, then matches them by name.

## Matching algorithm

1. Build a dict of old symbols keyed by `(name, kind)`
2. Build a dict of new symbols keyed by `(name, kind)`
3. Symbols in both → **modified** (compare bodies/signatures)
4. Symbols only in old → **removed**
5. Symbols only in new → **added**
6. Removed symbol name appears in a different file as added → **moved**

This is O(n) and handles the common case well. It deliberately does not attempt fuzzy rename detection — accuracy over comprehensiveness.

## Selective trigger

DiffGuard's core design principle: **stay silent when there's nothing useful to say.**

The `review` command checks for high-signal changes before producing output. If none are found, it exits with code 0 (silence). The logic lives in `cli.py::_has_high_signal_changes()`:

A change is **high-signal** if any of these are true:

| Trigger | What it means |
|---------|---------------|
| Signature changed | `before_signature` and `after_signature` both present — function contract changed |
| Breaking change | `breaking=True` — callers may break |
| Symbol removed | `kind` ends with `_removed` — dependents will break |
| Symbol moved | `kind == "moved"` — imports need updating |

Body-only changes (same signature, different implementation) are **not** high-signal — they're internal refactors that don't affect callers.

Dependency references (`deps.py`) provide context about *who* is affected, but don't independently trigger output. A moved function with 12 importers is high-signal because of the move, not because of the importers.

### Signature change categories

When a signature change is detected, `signatures.py::classify_signature_change()` provides a specific category label:

| Category | Meaning |
|----------|---------|
| `PARAMETER REMOVED` | Positional or keyword-only parameter removed |
| `PARAMETER ADDED (BREAKING)` | New parameter without a default value |
| `RETURN TYPE CHANGED` | Return type annotation changed |
| `DEFAULT VALUE CHANGED` | Only difference is a changed default value on existing params |
| `BREAKING SIGNATURE CHANGE` | Other breaking change (type change, reorder, etc.) |
| `SIGNATURE CHANGED` | Non-breaking signature change |

### Change kinds in schema

The `SymbolChange.kind` field uses these values:

| Kind | Description |
|------|-------------|
| `function_added` | New function |
| `function_removed` | Function deleted |
| `function_modified` | Function body changed (signature intact) |
| `class_added` | New class |
| `class_removed` | Class deleted |
| `class_modified` | Class body changed (signature intact) |
| `signature_changed` | Function/class signature changed (check `breaking` flag) |
| `moved` | Symbol moved to a different file |

## Exit codes

### `review` command

| Code | Meaning |
|------|---------|
| 0 | No high-signal findings — silence. The agent should move on. |
| 1 | Findings present — the agent should read the output. |
| 2 | Error (invalid ref range, git failure, etc.) |

### `summarize` command

| Code | Meaning |
|------|---------|
| 0 | Success |
| 3 | No changes in diff |
| 4 | Partial — parse errors in some files |

## Dependency scanning

`deps.py::find_references()` locates callers of changed symbols in files *outside* the diff:

1. **Pre-filter with `git grep`** — textually search for symbol names across the repo (fast)
2. **Confirm with tree-sitter** — parse candidate files, walk the AST for identifier nodes matching symbol names
3. **Classify context** — each reference is labeled `"import"` or `"call"` based on parent node types

This two-stage approach avoids parsing every file in the repo while maintaining accuracy.

## Graceful degradation

- **Unsupported language:** File included in output with `unsupported_language: true`, line-level stats only.
- **Parse error:** File included with `parse_error: true`, falls back to line-level stats.
- **Binary file:** Skipped with `binary: true`.

DiffGuard never crashes on unsupported input. It always produces valid JSON.

## Stack

- **Python** — fast enough with native tree-sitter bindings
- **py-tree-sitter** — C-speed parsing, pre-built binaries for 40+ languages
- **Pydantic v2** — schema definition and validation
- **Click** — CLI framework
- **difflib** — per-function body comparison (no GumTree, no full AST diff)

### Why not these alternatives

| Alternative | Why not |
|-------------|---------|
| GumTree | O(n³), Java dependency, killed v1 |
| Rust/TypeScript core | Premature optimization. Python + native tree-sitter is fast enough. |
| difftastic | Line-oriented JSON output, not semantic. Great visual tool, wrong abstraction. |
| ast-grep | Pattern search, not a differ. Possible future add-on. |
# Roadmap

*Last updated: 2026-02-11*

## Vision

DiffGuard is a **verification layer** for code changes. Not a reviewer (opinions) — a verifier (facts).

- **Human hook:** "Catches the bugs that pass code review"
- **Technical hook:** "Precision verification for AI-native workflows"

Works for both humans (VS Code/Cursor devs) and AI agents (Claude Code, OpenClaw, pipelines). Local-first, privacy-first, agent-native CLI. Your code never leaves your machine.

## Competitive context

| Tool | Model | Limitation |
|------|-------|------------|
| CodeRabbit | SaaS, $15-30/seat | Code leaves your machine. Reviews on their servers. |
| Aider repo-map | tree-sitter + PageRank | Locked inside Aider. Not usable standalone. |
| ast-grep | Pattern search CLI | Searches, doesn't diff. No semantic change detection. |
| semgrep | Static analysis rules | Security-focused. Not a change reviewer. |
| GitHub Copilot review | SaaS, GitHub-only | Vendor lock-in. No local option. |
| claude-code-action | GitHub Action, runs on your runner | Broad review, not precision bug detection. **Complementary to DiffGuard** — DiffGuard triages, claude-code-action reviews what matters. |

DiffGuard is the only open-source, local-first, agent-native option.

## License

BSL 1.1 — see [LICENSE](https://github.com/oste-git/diffguard/blob/main/LICENSE) for details.

## Phases

### Phase 1: Ship it (Now → 4 weeks)

**Status: current**

**Goals:**
- Ship CLI to PyPI (`pip install diffguard`)
- Integration snippets for CLAUDE.md, Cursor rules, Aider
- Launch post

**Gate criteria:**
- 50 installs in first 30 days
- 5 distinct users in 30 days

**Kill signals:**
- <20 installs after 30 days with active promotion

### Phase 2: CI integration (Weeks 5–10)

**Status: planned**

**Goals:**
- GitHub Action (`diffguard-action`)
- `--ci` mode (non-interactive, structured output)
- Team config file (`.diffguard.yml`)
- `--fail-on` severity flag for CI gates
- claude-code-action + DiffGuard integration example (show them running together)
- "Bugs AI reviewers miss" benchmark — test DiffGuard against AI-generated code, publish results
- Ensure `--format json` output has severity, confidence, location fields for agent consumption

**Gate criteria:**
- \>100 weekly active users (WAU)

**Kill signals:**
- <50 WAU despite GitHub Action availability

### Phase 3: Watch mode (Weeks 11–16)

**Status: planned**

**Goals:**
- `diffguard watch` — daemon mode, incremental review on file save
- Context hints — suggest related files/symbols for the agent
- Rust and Java language support

**Gate criteria:**
- Sustained growth in WAU
- Community requests for daemon mode

**Kill signals:**
- No organic demand for watch mode after Phase 2 traction

## Future-Proof Thesis

> As AI generates more code, the need for automated verification increases, not decreases. Human review capacity is fixed. AI-generated code volume is exponential. The bottleneck shifts from "who writes the code" to "who verifies the code is correct." DiffGuard is a verification engine that works regardless of whether code was written by a human, Cursor, Claude Code, or a fully autonomous agent swarm. The less human oversight there is in the loop, the more critical precision-targeted bug detection becomes.

DiffGuard is infrastructure (model-agnostic pre-processor), not a model wrapper. CLI-first design is already positioned for the agentic future.

## Kill / continue signals

| Milestone | Signal | Action |
|-----------|--------|--------|
| Month 1 | >50 installs | Continue to Phase 2 |
| Month 3 | >100 WAU, >200 GitHub stars | Continue to Phase 3 |
| Month 6 | External contributors appearing | Project has legs — invest more |
| Month 1 | <20 installs | Reassess positioning or pivot |
| Month 3 | <50 WAU | Consider stopping active development |
