# supamem

> Qdrant-backed dual-memory tooling for AI coding agents (Claude Code, Cursor, OpenCode).
> Provides a CLI to bootstrap, index, run an MCP server, install per-client hooks, and run
> retrieval evals — all backed by a locked tuned-hybrid (BM25 + MiniLM) Qdrant pipeline.
> Extracted from the SoftChat (https://app.softchat.ru) production memory stack so any
> team can run on the same battle-tested foundation.

`supamem` packages a hybrid sparse+dense semantic memory layer (Qdrant), a Model Context
Protocol server, and per-client session/edit hooks as a single Python distribution. Once
installed, AI coding assistants gain persistent semantic memory across projects.

## Core docs

- [README](README.md): Hero, quickstart, prerequisites, install matrix, CLI reference, client wiring
- [MIGRATION](MIGRATION.md): Migrating from an in-tree `dev_memory` setup to supamem
- [LICENSE](LICENSE): MIT

## Distribution

- [PyPI](https://pypi.org/project/supamem/): Released via Trusted Publisher OIDC; `pip install supamem` or `uv tool install supamem`. Current version: v0.1.5.
- [CHANGELOG](CHANGELOG.md): Per-version release notes (v0.1.0 initial, v0.1.1 update-check + AGENTS.md, v0.1.2 project-tunable regress baselines, v0.1.3 dual_memory_write + qdrant aliases, v0.1.4 SessionStart banner + supamem live dashboard)

## Translations

- [README (English, canonical)](README.md): The one PyPI renders
- [README zh-CN](README.zh-CN.md): Simplified Chinese
- [README es](README.es.md): Spanish
- [README ja](README.ja.md): Japanese
- [README ru](README.ru.md): Russian (project author native)

## CLI commands

- [supamem init](README.md#cli-surface): Greenfield bootstrap — probes Qdrant, creates collection, writes `.supamem/config.toml`
- [supamem install](README.md#wiring-into-your-client): Patch a client config (claude-code, cursor, opencode) — atomic with backup. `--scope project` (default, per-workspace `.mcp.json` / `.cursor/mcp.json`) or `--scope user` (legacy global). `--enforce-search` (claude-code only) registers the opt-in PreToolUse edit-gate.
- [supamem repair](README.md#cli-surface): Migrate from legacy global install to per-workspace files. Strips stale globals, re-installs at project scope. Idempotent. (v0.2.0+)
- [supamem index](README.md#cli-surface): Embed dev memories into Qdrant using the locked tuned-hybrid pipeline (D-25)
- [supamem mcp-server](README.md#cli-surface): Run the MCP server over stdio (default) or HTTP
- [supamem hook](README.md#cli-surface): Per-client session/edit hooks called by the client itself
- [supamem doctor](README.md#cli-surface): Probe Qdrant, resolve config chain, report version drift, surface update-check status
- [supamem stats](README.md#cli-surface): Welford schema-v2 usage counters
- [supamem live](README.md#-see-it-work--supamem-live): Real-time terminal dashboard tailing the audit JSONL — visibility into PreToolUse hook injections (v0.1.4+)
- [supamem migrate](README.md#cli-surface): Brownfield migration from an existing `dev_memory` collection
- [supamem eval](README.md#cli-surface): Run the regression harness; project-tunable baselines via `[supamem.eval]` config (v0.1.2+)
- [supamem uninstall](README.md#cli-surface): Reverse `supamem install` cleanly

## MCP tools (v0.1.3)

- `dual_memory_search`: Hybrid (BM25+dense, RRF) retrieval over the project's Qdrant collection. Top-k, latency, summary. Response shape (v0.2.0+): each `Chunk` carries `text` (full intact payload) and `preview` (display-only excerpt capped at `mcp.caps.max_preview_chars`); top-level `SearchResult.clamped_to` is set when the server clamped requested `top_k`
- `dual_memory_write`: Persist agent-authored memory — writes Markdown to `<project>/.claude/insights/_agent/<slug>.md` with YAML frontmatter, immediately upserts into Qdrant (wait=True), idempotent on topic via UUIDv5
- `qdrant_find` (alias of dual_memory_search): Backward-compat for users coming from upstream `mcp-server-qdrant`. Inherits the same caps and response shape as the canonical tool (D-17 alias parity)
- `qdrant_store` (alias of dual_memory_write): Same compat shim. Disable both aliases with `SUPAMEM_QDRANT_ALIASES=0`

## MCP response caps (v0.2.0+)

Server-side hard caps on every retrieval response. Configured under the `[supamem.mcp.caps]` TOML table; surfaced in `supamem doctor` with config-source provenance.

- `mcp.caps.max_top_k` (default: 25) — server silently clamps requested `top_k` to this value; `SearchResult.clamped_to` is populated when clamping fires so callers can detect it
- `mcp.caps.max_query_chars` (default: 250) — Pydantic `Field(max_length=...)` baked into the tool schema at registration time; queries longer than the cap are rejected with a structured MCP validation error (no silent truncation, no stdout pollution)
- `mcp.caps.max_preview_chars` (default: 200) — display preview cap on each `Chunk.preview`; the full canonical payload in `Chunk.text` is never truncated

## Visibility surfaces (v0.1.4+)

- `supamem live` CLI: Rich-Live terminal dashboard, real-time tail of the audit JSONL with rotation/resize/Ctrl-C handling and pipe-safe plain-JSONL fallback when stdout isn't a TTY
- `supamem hook session-start`: cross-client SessionStart banner injected via `additionalContext` (Claude Code) + `additional_context` (Cursor/OpenCode forks). Auto-detects calling client from `CLAUDECODE`/`OPENCODE`/`CURSOR_AGENT` env vars. Format: `🧠 supamem v<x.y.z> · <collection> · <N> chunks · audit <path>`. Fail-soft per hook discipline — never blocks session start

## MCP project-root resolution (v0.2.0+)

stdio MCP servers are often launched by hosts (Cursor, IDE wrappers) from a cwd that is NOT the workspace, which silently drops supamem to the default collection (`dev_memory_tuned_hybrid`) and produces Qdrant 404s when callers query the project's actual collection.

- `SUPAMEM_PROJECT_ROOT` (env var) — preferred, explicit. Auto-injected by `supamem install --scope project` into `<repo>/.mcp.json` (Claude Code) and `<repo>/.cursor/mcp.json` (Cursor) so the subprocess locates `.supamem/config.toml` regardless of cwd
- Parent-walk fallback — when the env var is unset, supamem walks parents from `Path.cwd()` looking for `.supamem/config.toml` or `pyproject.toml [tool.supamem]`. Stops at filesystem root or `$HOME` to avoid scanning above the user's home
- Stderr fallthrough warning — when neither the env var nor the parent-walk locate a project marker AND the resolved collection is still the shipped default, `supamem mcp-server --transport stdio` emits a one-line stderr warning (cwd inspected, env var presence — never values, fix command). Stdout stays JSON-RPC clean
- Verify with `supamem doctor` from the repo root: the resolved collection must match what the MCP client returns from `dual_memory_search`

## Multi-project install + agent-discipline hooks (v0.2.0+)

- **Per-workspace install is the default** as of v0.2.0. `supamem install --client claude-code` writes to `<repo>/.mcp.json` (Anthropic project-scope MCP file, takes precedence over user-scope per docs). `supamem install --client cursor` writes to `<repo>/.cursor/mcp.json` (Cursor per-workspace, project-level wins on conflict). Use `--scope user` to keep legacy global writes (last install wins on multi-project machines).
- **`supamem repair`** is the migration verb for users on legacy global installs. Strips supamem from BOTH project AND user scopes (defensive uninstall) then re-installs at project scope from current cwd. Idempotent. Auto-detects clients when `--client` omitted. Forwards `--enforce-search`.
- **Claude Code edit-gate hook** (`supamem hook claude-code-gate`, opt-in via `supamem install --enforce-search`). Registers a PreToolUse `Edit|Write|MultiEdit` matcher that DENIES the tool call when no `mcp__supamem__dual_memory_search` (or `qdrant_find` alias) is logged in the session transcript since the last user turn (strategy A — strict per-turn). Reverse-scans the transcript JSONL with a 256 KB byte cap; emits Anthropic's `permissionDecision: deny` JSON contract on stdout. Override per-session with `SUPAMEM_GATE_DISABLE=1`.
- **Cursor `beforeSubmitPrompt` advisory hook** (`supamem hook cursor-advisory`). Cursor 1.7's hooks API has no fail-closed pre-edit event, so this is advisory-only: when the user's prompt looks edit-bound (regex over `fix|refactor|rename|implement|add|...`), emit `{"continue": true, "permission": "allow", "agentMessage": "..."}` reminding the agent to call `dual_memory_search` first. Override with `SUPAMEM_ADVISORY_DISABLE=1`. Auto-installed by Cursor installer alongside the existing sessionStart snapshot.

## SessionStart banner (v0.2.0 enriched)

Format: `🧠 supamem ✓ v0.2.0 · <collection> · <N> chunks · audit <path>` (additional `· update v0.X.Y available` segment when `update_check` cache reports a newer release).

- Health flag — single character right after `supamem`: `✓` healthy / `⚠` qdrant unreachable OR resolved collection is still the shipped default (legacy global-install / wrong-cwd failure mode)
- Update hint — cache-only read of `update_check.json`; never blocks session-open on network. Healing is NEVER automatic — the banner only signals; run `supamem repair` to act
- Suppress entirely with `SUPAMEM_BANNER_DISABLE=1`

## Update-check (v0.1.1+)

- Daemon-thread GitHub Releases probe; 24h TTL cache at `platformdirs.user_cache_dir("supamem")/update_check.json`
- Stderr footer on next invocation when newer release available; never blocks
- Suppress with `SUPAMEM_NO_UPDATE_CHECK=1`, `NO_UPDATE_NOTIFIER=1`, or `CI=1`
- Visible in `supamem doctor` (current vs cached-latest, last-check timestamp, suppression env)

## Architecture

- [How it works](README.md#how-it-works): MCP server topology, hybrid retrieval, hook flow
- [Hybrid retrieval](README.md#features): Tuned BM25 + MiniLM fusion, locked schema D-25
- [Markdown chunker](README.md#features): Header-aware T-1 chunker, 200-token target / 250 soft max

## Prerequisites

- [Python 3.12+](README.md#prerequisites): macOS / Linux / Windows install commands
- [Qdrant 1.10+](README.md#prerequisites): Docker, docker compose, or Qdrant Cloud
- [MCP-compatible client](README.md#prerequisites): Claude Code, Cursor, or OpenCode

## Optional

- [Contributing](README.md#contributing): Local dev setup with uv + pytest + ruff
- [SoftChat](https://app.softchat.ru): Russian-language AI chat platform — origin project
- [SoftSkillz](https://softskillz.ai): AI-first product engineering team
- [Qdrant docs](https://qdrant.tech/documentation/): Vector database upstream
- [Model Context Protocol](https://modelcontextprotocol.io/): MCP spec
- [uv](https://docs.astral.sh/uv/): Recommended Python package manager
