Metadata-Version: 2.4
Name: skill-mem
Version: 0.1.0
Summary: Routing, utility tracking, and evolution for Agent Skills.
License-Expression: MIT
License-File: LICENSE
Keywords: agent,embeddings,memory,routing,skills
Requires-Python: >=3.11
Requires-Dist: pyyaml>=6.0
Provides-Extra: openrouter
Requires-Dist: openrouter>=0.7.11; extra == 'openrouter'
Description-Content-Type: text/markdown

# skill-mem

Routing, utility tracking, and evolution for [Agent Skills](https://agentskills.io).

```python
from skill_mem import Library, Outcome, Revision

library = Library(".agents/skills", embed=my_embed_fn)

# Route — find the right skill for a query
matches = library.route("analyze this CSV and find outliers")
skill = matches[0].skill

# Use it however you want — system prompt, tool description, few-shot examples
response = await agent.run(query, system=skill.content)

# Record what happened — the query is stored for richer routing later
library.record(skill.name, query, Outcome(success=True, reason="found 3 outliers"))

# Over time, skills that fail get rewritten by your LLM
library.evolve(skill.name, context="choked on 500k rows", rewrite=my_rewrite_fn)
```

Skills are standard [Agent Skills](https://agentskills.io/specification) `SKILL.md` files. Claude Code, Cursor, VS Code, and [30+ other tools](https://agentskills.io) already read them. This library adds the part they don't have: routing, tracking, and learning.

## Install

```bash
uv add skill-mem
```

You bring your own embedding function. Any model works — local or API:

```python
# Local (no API key needed)
from fastembed import TextEmbedding
model = TextEmbedding("BAAI/bge-small-en-v1.5")
def embed(text: str) -> list[float]:
    return list(next(model.embed([text])))

# Or OpenAI
import openai
def embed(text: str) -> list[float]:
    return openai.embeddings.create(input=text, model="text-embedding-3-small").data[0].embedding
```

## How it works

### Skills on disk

```
.agents/skills/
  csv-analysis/
    SKILL.md              ← standard Agent Skills file
  web-search/
    SKILL.md
  .skill-meta/            ← added by skill-mem (gitignored)
    embeddings.json       ← cached vectors for routing
    attempts.json         ← full attempt log (query, success, reason)
    versions.json         ← version tracking
    history/              ← full archived SKILL.md before each evolution
```

The `SKILL.md` files are portable — any Agent Skills-compatible tool reads them. The `.skill-meta/` sidecar stores the intelligence layer. Extra frontmatter fields (`license`, `metadata`, `compatibility`, etc.) are preserved through evolution.

### A `SKILL.md` file

```yaml
---
name: csv-analysis
description: Analyze, summarize, or query CSV files and tabular data.
---

Load the file with pandas. Use df.describe() for numeric columns.
Check nulls with df.isnull().sum(). For correlations, use df.corr()
and call out anything above 0.7.

For large files (>100k rows), sample with df.sample(10000) first.
```

The `description` field is what gets embedded for routing. The body is `skill.content` — the instructions your agent receives.

## API

```python
from skill_mem import Library, Skill, Match, Outcome, Revision, Stats

library = Library(path, embed)
```

### Read

```python
library.route(query, k=3) -> list[Match]     # embedding search, top-k
library.get(name) -> Skill                     # by name
library.all() -> list[Skill]                   # everything
library.utility(name) -> float                 # success rate (0.0–1.0)
library.stats(name) -> Stats                   # utility, attempts, recent log
```

### Write

```python
library.add(name, description, content) -> Skill
library.record(name, query, Outcome(success, reason))
library.evolve(name, context, rewrite=fn, validate=fn) -> Skill
library.discover(query, context, create=fn) -> Skill
```

### The loop

This is the core cycle from the [Memento-Skills paper](https://arxiv.org/abs/2603.18743):

```python
matches = library.route(query)
skill = matches[0].skill if matches else None

result = await agent.run(query, skill=skill)
outcome = judge(query, result)

if skill:
    library.record(skill.name, query, outcome)
    if not outcome.success:
        if library.utility(skill.name) < 0.4:
            library.discover(query, outcome.reason, create=my_create_fn)
        else:
            library.evolve(skill.name, outcome.reason, rewrite=my_rewrite_fn)
elif outcome.success:
    library.discover(query, result, create=my_create_fn)
```

Route. Execute. Judge. Evolve. Every cycle, the library gets better.

## Evolve and discover

`evolve` and `discover` take your functions — the library handles versioning, archival, and re-indexing.

```python
def my_rewrite(skill: Skill, context: str) -> Revision:
    """Called by library.evolve(). Returns Revision(description, content)."""
    result = llm(f"Rewrite this skill:\n{skill.content}\n\nFailure: {context}")
    return Revision(description=result.description, content=result.content)

def my_create(query: str, context: str) -> tuple[str, str, str]:
    """Called by library.discover(). Returns (name, description, content)."""
    result = llm(f"Create a reusable skill from this interaction:\n{query}\n{context}")
    return (result.name, result.description, result.content)
```

Evolution can update both `description` and `content`, so a skill that learns to handle parquet files also becomes easier to route to. An optional `validate` hook gates rewrites — if it returns `False`, the skill rolls back:

```python
library.evolve("csv", context, rewrite=my_rewrite, validate=lambda old, new: "pandas" in new.content)
```

When a skill evolves, the full old `SKILL.md` is archived to `.skill-meta/history/<name>/v<N>.md`. Successful queries are folded into routing embeddings so skills become easier to find as they accumulate evidence.

## Examples

```bash
uv run python examples/basic.py       # add skills, route, record outcomes
uv run python examples/evolving.py    # evolution loop with validation gate
```

## Paper

Inspired by [Memento-Skills: Let Agents Design Agents](https://arxiv.org/abs/2603.18743). This extracts the skill memory primitive — routing, utility tracking, and evolution — as a standalone library, and stores skills in the [Agent Skills](https://agentskills.io) format for ecosystem compatibility.
