# arandu — Complete Documentation

> This file contains the complete documentation for the arandu SDK, a long-term memory system for AI agents. Generated automatically from the source docs.

---

# arandu

**Long-term memory for AI agents.** Extract facts from conversations, resolve entities, reconcile knowledge over time, and retrieve relevant context — all backed by PostgreSQL and pgvector.

> *The name "Arandu" comes from the Guarani word meaning "wisdom acquired through experience" — literally "listening to time." Just as the Guarani concept describes knowledge built through lived experience, Arandu gives your AI agent the ability to accumulate, consolidate, and recall knowledge over time.*

---

## Why arandu?

Most AI agents are stateless. They forget everything between sessions. `arandu` gives your agent a persistent, structured memory that grows smarter over time:

- **Automatic fact extraction** — The write pipeline uses LLMs to extract entities, facts, and relationships from natural language.
- **Entity resolution** — Recognizes that "my wife Ana", "Ana", and "her" all refer to the same person, using a 3-phase resolver (exact → fuzzy → LLM).
- **Knowledge reconciliation** — Decides whether new information should ADD, UPDATE, or DELETE existing facts. No duplicates, no stale data.
- **Multi-signal retrieval** — Combines semantic search (pgvector), keyword matching, graph traversal, and recency scoring to find the most relevant facts.
- **Background maintenance** — Clustering, consolidation, and importance scoring keep memory organized and fresh — like how your brain consolidates during sleep.
- **Provider-agnostic** — Bring your own LLM and embedding provider via simple Python protocols. OpenAI provider included.

## Installation

```bash
pip install arandu
```

With OpenAI support (recommended):

```bash
pip install arandu[openai]
```

### Requirements

- Python 3.11+
- PostgreSQL with the [pgvector](https://github.com/pgvector/pgvector) extension

## Quick Start

```python
import asyncio
from arandu import MemoryClient
from arandu.providers.openai import OpenAIProvider

async def main():
    # 1. Set up providers
    provider = OpenAIProvider(api_key="sk-...")

    # 2. Create client
    memory = MemoryClient(
        database_url="postgresql+psycopg://user:pass@localhost/mydb",
        llm=provider,
        embeddings=provider,
    )

    # 3. Initialize tables (idempotent)
    await memory.initialize()

    # 4. Write — extracts facts automatically
    result = await memory.write(
        user_id="user_123",
        message="I live in São Paulo and work at Acme Corp as a backend engineer.",
    )
    print(f"Added {len(result.facts_added)} facts, resolved {len(result.entities_resolved)} entities")

    # 5. Retrieve — finds relevant context
    context = await memory.retrieve(
        user_id="user_123",
        query="where does the user live and work?",
    )
    print(context.context)

    # 6. Cleanup
    await memory.close()

asyncio.run(main())
```

## How It Works

### Write Pipeline

```
Message → Extract (LLM) → Resolve Entities → Reconcile → Upsert
```

Every message goes through four stages: the LLM extracts structured facts, entities are resolved to canonical records, new facts are reconciled against existing knowledge, and decisions (ADD/UPDATE/NOOP/DELETE) are executed.

→ [Learn more about the Write Pipeline](concepts/write-pipeline.md)

### Read Pipeline

```
Query → Plan (LLM) → Retrieve (semantic + keyword + graph) → Rerank → Format
```

Queries go through an LLM planner that decides retrieval strategy, then three parallel signals are merged, optionally reranked, and compressed into a context string.

→ [Learn more about the Read Pipeline](concepts/read-pipeline.md)

### Background Jobs

```
Clustering → Consolidation → Importance Scoring → Summary Refresh
```

Periodic background jobs keep memory organized and fresh — like sleep-time processing in the brain.

→ [Learn more about Background Jobs](concepts/background-jobs.md)

## Architecture

`arandu` is designed around three principles:

1. **Protocol-based DI** — LLM and embedding providers are injected via `typing.Protocol`. No vendor lock-in.
2. **Fail-safe by default** — Every LLM call has timeouts and fallbacks. A failed extraction still logs the event. A failed reconciliation defaults to ADD.
3. **Composition over inheritance** — Small, focused modules composed into pipelines. No deep class hierarchies.

→ [Learn more about the Design Philosophy](concepts/design-philosophy.md)

## Next Steps

<div class="grid cards" markdown>

- :material-rocket-launch:{ .lg .middle } **Getting Started**

    ---

    Full setup guide: PostgreSQL, pgvector, first write and retrieve.

    [:octicons-arrow-right-24: Getting Started](getting-started.md)

- :material-brain:{ .lg .middle } **Concepts**

    ---

    Deep dive into how each pipeline works and why.

    [:octicons-arrow-right-24: Write Pipeline](concepts/write-pipeline.md)

</div>

---

# Getting Started

This guide walks you through setting up `arandu` from scratch: installing dependencies, configuring PostgreSQL with pgvector, writing your first facts, and retrieving them.

## Prerequisites

- **Python 3.11+**
- **PostgreSQL 15+** with the [pgvector](https://github.com/pgvector/pgvector) extension installed
- An **OpenAI API key** (or any LLM/embedding provider — see [Custom Providers](#custom-providers))

## Step 1: Install

```bash
pip install arandu[openai]
```

This installs the core SDK plus the bundled OpenAI provider. If you're using a different LLM provider, install just the core:

```bash
pip install arandu
```

## Step 2: Set Up PostgreSQL + pgvector

`arandu` stores facts, entities, and embeddings in PostgreSQL using the pgvector extension for vector similarity search.

### Option A: Docker (recommended for development)

```bash
docker run -d \
  --name memory-db \
  -e POSTGRES_USER=memory \
  -e POSTGRES_PASSWORD=memory \
  -e POSTGRES_DB=memory \
  -p 5432:5432 \
  pgvector/pgvector:pg16
```

The `pgvector/pgvector` image comes with the extension pre-installed.

### Option B: Existing PostgreSQL

If you already have PostgreSQL running, enable the pgvector extension:

```sql
CREATE EXTENSION IF NOT EXISTS vector;
```

> **pgvector installation:** If you don't have pgvector installed on your server, follow the
[pgvector installation guide](https://github.com/pgvector/pgvector#installation).

## Step 3: Initialize the Client

```python
import asyncio
from arandu import MemoryClient, MemoryConfig
from arandu.providers.openai import OpenAIProvider

async def main():
    # Create the LLM + embedding provider
    provider = OpenAIProvider(api_key="sk-...")

    # Create the memory client
    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost:5432/memory",
        llm=provider,
        embeddings=provider,
    )

    # Create tables (safe to call multiple times)
    await memory.initialize()

    print("Memory initialized!")
    await memory.close()

asyncio.run(main())
```

`initialize()` creates all required tables and indexes (including pgvector HNSW indexes). It's idempotent — safe to call on every startup.

## Step 4: Write Your First Facts

The `write()` method takes a natural language message and automatically:

1. Extracts entities, facts, and relationships using an LLM
2. Resolves entities to canonical records (deduplication)
3. Reconciles new facts against existing knowledge
4. Upserts the results into the database

```python
async def write_example(memory: MemoryClient):
    # First message
    result = await memory.write(
        user_id="user_123",
        message="My name is Rafael and I live in São Paulo. I work at Acme Corp as a backend engineer.",
    )
    print(f"Facts added: {len(result.facts_added)}")
    print(f"Entities resolved: {len(result.entities_resolved)}")
    print(f"Duration: {result.duration_ms:.0f}ms")

    # Second message — the system recognizes "Rafael" and updates knowledge
    result = await memory.write(
        user_id="user_123",
        message="I just moved to Rio de Janeiro. Still working at Acme though.",
    )
    print(f"Facts added: {len(result.facts_added)}")
    print(f"Facts updated: {len(result.facts_updated)}")  # "lives in São Paulo" → "lives in Rio"
```

### Understanding WriteResult

The `WriteResult` object tells you exactly what happened:

| Field | Type | Description |
|-------|------|-------------|
| `event_id` | `str` | Unique ID for this write event |
| `facts_added` | `list` | New facts created (ADD decisions) |
| `facts_updated` | `list` | Existing facts superseded (UPDATE decisions) |
| `facts_unchanged` | `int` | Facts confirmed but not changed (NOOP decisions) |
| `facts_deleted` | `int` | Facts retracted (DELETE decisions) |
| `entities_resolved` | `list` | Entities identified and resolved |
| `duration_ms` | `float` | Total pipeline duration |

## Step 5: Retrieve Context

The `retrieve()` method finds facts relevant to a query using multiple signals:

```python
async def retrieve_example(memory: MemoryClient):
    result = await memory.retrieve(
        user_id="user_123",
        query="where does Rafael live and what does he do?",
    )

    # Pre-formatted context string (ready to inject into an LLM prompt)
    print(result.context)

    # Individual scored facts
    for fact in result.facts:
        print(f"  [{fact.score:.2f}] {fact.entity_name}: {fact.value}")

    print(f"Total candidates evaluated: {result.total_candidates}")
    print(f"Duration: {result.duration_ms:.0f}ms")
```

### Understanding RetrieveResult

| Field | Type | Description |
|-------|------|-------------|
| `facts` | `list[ScoredFact]` | Ranked facts with scores |
| `context` | `str` | Pre-formatted context string for LLM prompts |
| `total_candidates` | `int` | Total facts evaluated before ranking |
| `duration_ms` | `float` | Total pipeline duration |

Each `ScoredFact` contains:

| Field | Type | Description |
|-------|------|-------------|
| `fact_id` | `str` | Unique fact identifier |
| `entity_name` | `str` | Human-readable entity name |
| `attribute_key` | `str` | Fact category/attribute |
| `value` | `str` | The fact content |
| `score` | `float` | Combined relevance score (0-1) |
| `scores` | `dict` | Breakdown by signal (semantic, recency, etc.) |

## Step 6: Configure (Optional)

Every aspect of the pipeline is configurable via `MemoryConfig`:

```python
from arandu import MemoryConfig

config = MemoryConfig(
    # Use single-pass extraction for simpler messages
    extraction_mode="single_pass",

    # Tune retrieval
    topk_facts=30,
    min_similarity=0.25,
    enable_reranker=True,

    # Adjust score weights
    score_weights={
        "semantic": 0.60,
        "recency": 0.25,
        "importance": 0.15,
    },

    # Set timezone for recency calculations
    timezone="America/Sao_Paulo",
)

memory = MemoryClient(
    database_url="postgresql+psycopg://memory:memory@localhost/memory",
    llm=provider,
    embeddings=provider,
    config=config,
)
```

All parameters have sensible defaults — you only need to override what matters for your use case.

## Step 7: Cleanup

Always close the client when done to release database connections:

```python
await memory.close()
```

Or use it as an async context pattern:

```python
memory = MemoryClient(...)
await memory.initialize()
try:
    # ... use memory
finally:
    await memory.close()
```

## Complete Example

Here's a full working example putting it all together:

```python
import asyncio
from arandu import MemoryClient
from arandu.providers.openai import OpenAIProvider

async def main():
    provider = OpenAIProvider(api_key="sk-...")
    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost:5432/memory",
        llm=provider,
        embeddings=provider,
    )
    await memory.initialize()

    try:
        # Write some facts
        await memory.write(
            user_id="demo",
            message="I'm a software engineer living in Berlin. I love cycling and craft coffee.",
        )
        await memory.write(
            user_id="demo",
            message="My girlfriend Ana is a designer. We adopted a cat named Pixel last month.",
        )

        # Retrieve context
        result = await memory.retrieve(user_id="demo", query="tell me about this person")
        print(result.context)

        # Targeted retrieval
        result = await memory.retrieve(user_id="demo", query="who is Ana?")
        for fact in result.facts:
            print(f"  [{fact.score:.2f}] {fact.entity_name}: {fact.value}")
    finally:
        await memory.close()

asyncio.run(main())
```

## Custom Providers

`arandu` uses Python protocols for dependency injection. You can bring any LLM or embedding provider by implementing two simple interfaces:

```python
from arandu.protocols import LLMProvider, EmbeddingProvider

class MyLLMProvider:
    async def complete(
        self,
        messages: list[dict],
        temperature: float = 0,
        response_format: dict | None = None,
        max_tokens: int | None = None,
    ) -> str:
        # Call your LLM here
        ...

class MyEmbeddingProvider:
    async def embed(self, texts: list[str]) -> list[list[float]]:
        # Return embeddings for a batch
        ...

    async def embed_one(self, text: str) -> list[float] | None:
        # Return embedding for a single text
        ...
```

No inheritance required — just implement the methods with the right signatures.

## Next Steps

- [**Write Pipeline**](concepts/write-pipeline.md) — Understand how facts are extracted, entities resolved, and knowledge reconciled
- [**Read Pipeline**](concepts/read-pipeline.md) — Learn how multi-signal retrieval finds the most relevant facts
- [**Background Jobs**](concepts/background-jobs.md) — Set up clustering, consolidation, and importance scoring
- [**Design Philosophy**](concepts/design-philosophy.md) — Explore the neuroscience-inspired architecture

---

# Write Pipeline

The write pipeline transforms natural language messages into structured, versioned knowledge. Every call to `memory.write()` runs four stages: **Extract → Resolve → Reconcile → Upsert**.

```mermaid
flowchart LR
    A["Message"] --> B["Extract"]
    B --> C["Resolve Entities"]
    C --> D["Reconcile"]
    D --> E["Upsert"]
    E --> F["WriteResult"]
```

## Overview

When you call `memory.write(user_id, message)`, the pipeline:

1. **Creates an immutable event** — The raw message is logged as a `MemoryEvent` with its embedding. This record is never modified or deleted — it's the audit trail.
2. **Extracts entities, facts, and relationships** — An LLM parses the message to identify structured information.
3. **Resolves entities to canonical records** — Deduplicates mentions ("Ana", "my wife Ana", "her") into a single entity.
4. **Reconciles against existing knowledge** — Decides whether each fact is new (ADD), supersedes an old fact (UPDATE), is already known (NOOP), or retracts something (DELETE).
5. **Upserts the results** — Executes the reconciliation decisions in the database.

Each stage is independently fail-safe: if extraction fails, the event is still logged. If reconciliation fails for one fact, the others proceed normally.

---

## Stage 1: Extraction

The extraction stage uses an LLM to parse natural language into structured data: entities, facts, and relationships.

### Two Modes

    Multi-pass extraction runs three sequential LLM calls for higher accuracy:

    1. **Entity scan** — Identify all entities mentioned in the message
    2. **Fact extraction** — For each batch of entities, extract facts (batched by `entity_batch_size`)
    3. **Relation extraction** — Identify relationships between entities

    Best for messages that mention multiple entities or contain complex information.

    Single-pass extraction runs a single LLM call that extracts entities, facts, and relationships together.

    Faster and cheaper, but may miss nuanced relationships in complex messages. Falls back automatically when entity count ≤ `multi_pass_entity_threshold`.

    Includes an optional **reflexion pass** — a second LLM call that reviews the extraction for quality.

### Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `extraction_model` | `"gpt-4o"` | LLM model for extraction |
| `extraction_mode` | `"multi_pass"` | `"multi_pass"` or `"single_pass"` |
| `extraction_timeout_sec` | `30.0` | Timeout per LLM call |
| `multi_pass_entity_threshold` | `3` | Entity count below which multi-pass falls back to single-pass |
| `entity_batch_size` | `5` | Entities per fact-extraction batch in multi-pass mode |

### What Gets Extracted

For each message, the extraction stage produces:

- **Entities** — Named things: people, organizations, places, concepts, etc.
- **Facts** — Statements about entities in natural language (e.g., "lives in São Paulo", "works as a backend engineer")
- **Relations** — Connections between entities (e.g., "Rafael" → `works_at` → "Acme Corp")

Each fact includes a **confidence level**:

| Level | Score | Example |
|-------|-------|---------|
| Explicit statement | 0.95 | "I live in São Paulo" |
| Strong inference | 0.80 | "We went to the São Paulo office" (implies location) |
| Weak inference | 0.60 | Contextual implication |
| Speculation | 0.40 | Uncertain information |

### Fail-safe Behavior

If an LLM call fails (timeout, invalid JSON, rate limit), the extraction returns an empty result rather than raising an exception. The event is still logged — no data is lost. The next message may capture the same information.

> **Neuroscience parallel:** Extraction mirrors **encoding** in human memory — the process of converting sensory input (a conversation) into a memory trace. Just as human encoding is selective (we don't remember every word), the LLM extracts only salient facts and entities.

---

## Stage 2: Entity Resolution

Entity resolution maps extracted entity names to canonical records. This prevents duplicates: "Ana", "my wife Ana", and "Aninha" all resolve to the same `person:ana` entity.

### Three-Phase Resolution

```mermaid
flowchart LR
    A["Entity name"] --> B{"Exact match?"}
    B -->|Yes| F["Resolved"]
    B -->|No| C{"Fuzzy match?"}
    C -->|"≥ 0.85"| F
    C -->|"0.50–0.85"| D{"LLM decides"}
    C -->|"< 0.50"| E["Create new entity"]
    D -->|Match| F
    D -->|No match| E
    E --> F
```

**Phase 1: Exact match**

Checks the alias cache, entity slugs, and display names. Instant, no LLM call.

Includes **prefix/diminutive matching** for person entities: "Carol" matches "Carolina" (minimum 3 characters).

**Phase 2: Fuzzy match**

Uses embedding cosine similarity (via pgvector) to find candidates:

- **≥ 0.85** — High confidence match, resolves directly
- **0.50–0.85** — Ambiguous, forwards top-3 candidates to Phase 3
- **< 0.50** — No match, creates a new entity

Falls back to `difflib.SequenceMatcher` when embeddings are unavailable.

**Phase 3: LLM fallback**

Sends ambiguous candidates to an LLM (`gpt-4o-mini`) for disambiguation. The LLM sees the entity name, the candidates, and decides which (if any) is a match.

### Special Cases

- **Self-references** — "I", "me", "eu", "myself" automatically resolve to `user:self`
- **Relationship terms** — "my girlfriend", "my brother", "meu amigo" resolve with person type inference
- **Relational hints** — `"Carol (Rafael's girlfriend)"` strips the hint and forces `type="person"`

### Alias Registration

When a new alias is discovered (e.g., "Aninha" resolves to `person:ana`), it's registered in `MemoryEntityAlias` with **first-write-wins** semantics — concurrent writes won't create conflicting aliases.

### Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `fuzzy_threshold` | `0.85` | Cosine similarity threshold for direct fuzzy match |
| `enable_llm_resolution` | `True` | Whether to use LLM for ambiguous cases |
| `entity_resolution_model` | `"gpt-4o-mini"` | Model for LLM resolution |

> **Neuroscience parallel:** Entity resolution mirrors **associative memory** — the brain's ability to link new stimuli to existing representations. Hearing "Carol" activates the neural pattern for "Carolina" through pattern completion, just as fuzzy matching activates candidate entities through embedding similarity.

---

## Stage 3: Reconciliation

Reconciliation compares new facts against existing knowledge and decides what to do with each one. This is how the memory stays accurate over time — contradictions are resolved, redundancies are skipped, and outdated information is superseded.

### Decision Logic

For each extracted fact, the reconciler:

1. **Fetches existing facts** for the same entity
2. **Computes similarity** between the new fact and each existing fact (via embeddings)
3. **Decides the action**:

| Action | When | Example |
|--------|------|---------|
| **ADD** | New information, no similar existing fact (similarity < 0.50) | "speaks French" when no language fact exists |
| **UPDATE** | Supersedes an existing fact (similarity 0.50–0.85+) | "lives in Rio" supersedes "lives in São Paulo" |
| **NOOP** | Already known (high similarity) | "works at Acme" when this fact already exists |
| **DELETE** | Explicitly retracts a fact | "I no longer work at Acme" |

- **Low similarity (< 0.50)**: Auto-ADD without LLM call (fast path)
- **Ambiguous similarity (0.50+)**: LLM decides the action with full context

### Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `reconciliation_model` | `"gpt-4o-mini"` | Model for reconciliation decisions |

### Fail-safe Behavior

If the reconciliation LLM call fails, the system defaults to **ADD** — it's better to have a near-duplicate than to lose information. The background consolidation jobs (clustering, deduplication) clean up duplicates later.

### Fact Versioning

Facts are versioned using temporal validity windows (`valid_from`, `valid_to`):

- **Active facts** have `valid_to = NULL`
- **Updated facts** get `valid_to` set to the current timestamp, and a new fact is created with `supersedes_fact_id` pointing to the old one
- **Deleted facts** get both `valid_to` and `invalidated_at` set

This enables time-travel queries: you can ask what the system knew at any point in time.

> **Neuroscience parallel:** Reconciliation mirrors **reconsolidation** — the process by which retrieved memories become labile and can be modified. When you recall a memory ("lives in São Paulo") and encounter new information ("just moved to Rio"), the original memory is updated. The brain doesn't simply overwrite — it creates a new trace linked to the original, just as UPDATE creates a new fact with `supersedes_fact_id`.

---

## Stage 4: Upsert

The upsert stage executes the reconciliation decisions in the database:

| Decision | Database action |
|----------|-----------------|
| ADD | Create new `MemoryFact` with embedding |
| UPDATE | Close old fact (`valid_to = now`), create new one with `supersedes_fact_id` |
| NOOP | Update `last_confirmed_at` on existing fact |
| DELETE | Close fact (`valid_to = now`, `invalidated_at = now`) |

### Relationship Tracking

During upsert, extracted relationships are also persisted:

- Creates/updates `MemoryEntityRelationship` records
- Resolves source and target entities via the entity map
- **Strength reinforcement**: repeated relationships increase `strength` (up to 1.0)
- Uses `ON CONFLICT DO UPDATE` for idempotent upserts

### Transaction Safety

The entire write pipeline runs inside a database transaction. Individual fact upserts use **savepoints** (`session.begin_nested()`) so that a failure in one fact doesn't abort the entire batch:

```python
# If this fact fails, only this savepoint rolls back
async with session.begin_nested():
    session.add(new_fact)
    await session.flush()
```

The event record is created and flushed first, so it survives even if all subsequent stages fail.

---

## WriteResult

After the pipeline completes, you get a `WriteResult` with full observability:

```python
result = await memory.write(user_id="user_123", message="...")

# What happened
print(result.facts_added)       # List of facts created
print(result.facts_updated)     # List of facts superseded
print(result.facts_unchanged)   # Count of NOOP decisions
print(result.facts_deleted)     # Count of DELETE decisions
print(result.entities_resolved) # List of resolved entities
print(result.duration_ms)       # Total pipeline time
```

---

## Pipeline Diagram (Complete)

```mermaid
flowchart TD
    MSG["User message"] --> EVT["Create MemoryEvent\n(immutable log + embedding)"]
    EVT --> EXT{"Extraction mode?"}
    EXT -->|multi-pass| MP["Entity Scan → Fact Batches → Relations"]
    EXT -->|single-pass| SP["Single LLM call\n(entities + facts + relations)"]
    MP --> RES["Entity Resolution\n(exact → fuzzy → LLM)"]
    SP --> RES
    RES --> REC["Reconciliation\n(ADD / UPDATE / NOOP / DELETE)"]
    REC --> UPS["Upsert\n(with savepoints)"]
    UPS --> REL["Relationship Tracking\n(strength reinforcement)"]
    REL --> WR["WriteResult"]
```

---

# Read Pipeline

The read pipeline finds facts relevant to a query using multiple signals in parallel. Every call to `memory.retrieve()` runs five stages: **Plan → Retrieve → Enhance → Rerank → Format**.

```mermaid
flowchart LR
    A["Query"] --> B["Plan"]
    B --> C["Retrieve\n(3 signals)"]
    C --> D["Enhance"]
    D --> E["Rerank"]
    E --> F["RetrieveResult"]
```

## Overview

When you call `memory.retrieve(user_id, query)`, the pipeline:

1. **Plans the retrieval** — An LLM planner analyzes the query and decides the strategy, reformulates the query for semantic search, and identifies relevant entities for graph traversal.
2. **Retrieves candidates** — Three parallel signals (semantic, keyword, graph) find candidate facts.
3. **Enhances results** — Spreading activation expands context from seed facts along entity relationships.
4. **Reranks** — An optional LLM reranker refines the ranking based on query intent.
5. **Formats** — Facts are scored, compressed within a token budget, and formatted into a context string.

---

## Stage 1: Retrieval Agent (Planner)

The retrieval agent is an LLM-powered planner that decides **how** to retrieve, not just **what** to retrieve. It analyzes the query and produces a `RetrievalPlan`.

### What the Planner Decides

| Field | Description | Example |
|-------|-------------|---------|
| `strategy` | Retrieval strategy | `"multi_signal"` (default) or `"skip"` (for greetings) |
| `similarity_query` | Reformulated query for semantic search | "user location city" (from "where do I live?") |
| `entities` | Entity keys for graph signal | `["person:ana", "organization:acme"]` |
| `as_of_range` | Time-travel window (optional) | `{"start": "2024-01-01", "end": "2024-06-30"}` |
| `broad_query` | Whether to expand graph scope | `true` for "tell me everything about..." |
| `reason` | Explanation of the strategy | For debugging and observability |

### Query Reformulation

The planner doesn't just pass the user's query to semantic search. It **reformulates** it to improve vector similarity matching:

- `"where do I live?"` → `"user location city residence"`
- `"what does Ana do for work?"` → `"Ana profession occupation job role"`

This bridges the vocabulary gap between how users ask questions and how facts are stored.

### Schema-Aware Planning

The planner inspects the actual memory schema for this user:

- Which entity types exist (persons, organizations, places...)
- Which entities have the most facts
- What attributes are stored

This grounds the plan in reality — the planner won't search for entities that don't exist.

### Skip Strategy

For greetings and casual messages ("hi", "how are you?"), the planner returns `strategy: "skip"`, short-circuiting the pipeline. No database queries, no LLM calls, instant response.

> **Neuroscience parallel:** The retrieval agent mirrors **retrieval cues** in cognitive psychology. When you try to remember something, your brain doesn't do an exhaustive search — it uses contextual cues to narrow down the search space. The planner identifies entities and reformulates queries as cues that guide the retrieval signals.

---

## Stage 2: Multi-Signal Retrieval

Three independent signals run **in parallel** via `asyncio.gather()`, each finding candidates from a different angle:

```mermaid
flowchart TD
    P["RetrievalPlan"] --> S["Semantic Search\n(pgvector cosine)"]
    P --> K["Keyword Search\n(SQL ILIKE)"]
    P --> G["Graph Traversal\n(BFS 2-hop)"]
    S --> M["Merge & Rank\n(RRF + weighted scoring)"]
    K --> M
    G --> M
```

### Signal 1: Semantic Search

Uses pgvector cosine similarity to find facts whose embeddings are close to the query embedding.

- Embeds the reformulated query (from the planner)
- Searches the `MemoryFact` table with HNSW index
- Returns top-N candidates above `min_similarity` threshold
- Filters: `user_id`, active facts (`valid_to IS NULL`), confidence ≥ `min_confidence`

This is the primary signal — it finds facts that are **semantically similar** to the query, even if they don't share exact keywords.

### Signal 2: Keyword Search

SQL ILIKE matching on `fact_text` for exact or partial keyword hits.

- Extracts significant words (> 2 characters) from the query
- Matches against fact text (up to 5 keywords)
- Score = fraction of query words found in the fact

This complements semantic search by catching exact matches that embedding similarity might miss (e.g., proper nouns, technical terms, abbreviations).

### Signal 3: Graph Retrieval

Traverses entity relationships to find facts connected to the query entities.

- Starts from entities identified by the planner
- BFS traversal up to 2 hops through `MemoryEntityRelationship`
- Scoring formula: `edge_strength × recency_factor × edge_recency_factor × query_bonus`
- `query_bonus`: 1.5× when the entity name appears in the query text

Graph retrieval excels at finding **contextual** facts. When you ask about a person, it also finds facts about their workplace, their relationships, and their projects.

### Merge & Rank

After all three signals return, results are merged:

1. **Deduplicate** by fact ID (same fact may appear in multiple signals)
2. **Apply recency decay** — Exponential decay with configurable half-life (`recency_half_life_days`, default 14)
3. **Apply confidence decay** — Older facts with lower confidence are penalized
4. **Compute combined score** — Weighted sum:

```python
score = (
    score_weights["semantic"]   * semantic_score +    # default 0.70
    score_weights["recency"]    * recency_score +     # default 0.20
    score_weights["importance"] * importance_score     # default 0.10
)
```

### Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `topk_facts` | `20` | Maximum facts to return |
| `topk_events` | `8` | Maximum events to consider |
| `min_similarity` | `0.20` | Minimum cosine similarity for semantic results |
| `min_confidence` | `0.55` | Minimum fact confidence |
| `recency_half_life_days` | `14` | Half-life for recency decay |
| `score_weights` | See above | Weights for each scoring signal |
| `enable_reranker` | `True` | Whether to use LLM reranking |

> **Neuroscience parallel:** Multi-signal retrieval mirrors **spreading activation** in semantic networks (Collins & Loftus, 1975). When you think of "doctor", activation spreads to related concepts ("hospital", "medicine", "appointment") through associative links. Similarly, graph retrieval spreads from query entities along relationship edges, while semantic search activates facts through embedding proximity.

---

## Stage 3: Enhancement

### Spreading Activation

Starting from the top-K seed facts, the pipeline expands context by following entity relationships:

- For each seed fact, find its entity's relationships
- Traverse relationships for N hops (`spreading_activation_hops`, default 2)
- Apply decay factor per hop (`spreading_decay_factor`, default 0.50)
- Return up to M additional facts per entity

This catches important context that wasn't directly matched. If you ask "what does Rafael do?", spreading activation might surface facts about his workplace, team, and projects.

### Pattern Signal

Facts that have been reinforced multiple times (confirmed by NOOP decisions in write) get an additive score boost:

- High reinforcement count → up to 0.10 extra score
- Captures frequently mentioned, well-established facts

### Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `spreading_activation_hops` | `2` | Maximum hops from seed facts |
| `spreading_decay_factor` | `0.50` | Score decay per hop (0.5 = halved each hop) |

---

## Stage 4: Reranking (Optional)

When `enable_reranker=True`, the top candidates are reranked by an LLM that considers query intent:

- Respects the semantic meaning of the query (not just keyword overlap)
- Can promote facts that are indirectly relevant but important
- Graceful degradation: if the reranker fails (timeout, error), the original ranking is preserved

The reranker is the most expensive stage but provides the highest quality improvement for complex queries.

---

## Stage 5: Formatting

The final stage converts ranked facts into consumable output.

### Context Compression

Facts are divided into three tiers within a token budget (`context_max_tokens`):

| Tier | Budget share | Content |
|------|-------------|---------|
| **Hot** | 50% | Most relevant facts (highest scores) |
| **Warm** | 30% | Supporting context |
| **Cold** | 20% | Background facts (lower scores but still relevant) |

### Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `context_budget_tokens` | `3000` | Total token budget for retrieval |
| `context_max_tokens` | `2000` | Maximum tokens in formatted context |
| `hot_tier_ratio` | `0.50` | Share of budget for top facts |
| `warm_tier_ratio` | `0.30` | Share of budget for supporting facts |

### Output Format

The `context` string is formatted for direct injection into LLM prompts:

```
## Known facts about the user:
- Lives in São Paulo (confidence: 0.95)
- Works at Acme Corp as a backend engineer (confidence: 0.90)
- Wife's name is Ana (confidence: 0.92)
```

---

## RetrieveResult

```python
result = await memory.retrieve(user_id="user_123", query="...")

# Pre-formatted context (ready for LLM prompts)
print(result.context)

# Individual facts with scores
for fact in result.facts:
    print(f"[{fact.score:.2f}] {fact.entity_name}: {fact.value}")
    print(f"  Scores: {fact.scores}")  # {"semantic": 0.85, "recency": 0.72, ...}

# Pipeline stats
print(f"Candidates evaluated: {result.total_candidates}")
print(f"Duration: {result.duration_ms:.0f}ms")
```

---

## Pipeline Diagram (Complete)

```mermaid
flowchart TD
    Q["User query"] --> AG["Retrieval Agent\n(LLM planner)"]
    AG -->|skip| SKIP["Return empty\n(greeting/casual)"]
    AG -->|multi_signal| PAR["Parallel retrieval"]
    PAR --> SEM["Semantic Search\n(pgvector cosine)"]
    PAR --> KW["Keyword Search\n(SQL ILIKE)"]
    PAR --> GR["Graph Traversal\n(BFS 2-hop)"]
    SEM --> MERGE["Merge & Rank\n(dedup + weighted scoring)"]
    KW --> MERGE
    GR --> MERGE
    MERGE --> SA["Spreading Activation\n(expand context along edges)"]
    SA --> RR{"Reranker\nenabled?"}
    RR -->|yes| RERANK["LLM Rerank"]
    RR -->|no| FMT["Format & Compress"]
    RERANK --> FMT
    FMT --> RES["RetrieveResult"]
```

> **Neuroscience parallel:** The tiered compression (hot/warm/cold) mirrors **levels of activation** in working memory. In Cowan's embedded-process model, a small number of items are in the focus of attention (hot tier), surrounded by activated long-term memory (warm tier), with the rest of long-term memory available but not active (cold tier). The token budget acts as the capacity limit of working memory.

---

# Background Jobs

Background jobs maintain and optimize memory over time — clustering facts, consolidating patterns, scoring importance, and refreshing summaries. They run periodically (not during message handling) and improve retrieval quality without impacting response latency.

```mermaid
flowchart LR
    A["Scheduler\n(periodic)"] --> B["Clustering"]
    A --> C["Consolidation"]
    A --> D["Importance\nScoring"]
    A --> E["Summary\nRefresh"]
```

## Overview

`arandu` provides four categories of background jobs:

| Job | Purpose | Uses LLM? | Frequency |
|-----|---------|-----------|-----------|
| **Clustering** | Group related facts semantically | Yes (summaries) | Every 4-8 hours |
| **Consolidation** | Detect patterns, contradictions, trends | Yes | Every 4-8 hours |
| **Memify** | Convert episodic facts to procedural/semantic knowledge | Yes | Daily |
| **Sleep-time compute** | Score importance, refresh summaries, detect communities | Partially | Every 4-8 hours |

All jobs are exposed as async functions you can call directly or schedule with your preferred task runner (APScheduler, Celery, cron, etc.).

> **Neuroscience parallel:** Background jobs mirror **sleep-time processing** in the brain. During sleep, the brain consolidates memories, transfers information from hippocampus (short-term) to neocortex (long-term), prunes irrelevant connections, and strengthens important ones. These jobs perform the same operations on your agent's memory.

---

## Clustering

Clustering groups related facts into semantic clusters, making retrieval more contextual and enabling community detection.

### Fact Clustering

```python
from arandu import cluster_user_facts, ClusteringResult

result: ClusteringResult = await cluster_user_facts(
    session=db_session,
    user_id="user_123",
    llm=llm_provider,
    embeddings=embedding_provider,
    config=memory_config,
)
```

**How it works:**

1. Groups facts by `(entity_type, entity_key)` — facts about the same entity stay together
2. Generates a 2-3 sentence summary per cluster using an LLM
3. Computes and stores cluster embeddings for later community detection
4. Idempotent — updates existing clusters rather than creating duplicates

### Community Detection

```python
from arandu import detect_communities, CommunityDetectionResult

result: CommunityDetectionResult = await detect_communities(
    session=db_session,
    user_id="user_123",
    llm=llm_provider,
    embeddings=embedding_provider,
    config=memory_config,
)
```

**How it works:**

1. Compares cluster embeddings using cosine similarity
2. Groups clusters above `community_similarity_threshold` (default 0.75)
3. Creates `MemoryMetaObservation` records with type `"entity_community"`
4. Example: a "work" community might include clusters about colleagues, projects, and company facts

### Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `cluster_max_age_days` | `90` | Maximum age of facts to include in clustering |
| `community_similarity_threshold` | `0.75` | Cosine similarity threshold for grouping clusters |

---

## Consolidation

Consolidation analyzes recent events and facts to detect higher-order patterns — insights, contradictions, trends, and behavioral preferences.

### Periodic Consolidation (L2)

```python
from arandu import run_consolidation, ConsolidationResult

result: ConsolidationResult = await run_consolidation(
    session=db_session,
    user_id="user_123",
    llm=llm_provider,
    config=memory_config,
)
```

**How it works:**

1. Analyzes events and facts over a lookback window (`consolidation_lookback_days`)
2. Detects patterns across facts:
   - **Insights** — Emergent understanding from multiple facts
   - **Patterns** — Repeated behaviors or preferences
   - **Contradictions** — Conflicting facts that need resolution
   - **Trends** — Changes over time
3. Generates `MemoryMetaObservation` records
4. Tags events with emotions (emotion, intensity, energy level)

### Profile Consolidation (L3)

```python
from arandu import run_profile_consolidation

await run_profile_consolidation(
    session=db_session,
    user_id="user_123",
    llm=llm_provider,
    config=memory_config,
)
```

**How it works:**

1. Refreshes entity summaries via LLM — a higher-level view of each entity
2. Updates the overall profile overview
3. Triggered periodically (less frequently than L2)

### Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `consolidation_min_events` | `3` | Minimum events before running consolidation |
| `consolidation_lookback_days` | `7` | How far back to look for patterns |

> **Neuroscience parallel:** Consolidation mirrors the brain's **memory consolidation during sleep**. The hippocampus replays recent experiences, the neocortex detects patterns and integrates them into existing knowledge structures, and contradictions are flagged for resolution. L2 consolidation is analogous to slow-wave sleep (SWS) replay, while L3 profile consolidation is analogous to REM sleep's role in integrating memories into semantic knowledge.

---

## Memify

Memify converts episodic facts (specific events) into procedural and semantic knowledge (general knowledge). It also manages fact vitality — how "alive" a fact is based on recent mentions.

### Run Memify

```python
from arandu import run_memify, MemifyResult

result: MemifyResult = await run_memify(
    session=db_session,
    user_id="user_123",
    llm=llm_provider,
    config=memory_config,
)
```

**How it works:**

1. Groups related facts by entity and topic
2. Generates distilled summaries (procedural/semantic knowledge)
3. Checks vitality — facts mentioned recently are kept; stale facts may be deprecated
4. Merges similar procedures to prevent knowledge fragmentation

### Vitality Scoring

```python
from arandu import compute_vitality

vitality_scores = await compute_vitality(
    session=db_session,
    user_id="user_123",
    config=memory_config,
)
```

Vitality measures how "alive" a fact is based on:

- **Recency** — When was the fact last confirmed or mentioned?
- **Reinforcement** — How many times has this fact been confirmed (NOOP decisions)?
- **Importance** — How relevant is this fact to the user's profile?

> **Neuroscience parallel:** Memify mirrors the **forgetting curve** described by Hermann Ebbinghaus (1885). Memories decay exponentially over time unless reinforced through retrieval practice. Facts with high vitality (frequently accessed) resist decay, while low-vitality facts gradually fade — just as the brain prunes synaptic connections for unused information.

---

## Sleep-Time Compute

Sleep-time compute runs three targeted maintenance jobs that improve retrieval quality:

### Job 1: Entity Importance Scoring

```python
from arandu import compute_entity_importance, EntityImportanceResult

result: EntityImportanceResult = await compute_entity_importance(
    session=db_session,
    user_id="user_123",
    config=memory_config,
)
```

Pure SQL computation (no LLM calls). Scores each entity from 0.0 to 1.0 using four normalized signals:

| Signal | Weight | Description |
|--------|--------|-------------|
| Fact density | 0.30 | Number of facts relative to other entities |
| Recency | 0.25 | Exponential decay (30-day half-life) |
| Retrieval frequency | 0.25 | How often facts about this entity are retrieved |
| Relationship degree | 0.20 | Number of incoming + outgoing relationships |

The importance score is used as a signal in retrieval scoring and as a priority factor for summary refresh.

### Job 2: Entity Summary Refresh

```python
from arandu import refresh_entity_summaries, SummaryRefreshResult

result: SummaryRefreshResult = await refresh_entity_summaries(
    session=db_session,
    user_id="user_123",
    llm=llm_provider,
    config=memory_config,
)
```

Refreshes stale entity summaries:

- **Stale condition**: `summary_text IS NULL` or last refresh > 7 days ago
- **Priority**: entities with higher `importance_score` refreshed first
- **Limit**: 10 entities per run (prevents timeout)
- Generates 2-3 sentence summaries from the entity's facts using an LLM

### Job 3: Entity Community Detection

```python
from arandu import detect_entity_communities

result = await detect_entity_communities(
    session=db_session,
    user_id="user_123",
    llm=llm_provider,
    embeddings=embedding_provider,
    config=memory_config,
)
```

Finds groups of related entities using the relationship graph:

1. Loads active entities and edges (strength ≥ 0.3)
2. Runs Union-Find (with path compression + union by rank) to find connected components
3. Filters by minimum entity threshold
4. Generates LLM summary and embedding for each community
5. Deduplicates against existing communities (Jaccard member overlap)
6. Stores as `MemoryMetaObservation` records

> **Neuroscience parallel:** Sleep-time compute mirrors **offline processing during sleep**. The brain doesn't just passively store memories during sleep — it actively reorganizes them. Importance scoring is analogous to the brain's process of **synaptic homeostasis** (Tononi & Cirelli), where strongly activated synapses are maintained while weakly activated ones are pruned. Summary refresh mirrors the formation of **gist memories** — compressed representations that capture the essence of detailed episodes.

---

## Scheduling

`arandu` doesn't include a scheduler — you bring your own. All background functions are simple async callables that can be integrated with any scheduling system.

### Example: APScheduler

```python
from apscheduler.schedulers.asyncio import AsyncIOScheduler
from arandu import (
    cluster_user_facts,
    run_consolidation,
    compute_entity_importance,
    refresh_entity_summaries,
)

scheduler = AsyncIOScheduler()

async def maintenance_cycle():
    async with get_session() as session:
        for user_id in await get_active_users(session):
            await compute_entity_importance(session, user_id, config)
            await refresh_entity_summaries(session, user_id, llm, config)
            await cluster_user_facts(session, user_id, llm, embeddings, config)
            await run_consolidation(session, user_id, llm, config)

scheduler.add_job(maintenance_cycle, "interval", hours=4)
scheduler.start()
```

### Example: Simple Loop

```python
import asyncio

async def background_loop():
    while True:
        await maintenance_cycle()
        await asyncio.sleep(4 * 3600)  # every 4 hours
```

### Recommended Cadence

| Job | Frequency | Cost |
|-----|-----------|------|
| Entity importance | Every 4h | Cheap (SQL only) |
| Summary refresh | Every 4h | Moderate (LLM, limited to 10/run) |
| Clustering | Every 4-8h | Moderate (LLM for summaries) |
| Consolidation | Every 4-8h | Moderate (LLM for pattern detection) |
| Memify | Daily | Moderate (LLM for distillation) |
| Community detection | Daily | Moderate (LLM + embeddings) |

Run importance scoring first — its output is used by summary refresh to prioritize entities.

---

# Design Philosophy

`arandu` is designed around two foundations: **software engineering principles** that make it reliable and extensible, and **cognitive science models** that inform its architecture. This page covers both — the engineering decisions and the neuroscience parallels that inspired them.

---

## Engineering Principles

### Protocol-Based Dependency Injection

The SDK uses Python's `typing.Protocol` for all external dependencies (LLM, embeddings). No inheritance required — just implement the method signatures:

```python
@runtime_checkable
class LLMProvider(Protocol):
    async def complete(
        self,
        messages: list[dict],
        temperature: float = 0,
        response_format: dict | None = None,
        max_tokens: int | None = None,
    ) -> str: ...
```

**Why:** Vendor lock-in kills adoption. By using structural subtyping (duck typing), any LLM provider works without inheriting from a base class. The OpenAI provider is included for convenience, but you can swap in Anthropic, local models, or custom endpoints with zero SDK changes.

### Fail-Safe by Default

Every stage of the pipeline has fallback behavior:

| Stage | Failure | Fallback |
|-------|---------|----------|
| Extraction | LLM timeout/error | Return empty extraction; event still logged |
| Entity Resolution | LLM fallback fails | Create new entity (prefer duplicates over lost data) |
| Reconciliation | LLM error | Default to ADD |
| Reranking | Reranker fails | Keep original ranking |
| Background jobs | Any job fails | Other jobs proceed independently |

**Why:** In a production AI agent, memory is a supporting system — it should never crash the main flow. A degraded response (missing some context) is always better than an error.

### Composition Over Inheritance

The SDK has no abstract base classes, no deep class hierarchies. It's built from small, focused modules composed into pipelines:

- `write/extract.py` → `write/entity_resolution.py` → `write/reconcile.py` → `write/upsert.py`
- `read/retrieval_agent.py` → `read/retrieval.py` → `read/reranker.py`

**Why:** Each module has a single responsibility with clear inputs and outputs. You can understand, test, and replace any module independently. This follows the Unix philosophy: do one thing well.

### Savepoint-Based Transaction Safety

Write operations use database savepoints (`session.begin_nested()`) so that a failure in one fact doesn't abort the entire batch:

```python
async with session.begin_nested():
    # If this fails, only this savepoint rolls back
    session.add(new_fact)
    await session.flush()
```

**Why:** In a pipeline that processes multiple facts per message, atomic all-or-nothing transactions are too fragile. Savepoints give per-fact atomicity while keeping the outer transaction alive.

---

## Neuroscience Parallels

The architecture of `arandu` draws from established models in cognitive neuroscience. Each parallel below maps a system component to its biological counterpart.

### Encoding: The Write Pipeline

**System:** Message → Extract → Resolve → Reconcile → Upsert

**Brain:** Sensory input → Perception → Association → Consolidation → Storage

When you experience something, your brain doesn't record a raw video. It encodes a **selective representation** — extracting salient features, linking them to existing knowledge, and storing the result in a form that can be retrieved later. The write pipeline does the same:

- **Extraction** is perception: an LLM selects what matters from the raw message
- **Entity resolution** is association: linking new mentions to existing memory traces
- **Reconciliation** is reconsolidation: updating existing memories when new information arrives
- **Upsert** is storage: committing the processed trace to long-term memory

### Associative Memory: Entity Resolution

**System:** 3-phase resolution (exact → fuzzy → LLM)

**Brain:** Pattern completion in hippocampal-neocortical circuits

The brain doesn't store memories as isolated records — it stores them as patterns of activation across neural networks. When you encounter a partial cue ("Carol"), your brain completes the pattern to retrieve the full representation ("Carolina, my colleague from work").

Entity resolution mirrors this process:

- **Exact match** = direct retrieval (strong, well-established associations)
- **Fuzzy match** = pattern completion (partial cue activates the most similar existing pattern)
- **LLM fallback** = deliberate recall (conscious effort to disambiguate when automatic retrieval fails)

The **fuzzy threshold** (0.85) and **LLM fallback range** (0.50-0.85) model the brain's confidence gradient: strong matches are automatic, ambiguous matches require deliberation.

### Reconsolidation: Fact Reconciliation

**System:** ADD / UPDATE / NOOP / DELETE decisions

**Brain:** Memory reconsolidation (Nader, Schiller, & LeDoux, 2000)

When a memory is retrieved, it enters a **labile state** where it can be modified. This is reconsolidation — the brain's mechanism for updating memories with new information while preserving the original trace.

The reconciliation stage models this process:

- **NOOP** = retrieval without modification (memory confirmed, `last_confirmed_at` updated)
- **UPDATE** = reconsolidation (old memory superseded, new version created with provenance link via `supersedes_fact_id`)
- **ADD** = new encoding (no existing memory to reconsolidate)
- **DELETE** = active forgetting (explicit retraction, modeled by setting `invalidated_at`)

The fact versioning system (`valid_from`, `valid_to`, `supersedes_fact_id`) preserves the full history — just as the brain retains traces of original memories even after reconsolidation.

### Spreading Activation: Graph Retrieval

**System:** BFS 2-hop traversal with decay factor

**Brain:** Spreading activation in semantic networks (Collins & Loftus, 1975)

In Collins and Loftus's model, when a concept is activated (e.g., "fire engine"), activation spreads along associative links to related concepts ("red", "truck", "emergency"), with strength decreasing as distance increases.

Graph retrieval implements this directly:

- **Seed entities** from the query activate the starting nodes
- **Hop 1** activates direct neighbors (no pruning — all connections fire)
- **Hop 2** activates second-degree connections (pruned by `min_edge_strength`)
- **Decay factor** (0.50 per hop) models the attenuation of activation over distance
- **Edge strength** models the associative strength between concepts (reinforced by repeated co-mention)

The `query_bonus` (1.5×) for entities whose names appear in the query models **top-down priming** — when you explicitly mention an entity, its connections are more strongly activated.

### Sleep-Time Compute: Background Processing

**System:** Clustering, consolidation, importance scoring, summary refresh

**Brain:** Memory consolidation during sleep (Diekelmann & Born, 2010)

During sleep, the brain performs critical maintenance:

1. **Hippocampal replay** — Recent experiences are replayed in compressed form, transferring them from short-term (hippocampal) to long-term (neocortical) storage
2. **Synaptic homeostasis** — Strongly activated synapses are maintained while weakly activated ones are pruned (Tononi & Cirelli)
3. **Pattern detection** — The neocortex detects statistical regularities across episodes
4. **Gist extraction** — Detailed episodic memories are compressed into semantic knowledge

The background jobs map to these processes:

| Brain process | System job | Mechanism |
|---------------|-----------|-----------|
| Hippocampal replay | Consolidation | Reviews recent events, detects patterns and contradictions |
| Synaptic homeostasis | Importance scoring | Scores entities by density + recency + retrieval frequency + connectivity |
| Pattern detection | Community detection | Finds groups of related entities via graph analysis |
| Gist extraction | Summary refresh + Memify | Generates compressed summaries from detailed facts |

### Forgetting Curve: Vitality and Recency

**System:** Recency decay, vitality scoring, importance-based pruning

**Brain:** Ebbinghaus forgetting curve (1885)

Hermann Ebbinghaus demonstrated that memory retention decays exponentially over time, but each retrieval (practice) resets the curve and slows future decay. This is the **spacing effect** — the most robust finding in memory research.

`arandu` models this with:

- **Recency decay** — Exponential decay with configurable half-life (`recency_half_life_days`). Recent facts score higher. This models the basic forgetting curve.
- **Retrieval reinforcement** — Each NOOP decision (fact confirmed during write) updates `last_confirmed_at`, effectively "practicing" the fact and resetting its decay curve.
- **Vitality scoring** — Combines recency, reinforcement count, and importance to determine how "alive" a fact is. Low-vitality facts are candidates for consolidation or pruning.

### Selective Attention: Reranking

**System:** LLM reranker on retrieval candidates

**Brain:** Selective attention (Broadbent, 1958; Treisman, 1964)

The brain doesn't process all sensory input equally — selective attention filters and prioritizes information based on current goals. The cocktail party effect demonstrates this: you can focus on one conversation in a noisy room by filtering out irrelevant signals.

The reranker acts as the attention filter:

- Raw retrieval signals (semantic, keyword, graph) produce a broad set of candidates — like the full sensory input
- The reranker evaluates each candidate against the query intent — like attentional selection
- Only the most relevant facts pass through to the context — like the attended signal

This is why the reranker uses an LLM (not just scoring heuristics): attention is goal-directed and requires understanding the **meaning** of both query and candidates.

### Working Memory: Context Budget

**System:** Token budget with hot/warm/cold tiers

**Brain:** Working memory (Baddeley & Hitch, 1974; Cowan, 2001)

Working memory has a strict capacity limit — Cowan estimates 4±1 items can be held in the focus of attention simultaneously. The context budget models this constraint:

- **Token budget** = capacity limit (you can't send infinite context to an LLM)
- **Hot tier** (50%) = focus of attention (the most relevant facts for the current query)
- **Warm tier** (30%) = activated long-term memory (supporting context that's available but not focal)
- **Cold tier** (20%) = peripheral activation (background facts that might become relevant)

This tiered approach ensures the LLM receives a focused, prioritized context rather than a noisy dump of everything the system knows.

---

## Summary Table

| System Component | Neuroscience Model | Key Reference |
|-----------------|-------------------|---------------|
| Write Pipeline | Encoding | — |
| Entity Resolution | Associative memory / Pattern completion | — |
| Reconciliation | Reconsolidation | Nader, Schiller, & LeDoux (2000) |
| Graph Retrieval | Spreading activation | Collins & Loftus (1975) |
| Recency Decay | Forgetting curve | Ebbinghaus (1885) |
| Background Jobs | Sleep consolidation | Diekelmann & Born (2010) |
| Importance Scoring | Synaptic homeostasis | Tononi & Cirelli (SHY) |
| Summary Refresh | Gist memory formation | — |
| Reranking | Selective attention | Broadbent (1958) |
| Context Budget | Working memory capacity | Baddeley & Hitch (1974); Cowan (2001) |
| Vitality/Reinforcement | Spacing effect | Ebbinghaus (1885) |

> **These are analogies, not claims:** The parallels above are architectural inspirations, not scientific claims. `arandu` is an engineering system, not a cognitive model. The brain is vastly more complex — these parallels highlight the design intuitions, not the biological mechanisms.

---

# Write Pipeline API

> **Advanced API:** These are advanced APIs for power users who want to interact with individual pipeline stages directly. Most users should use [`MemoryClient.write()`](../reference/index.md) instead, which orchestrates the full pipeline automatically.

All write pipeline functions are exported from `arandu.write`.

```python
from arandu.write import (
    classify_input, select_strategy, run_write_pipeline,
    canonicalize_attribute_key, normalize_key, validate_proposed_key,
    create_or_update_entity, get_entities_for_user, get_entity_by_key,
    detect_and_record_corrections, is_user_correction,
    get_pending, clear_pending, save_pending_execution, save_pending_selection,
)
```

---

## Pipeline Orchestrator

### run_write_pipeline

Executes the full write pipeline: **extract** -> **resolve** -> **reconcile** -> **upsert**.

```python
async def run_write_pipeline(
    session: AsyncSession,
    user_id: str,
    message: str,
    llm: LLMProvider,
    embeddings: EmbeddingProvider,
    config: MemoryConfig,
    source: str = "api",
    recent_messages: list[str] | None = None,
) -> dict
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session (caller manages transaction/commit). |
| `user_id` | `str` | Unique identifier for the user. |
| `message` | `str` | The user's message text. |
| `llm` | `LLMProvider` | Injected LLM provider. |
| `embeddings` | `EmbeddingProvider` | Injected embedding provider. |
| `config` | `MemoryConfig` | Memory configuration. |
| `source` | `str` | Source channel identifier (default `"api"`). |
| `recent_messages` | `list[str] | None` | Optional conversation context (last N messages) for resolving pronouns and anaphora. |

**Returns:** `dict` with keys `event_id`, `facts_added`, `facts_updated`, `facts_unchanged`, `facts_deleted`, `entities_resolved`, `duration_ms`.

The pipeline creates an immutable `MemoryEvent` first (survives even if later stages fail), then runs extraction, entity resolution, reconciliation, and upsert inside a savepoint for atomicity.

---

## Extraction Strategy

Pure functions (no LLM, no DB) that classify input text and choose an extraction mode based on heuristics.

### InputType

```python
class InputType(str, Enum):
    SHORT = "short"        # < 500 chars
    MEDIUM = "medium"      # 500-2000 chars, unstructured
    LONG = "long"          # > 2000 chars, unstructured
    STRUCTURED = "structured"  # > 500 chars with headers/bullets/tables
```

### ExtractionMode

```python
class ExtractionMode(str, Enum):
    SINGLE_SHOT = "single_shot"
    CHUNKED = "chunked"
```

### InputClassification

Result of input text analysis.

| Field | Type | Description |
|-------|------|-------------|
| `input_type` | `InputType` | Classified input type. |
| `char_count` | `int` | Number of characters. |
| `estimated_tokens` | `int` | Estimated token count (chars // 4). |
| `has_headers` | `bool` | Whether headers were detected. |
| `has_bullets` | `bool` | Whether bullet points were detected. |
| `has_tables` | `bool` | Whether tables were detected. |
| `section_count` | `int` | Number of text sections. |
| `line_count` | `int` | Number of lines. |

### ExtractionStrategy

Selected extraction strategy.

| Field | Type | Description |
|-------|------|-------------|
| `mode` | `ExtractionMode` | Extraction mode (single_shot or chunked). |
| `reason` | `str` | Human-readable reason for the selection. |
| `max_tokens_per_call` | `int` | Max tokens per LLM call. |
| `estimated_chunks` | `int` | Number of expected chunks (1 for single-shot). |
| `chunk_context_hint` | `str | None` | Hint about document type for chunked mode. |

### classify_input

Classify input text using heuristics (no LLM call).

```python
def classify_input(text: str) -> InputClassification
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `text` | `str` | Input text to classify. |

**Returns:** `InputClassification` with detected features.

```python
from arandu.write import classify_input, select_strategy

classification = classify_input("My wife's name is Ana and we live in Sao Paulo.")
print(classification.input_type)  # InputType.SHORT
print(classification.char_count)  # 49
```

### select_strategy

Select extraction strategy from a classification result.

```python
def select_strategy(classification: InputClassification) -> ExtractionStrategy
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `classification` | `InputClassification` | Result of `classify_input()`. |

**Returns:** `ExtractionStrategy` with mode and parameters.

```python
strategy = select_strategy(classification)
print(strategy.mode)             # ExtractionMode.SINGLE_SHOT
print(strategy.estimated_chunks) # 1
```

---

## Attribute Key Canonicalization

Pipeline: **exact match** -> **alias** -> **dotted variant** -> **suffix** -> **open catalog** -> **drop**.

### normalize_key

Normalize a raw attribute key: lowercase, strip, spaces/hyphens to dots. Underscores are preserved.

```python
def normalize_key(raw: str) -> str
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `raw` | `str` | Raw attribute key string. |

**Returns:** Normalized key string.

```python
from arandu.write import normalize_key

normalize_key("Personal Info")    # "personal.info"
normalize_key("food_preference")  # "food_preference"
```

### validate_proposed_key

Validate that a proposed key meets naming rules.

```python
def validate_proposed_key(
    key: str,
    extra_namespaces: set[str] | None = None,
) -> bool
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `key` | `str` | Normalized key to validate. |
| `extra_namespaces` | `set[str] | None` | Optional deployer-provided namespaces to accept. |

**Returns:** `True` if key is well-formed and in an allowed namespace.

### canonicalize_attribute_key

Canonicalize an attribute key via catalog, alias, and recovery strategies. This is an async function that queries the database for registry lookups.

```python
async def canonicalize_attribute_key(
    session: AsyncSession,
    user_id: str,
    raw_key: str,
    config: MemoryConfig,
) -> tuple[str | None, Literal["allow", "map", "propose", "drop"], dict[str, Any]]
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `raw_key` | `str` | Raw attribute key from extraction. |
| `config` | `MemoryConfig` | Memory configuration. |

**Returns:** Tuple of `(canonical_key, action, metadata)` where action is one of `"allow"`, `"map"`, `"propose"`, or `"drop"`.

---

## Entity Helpers

Async CRUD operations for `MemoryEntity` records using PostgreSQL `ON CONFLICT` upsert.

### create_or_update_entity

Create a `MemoryEntity` or update if it exists.

```python
async def create_or_update_entity(
    session: AsyncSession,
    user_id: str,
    canonical_key: str,
    display_name: str | None = None,
    entity_type: str = "other",
) -> MemoryEntity
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `canonical_key` | `str` | Canonical entity key. |
| `display_name` | `str | None` | Optional display name. |
| `entity_type` | `str` | Entity type (person, pet, place, etc.). Default `"other"`. |

**Returns:** The created or updated `MemoryEntity`.

### get_entity_by_key

Get a single `MemoryEntity` by user_id and canonical_key.

```python
async def get_entity_by_key(
    session: AsyncSession,
    user_id: str,
    canonical_key: str,
) -> MemoryEntity | None
```

**Returns:** `MemoryEntity` or `None` if not found.

### get_entities_for_user

List all `MemoryEntity` records for a user.

```python
async def get_entities_for_user(
    session: AsyncSession,
    user_id: str,
    active_only: bool = True,
) -> list[MemoryEntity]
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `active_only` | `bool` | If True, only return active entities. Default `True`. |

**Returns:** List of `MemoryEntity` records, ordered by `last_seen_at` descending.

---

## Correction Detection

Detects when users correct memory facts by comparing old vs new values for the same attribute_key.

### CorrectionResult

| Field | Type | Description |
|-------|------|-------------|
| `corrections_detected` | `int` | Number of corrections found. Default `0`. |
| `corrected_keys` | `list[str]` | Attribute keys that were corrected. |
| `facts_corrected_ids` | `list[str]` | IDs of old facts that were corrected. |

### is_user_correction

Check if a new fact corrects an old fact (same key, different value).

```python
def is_user_correction(old_fact: object, new_fact: object) -> bool
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `old_fact` | `object` | The existing fact being superseded. |
| `new_fact` | `object` | The new fact replacing it. |

**Returns:** `True` if this is a user correction.

### detect_and_record_corrections

Detect supersedes with value changes and increment correction count on old facts.

```python
async def detect_and_record_corrections(
    session: AsyncSession,
    user_id: str,
    saved_facts: list[Any],
) -> CorrectionResult
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `saved_facts` | `list[Any]` | List of newly saved MemoryFact objects. |

**Returns:** `CorrectionResult` with detection stats.

---

## Pending Operations

In-memory store for pending destructive operations with a 5-minute TTL. State is per-process and lost on restart.

### save_pending_selection

Save a pending selection when a search returned results awaiting user choice.

```python
def save_pending_selection(
    user_id: str,
    intent: str,
    transactions: list[Any],
    confirmation_text: str,
    edit_params: dict[str, Any] | None = None,
) -> None
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `user_id` | `str` | User identifier. |
| `intent` | `str` | The user's intent (delete, edit, etc.). |
| `transactions` | `list[Any]` | List of candidate transactions. |
| `confirmation_text` | `str` | Text to show user for confirmation. |
| `edit_params` | `dict | None` | Optional parameters for edit operations. |

### save_pending_execution

Save a pending execution when a destructive operation was blocked.

```python
def save_pending_execution(
    user_id: str,
    tool_calls: list[Any],
    search_result: str,
    confirmation_text: str,
) -> None
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `user_id` | `str` | User identifier. |
| `tool_calls` | `list[Any]` | Blocked tool calls. |
| `search_result` | `str` | Context from the search. |
| `confirmation_text` | `str` | Text to show user for confirmation. |

### get_pending

Get pending operation if it exists and hasn't expired (5-minute TTL).

```python
def get_pending(user_id: str) -> dict[str, Any] | None
```

**Returns:** Pending operation dict, or `None` if expired/absent.

### clear_pending

Remove pending operation after execution or cancellation.

```python
def clear_pending(user_id: str) -> None
```

---

# Read Pipeline API

> **Advanced API:** These are advanced APIs for power users who want to interact with individual retrieval stages directly. Most users should use [`MemoryClient.retrieve()`](../reference/index.md) instead, which orchestrates the full multi-signal pipeline automatically.

All read pipeline functions are exported from `arandu.read`.

```python
from arandu.read import (
    run_read_pipeline,
    plan_retrieval, expand_query,
    retrieve_relevant_events, compute_pattern_signal,
    retrieve_graph_facts, spread_activation,
    compress_context, compress_broad_context,
    materialize_emotional_trends, get_emotional_summary_for_context,
    compute_dynamic_importance,
    generate_optimized_directives, check_directive_contradiction,
    effective_confidence, invalidate_directive_cache,
)
```

---

## Pipeline Orchestrator

### run_read_pipeline

Executes the full read pipeline: **agent** -> **retrieve (multi-signal)** -> **rerank** -> **format**.

Multi-signal retrieval runs semantic + keyword + graph in parallel via `asyncio.gather()`. The retrieval agent plans which entities to use for the graph signal and reformulates the query.

```python
async def run_read_pipeline(
    session: AsyncSession,
    user_id: str,
    query: str,
    llm: LLMProvider,
    embeddings: EmbeddingProvider,
    config: MemoryConfig,
) -> ReadResult
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session (caller manages transaction). |
| `user_id` | `str` | User identifier. |
| `query` | `str` | The query to search memory for. |
| `llm` | `LLMProvider` | Injected LLM provider. |
| `embeddings` | `EmbeddingProvider` | Injected embedding provider. |
| `config` | `MemoryConfig` | Memory configuration. |

**Returns:** `ReadResult` with `facts` (list of `ScoredFact`), `context` (prompt-ready string), `total_candidates`, and `duration_ms`.

---

## Retrieval Agent

The retrieval agent is an LLM planner that analyzes the user query and decides the retrieval strategy before any search happens.

### PatternQuery

A pattern-based query for keyword signal matching.

| Field | Type | Description |
|-------|------|-------------|
| `entity_pattern` | `str` | SQL LIKE pattern for entity_key matching. |
| `attribute_filter` | `str | None` | Optional attribute key filter (always `None` in V5). |

### RetrievalPlan

Output of the retrieval agent. V5 runs all signals (semantic, graph, keyword) in parallel.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `strategy` | `str` | `"multi_signal"` | `"multi_signal"` (default) or `"skip"`. |
| `entities` | `list[str]` | `[]` | Detected entity_keys for graph signal. |
| `pattern_queries` | `list[PatternQuery]` | `[]` | Pattern queries for keyword signal. |
| `similarity_query` | `str | None` | `None` | Reformulated query for semantic signal. |
| `max_facts` | `int` | `50` | Budget per signal. |
| `reason` | `str` | `""` | Why this plan was chosen. |
| `latency_ms` | `float` | `0.0` | Time spent planning. |
| `as_of_range` | `tuple[datetime, datetime] | None` | `None` | Optional time-travel window. |
| `broad_query` | `bool` | `False` | True for comprehensive queries. |

### plan_retrieval

Call LLM to decide retrieval strategy. Falls back to `multi_signal` on timeout/parse/API error.

```python
async def plan_retrieval(
    session: AsyncSession,
    user_id: str,
    query_text: str,
    llm: LLMProvider,
    *,
    session_context: Any | None = None,
) -> RetrievalPlan
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `query_text` | `str` | The user's query. |
| `llm` | `LLMProvider` | Injected LLM provider. |
| `session_context` | `Any | None` | Optional session digest with anaphora context. |

**Returns:** `RetrievalPlan` with strategy, entities, and query parameters.

---

## Query Expansion

Post-processes a `RetrievalPlan` with entity priming -- resolves entities mentioned in the query via the knowledge graph (aliases + relationships) and injects context terms.

### ExpandedQuery

| Field | Type | Description |
|-------|------|-------------|
| `primed_entities` | `list[str]` | Entity keys discovered via alias + KG priming. |
| `temporal_range` | `tuple[datetime, datetime] | None` | Resolved date range (from retrieval agent). |
| `expanded_terms` | `list[str]` | Additional context terms from entity facts. |

### expand_query

Expand a retrieval plan with entity priming. Fail-safe: any exception returns an empty `ExpandedQuery`.

```python
async def expand_query(
    session: AsyncSession,
    user_id: str,
    query: str,
    plan: RetrievalPlan,
    llm: object,
) -> ExpandedQuery
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `query` | `str` | Original user query text. |
| `plan` | `RetrievalPlan` | RetrievalPlan from the retrieval agent. |
| `llm` | `object` | LLM provider (reserved for future use). |

**Returns:** `ExpandedQuery` with primed entities, temporal range, and expanded terms.

---

## Fact Retrieval

### retrieve_relevant_events

Retrieve relevant events by embedding similarity + recency scoring.

```python
async def retrieve_relevant_events(
    session: AsyncSession,
    user_id: str,
    query_embedding: list[float],
    config: MemoryConfig,
    limit: int | None = None,
) -> list[dict[str, Any]]
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `query_embedding` | `list[float]` | Query embedding vector. |
| `config` | `MemoryConfig` | Memory configuration. |
| `limit` | `int | None` | Max events to return. |

**Returns:** List of event dicts with `date`, `text`, `score`, `event_id`.

### compute_pattern_signal

Boost facts with high reinforcement counts (pattern signal). Facts that have been confirmed/reinforced multiple times get a small additive score boost (up to 0.1).

```python
def compute_pattern_signal(
    candidates: list[RetrievalCandidate],
) -> list[RetrievalCandidate]
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `candidates` | `list[RetrievalCandidate]` | Current ranked candidates. |

**Returns:** Candidates with updated scores, sorted by `final_score`.

---

## Graph Retrieval

BFS 2-hop traversal on the `MemoryEntityRelationship` knowledge graph with relevance pruning.

### GraphRetrievalResult

| Field | Type | Description |
|-------|------|-------------|
| `facts` | `list[dict[str, Any]]` | Scored fact dicts with `source="graph"`. |
| `neighbor_keys` | `list[str]` | Entity keys discovered via BFS. |
| `edges_traversed` | `int` | Total edges examined during BFS. |
| `edges` | `list[dict[str, Any]]` | Deduplicated edge dicts with display names. |

### retrieve_graph_facts

BFS 2-hop retrieval with composite scoring: `edge_strength * recency * edge_recency * query_bonus`.

```python
async def retrieve_graph_facts(
    session: AsyncSession,
    user_id: str,
    entity_keys: list[str],
    *,
    min_confidence: float = 0.3,
    as_of_start: datetime | None = None,
    as_of_end: datetime | None = None,
    broad_query: bool = False,
    max_facts: int | None = None,
    query_text: str = "",
    min_edge_strength: float = 0.5,
) -> GraphRetrievalResult
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `session` | `AsyncSession` | -- | Database session. |
| `user_id` | `str` | -- | User identifier. |
| `entity_keys` | `list[str]` | -- | Seed entity_keys to start BFS from. |
| `min_confidence` | `float` | `0.3` | Minimum fact confidence threshold. |
| `as_of_start` | `datetime | None` | `None` | Start of temporal window. |
| `as_of_end` | `datetime | None` | `None` | End of temporal window. |
| `broad_query` | `bool` | `False` | When True, allows expanded budget. |
| `max_facts` | `int | None` | `None` | Override default limit (30). |
| `query_text` | `str` | `""` | Original query text for query_bonus scoring. |
| `min_edge_strength` | `float` | `0.5` | Minimum edge strength for hop 2+ pruning. |

**Returns:** `GraphRetrievalResult` with scored facts and graph metadata.

---

## Spreading Activation

Expands context from seed facts by following `entity_key`, `cluster_id`, and knowledge graph relationship links. Uses dynamic importance scoring with decay per hop.

### SpreadingActivationResult

| Field | Type | Description |
|-------|------|-------------|
| `candidates` | `list[RetrievalCandidate]` | Expanded candidates from hop 1-2. |
| `meta_observations` | `list[Any]` | Relevant meta-observations referencing seed facts. |
| `entities_explored` | `list[str]` | Entity keys explored during spreading. |
| `clusters_explored` | `list[str]` | Cluster IDs explored during spreading. |
| `hop1_count` | `int` | Number of facts found in hop 1. |
| `hop2_count` | `int` | Number of facts found in hop 2. |
| `kg_relationships_explored` | `int` | Number of KG relationships traversed. |

### spread_activation

Expand context from seed facts via entity_key, cluster_id, and KG relationships (hop 1-2).

```python
async def spread_activation(
    session: AsyncSession,
    user_id: str,
    seed_fact_ids: list[str],
    config: MemoryConfig,
    *,
    seed_scores: dict[str, float] | None = None,
    allowed_keys: set[str] | None = None,
) -> list[RetrievalCandidate]
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `seed_fact_ids` | `list[str]` | IDs of seed facts to expand from. |
| `config` | `MemoryConfig` | Memory configuration with spreading activation params. |
| `seed_scores` | `dict[str, float] | None` | Optional dict mapping seed fact ID to score. |
| `allowed_keys` | `set[str] | None` | Optional set of allowed attribute keys. |

**Returns:** List of `RetrievalCandidate` objects from spreading activation. Fail-safe: returns empty list on error.

---

## Context Compression

Builds a prompt-ready context string from scored facts, events, clusters, and meta-observations using a tiered system: **Hot** (Tier 1), **Warm** (Tier 2), **Cold** (Tier 3).

### CompressedContext

| Field | Type | Description |
|-------|------|-------------|
| `context_text` | `str` | Final prompt-ready context string. |
| `hot_count` | `int` | Number of facts in hot tier (Tier 1). |
| `warm_count` | `int` | Number of facts in warm tier (Tier 2). |
| `cold_count` | `int` | Number of items in cold tier (Tier 3). |
| `total_tokens` | `int` | Estimated token count of context_text. |

### compress_context

Build tiered context text within token budget.

```python
async def compress_context(
    facts: list[dict[str, Any]],
    events: list[dict[str, Any]],
    config: MemoryConfig,
    *,
    clusters: list[Any] | None = None,
    meta_observations: list[Any] | None = None,
    stale_keys: set[str] | None = None,
    stale_threshold_days: int = 90,
    now: datetime | None = None,
) -> CompressedContext
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `facts` | `list[dict]` | Scored fact dicts (must have `score`, `fact`, `entity`, `attribute`, `value`, `date` keys). |
| `events` | `list[dict]` | Event dicts with `date` and `text` keys. |
| `config` | `MemoryConfig` | Memory configuration with token budget and tier ratios. |
| `clusters` | `list | None` | Optional cluster objects. |
| `meta_observations` | `list | None` | Optional meta-observation objects. |
| `stale_keys` | `set[str] | None` | Attribute keys considered always-stale. |
| `stale_threshold_days` | `int` | Days after which a fact is stale (default 90). |
| `now` | `datetime | None` | Current timestamp (defaults to UTC now). |

**Returns:** `CompressedContext` with tiered context text.

### compress_broad_context

Build context for broad queries using clusters as primary unit.

```python
async def compress_broad_context(
    cluster_facts: dict[str, list[dict[str, Any]]],
    clusters: list[Any],
    config: MemoryConfig,
    *,
    meta_observations: list[Any] | None = None,
    events: list[dict[str, Any]] | None = None,
) -> CompressedContext
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `cluster_facts` | `dict[str, list[dict]]` | Mapping of cluster_label to fact dicts. |
| `clusters` | `list[Any]` | Cluster objects with `label`, `summary_text`, `fact_count`. |
| `config` | `MemoryConfig` | Memory configuration. |
| `meta_observations` | `list | None` | Optional meta-observation objects. |
| `events` | `list[dict] | None` | Optional event dicts. |

**Returns:** `CompressedContext` with cluster-first context text.

---

## Emotional Trends

Materializes emotional trends from memory events and provides formatted summaries for injection into retrieval context.

### EmotionalTrendsResult

| Field | Type | Description |
|-------|------|-------------|
| `emotion_counts` | `dict[str, int]` | Mapping of emotion to occurrence count. |
| `trend_direction` | `str` | `"increasing"`, `"decreasing"`, or `"stable"`. |
| `dominant_emotion` | `str | None` | Most frequent emotion, or None. |
| `trigger_keywords` | `list[str]` | Top keywords from high-intensity events. |
| `avg_intensity` | `float` | Average emotion intensity across events. |
| `dominant_intensity` | `float` | Average intensity of the dominant emotion. |
| `dominant_energy` | `str` | Predominant energy level (high/medium/low). |
| `events_analyzed` | `int` | Number of events analyzed. |
| `observation_created` | `bool` | Whether a meta-observation was created/updated. |
| `observation_id` | `str | None` | ID of the created/updated observation. |

### materialize_emotional_trends

Aggregate emotion data from events, detect trends, and materialize as a meta-observation.

```python
async def materialize_emotional_trends(
    session: AsyncSession,
    user_id: str,
    config: MemoryConfig,
) -> EmotionalTrendsResult
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `config` | `MemoryConfig` | Memory configuration with trend window and min events. |

**Returns:** `EmotionalTrendsResult` with aggregated trend data.

### get_emotional_summary_for_context

Return formatted emotional summary for injection into retrieval context. Returns `None` if no recent (7-day) active emotional trend exists.

```python
async def get_emotional_summary_for_context(
    session: AsyncSession,
    user_id: str,
) -> str | None
```

**Returns:** Formatted summary string, or `None`.

---

## Dynamic Importance

### compute_dynamic_importance

Compute dynamic importance score for a memory fact. Inspired by cognitive memory strength models.

Components:

- **retrieval_boost**: `log(1 + times_retrieved)` -- saturates gradually
- **recency_of_use_boost**: decays from `last_retrieved_at` (half-life 7 days)
- **correction_penalty**: `0.8^n` for each user correction
- **pattern_boost**: 1.3x if fact is part of an active meta-observation

```python
def compute_dynamic_importance(
    base_importance: float,
    times_retrieved: int,
    last_retrieved_at: datetime | None,
    user_correction_count: int,
    is_in_active_pattern: bool,
    now: datetime | None = None,
) -> float
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `base_importance` | `float` | Base importance score (typically 0.5). |
| `times_retrieved` | `int` | Number of times this fact has been retrieved. |
| `last_retrieved_at` | `datetime | None` | When the fact was last retrieved. |
| `user_correction_count` | `int` | Number of user corrections on this fact. |
| `is_in_active_pattern` | `bool` | Whether fact is part of an active meta-observation. |
| `now` | `datetime | None` | Current timestamp (defaults to UTC now). |

**Returns:** Dynamic importance score, clamped to `[0.05, 3.0]`.

---

## Procedural Memory

LLM-optimized behavioral directives system that compresses persona + learned behavioral preferences into cohesive instruction blocks.

### DirectiveBlock

| Field | Type | Description |
|-------|------|-------------|
| `text` | `str` | Cohesive behavioral instructions block. |
| `directive_count` | `int` | Number of active directives used. |
| `cache_hit` | `bool` | Whether this was served from cache. |

### ContradictionResult

| Field | Type | Description |
|-------|------|-------------|
| `has_contradiction` | `bool` | Whether a contradiction was found. |
| `conflicting_directive` | `str | None` | Title of the conflicting directive. |
| `resolution` | `str | None` | Explanation of how the contradiction was resolved. |

### generate_optimized_directives

Generate an LLM-optimized behavioral instructions block by integrating persona + learned directives.

```python
async def generate_optimized_directives(
    session: AsyncSession,
    user_id: str,
    llm_provider: LLMProvider,
    config: MemoryConfig,
    *,
    persona_text: str = "",
) -> DirectiveBlock
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `llm_provider` | `LLMProvider` | Injected LLM provider. |
| `config` | `MemoryConfig` | Memory configuration. |
| `persona_text` | `str` | Optional persona description. |

**Returns:** `DirectiveBlock` with generated text. Result is cached by hash of directive IDs + reinforcement counts. Fail-safe: returns empty `DirectiveBlock` on error.

### check_directive_contradiction

Check a new directive against existing ones for contradictions. Uses embedding similarity as pre-filter, then LLM as judge.

```python
async def check_directive_contradiction(
    session: AsyncSession,
    user_id: str,
    new_directive: str,
    embedding_provider: EmbeddingProvider,
    llm_provider: LLMProvider,
    *,
    similarity_threshold: float = 0.80,
) -> ContradictionResult
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `session` | `AsyncSession` | Database session. |
| `user_id` | `str` | User identifier. |
| `new_directive` | `str` | Text of the new directive to check. |
| `embedding_provider` | `EmbeddingProvider` | Injected embedding provider. |
| `llm_provider` | `LLMProvider` | Injected LLM provider. |
| `similarity_threshold` | `float` | Minimum similarity to trigger LLM check (default 0.80). |

**Returns:** `ContradictionResult` with check outcome. Fail-safe: returns no contradiction on error.

### effective_confidence

Apply temporal decay to directive confidence. Formula: `base_confidence * 0.95^weeks`.

```python
def effective_confidence(
    base_confidence: float,
    created_at: datetime,
    now: datetime | None = None,
) -> float
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `base_confidence` | `float` | Original confidence value (0.0-1.0). |
| `created_at` | `datetime` | When the directive was created. |
| `now` | `datetime | None` | Current timestamp (defaults to UTC now). |

**Returns:** Decayed confidence, floored at 0.10.

### invalidate_directive_cache

Manually invalidate the directive cache for a user.

```python
def invalidate_directive_cache(user_id: str) -> None
```

---

# Database Utilities

The `arandu.db` module provides low-level database setup functions. These are used internally by `MemoryClient` but are available for advanced use cases where you need direct control over the database engine and session lifecycle.

```python
from arandu.db import create_engine, create_session_factory, init_db
```

---

## create_engine

Create an async SQLAlchemy engine from a connection string.

Automatically converts `postgresql://` to `postgresql+psycopg://` if the async driver prefix is missing.

```python
def create_engine(database_url: str) -> AsyncEngine
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `database_url` | `str` | PostgreSQL connection string. |

**Returns:** `AsyncEngine` instance.

```python
from arandu.db import create_engine

engine = create_engine("postgresql://user:pass@localhost:5432/mydb")
# Internally becomes: postgresql+psycopg://user:pass@localhost:5432/mydb
```

---

## create_session_factory

Create an async session factory bound to the given engine.

```python
def create_session_factory(engine: AsyncEngine) -> async_sessionmaker[AsyncSession]
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `engine` | `AsyncEngine` | The async engine to bind sessions to. |

**Returns:** `async_sessionmaker[AsyncSession]` with `expire_on_commit=False`.

```python
from arandu.db import create_engine, create_session_factory

engine = create_engine("postgresql://user:pass@localhost:5432/mydb")
SessionFactory = create_session_factory(engine)

async with SessionFactory() as session:
    # Use session for queries
    ...
```

---

## init_db

Create all memory tables in the consumer's database.

Uses `Base.metadata.create_all` -- safe to call multiple times (creates only tables that don't already exist). This ensures all SQLAlchemy model classes are registered before creating tables.

```python
async def init_db(engine: AsyncEngine) -> None
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `engine` | `AsyncEngine` | The async engine to create tables on. |

```python
from arandu.db import create_engine, init_db

engine = create_engine("postgresql://user:pass@localhost:5432/mydb")
await init_db(engine)
```

---

## Database Schema

The SDK defines its SQLAlchemy models in `arandu.models`. Key tables include:

| Table | Description |
|-------|-------------|
| `memory_events` | Immutable event records (user messages with embeddings). |
| `memory_facts` | Extracted facts with entity/attribute/value triples and embeddings. |
| `memory_entities` | Entity registry (people, places, pets, etc.). |
| `memory_entity_aliases` | Aliases for entity resolution. |
| `memory_entity_relationships` | Knowledge graph edges between entities. |
| `memory_clusters` | Semantic clusters of related facts. |
| `memory_meta_observations` | Detected patterns, insights, and behavioral preferences. |
| `memory_attribute_registry` | Custom attribute key registry per user. |
| `session_observations` | L1 session-level observations from the observer. |

All tables use UUID primary keys and include `user_id` for multi-tenant isolation. The `memory_facts` and `memory_events` tables have `pgvector` embedding columns for semantic search.

> **Schema Management:** For production deployments, consider using Alembic migrations instead of `init_db()`. The `init_db()` function is convenient for development and testing but does not handle schema migrations for existing tables.

---

# Data Types Reference

This page documents all dataclasses, enums, and result types used across the write pipeline, read pipeline, and background jobs that are not covered in the main [API Reference](../reference/index.md).

---

## Write Pipeline Types

### InputType

```python
class InputType(str, Enum)
```

Input text classification types, determined by heuristics in `classify_input()`.

| Value | Description |
|-------|-------------|
| `SHORT` | Less than 500 characters. |
| `MEDIUM` | 500-2000 characters, unstructured. |
| `LONG` | More than 2000 characters, unstructured. |
| `STRUCTURED` | More than 500 characters with headers, bullets, or tables. |

### ExtractionMode

```python
class ExtractionMode(str, Enum)
```

| Value | Description |
|-------|-------------|
| `SINGLE_SHOT` | Single LLM call for extraction. |
| `CHUNKED` | Input is split into chunks, each processed separately. |

### InputClassification

```python
@dataclass
class InputClassification
```

Result of `classify_input()`. See [Write Pipeline API](write-api.md#inputclassification) for full field reference.

### ExtractionStrategy

```python
@dataclass
class ExtractionStrategy
```

Result of `select_strategy()`. See [Write Pipeline API](write-api.md#extractionstrategy) for full field reference.

### CorrectionResult

```python
@dataclass
class CorrectionResult
```

Result of correction detection.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `corrections_detected` | `int` | `0` | Number of corrections found. |
| `corrected_keys` | `list[str]` | `[]` | Attribute keys that were corrected. |
| `facts_corrected_ids` | `list[str]` | `[]` | IDs of old facts that were corrected. |

---

## Read Pipeline Types

### ExpandedQuery

```python
@dataclass
class ExpandedQuery
```

Result of query expansion (entity priming).

| Field | Type | Description |
|-------|------|-------------|
| `primed_entities` | `list[str]` | Entity keys discovered via alias + KG priming. |
| `temporal_range` | `tuple[datetime, datetime] | None` | Resolved date range. |
| `expanded_terms` | `list[str]` | Additional context terms from entity facts. |

### PatternQuery

```python
@dataclass
class PatternQuery
```

A pattern-based query for keyword signal matching.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `entity_pattern` | `str` | -- | SQL LIKE pattern for entity_key matching. |
| `attribute_filter` | `str | None` | `None` | Optional attribute key filter. |

### RetrievalPlan

```python
@dataclass
class RetrievalPlan
```

Output of the retrieval agent LLM planner. See [Read Pipeline API](read-api.md#retrievalplan) for full field reference.

### GraphRetrievalResult

```python
@dataclass
class GraphRetrievalResult
```

Result of graph-based BFS 2-hop retrieval.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `facts` | `list[dict[str, Any]]` | `[]` | Scored fact dicts with `source="graph"`. |
| `neighbor_keys` | `list[str]` | `[]` | Entity keys discovered via BFS. |
| `edges_traversed` | `int` | `0` | Total edges examined during BFS. |
| `edges` | `list[dict[str, Any]]` | `[]` | Deduplicated edge dicts with display names. |

### SpreadingActivationResult

```python
@dataclass
class SpreadingActivationResult
```

Result of spreading activation expansion.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `candidates` | `list[RetrievalCandidate]` | `[]` | Expanded candidates from hop 1-2. |
| `meta_observations` | `list[Any]` | `[]` | Relevant meta-observations referencing seed facts. |
| `entities_explored` | `list[str]` | `[]` | Entity keys explored during spreading. |
| `clusters_explored` | `list[str]` | `[]` | Cluster IDs explored during spreading. |
| `hop1_count` | `int` | `0` | Number of facts found in hop 1. |
| `hop2_count` | `int` | `0` | Number of facts found in hop 2. |
| `kg_relationships_explored` | `int` | `0` | Number of KG relationships traversed. |

### CompressedContext

```python
@dataclass
class CompressedContext
```

Result of context compression (tiered hot/warm/cold).

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `context_text` | `str` | `""` | Final prompt-ready context string. |
| `hot_count` | `int` | `0` | Number of facts in hot tier (Tier 1). |
| `warm_count` | `int` | `0` | Number of facts in warm tier (Tier 2). |
| `cold_count` | `int` | `0` | Number of items in cold tier (Tier 3). |
| `total_tokens` | `int` | `0` | Estimated token count of context_text. |

### EmotionalTrendsResult

```python
@dataclass
class EmotionalTrendsResult
```

Result of emotional trend materialization.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `emotion_counts` | `dict[str, int]` | `{}` | Mapping of emotion to occurrence count. |
| `trend_direction` | `str` | `"stable"` | `"increasing"`, `"decreasing"`, or `"stable"`. |
| `dominant_emotion` | `str | None` | `None` | Most frequent emotion. |
| `trigger_keywords` | `list[str]` | `[]` | Top keywords from high-intensity events. |
| `avg_intensity` | `float` | `0.0` | Average emotion intensity. |
| `dominant_intensity` | `float` | `0.0` | Average intensity of the dominant emotion. |
| `dominant_energy` | `str` | `"medium"` | Predominant energy level. |
| `events_analyzed` | `int` | `0` | Number of events analyzed. |
| `observation_created` | `bool` | `False` | Whether a meta-observation was created/updated. |
| `observation_id` | `str | None` | `None` | ID of the created/updated observation. |

### DirectiveBlock

```python
@dataclass
class DirectiveBlock
```

Result of directive generation (procedural memory).

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `text` | `str` | `""` | Cohesive behavioral instructions block. |
| `directive_count` | `int` | `0` | Number of active directives used. |
| `cache_hit` | `bool` | `False` | Whether this was served from cache. |

### ContradictionResult

```python
@dataclass
class ContradictionResult
```

Result of contradiction check between directives.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `has_contradiction` | `bool` | `False` | Whether a contradiction was found. |
| `conflicting_directive` | `str | None` | `None` | Title of the conflicting directive. |
| `resolution` | `str | None` | `None` | Explanation of the resolution. |

---

## Background Job Result Types

### ClusteringResult

```python
@dataclass
class ClusteringResult
```

Result of fact clustering.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `clusters_created` | `int` | `0` | Number of new clusters created. |
| `clusters_reinforced` | `int` | `0` | Number of existing clusters updated. |
| `summaries_generated` | `int` | `0` | Number of cluster summaries generated via LLM. |
| `facts_assigned` | `int` | `0` | Number of facts assigned to clusters. |

### CommunityDetectionResult

```python
@dataclass
class CommunityDetectionResult
```

Result of cross-entity community detection.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `communities_created` | `int` | `0` | New community observations created. |
| `communities_reinforced` | `int` | `0` | Existing community observations reinforced. |
| `clusters_in_communities` | `int` | `0` | Total clusters assigned to communities. |
| `skipped` | `bool` | `False` | Whether detection was skipped. |
| `skip_reason` | `str | None` | `None` | Reason for skipping. |

### ConsolidationResult

```python
@dataclass
class ConsolidationResult
```

Result of L2/L3 consolidation.

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `events_processed` | `int` | `0` | Number of events analyzed. |
| `observations_created` | `int` | `0` | New meta-observations created. |
| `observations_reinforced` | `int` | `0` | Existing observations reinforced. |
| `skipped` | `bool` | `False` | Whether consolidation was skipped. |
| `skip_reason` | `str | None` | `None` | Reason for skipping. |

### MemifyResult

```python
@dataclass
class MemifyResult
```

Result of the memify pipeline (vitality scoring, staleness marking, edge management).

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `facts_scored` | `int` | `0` | Number of facts scored for vitality. |
| `facts_marked_stale` | `int` | `0` | Number of facts marked as stale. |
| `edges_reinforced` | `int` | `0` | Number of KG edges reinforced. |
| `merges_executed` | `int` | `0` | Number of entity merges executed. |

### EntityImportanceResult

```python
@dataclass
class EntityImportanceResult
```

Result of entity importance scoring (sleep-time compute).

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `entities_scored` | `int` | `0` | Number of entities scored. |
| `top_entities` | `list[tuple[str, float]]` | `[]` | Top entities by score (key, score) pairs. |

### SummaryRefreshResult

```python
@dataclass
class SummaryRefreshResult
```

Result of entity summary refresh (sleep-time compute).

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `summaries_refreshed` | `int` | `0` | Number of summaries generated. |
| `summaries_skipped` | `int` | `0` | Number of entities skipped. |

---

## Background Functions

### tag_event_emotion

Infer emotion, intensity, and energy from event text via LLM.

```python
async def tag_event_emotion(
    event_text: str,
    llm: LLMProvider,
) -> dict[str, Any] | None
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `event_text` | `str` | Text to analyze. |
| `llm` | `LLMProvider` | Injected LLM provider. |

**Returns:** Dict with `emotion`, `intensity`, `energy` keys, or `None` on failure.

```python
from arandu.background import tag_event_emotion

result = await tag_event_emotion("I'm so happy today!", llm)
# {"emotion": "joy", "intensity": 0.85, "energy": "high"}
```

---

# Configuration

All memory system parameters are configured through a single `MemoryConfig` dataclass. Every parameter has a sensible default — override only what matters for your use case.

```python
from arandu import MemoryClient, MemoryConfig

config = MemoryConfig(
    extraction_mode="single_pass",
    topk_facts=30,
    enable_reranker=True,
)

memory = MemoryClient(
    database_url="postgresql+psycopg://...",
    llm=provider,
    embeddings=provider,
    config=config,
)
```

---

## Extraction

Parameters controlling how facts, entities, and relationships are extracted from messages.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `extraction_model` | `str` | `"gpt-4o"` | LLM model used for fact extraction |
| `extraction_mode` | `str` | `"multi_pass"` | Extraction strategy: `"multi_pass"` (3 LLM calls, higher accuracy) or `"single_pass"` (1 call, faster) |
| `extraction_timeout_sec` | `float` | `30.0` | Timeout per LLM call during extraction |
| `multi_pass_entity_threshold` | `int` | `3` | Entity count at or below which multi-pass falls back to single-pass |
| `entity_batch_size` | `int` | `5` | Number of entities per fact-extraction batch in multi-pass mode |
| `reflexion_timeout_sec` | `float` | `10.0` | Timeout for the optional reflexion (quality review) pass in single-pass mode |

**Tips:**

- Use `"single_pass"` for simple messages with 1-2 entities — it's faster and cheaper
- Increase `entity_batch_size` if your messages mention many entities at once
- Lower `extraction_timeout_sec` if you need faster responses at the cost of potentially missed extractions

---

## Entity Resolution

Parameters controlling how extracted entity mentions are resolved to canonical entity records.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `fuzzy_threshold` | `float` | `0.85` | Cosine similarity threshold for direct fuzzy match (above this = automatic match) |
| `enable_llm_resolution` | `bool` | `True` | Whether to use an LLM for ambiguous fuzzy matches (0.50–0.85 range) |
| `entity_resolution_model` | `str` | `"gpt-4o-mini"` | LLM model for entity disambiguation |
| `entity_resolution_timeout_sec` | `float` | `10.0` | Timeout for LLM entity resolution calls |

**Tips:**

- Lower `fuzzy_threshold` (e.g., 0.75) to be more aggressive in matching similar entity names
- Set `enable_llm_resolution=False` to skip the LLM fallback for ambiguous matches (faster, but may create more duplicate entities)

---

## Reconciliation

Parameters controlling how new facts are compared against existing knowledge.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `reconciliation_model` | `str` | `"gpt-4o-mini"` | LLM model for reconciliation decisions (ADD/UPDATE/NOOP/DELETE) |
| `reconciliation_timeout_sec` | `float` | `10.0` | Timeout for reconciliation LLM calls |

---

## Retrieval

Parameters controlling how facts are retrieved in response to queries.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `topk_facts` | `int` | `20` | Maximum number of facts to return |
| `topk_events` | `int` | `8` | Maximum number of events to consider for context |
| `event_max_scan` | `int` | `200` | Maximum events to scan during retrieval |
| `min_similarity` | `float` | `0.20` | Minimum cosine similarity for semantic search results |
| `min_confidence` | `float` | `0.55` | Minimum fact confidence to include in results |
| `recency_half_life_days` | `int` | `14` | Half-life (in days) for exponential recency decay |
| `context_budget_tokens` | `int` | `3000` | Total token budget for the retrieval pipeline |
| `enable_reranker` | `bool` | `True` | Whether to use LLM reranking on retrieval results |
| `reranker_model` | `str` | `"gpt-4o-mini"` | LLM model for reranking |
| `reranker_timeout_sec` | `float` | `5.0` | Timeout for reranker LLM calls |

**Tips:**

- Increase `topk_facts` (e.g., 50) for broader context at the cost of more noise
- Lower `min_similarity` (e.g., 0.10) to catch more distant semantic matches
- Increase `recency_half_life_days` (e.g., 30) if older facts should remain relevant longer
- Set `enable_reranker=False` for faster retrieval when precision is less critical

---

## Score Weights

Weights for the hybrid ranking formula that combines multiple retrieval signals.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `score_weights` | `dict` | `{"semantic": 0.70, "recency": 0.20, "importance": 0.10}` | Weights for each scoring signal (must sum to ~1.0) |

```python
config = MemoryConfig(
    score_weights={
        "semantic": 0.60,   # reduce semantic, boost other signals
        "recency": 0.25,
        "importance": 0.15,
    },
)
```

**Tips:**

- Increase `"recency"` weight for applications where freshness matters more than semantic relevance
- Increase `"importance"` weight to favor well-established entities and frequently mentioned facts

---

## Confidence

Parameters controlling confidence levels assigned to extracted facts.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `confidence_level_map` | `dict` | `{"explicit_statement": 0.95, "strong_inference": 0.80, "weak_inference": 0.60, "speculation": 0.40}` | Mapping from confidence level names to numeric scores |
| `confidence_default` | `float` | `0.60` | Default confidence when the LLM doesn't specify a level |

---

## Spreading Activation

Parameters controlling how context expands from seed facts along entity relationships.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `spreading_activation_hops` | `int` | `2` | Maximum number of relationship hops from seed facts |
| `spreading_decay_factor` | `float` | `0.50` | Score decay multiplier per hop (0.5 = halved each hop) |
| `spreading_max_related_entities` | `int` | `5` | Maximum related entities to follow per seed |
| `spreading_facts_per_entity` | `int` | `3` | Maximum facts to pull from each related entity |

---

## Context Compression

Parameters controlling how retrieved facts are compressed into the final context string.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `context_max_tokens` | `int` | `2000` | Maximum tokens in the formatted context output |
| `hot_tier_ratio` | `float` | `0.50` | Share of token budget for highest-scored facts |
| `warm_tier_ratio` | `float` | `0.30` | Share of token budget for supporting facts |

The remaining budget (1 - hot - warm = 0.20) goes to the cold tier (background context).

---

## Emotional Trends

Parameters for detecting emotional patterns in user messages.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `emotional_trend_window_days` | `int` | `30` | Window for analyzing emotional trends |
| `emotional_trend_min_events` | `int` | `5` | Minimum events required to detect a trend |

---

## Clustering

Parameters for the fact clustering background job.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `cluster_max_age_days` | `int` | `90` | Maximum age of facts to include in clustering |
| `cluster_min_facts` | `int` | `2` | Minimum facts per cluster |
| `community_similarity_threshold` | `float` | `0.75` | Cosine similarity threshold for grouping clusters into communities |
| `community_min_clusters` | `int` | `2` | Minimum clusters to form a community |

---

## Consolidation

Parameters for the consolidation background job.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `consolidation_min_events` | `int` | `3` | Minimum events before running consolidation |
| `consolidation_lookback_days` | `int` | `7` | How far back (in days) to look for patterns |

---

## Sleep-Time Compute

Parameters for background importance scoring and summary refresh.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `importance_recency_halflife_days` | `int` | `30` | Half-life for recency signal in importance scoring |
| `summary_refresh_interval_days` | `int` | `7` | Days before an entity summary is considered stale |

---

## Memify

Parameters for the memify (episodic → procedural knowledge) background job.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `vitality_stale_threshold` | `float` | `0.2` | Vitality score below which a fact is considered stale |
| `memify_merge_similarity_threshold` | `float` | `0.90` | Similarity threshold for merging similar procedures |

---

## Procedural Memory

Parameters for directive/procedural memory retrieval.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `directive_max_tokens` | `int` | `300` | Maximum tokens for procedural directives |
| `directive_cache_ttl_minutes` | `int` | `30` | Cache TTL for directive lookups |
| `contradiction_similarity_threshold` | `float` | `0.80` | Threshold for detecting contradictory directives |

---

## Locale / Deployment

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `timezone` | `str` | `"UTC"` | IANA timezone for temporal resolution in retrieval (e.g., `"America/Sao_Paulo"`) |

---

## Open Catalog (Deployer Extensions)

Parameters for extending the built-in attribute catalog with custom entries.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `extra_attribute_keys` | `set[str]` | `set()` | Additional attribute keys recognized during extraction |
| `attribute_aliases` | `dict[str, str]` | `{}` | Aliases for attribute keys (e.g., `{"hometown": "city"}`) |
| `extra_namespaces` | `set[str]` | `set()` | Additional entity namespaces beyond built-in types |
| `extra_self_references` | `frozenset[str]` | `frozenset()` | Additional words treated as self-references (e.g., `{"yo"}` for Spanish) |
| `extra_relationship_hints` | `frozenset[str]` | `frozenset()` | Additional relationship hint words for entity resolution |

---

## Limits

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `max_facts_per_event` | `int` | `100` | Maximum facts extracted from a single message |
| `embedding_dimensions` | `int` | `1536` | Dimensionality of embedding vectors (must match your provider) |

---

# Custom Providers

`arandu` uses Python protocols for dependency injection. You can use any LLM or embedding backend by implementing two simple interfaces — no inheritance required.

## The Protocols

The SDK defines two protocols in `arandu.protocols`:

### LLMProvider

Your LLM provider must implement a single `complete` method:

```python
class LLMProvider(Protocol):
    async def complete(
        self,
        messages: list[dict],
        temperature: float = 0,
        response_format: dict | None = None,
        max_tokens: int | None = None,
    ) -> str: ...
```

| Parameter | Description |
|-----------|-------------|
| `messages` | List of message dicts with `"role"` and `"content"` keys (OpenAI format) |
| `temperature` | Sampling temperature (0 = deterministic) |
| `response_format` | Optional format spec (e.g., `{"type": "json_object"}` for JSON mode) |
| `max_tokens` | Optional maximum tokens for the response |
| **Returns** | The assistant's response text as a string |

> **JSON mode support:** The memory pipeline relies heavily on JSON-mode responses (`response_format={"type": "json_object"}`).
Your provider must support this — either natively or by parsing the response.

### EmbeddingProvider

Your embedding provider must implement two methods:

```python
class EmbeddingProvider(Protocol):
    async def embed(self, texts: list[str]) -> list[list[float]]: ...
    async def embed_one(self, text: str) -> list[float] | None: ...
```

| Method | Description |
|--------|-------------|
| `embed(texts)` | Generate embeddings for a batch of texts. Returns one vector per input. |
| `embed_one(text)` | Generate embedding for a single text. Returns `None` if text is empty/invalid. |

> **Embedding dimensions:** The default `embedding_dimensions` in `MemoryConfig` is 1536 (OpenAI `text-embedding-3-small`).
If your provider uses different dimensions, set `MemoryConfig(embedding_dimensions=...)` accordingly.

---

## Example: Anthropic Provider

Here's a complete example implementing both protocols using the Anthropic SDK:

```python
import asyncio
import json
from anthropic import AsyncAnthropic

class AnthropicLLMProvider:
    """LLM provider using Anthropic's Claude API."""

    def __init__(self, api_key: str, model: str = "claude-sonnet-4-20250514") -> None:
        self._client = AsyncAnthropic(api_key=api_key)
        self._model = model

    async def complete(
        self,
        messages: list[dict],
        temperature: float = 0,
        response_format: dict | None = None,
        max_tokens: int | None = None,
    ) -> str:
        # Convert OpenAI-format messages to Anthropic format
        system_msg = ""
        chat_messages = []
        for msg in messages:
            if msg["role"] == "system":
                system_msg = msg["content"]
            else:
                chat_messages.append({
                    "role": msg["role"],
                    "content": msg["content"],
                })

        # Add JSON instruction if json mode requested
        if response_format and response_format.get("type") == "json_object":
            system_msg += "\n\nRespond with valid JSON only. No markdown fences."

        response = await self._client.messages.create(
            model=self._model,
            system=system_msg,
            messages=chat_messages,
            temperature=temperature,
            max_tokens=max_tokens or 4096,
        )

        return response.content[0].text
```

> **Separate providers:** You can use different providers for LLM and embeddings. For example,
use Anthropic for completions and OpenAI for embeddings:

    ```python
    memory = MemoryClient(
        database_url="...",
        llm=AnthropicLLMProvider(api_key="sk-ant-..."),
        embeddings=OpenAIProvider(api_key="sk-..."),  # just for embeddings
    )
    ```

---

## Example: Local Model Provider

For running with local models (e.g., via Ollama):

```python
import httpx

class OllamaProvider:
    """LLM + Embedding provider using a local Ollama server."""

    def __init__(
        self,
        base_url: str = "http://localhost:11434",
        model: str = "llama3.1",
        embedding_model: str = "nomic-embed-text",
    ) -> None:
        self._base_url = base_url
        self._model = model
        self._embedding_model = embedding_model
        self._client = httpx.AsyncClient(timeout=60.0)

    # -- LLMProvider --

    async def complete(
        self,
        messages: list[dict],
        temperature: float = 0,
        response_format: dict | None = None,
        max_tokens: int | None = None,
    ) -> str:
        payload: dict = {
            "model": self._model,
            "messages": messages,
            "stream": False,
            "options": {"temperature": temperature},
        }
        if response_format and response_format.get("type") == "json_object":
            payload["format"] = "json"

        response = await self._client.post(
            f"{self._base_url}/api/chat",
            json=payload,
        )
        response.raise_for_status()
        return response.json()["message"]["content"]

    # -- EmbeddingProvider --

    async def embed(self, texts: list[str]) -> list[list[float]]:
        results = []
        for text in texts:
            if not text.strip():
                continue
            response = await self._client.post(
                f"{self._base_url}/api/embed",
                json={"model": self._embedding_model, "input": text},
            )
            response.raise_for_status()
            results.append(response.json()["embeddings"][0])
        return results

    async def embed_one(self, text: str) -> list[float] | None:
        if not text or not text.strip():
            return None
        results = await self.embed([text])
        return results[0] if results else None
```

> **Embedding dimensions:** When using local models, check the embedding dimensions and configure accordingly:

    ```python
    config = MemoryConfig(
        embedding_dimensions=768,  # nomic-embed-text uses 768 dims
    )
    ```

---

## Testing Your Provider

You can verify your provider works with the memory system before going to production:

```python
import asyncio
from arandu import MemoryClient, MemoryConfig

async def test_provider():
    provider = YourProvider(...)
    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost/memory",
        llm=provider,
        embeddings=provider,
    )
    await memory.initialize()

    try:
        # Test write
        result = await memory.write(
            user_id="test",
            message="Testing the provider. My name is Alice and I work at Acme.",
        )
        assert len(result.facts_added) > 0, "No facts extracted — check LLM responses"
        assert len(result.entities_resolved) > 0, "No entities resolved"
        print(f"Write OK: {len(result.facts_added)} facts, {len(result.entities_resolved)} entities")

        # Test retrieve
        context = await memory.retrieve(user_id="test", query="who is Alice?")
        assert len(context.facts) > 0, "No facts retrieved — check embeddings"
        print(f"Retrieve OK: {len(context.facts)} facts found")
        print(f"Context: {context.context}")
    finally:
        await memory.close()

asyncio.run(test_provider())
```

## Key Requirements

When implementing a custom provider, keep these requirements in mind:

1. **JSON mode** — The pipeline sends `response_format={"type": "json_object"}` frequently. Your provider must return valid JSON when this is set.

2. **Async** — Both protocols are async (`async def`). If your backend SDK is synchronous, wrap calls with `asyncio.to_thread()`.

3. **Empty/error handling** — `embed_one` should return `None` for empty input, not raise. `embed` should return `[]` for empty input.

4. **Timeout** — Consider adding timeouts to your provider. The SDK sets timeouts on its side via `MemoryConfig`, but provider-level timeouts add an extra safety layer.

5. **Embedding dimensions** — Set `MemoryConfig(embedding_dimensions=N)` to match your provider's output dimensions. Mismatched dimensions will cause pgvector errors.

---

# Cookbook

Complete, copy-paste-ready examples for common use cases.

---

## Basic Usage

The simplest integration: write facts from user messages and retrieve context for responses.

```python
import asyncio
from arandu import MemoryClient
from arandu.providers.openai import OpenAIProvider

async def main():
    provider = OpenAIProvider(api_key="sk-...")
    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost:5432/memory",
        llm=provider,
        embeddings=provider,
    )
    await memory.initialize()

    try:
        # Simulate a conversation
        messages = [
            "Hi, I'm Rafael. I'm a backend engineer at Acme Corp in São Paulo.",
            "My girlfriend Ana is a UX designer. We have a cat named Pixel.",
            "I've been learning Rust lately, mostly on weekends.",
            "Actually, I just moved to Rio de Janeiro. Still remote at Acme.",
        ]

        for msg in messages:
            result = await memory.write(user_id="rafael", message=msg)
            added = len(result.facts_added)
            updated = len(result.facts_updated)
            print(f"Write: +{added} facts, ~{updated} updates ({result.duration_ms:.0f}ms)")

        # Retrieve context for different queries
        queries = [
            "where does Rafael live?",
            "tell me about Rafael's relationships",
            "what are Rafael's hobbies?",
        ]

        for query in queries:
            result = await memory.retrieve(user_id="rafael", query=query)
            print(f"\nQuery: {query}")
            print(f"Found {len(result.facts)} facts ({result.duration_ms:.0f}ms)")
            for fact in result.facts[:5]:
                print(f"  [{fact.score:.2f}] {fact.entity_name}: {fact.value}")
    finally:
        await memory.close()

asyncio.run(main())
```

---

## Custom Provider (Anthropic)

Use Claude as your LLM while keeping OpenAI for embeddings:

```python
import asyncio
from anthropic import AsyncAnthropic
from arandu import MemoryClient, MemoryConfig
from arandu.providers.openai import OpenAIProvider

class ClaudeLLM:
    """Anthropic Claude as LLM provider for arandu."""

    def __init__(self, api_key: str, model: str = "claude-sonnet-4-20250514") -> None:
        self._client = AsyncAnthropic(api_key=api_key)
        self._model = model

    async def complete(
        self,
        messages: list[dict],
        temperature: float = 0,
        response_format: dict | None = None,
        max_tokens: int | None = None,
    ) -> str:
        system_msg = ""
        chat_messages = []
        for msg in messages:
            if msg["role"] == "system":
                system_msg = msg["content"]
            else:
                chat_messages.append({"role": msg["role"], "content": msg["content"]})

        if response_format and response_format.get("type") == "json_object":
            system_msg += "\n\nYou MUST respond with valid JSON only. No markdown fences."

        response = await self._client.messages.create(
            model=self._model,
            system=system_msg,
            messages=chat_messages,
            temperature=temperature,
            max_tokens=max_tokens or 4096,
        )
        return response.content[0].text

async def main():
    # Claude for reasoning, OpenAI for embeddings
    llm = ClaudeLLM(api_key="sk-ant-...")
    embeddings = OpenAIProvider(api_key="sk-...")

    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost/memory",
        llm=llm,
        embeddings=embeddings,
    )
    await memory.initialize()

    try:
        result = await memory.write(
            user_id="demo",
            message="I love hiking in the mountains. Last weekend I went to Serra da Mantiqueira.",
        )
        print(f"Extracted {len(result.facts_added)} facts using Claude")

        context = await memory.retrieve(user_id="demo", query="outdoor activities")
        print(context.context)
    finally:
        await memory.close()

asyncio.run(main())
```

---

## Advanced Configuration (Retrieval Tuning)

Fine-tune retrieval for different use cases:

```python
import asyncio
from arandu import MemoryClient, MemoryConfig
from arandu.providers.openai import OpenAIProvider

async def main():
    provider = OpenAIProvider(api_key="sk-...")

    # Configuration for a chatbot that needs broad, fresh context
    config = MemoryConfig(
        # Extraction: faster single-pass for real-time chat
        extraction_mode="single_pass",
        extraction_model="gpt-4o-mini",

        # Retrieval: more results, favor recency
        topk_facts=40,
        min_similarity=0.15,          # cast a wider net
        recency_half_life_days=7,     # favor recent facts more aggressively

        # Score weights: boost recency for a fast-moving conversation
        score_weights={
            "semantic": 0.50,
            "recency": 0.35,
            "importance": 0.15,
        },

        # Reranker: use fast model
        enable_reranker=True,
        reranker_model="gpt-4o-mini",

        # Context: larger budget for rich responses
        context_max_tokens=3000,
        context_budget_tokens=5000,

        # Spreading activation: wider context expansion
        spreading_activation_hops=3,
        spreading_max_related_entities=8,

        # Timezone for recency calculations
        timezone="America/Sao_Paulo",
    )

    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost/memory",
        llm=provider,
        embeddings=provider,
        config=config,
    )
    await memory.initialize()

    try:
        # Write a series of messages
        await memory.write(user_id="demo", message="I started a new job at TechCorp today!")
        await memory.write(user_id="demo", message="My manager's name is Sarah. She seems great.")
        await memory.write(user_id="demo", message="The office is in downtown with a nice view.")

        # Retrieve with tuned settings
        result = await memory.retrieve(user_id="demo", query="what's new with the user?")
        print(f"Retrieved {len(result.facts)} facts")
        print(f"Context ({len(result.context)} chars):")
        print(result.context)

        # Check individual scores to verify tuning
        for fact in result.facts:
            print(f"\n  [{fact.score:.3f}] {fact.value}")
            print(f"    Scores: {fact.scores}")
    finally:
        await memory.close()

asyncio.run(main())
```

---

## Background Jobs Integration

Set up periodic maintenance to keep memory organized:

```python
import asyncio
from arandu import (
    MemoryClient,
    MemoryConfig,
    cluster_user_facts,
    compute_entity_importance,
    detect_communities,
    refresh_entity_summaries,
    run_consolidation,
    run_memify,
)
from arandu.providers.openai import OpenAIProvider
from arandu.db import create_engine, create_session_factory

async def run_maintenance(
    database_url: str,
    user_ids: list[str],
    provider: OpenAIProvider,
    config: MemoryConfig,
) -> None:
    """Run all background maintenance jobs for a list of users."""
    engine = create_engine(database_url)
    session_factory = create_session_factory(engine)

    try:
        async with session_factory() as session:
            for user_id in user_ids:
                print(f"\n--- Maintenance for {user_id} ---")

                # 1. Importance scoring (cheap, SQL-only)
                importance = await compute_entity_importance(session, user_id, config)
                print(f"  Importance: scored {importance.entities_scored} entities")

                # 2. Summary refresh (moderate, LLM)
                summaries = await refresh_entity_summaries(
                    session, user_id, provider, config
                )
                print(f"  Summaries: refreshed {summaries.refreshed_count}")

                # 3. Clustering (moderate, LLM)
                clusters = await cluster_user_facts(
                    session, user_id, provider, provider, config
                )
                print(f"  Clustering: {clusters.clusters_created} clusters")

                # 4. Community detection
                communities = await detect_communities(
                    session, user_id, provider, provider, config
                )
                print(f"  Communities: {communities.communities_found}")

                # 5. Consolidation (moderate, LLM)
                consolidation = await run_consolidation(session, user_id, provider, config)
                print(f"  Consolidation: {consolidation.observations_created} observations")

                # 6. Memify (moderate, LLM)
                memify = await run_memify(session, user_id, provider, config)
                print(f"  Memify: {memify.facts_memified} facts processed")

            await session.commit()
    finally:
        await engine.dispose()

async def main():
    provider = OpenAIProvider(api_key="sk-...")
    config = MemoryConfig()
    database_url = "postgresql+psycopg://memory:memory@localhost/memory"

    # Run once
    await run_maintenance(database_url, ["user_123", "user_456"], provider, config)

    # Or schedule with asyncio
    # while True:
    #     await run_maintenance(database_url, user_ids, provider, config)
    #     await asyncio.sleep(4 * 3600)  # every 4 hours

asyncio.run(main())
```

---

## Multi-User Setup

Handle multiple users with isolated memory spaces:

```python
import asyncio
from arandu import MemoryClient
from arandu.providers.openai import OpenAIProvider

async def main():
    provider = OpenAIProvider(api_key="sk-...")
    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost/memory",
        llm=provider,
        embeddings=provider,
    )
    await memory.initialize()

    try:
        # Each user has completely isolated memory
        await memory.write(
            user_id="alice",
            message="I work at Google as a PM. I live in Mountain View.",
        )
        await memory.write(
            user_id="bob",
            message="I'm a freelance designer based in Berlin.",
        )

        # Alice's context only shows Alice's facts
        alice_ctx = await memory.retrieve(user_id="alice", query="where do they work?")
        print("Alice:", alice_ctx.context)

        # Bob's context only shows Bob's facts
        bob_ctx = await memory.retrieve(user_id="bob", query="where do they work?")
        print("Bob:", bob_ctx.context)
    finally:
        await memory.close()

asyncio.run(main())
```

---

# API Reference

Auto-generated from source code docstrings. For conceptual guides on how these components work together, see the [Concepts](../concepts/write-pipeline.md) section.

---

## Client

### MemoryClient

    options:
      show_source: false
      heading_level: 4
      members_order: source

### WriteResult

    options:
      show_source: false
      heading_level: 4

### RetrieveResult

    options:
      show_source: false
      heading_level: 4

### ScoredFact

    options:
      show_source: false
      heading_level: 4

---

## Configuration

### MemoryConfig

    options:
      show_source: false
      heading_level: 4

---

## Protocols

### LLMProvider

    options:
      show_source: false
      heading_level: 4

### EmbeddingProvider

    options:
      show_source: false
      heading_level: 4

---

## Providers

### OpenAIProvider

    options:
      show_source: false
      heading_level: 4

---

## Exceptions

### MemoryError

    options:
      show_source: false
      heading_level: 4

### ExtractionError

    options:
      show_source: false
      heading_level: 4

### ResolutionError

    options:
      show_source: false
      heading_level: 4

### ReconciliationError

    options:
      show_source: false
      heading_level: 4

### RetrievalError

    options:
      show_source: false
      heading_level: 4

### UpsertError

    options:
      show_source: false
      heading_level: 4

---

## Background Functions

### Clustering

#### cluster_user_facts

    options:
      show_source: false
      heading_level: 5

#### detect_communities

    options:
      show_source: false
      heading_level: 5

### Consolidation

#### run_consolidation

    options:
      show_source: false
      heading_level: 5

#### run_profile_consolidation

    options:
      show_source: false
      heading_level: 5

### Memify

#### run_memify

    options:
      show_source: false
      heading_level: 5

#### compute_vitality

    options:
      show_source: false
      heading_level: 5

### Sleep-Time Compute

#### compute_entity_importance

    options:
      show_source: false
      heading_level: 5

#### refresh_entity_summaries

    options:
      show_source: false
      heading_level: 5

#### detect_entity_communities

    options:
      show_source: false
      heading_level: 5

---

## Result Dataclasses

### ClusteringResult

    options:
      show_source: false
      heading_level: 4

### CommunityDetectionResult

    options:
      show_source: false
      heading_level: 4

### ConsolidationResult

    options:
      show_source: false
      heading_level: 4

### MemifyResult

    options:
      show_source: false
      heading_level: 4

### EntityImportanceResult

    options:
      show_source: false
      heading_level: 4

### SummaryRefreshResult

    options:
      show_source: false
      heading_level: 4

---

## See Also: Advanced API

For documentation of internal pipeline functions, sub-module exports, and additional data types not covered here, see the **Advanced** section:

- [Write Pipeline API](../advanced/write-api.md) -- extraction strategy, canonicalization, entity helpers, correction detection, pending operations, and `run_write_pipeline()`.
- [Read Pipeline API](../advanced/read-api.md) -- retrieval agent, query expansion, graph retrieval, spreading activation, context compression, emotional trends, dynamic importance, procedural memory, and `run_read_pipeline()`.
- [Database Utilities](../advanced/database.md) -- `create_engine()`, `create_session_factory()`, `init_db()`, and schema overview.
- [Data Types Reference](../advanced/data-types.md) -- all enums, dataclasses, and result types across write, read, and background modules.

---
