synaptic-core llms.txt

Purpose
- This file is the operational contract for human developers and AI coding agents using `synaptic-core` as a tool/library.
- It is written to remove ambiguity about API shape, runtime behavior, and integration constraints.

Package Identity
- Name: `synaptic-core`
- Version: `0.2.0`
- Python: `>=3.10`
- Primary entrypoint: `synaptic_core.SynapticMemory`
- Public API is async.

Install
- Local source install: `python -m pip install .`
- Optional real embeddings: `python -m pip install ".[real_embedding]"`
- If published to your package index, install by pinned version (example: `python -m pip install synaptic-core==0.2.0`).

Core Runtime Model
- `SynapticMemory` coordinates:
  - storage (`store`)
  - retrieval (`retrieve`)
  - direct associative activation (`activate`)
  - outcome-driven learning (`feedback`)
  - observability/status (`graph_status`, `session_summary`, `weekly_digest`, `stats`)
  - deletion (`delete`)
- Default persistence backend is SQLite (`synaptic.db`).
- You can inject a custom backend implementing the `MemoryBackend` protocol.

Important Version Rule
- Treat `pyproject.toml` and `synaptic_core.__version__` as authoritative package version sources.
- Current expected value: `0.2.0`.

Interop Contract
- Framework adapters (including Axis) live outside `synaptic-core`.
- Canonical Axis interop contract ID: `AXIS-MEMORY-PROVIDER-V1`.
- `synaptic-core` ships no framework-specific entry points.

SynapticMemory Constructor Contract
- Signature (keyword-only):
  - `SynapticMemory(*, db_path="synaptic.db", embedding_fn=None, backend=None, classifier=None, cross_tier_similarity_threshold=0.75, cross_tier_initial_weight=0.25, cross_tier_seed_top_k=25, cross_tier_seed_max_edges=8, top_k_retrieval=10, top_k_activation_seeds=5, telemetry_collector=None, activation_engine=None, outcome_detector=None, connection_learner=None, deployment_id="local", weekly_digest_cache_ttl_seconds=300, graph_confidence_refresh_query_interval=10, graduation_engine=None, graduation_scheduler_enabled=True, graduation_check_interval_queries=10, graduation_check_interval_seconds=86400.0, graduation_min_access_count=5, graduation_min_positive_outcome_ratio=0.6, graduation_min_age_hours=48.0, graduation_min_reinforced_edges=2, graduation_reinforced_edge_weight_threshold=0.45, graduation_edge_transfer_threshold=0.40, instance_optimizer=None, instance_optimizer_enabled=True, instance_optimizer_step_size=0.005, instance_optimizer_min_sample_count=10, instance_optimizer_update_interval=5, session_deserializer=None)`
- `embedding_fn` may be sync or async.
- `embedding_fn` is required for `store` and `retrieve`.
- `embedding_fn` is not required for `activate`, `feedback`, `graph_status`, `session_summary`, `weekly_digest`, `stats`, `delete`.

Public API Contract
- `await store(content, memory_type=None, external_id=None, source_type=None, metadata=None, **kwargs) -> Node`
  - Requires non-empty `content`.
  - Computes embedding via `embedding_fn`.
  - If `memory_type` not provided, classifier decides type.
  - Always stores new nodes in `short_term`.
  - Seeds provisional cross-tier edges to long-term candidates.
  - `kwargs` are accepted for compatibility and ignored.
  - Raises:
    - `ValueError("content is required")`
    - `RuntimeError` if no `embedding_fn`
    - `TypeError` if embedding output is not numeric sequence
    - `ValueError` if embedding output is empty

- `await retrieve(query, top_k=None, session_id=None, **kwargs) -> RetrievalResult`
  - Requires non-empty `query`.
  - Uses keyword search + semantic search + reciprocal rank fusion + activation merge.
  - If `session_id is None` and `kwargs["namespace"]` is a string, namespace is used as session id.
  - `top_k` must be `> 0` when provided.
  - Raises:
    - `ValueError("query is required")`
    - `ValueError("top_k must be > 0")`
    - Embedding errors listed above

- `await activate(node_ids, initial_energy=1.0, *, top_k=None, session_id=None) -> RetrievalResult`
  - `node_ids` must be non-empty.
  - Directly runs graph activation, bypassing keyword and semantic search.
  - Missing ids are skipped; if none resolve, raises `KeyError`.
  - `top_k` must be `> 0` when provided.
  - Raises:
    - `ValueError("node_ids must not be empty")`
    - `KeyError("none of the provided node_ids were found")`
    - `ValueError("top_k must be > 0")`

- `await feedback(query_id, outcome, *, agent_response=None, user_next_message=None, dwell_time_ms=None, active_nodes=None, corrected_nodes=None, provider="unknown", **kwargs) -> CompositeOutcome`
  - `outcome` must map to `OutcomeSignalType`.
  - If `query_id` exists in retrieval history, that retrieval is used.
  - If not found, fallback requires `active_nodes`; otherwise raises `KeyError`.
  - `kwargs` are accepted for compatibility and ignored.
  - Raises:
    - `ValueError("unsupported outcome signal: ...")` for invalid outcome
    - `KeyError("query_id not found in retrieval history: ...")` if unknown query and no `active_nodes`

- `await graph_status() -> dict[str, Any]`
  - Returns graph counts, confidence, confidence components, confidence band, maturity percent, and 7-day health aggregates.

- `await session_summary(session_id) -> dict[str, int]`
  - Returns query/hit/miss/association/learning/cross-tier counters.
  - Uses durable SQLite aggregates when available.

- `await weekly_digest(*, now=None, force_refresh=False) -> dict[str, Any]`
  - Returns 7-day summary payload with cache TTL behavior.
  - Uses durable observability events when available.

- `await stats() -> dict[str, float]`
  - Returns `node_count`, `edge_count`, `hit_rate`, `avg_activation_steps`.

- `await delete(node_id) -> None`
  - Removes node and related edges; prunes retrieval history references containing that node.

- `await link(node_a_id, node_b_id, weight=0.3, connection_type="excitatory", formation_trigger="explicit", is_cross_tier=None) -> dict[str, Any]`
  - Creates or updates an edge between two existing nodes.
  - Raises `ValueError` for blank/same ids and `KeyError` when a node does not exist.

- `await neighbors(node_id, limit=None) -> list[dict[str, Any]]`
  - Returns neighbor summaries (`target_node_id`, `weight`, `connection_type`, plus node context fields).
  - Raises `KeyError` when the source node does not exist.

- `await recent_queries(session_id=None, limit=20) -> list[dict[str, Any]]`
  - Returns newest query events first.
  - Uses in-memory query events and durable SQLite telemetry fallback when available.

- `await recent_sessions(limit=20) -> list[dict[str, Any]]`
  - Returns newest sessions first with summary counters.
  - Merges live in-memory session metrics with durable aggregate fallback.

- `await recent_outcomes(session_id=None, limit=20) -> list[dict[str, Any]]`
  - Returns newest outcomes first.
  - Uses in-memory outcome events and durable SQLite telemetry fallback when available.

- `await kv_set(key, value, metadata=None, ttl=None, namespace=None) -> None`
  - Stores a key/value payload under a namespace.
  - Supports optional TTL in seconds.

- `await kv_get(key, namespace=None) -> Any | None`
  - Returns stored value for key/namespace when present and unexpired.

- `await kv_search(query, limit=10, namespace=None, filters=None) -> list[KeyValueItem]`
  - Case-insensitive key search scoped to namespace.
  - Optional metadata equality filters.

- `await kv_delete(key, namespace=None) -> bool`
  - Deletes key in namespace and returns whether a row was removed.

- `await kv_clear(namespace=None) -> int`
  - Deletes all keys in a namespace.
  - Special namespace `"*"` clears all namespaces.

- `await store_session(session) -> session`
  - Stores serialized session payload at key prefix `session:`.
  - Enforces optimistic version checks and increments `version`.

- `await retrieve_session(session_id) -> session | dict | None`
  - Loads session payload and applies `session_deserializer` when configured.

- `await update_session(session) -> session`
  - Alias for `store_session`.

Enums and Constants You Should Use
- `OutcomeSignalType` values:
  - `explicit_positive`, `explicit_negative`, `memory_in_response`, `contradiction`, `re_query_miss`, `session_continuation`, `dwell_time`, `productive_continuation`, `correction`, `clarification_needed`
- `MemoryTier`: `short_term`, `long_term`
- `SourceType`: `document`, `url`, `file`, `manual`
- `ConnectionType`: `excitatory`, `inhibitory`, `cross_tier`

Returned Model Shapes (Key Fields)
- `Node`: `id`, `content`, `embedding`, `memory_type`, `memory_tier`, `created_at`, `last_accessed_at`, `access_count`, `activation_energy`, `external_id`, `source_type`, graduation/consolidation metadata.
- `RetrievalResult`: `query_id`, `nodes`, `activation_path`, `seed_node_ids`, `seed_nodes_contributed`, `seed_nodes_dead_ends`, `retrieval_latency_ms`, `activation_latency_ms`, `graph_confidence`, `activation_weight_applied`, `activation_sourced_node_ids`, `fusion_only_node_ids`, `short_term_node_ids`, `long_term_node_ids`, `cross_tier_traversals`.
- `CompositeOutcome`: `query_id`, `signals`, `composite_score`, `composite_confidence`, `learning_gated`, `oscillation_detected`, `oscillating_edge_ids`.
- `KeyValueItem`: `key`, `value`, `metadata`, `score`, `namespace`, `created_at`, `expires_at`.

Deterministic Ranking and Merge Rules
- Reciprocal Rank Fusion (`k=60` default):
  - deterministic tie-break by `node_id` (ascending).
  - duplicate ids within each source are deduplicated by first occurrence.
- Confidence merge bands:
  - `confidence <= 0.0`: fusion only.
  - `0.0 < confidence < 0.3`: fusion ordered first, then activation-only nodes.
  - `0.3 <= confidence < 0.7`: linear blend of normalized fusion and activation scores.
  - `confidence >= 0.7`: activation-led ranking, then fusion-only nodes.

Session Semantics
- Internal default session id is `"default"` when absent/blank.
- Retrieval seeds session priming from long-term nodes in the last 5 retrievals for that session.
- Use stable `session_id` for coherent analytics and priming behavior.

Telemetry Contract (Best-Effort, Non-Blocking)
- `SynapticMemory` uses safe telemetry calls. Missing methods or telemetry exceptions are swallowed.
- Expected telemetry hooks if provided:
  - `record_query`
  - `record_retrieval`
  - `record_activation`
  - `record_outcome`
  - `record_session`
  - `record_node_metadata`
  - `record_connection` (via learner/orchestrator)
  - `record_graduation` (via graduation engine)
- `TelemetryCollector` implementation writes asynchronously to local SQLite and is local-only by design.

Graduation and Optimizer Behavior
- Graduation scheduler defaults:
  - enabled
  - due every 10 query-counted retrievals or 86400 seconds
- Graduation and optimizer failures are isolated from hot paths (exceptions are swallowed in scheduler/optimization loops).
- Instance optimizer defaults:
  - enabled
  - step size `0.005`
  - min sample count `10`
  - update interval every `5` feedback calls

Backend Contract for Custom Integrations
- Required protocol methods:
  - `init_schema`
  - `create_node`, `get_node`, `update_node`, `delete_node`
  - `create_edge`, `get_edge`, `update_edge`, `delete_edge`
  - `keyword_search(query, top_k=...)`
  - `semantic_search(embedding, top_k=...)`
- Optional methods used when present for better fidelity/perf:
  - `semantic_search_long_term`
  - `get_neighbors`
  - `get_edge_between` / `get_edge_by_nodes` / `find_edge_between`
  - `has_edge_between`
  - `list_edges_for_node` / `get_edges_for_node` / `edges_for_node`
  - `update_edges_bulk`
  - `list_short_term_nodes`
  - `graph_snapshot`

Known Compatibility Notes
- Method shapes are intentionally compatibility-friendly:
  - `store(content, ..., **kwargs)`
  - `retrieve(query, ..., **kwargs)`
  - `feedback(query_id, outcome, ..., **kwargs)`
- Current behavior of extra kwargs:
  - `store`: ignored
  - `retrieve`: only `namespace` is used as fallback session id
  - `feedback`: ignored
- If you need outcome mapping from external taxonomy, map it to `OutcomeSignalType` before calling `feedback`.

AI Agent Integration Playbook (Recommended)
1. Initialize one long-lived `SynapticMemory` instance per deployment/runtime.
2. Provide a deterministic embedding function with stable dimensionality.
3. `store` durable memories once (include `external_id` and `source_type` when possible).
4. For each user turn:
   - `retrieve(query, session_id=stable_session, top_k=...)`
   - use returned nodes
   - `feedback(query_id=result.query_id, outcome=<OutcomeSignalType>, active_nodes=[...], provider=<stable-provider-id>)`
5. Call `graph_status` and `weekly_digest` for health monitoring; use `session_summary` for per-session analytics.

Hard Requirements for Agent Authors
- Do not call `store` or `retrieve` without `embedding_fn`.
- Do not pass `top_k <= 0`.
- Always `await` all public methods.
- Keep embedding dimensionality consistent across stored and queried vectors.
- Use stable `session_id` and `provider` values.
- For unknown `query_id` feedback, include `active_nodes` or expect `KeyError`.

Minimal Example
```python
import asyncio
from synaptic_core import SynapticMemory
from synaptic_core.types import OutcomeSignalType, SourceType

def embed(text: str) -> list[float]:
    t = text.lower().strip()
    return [float(len(t.split())), float(len(t)) / 100.0, float(sum(map(ord, t)) % 97) / 100.0]

async def main() -> None:
    memory = SynapticMemory(db_path="synaptic.db", embedding_fn=embed, deployment_id="prod-us-east-1")
    node = await memory.store("Customer asked for CSV monthly usage export.", external_id="doc-42", source_type=SourceType.DOCUMENT)
    result = await memory.retrieve("How do I export monthly usage as CSV?", session_id="chat-42", top_k=5)
    await memory.feedback(
        result.query_id,
        outcome=OutcomeSignalType.EXPLICIT_POSITIVE,
        active_nodes=[n.id for n in result.nodes],
        provider="assistant-api",
    )
    status = await memory.graph_status()
    print(node.id, status["graph_confidence"])

asyncio.run(main())
```
