Memory
InitRunner's memory system gives agents three capabilities: short-term session persistence for resuming conversations, long-term typed memory (semantic, episodic, and procedural), and automatic consolidation that extracts durable facts from episodic records.
- Semantic memory — facts and knowledge (e.g. "the user prefers dark mode")
- Episodic memory — what happened during tasks (e.g. "deployed v2.1 to staging, rollback needed")
- Procedural memory — learned policies and patterns (e.g. "always run tests before deploying")
All memory types are backed by a single database per agent using a configurable store backend (default: lancedb for vector similarity search). The store is dimension-agnostic — embedding dimensions are auto-detected on first use.
Quick Start
apiVersion: initrunner/v1
kind: Agent
metadata:
name: assistant
description: Agent with rich memory
spec:
role: |
You are a helpful assistant with long-term memory.
Use the remember() tool to save important facts.
Use the recall() tool to search your memories before answering.
Use the learn_procedure() tool to record useful patterns.
model:
provider: openai
name: gpt-4o-mini
memory:
max_sessions: 10
max_resume_messages: 20
semantic:
max_memories: 1000
episodic:
max_episodes: 500
procedural:
max_procedures: 100
consolidation:
enabled: true
interval: after_sessionMinimal config — enable semantic memory with a single nested key:
memory:
semantic:
max_memories: 1000# Interactive session (auto-saves history)
initrunner run role.yaml -i
# Resume where you left off
initrunner run role.yaml -i --resume
# Manage memory
initrunner memory list role.yaml
initrunner memory list role.yaml --type episodic
initrunner memory clear role.yaml
initrunner memory consolidate role.yaml
initrunner memory export role.yaml -o memories.json
initrunner memory import role.yaml memories.jsonMemory in Ephemeral Mode
In initrunner run (no YAML), memory is on by default. No config file needed.
# Memory on (default)
initrunner run
# Resume previous session
initrunner run --resume
# Disable memory
initrunner run --no-memoryEphemeral mode creates a lightweight memory store with semantic memory enabled. Use --resume to load the most recent session and pick up where you left off. Use --no-memory to start fresh every time.
Memory Types
Semantic
Facts and knowledge extracted from conversations or explicitly saved by the agent. This is the default memory type and the one used by the remember() tool.
Semantic memories are retrieved via recall() and are also the output of the consolidation process (extracting durable facts from episodic records).
Episodic
Records of what happened during agent tasks — outcomes, decisions, errors, and events. Episodic memories are created in three ways:
- The agent calls
record_episode()explicitly. - Autonomous runs auto-capture an episode when
finish_taskis called (see Episodic Auto-Capture). - Daemon trigger executions auto-capture an episode after each run.
Episodic memories serve as raw material for consolidation: the consolidation process reads unconsolidated episodes, extracts semantic facts via an LLM, and marks them as consolidated.
Procedural
Learned policies, patterns, and best practices. Procedural memories are created via the learn_procedure() tool and are automatically injected into the system prompt on every agent run (see Procedural Memory Injection).
Use procedural memory for instructions the agent should always follow, like "always confirm before deleting files" or "use snake_case for Python variables".
Configuration
Memory is configured in the spec.memory section:
spec:
memory:
max_sessions: 10 # default: 10
max_resume_messages: 20 # default: 20
store_backend: lancedb # default: "lancedb"
store_path: null # default: ~/.initrunner/memory/<agent-name>.lance
embeddings:
provider: "" # default: "" (derives from spec.model.provider)
model: "" # default: "" (uses provider default)
base_url: "" # default: "" (custom endpoint URL)
api_key_env: "" # default: "" (env var holding API key)
episodic:
enabled: true # default: true
max_episodes: 500 # default: 500
semantic:
enabled: true # default: true
max_memories: 1000 # default: 1000
procedural:
enabled: true # default: true
max_procedures: 100 # default: 100
consolidation:
enabled: true # default: true
interval: after_session # default: "after_session"
max_episodes_per_run: 20 # default: 20
model_override: null # default: null (uses agent's model)Top-Level Options
| Field | Type | Default | Description |
|---|---|---|---|
max_sessions | int | 10 | Maximum number of sessions to keep. Oldest sessions are pruned on REPL exit. |
max_resume_messages | int | 20 | Maximum number of messages loaded when using --resume. |
store_backend | str | "lancedb" | Memory store backend. |
store_path | str | null | null | Custom path for the memory database. Default: ~/.initrunner/memory/<agent-name>.lance. |
Embedding Options
| Field | Type | Default | Description |
|---|---|---|---|
embeddings.provider | str | "" | Embedding provider. Empty string derives from spec.model.provider. |
embeddings.model | str | "" | Embedding model name. Empty string uses the provider default. |
embeddings.base_url | str | "" | Custom endpoint URL. Triggers OpenAI-compatible mode. |
embeddings.api_key_env | str | "" | Env var name holding the API key for custom endpoints. Empty uses provider default. |
Episodic Options
| Field | Type | Default | Description |
|---|---|---|---|
episodic.enabled | bool | true | Enable episodic memory type and the record_episode() tool. |
episodic.max_episodes | int | 500 | Maximum episodic memories to keep. Oldest are pruned when new ones are added. |
Semantic Options
| Field | Type | Default | Description |
|---|---|---|---|
semantic.enabled | bool | true | Enable semantic memory type and the remember() tool. |
semantic.max_memories | int | 1000 | Maximum semantic memories to keep. Oldest are pruned when new ones are added. |
Procedural Options
| Field | Type | Default | Description |
|---|---|---|---|
procedural.enabled | bool | true | Enable procedural memory type and the learn_procedure() tool. |
procedural.max_procedures | int | 100 | Maximum procedural memories to keep. Oldest are pruned when new ones are added. |
Consolidation Options
| Field | Type | Default | Description |
|---|---|---|---|
consolidation.enabled | bool | true | Enable automatic consolidation of episodic memories into semantic facts. |
consolidation.interval | str | "after_session" | When to run consolidation: after_session (on REPL exit), after_autonomous (on autonomous loop exit), or manual (CLI only). |
consolidation.max_episodes_per_run | int | 20 | Maximum unconsolidated episodes to process per consolidation run. |
consolidation.model_override | str | null | null | Model to use for consolidation LLM calls. Defaults to the agent's model. |
Short-Term: Session Persistence
Session persistence saves REPL conversation history to LanceDB after each turn, enabling the --resume flag.
How It Works
- During an interactive REPL session, the full PydanticAI message history is saved after every turn.
- Each session gets a unique ID (random 12-character hex).
- When
--resumeis used, the most recent session for the agent is loaded. - Only the last
max_resume_messagesmessages are loaded to stay within context window limits. - If the loaded history starts with a
ModelResponse(which is invalid), leadingModelResponsemessages are skipped until aModelRequestis found.
Active Session History Limit
During an active REPL or dashboard session, message history is trimmed to max_resume_messages * 2 (default: 40 messages) after each turn. This prevents unbounded growth during long conversations. The trimming:
- Keeps the most recent messages (sliding window).
- Ensures the history starts with a
ModelRequest(never aModelResponse). - Applies in both the CLI REPL (
initrunner run -i) and the dashboard chat.
System Prompt Filtering
When saving sessions, all SystemPromptPart entries are stripped from ModelRequest messages. This ensures that:
- Stale system prompts from a previous
role.yamlversion don't persist. - The current
spec.roleis always used when resuming. - Session data is more compact.
Session Pruning
Old sessions beyond max_sessions are deleted (oldest first). Pruning runs automatically:
- REPL mode: on session exit.
- Daemon mode: after each trigger execution (when memory is configured).
This keeps the memory database from growing indefinitely.
Never-Raises Guarantee
Session saving follows a never-raises pattern: if writing to the database fails, the error is printed to stderr but the agent continues running. This prevents database issues from crashing interactive sessions.
Long-Term: Memory Tools
When spec.memory is configured, up to five tools are auto-registered depending on which memory types are enabled.
remember(content: str, category: str = "general") -> str
Stores a piece of information as a semantic memory with an embedding for later retrieval. Only registered when semantic.enabled is true.
- The
categoryis sanitized: lowercased, non-alphanumeric characters replaced with underscores. - An embedding is generated from the content using the configured embedding model.
- After storing, memories are pruned to
semantic.max_memories(oldest removed). - Returns a confirmation string with the memory ID and category.
recall(query: str, top_k: int = 5, memory_types: list[str] | None = None) -> str
Searches all memory types by semantic similarity. Always registered when spec.memory is configured.
- Generates an embedding from the query.
- Finds the
top_kmost similar memories using vector search. - Pass
memory_typesto filter by type (e.g.["semantic", "procedural"]). - Returns results formatted as:
[Type: semantic | Category: preferences | Score: 0.912 | 2025-06-01T10:30:00+00:00]
The user prefers dark mode and vim keybindings.
---
[Type: episodic | Category: autonomous_run | Score: 0.845 | 2025-06-01T09:15:00+00:00]
Deployed v2.1 to staging. Tests passed but rollback was needed due to memory leak.The score is 1 - distance (higher is more similar).
list_memories(category: str | None = None, limit: int = 20, memory_type: str | None = None) -> str
Lists recent memories, optionally filtered by category or type. Always registered when spec.memory is configured. Returns entries formatted as:
[semantic:preferences] (2025-06-01T10:30:00+00:00) The user prefers dark mode.
[episodic:autonomous_run] (2025-06-01T09:15:00+00:00) Deployed v2.1 to staging.learn_procedure(content: str, category: str = "general") -> str
Stores a learned procedure, policy, or pattern as a procedural memory. Only registered when procedural.enabled is true.
- The
categoryis sanitized the same way asremember(). - After storing, memories are pruned to
procedural.max_procedures(oldest removed). - Procedural memories are auto-injected into the system prompt on future runs (see Procedural Memory Injection).
record_episode(content: str, category: str = "general") -> str
Records an episode — what happened during a task or interaction. Only registered when episodic.enabled is true.
- The
categoryis sanitized the same way asremember(). - After storing, memories are pruned to
episodic.max_episodes(oldest removed). - Use this to capture outcomes, decisions made, errors encountered, or other events.
Episodic Auto-Capture
In autonomous and daemon modes, episodic memories are captured automatically — the agent does not need to call record_episode() explicitly.
Autonomous Mode
When finish_task is called with a summary, the summary is persisted as an episodic memory with category autonomous_run. This happens after each autonomous loop iteration that produces a result.
Daemon Mode
After each trigger execution, the run result summary is captured as an episodic memory. The metadata includes the trigger type (e.g. cron, file_watch, webhook).
Interactive Mode
Interactive REPL sessions do not auto-capture episodic memories. Use the record_episode() tool explicitly if needed.
Never-Raises Guarantee
Episodic auto-capture follows a never-raises pattern: if embedding or storage fails, a warning is logged but the agent run is not affected.
Consolidation
Consolidation is the process of extracting durable semantic facts from episodic memories using an LLM. It reads unconsolidated episodes, sends them to the model with a structured prompt, parses CATEGORY: content lines from the output, and stores each extracted fact as a new semantic memory.
When It Runs
consolidation.interval | Trigger |
|---|---|
after_session | On interactive REPL exit |
after_autonomous | On autonomous loop exit |
manual | Only via initrunner memory consolidate CLI |
Consolidation can always be triggered manually via the CLI regardless of the interval setting.
How It Works
- Fetch up to
max_episodes_per_rununconsolidated episodic memories (oldest first). - Format them into a prompt and send to the consolidation model.
- Parse
CATEGORY: contentlines from the LLM output. - Store each extracted fact as a semantic memory with
metadata: {"source": "consolidation"}. - Mark the processed episodes as consolidated (sets
consolidated_attimestamp).
Failure Semantics
Consolidation follows a never-raises pattern. If the LLM call or storage fails, a warning is logged and 0 is returned. Episodes are only marked as consolidated after all semantic memories are successfully stored.
Procedural Memory Injection
When procedural.enabled is true, procedural memories are automatically loaded into the system prompt on every agent run. Up to 20 of the most recent procedural memories are injected as a ## Learned Procedures and Policies section:
## Learned Procedures and Policies
- [deployment] Always run tests before deploying to production
- [code_review] Check for SQL injection in any database queries
- [communication] Summarize changes in bullet points for the userThis injection happens transparently — the agent sees these as part of its system prompt and follows them as standing instructions.
Database Schema
The memory store contains three LanceDB tables:
_meta
Key-value metadata (dimensions, chunk ID counters):
| Column | Type | Description |
|---|---|---|
key | string | Metadata key (e.g. "dimensions", "embedding_model") |
value | string | Metadata value (e.g. "1536", "openai:text-embedding-3-small") |
_sessions
| Column | Type | Description |
|---|---|---|
id | int64 | Row ID |
session_id | string | Unique session identifier |
agent_name | string | Agent name from metadata.name |
timestamp | string | ISO 8601 timestamp |
messages_json | large_string | JSON-serialized PydanticAI message history |
_memories
| Column | Type | Description |
|---|---|---|
id | int64 | Memory ID |
content | large_string | Memory content |
category | string | Category label (default: "general") |
created_at | string | ISO 8601 creation timestamp |
memory_type | string | One of episodic, semantic, procedural. Default: semantic. |
metadata_json | string | Optional JSON metadata (e.g. {"trigger_type": "cron"}, {"source": "consolidation"}) |
consolidated_at | string | ISO 8601 timestamp when the episode was consolidated. Empty for unconsolidated or non-episodic memories. |
vector | list<float32>[N] | Vector embedding (dimension auto-detected from model) |
CLI Commands
memory clear
Clear memory data for an agent.
initrunner memory clear role.yaml # clear all (prompts for confirmation)
initrunner memory clear role.yaml --force # skip confirmation
initrunner memory clear role.yaml --what sessions # clear only sessions
initrunner memory clear role.yaml --what memories # clear only long-term memories
initrunner memory clear role.yaml --what all # clear everything (same as no --what)
initrunner memory clear role.yaml --type semantic # clear only semantic memories
initrunner memory clear role.yaml --type episodic # clear only episodic memories| Option | Type | Default | Description |
|---|---|---|---|
role_file | Path | (required) | Path to the role YAML file. |
--what | str | all | What to clear: sessions, memories, or all. |
--type | str | null | Clear only a specific memory type: episodic, semantic, or procedural. Cannot be combined with --what sessions. |
--force | bool | false | Skip the confirmation prompt. |
If the memory store database doesn't exist, the command prints "No memory store found." and exits.
memory export
Export all long-term memories to a JSON file.
initrunner memory export role.yaml # exports to memories.json
initrunner memory export role.yaml -o my-export.json # custom output path| Option | Type | Default | Description |
|---|---|---|---|
role_file | Path | (required) | Path to the role YAML file. |
-o, --output | Path | memories.json | Output JSON file path. |
The exported JSON is an array of objects:
[
{
"id": 1,
"content": "The user prefers dark mode.",
"category": "preferences",
"created_at": "2025-06-01T10:30:00+00:00",
"memory_type": "semantic",
"metadata": null
},
{
"id": 2,
"content": "Deployed v2.1 to staging successfully.",
"category": "autonomous_run",
"created_at": "2025-06-02T14:00:00+00:00",
"memory_type": "episodic",
"metadata": {"trigger_type": "cron"}
}
]memory import
Import memories from a JSON file into an agent's memory store. Content is re-embedded using the role's embedding config, so you can transfer memories between agents that use different embedding models.
initrunner memory import role.yaml memories.json| Option | Type | Default | Description |
|---|---|---|---|
role_file | Path | (required) | Path to the role YAML file. |
input_file | Path | (required) | Path to the JSON file to import. |
The input JSON must be an array of memory objects matching the export format. Each object should have at least a content field. Optional fields: category (default: "general"), memory_type (default: "semantic"), created_at (preserved from export), and metadata.
Entries with blank content are skipped. Unknown memory_type values cause a fast failure with the record index in the error message. The store allocates new IDs for imported memories (exported id values are not preserved).
Embedding is done in batches of 50 using the role's memory.embeddings config. The role directory's .env file is loaded before embedding so API keys are available.
Round-trip example
# Export from one agent
initrunner memory export roles/agent-a.yaml -o /tmp/mem.json
# Import into another agent
initrunner memory import roles/agent-b.yaml /tmp/mem.jsonmemory list
List stored memories for an agent.
initrunner memory list role.yaml # list all (default limit: 20)
initrunner memory list role.yaml --type procedural # filter by type
initrunner memory list role.yaml --category deployment # filter by category
initrunner memory list role.yaml --limit 50 # custom limit| Option | Type | Default | Description |
|---|---|---|---|
role_file | Path | (required) | Path to the role YAML file. |
--type | str | null | Filter by memory type: episodic, semantic, or procedural. |
--category | str | null | Filter by category. |
--limit | int | 20 | Maximum number of results. |
memory consolidate
Manually run memory consolidation — extract semantic facts from unconsolidated episodic memories.
initrunner memory consolidate role.yaml| Option | Type | Default | Description |
|---|---|---|---|
role_file | Path | (required) | Path to the role YAML file. |
This command always runs consolidation regardless of the consolidation.interval setting. It processes up to consolidation.max_episodes_per_run unconsolidated episodes.
Store Location
~/.initrunner/memory/<agent-name>.lanceOverride with store_path in the memory config. The directory is created automatically if it doesn't exist.
Shared Memory
Multiple agents can share a single memory database, allowing one agent's remember() calls to be visible to another agent's recall(). There are two mechanisms:
- Flow: set
spec.shared_memory.enabled: truein a flow definition to give all agents a common store. See Agent Flow: Shared Memory. - Delegation: set
shared_memory.store_pathon a delegate tool to share memory between inline sub-agents. See Delegation: Shared Memory.
Both work by overriding store_path (and optionally semantic.max_memories) on each agent's memory config at startup, pointing them at the same LanceDB database.
Concurrent access from multiple service threads is safe — LanceDB handles contention with internal locking.
Dimension & Model Identity Tracking
The memory store tracks embedding dimensions and model identity:
- Session-only usage: the store works without knowing dimensions — the
memories_vectable is created lazily on the firstremember()call. - First
remember()call: dimensions and the embedding model identity are detected and written tostore_meta. - Subsequent opens: dimensions and model identity are read from
store_meta. AnEmbeddingModelChangedErroris raised if the model has changed; aDimensionMismatchErroris raised if dimensions conflict. - Migration: pre-existing stores default to 1536.
Scaffold
initrunner init --name assistant --template memoryThis generates a role.yaml with memory pre-configured and a system prompt that instructs the agent to use remember(), recall(), and list_memories().
Embedding Models
Memory uses the same embedding provider resolution as Ingestion:
memory.embeddings.model— If set, used directly.memory.embeddings.provider— Used to look up the default model.spec.model.provider— Falls back to the agent's model provider.
Provider Defaults
| Provider | Default Embedding Model |
|---|---|
openai | openai:text-embedding-3-small |
anthropic | openai:text-embedding-3-small |
google | google:text-embedding-004 |
ollama | ollama:nomic-embed-text |