Tutorial: Build a Research Assistant

This tutorial walks you through building a research assistant that searches the web, writes briefings, remembers what it found, and eventually runs on a schedule with a team of specialized personas.

Each step adds one feature. You will start with initrunner new, try the dashboard, then layer on memory, RAG, autonomous mode, triggers, teams, and flows. By the end you will have touched every major part of InitRunner.

Prerequisites: Complete the Quickstart first. You should have InitRunner installed, an API key configured, and a basic understanding of role.yaml.

The examples below use openai/gpt-5-mini. Swap the model: block if you use a different provider. See Provider Configuration for options.

mkdir research-assistant && cd research-assistant

Step 1: Create Your Agent

The fastest way to build an agent is to describe what you want in plain English:

initrunner new "a research assistant that searches the web for a given topic and writes a concise briefing with sources"

InitRunner sends your description to the LLM, which generates a complete role.yaml. You will see a syntax-highlighted panel with the result:

+-- research-assistant -------------------- VALID --+
| apiVersion: initrunner/v1                         |
| kind: Agent                                       |
| metadata:                                         |
|   name: research-assistant                        |
|   description: ...                                |
| spec:                                             |
|   role: |                                         |
|     You are a research assistant...               |
|   model:                                          |
|     provider: openai                              |
|     name: gpt-5-mini                              |
|   tools:                                          |
|     - type: search                                |
|     - type: web_reader                            |
|     - type: datetime                              |
|     - type: filesystem                            |
|       root_path: ./reports                        |
|       read_only: false                            |
|       allowed_extensions: [.md]                   |
|   guardrails:                                     |
|     max_tokens_per_run: 30000                     |
|     max_tool_calls: 15                            |
|     timeout_seconds: 120                          |
+---------------------------------------------------+

Refine (empty to save, "quit" to discard):
>

This is the refinement loop. You can type changes and the LLM will update the YAML. Try something like:

> also add a think tool so it can reason through complex topics

The panel updates with the new tool added. When you are happy with the result, press Enter on an empty line to save.

Your YAML will look different from the example above. The LLM generates it fresh each time. That is fine. What matters is that you have a search tool (for web search), a web_reader tool (for fetching pages), and a filesystem tool pointed at ./reports with read_only: false.

Prefer a template? Run initrunner new --template basic for a minimal starting point, or initrunner new --blank for the bare minimum. See Role Creation for all the options.

Now run it:

initrunner run role.yaml -p "Research the current state of AI agent frameworks and write a briefing"

The agent searches the web, reads relevant pages, and writes a briefing. Check ./reports/ for the saved file.

Shortcut: Pass --run "your prompt" straight to initrunner new and it will kick off that first run as soon as the refinement loop closes, so you don't need a separate initrunner run command the first time around.

Step 2: See It in the Dashboard

Everything you just did in the terminal also works in the browser. Launch the dashboard:

initrunner dashboard

This opens http://localhost:8100 in your browser. You will see the Launchpad with stats and starter cards. Click Agents in the sidebar to find your research assistant.

From here you can:

Run it. Click the play button on your agent's card. A slide-over drawer opens where you can type a prompt and run the agent without leaving the page.
Watch it work. During a run, the bottom panel shows live tool activity. You will see each tool call appear in real time with status indicators, durations, and a token/cost meter.
Edit it. Click into the agent detail page and open the Editor tab to modify the YAML directly in the browser.

You can also create agents entirely in the dashboard. On the agent creation page, the AI Generate tab works just like initrunner new but in the browser. There is also a Form Builder tab if you prefer filling in fields over writing YAML.

Two other top-level pages are worth a click while you're here. MCP Hub (/mcp) is a visual manager for Model Context Protocol servers with a Playground tab where you can fire off any tool in isolation and see the raw response, plus a Discover tab with a curated set of popular servers you can copy into a role. Cost Analytics (/cost) breaks down token spend and estimated USD cost per agent, per model, and per day once you have a few runs logged, which is how you notice the expensive agent in your roster before the bill does.

Tip: The dashboard runs alongside the CLI. Changes you make in one show up in the other. Edit in whichever feels more comfortable.

Step 3: Add Memory

Right now your agent forgets everything between runs. Add a memory block so it can remember findings across sessions.

Open role.yaml and add this under spec::

  memory:
    max_sessions: 10
    max_resume_messages: 20
    semantic:
      max_memories: 1000

This does two things. First, it saves conversation history so you can resume sessions with --resume. Second, it gives the agent long-term memory tools: remember(), recall(), list_memories(), learn_procedure(), and record_episode(). These are auto-registered when the memory block is present.

Try it in interactive mode:

initrunner run role.yaml -i

You: Research quantum computing breakthroughs from this month
Agent: [searches, reads pages, writes briefing, remembers key findings]
You: quit

Start a new session and ask about previous research:

initrunner run role.yaml -i

You: What do you remember about quantum computing?
Agent: Based on my memories, I found that...

Or pick up exactly where you left off:

initrunner run role.yaml -i --resume

This restores the full conversation history, not just the semantic memories.

To see what the agent has stored:

initrunner memory list role.yaml
initrunner memory list role.yaml --type semantic --limit 5

In the dashboard, the agent detail page has a Memory tab where you can browse stored memories visually.

For the full picture on episodic, semantic, and procedural memory, see Memory.

Step 4: Add a Knowledge Base

Your agent has been saving reports to ./reports/. You can make those searchable by adding document ingestion. This gives the agent a search_documents() tool that queries its own past work.

Add this under spec: in your role.yaml:

  ingest:
    sources:
      - ./reports/**/*.md
    chunking:
      strategy: fixed
      chunk_size: 512
      chunk_overlap: 50

If you do not have enough reports yet, run the agent a few times to build up a collection. Then ask about past research:

initrunner run role.yaml -p "What have I researched about AI agents? Summarize my previous findings."

On that run, InitRunner notices the new ingest: block, reads the matching files, splits them into chunks, generates embeddings, and stores everything in a local vector database. The agent then calls search_documents() to pull relevant chunks from your reports and cite them in the answer.

Auto-ingest is the default now. Every subsequent initrunner run does a cheap mtime check and only re-indexes files that were added, modified, or removed since the last pass, so there is no manual re-index step to remember. If you would rather control indexing yourself, set ingest.auto: false in your role YAML and run initrunner ingest role.yaml by hand whenever you want to refresh.

For an authoritative rebuild, reach for:

initrunner ingest role.yaml --force

You want --force after swapping embedding models, after copying files with cp -p (which preserves timestamps and defeats the mtime check), or any time you just want to be sure the index matches the source of truth.

In the dashboard, the Ingest tab on the agent detail page lets you upload files, add URLs, re-index with a progress bar, and delete individual documents.

By default retrieval is vector search, but you can switch to hybrid search that blends vector and keyword scoring, and you can embed with an in-process local: model instead of a provider API. See Ingestion Pipeline, RAG Guide, and Providers for chunking strategies, retrieval modes, and embedding options.

Step 5: Go Autonomous

So far you have been giving the agent a single prompt and getting one response back. Autonomous mode lets it plan, execute, and iterate on multi-step tasks without you prompting each step.

Your guardrails block already has max_tokens_per_run and max_tool_calls. For autonomous mode, you also want to set iteration limits. Update your guardrails:

  guardrails:
    max_tokens_per_run: 50000
    max_tool_calls: 30
    timeout_seconds: 300
    max_iterations: 8
    autonomous_token_budget: 40000

max_iterations caps how many plan-execute-reflect cycles the agent runs. autonomous_token_budget is a separate token ceiling for the entire autonomous loop. Both are safety nets.

Run it:

initrunner run role.yaml -a -p "Research the top 5 AI agent frameworks, compare their strengths and weaknesses, and write a detailed comparison report"

The agent builds a plan, works through it step by step, and writes a report. You will see it iterate through multiple cycles of searching, reading, and writing.

Cost note: Autonomous mode uses more tokens than one-shot mode since it runs multiple LLM calls in a loop. Start with a low max_iterations (5-8) and adjust once you see how the agent actually behaves on your task. InitRunner estimates the USD cost of every run via genai-prices and prints it alongside the token counts when the run finishes, so you can watch real spend add up instead of guessing from token counts. The --dry-run flag lets you test the flow without making API calls.

For reasoning strategies like plan_execute, todo_driven, and reflexion, see Reasoning and Autonomous Execution.

Step 6: Run on a Schedule

Triggers let the agent run automatically. Add a cron trigger and a file sink to log results:

  triggers:
    - type: cron
      schedule: "0 9 * * 1-5"
      prompt: "Research the latest AI news from today and write a morning briefing. Compare with previous findings."
      timezone: US/Eastern
  sinks:
    - type: file
      path: ./logs/research.jsonl
      format: json

This fires every weekday at 9am Eastern. The file sink logs each result as a JSON line to ./logs/research.jsonl.

For testing, use "* * * * *" (every minute) so you do not have to wait:

initrunner run role.yaml --daemon

Wait about a minute. The trigger fires, the agent runs, and the result appears in the sink file. Stop the daemon with Ctrl+C.

Change the schedule back to something practical before leaving it running.

Cap the daily bill. A scheduled agent can rack up surprising charges if a trigger fires more often than you expected, or if the agent burns through more tokens on each run than you planned for. InitRunner has USD ceilings built into guardrails for exactly this:
  guardrails:
    max_tokens_per_run: 50000
    max_tool_calls: 30
    timeout_seconds: 300
    daemon_daily_cost_budget: 2.00
    daemon_weekly_cost_budget: 10.00
The daemon estimates each run's cost before it dispatches, and once the counter hits the cap it stops firing new runs until the window resets. The daily counter resets at UTC midnight, the weekly one at the start of the ISO week.

Want the agent to do multi-step research on each trigger? Use --autopilot instead of --daemon. This runs the full autonomous loop (from Step 5) each time a trigger fires, instead of a single one-shot response.

In the dashboard, the Timeline tab on the agent detail page shows a Gantt-style chart of triggered runs over the last 24 hours. Color-coded bars show success, failure, and duration at a glance.

For all trigger types (file watch, webhook, Telegram, Discord, heartbeat), see Triggers. For sink options, see Sinks.

Step 7: Build a Research Team

A single agent does everything. A team splits the work across specialized personas that collaborate on the same task.

Create a new file called team.yaml:

apiVersion: initrunner/v1
kind: Team
metadata:
  name: research-team
  description: Three-persona research team with fact-checking
spec:
  strategy: sequential
  model:
    provider: openai
    name: gpt-5-mini
    temperature: 0.3
  personas:
    researcher:
      role: |
        You are a thorough researcher. Search the web for information
        on the given topic. Focus on finding primary sources, recent
        data, and expert opinions. Pass your raw findings to the
        fact-checker.
    fact-checker:
      role: |
        You are a skeptical fact-checker. Review the researcher's
        findings. Flag anything that looks unsupported, outdated, or
        contradictory. Note which claims have strong sources and which
        need qualification.
    writer:
      role: |
        You are a concise technical writer. Take the researched and
        fact-checked material and write a clear, well-structured
        briefing. Include source links. Flag any claims the
        fact-checker marked as weak.
  tools:
    - type: search
    - type: web_reader
    - type: datetime
    - type: filesystem
      root_path: ./reports
      read_only: false
      allowed_extensions: [.md]
    - type: think
  guardrails:
    max_tokens_per_run: 50000
    timeout_seconds: 300
    team_token_budget: 150000

With strategy: sequential, each persona runs in order: researcher, then fact-checker, then writer. Each one sees the output of the previous persona.

Run it:

initrunner run team.yaml -p "Research the current state of open-source LLMs"

You will see three turns of output as each persona does its part.

Want them to debate? Change strategy to debate and add a debate block:
  strategy: debate
  debate:
    max_rounds: 3
    synthesize: true
The personas argue their perspectives for multiple rounds, then a synthesis step combines the best points. See Team Mode.

In the dashboard, the Teams page shows your team with a pipeline visualization. The run panel streams output from each persona in real time.

Step 8: Orchestrate with Flows

Teams share one model config and pass text between personas. Flows are for when you need separate agents with their own models, tools, and triggers, connected by routing logic.

Say you want an intake agent that watches for research requests and routes them to the right specialist. Create a directory structure:

research-flow/
├── flow.yaml
├── roles/
│   ├── intake.yaml      # your existing role.yaml, with a cron trigger
│   └── deep-researcher.yaml  # a second agent for in-depth work

The flow.yaml connects them:

apiVersion: initrunner/v1
kind: Flow
metadata:
  name: research-flow
  description: Intake agent routes research requests to a deep researcher
spec:
  agents:
    intake:
      role: roles/intake.yaml
      sink:
        type: delegate
        target: deep-researcher
    deep-researcher:
      role: roles/deep-researcher.yaml
      needs: [intake]

When the intake agent finishes, its output is delegated to the deep researcher. The needs field means the deep researcher only runs after intake completes.

You can also scaffold this automatically:

initrunner flow new research-flow --pattern chain --agents 2

Validate and run:

initrunner flow validate flow.yaml
initrunner flow up flow.yaml

Routing options: The sink supports four strategies. all sends to every target (broadcast). keyword parses the output for routing hints. sense uses an LLM to pick the best target based on content. ensemble broadcasts the same prompt to every target and votes on the answers. You can also set loop_back to re-run a step until its output meets a condition. See Flow for details.

Sharing state between agents: Agents in a flow can post, read, and claim structured entries on a shared blackboard instead of threading everything through prompt text. Add type: blackboard to a role's tools. The board is per-run and only active inside a flow. Flow runs also checkpoint their state, so a long run can resume after a restart. See Durability.

In the dashboard, the Flow page has a visual editor where you can see and edit the agent graph. During a run, you can watch events stream between agents in real time.

What's Next

You have built a research assistant, given it memory and a knowledge base, made it autonomous, put it on a schedule, assembled a team, and connected agents in a flow. Here is where to go from here:

More tools: InitRunner has 28 built-in tool types including shell, Python, git, SQL, MCP servers, and more. See Tools.
Cost tracking: Monitor token usage and spending across agents. See Cost Tracking.
Dev workflow agents: Run pre-built PR review, changelog, and CI explainer agents in 10 minutes. See Dev Workflow Agents.
Telegram or Discord bot: Turn any agent into a chat bot. See Telegram and Discord.
API server: Expose your agent as an OpenAI-compatible endpoint with initrunner run role.yaml --serve. See API Server.
Security: Tool sandboxing, ABAC policies, and audit logging. See Security and InitGuard.
Browse community agents: Find and install pre-built agents at InitHub.
Full YAML reference: Every field documented. See Configuration.

On this page