InitRunner

Testing

InitRunner includes built-in tools for testing agents before deploying them — schema validation, dry-run mode (no API calls), and an eval-style test suite runner.

Validation

Validate a role YAML against the schema without running the agent:

initrunner validate role.yaml

This checks:

  • YAML syntax and structure
  • Required fields (apiVersion, kind, metadata.name, spec.role)
  • Field types and value ranges (e.g. temperature between 0.0 and 2.0)
  • Tool configurations (valid types, required fields per type)
  • Skill references (file exists, frontmatter is valid)
  • Trigger configurations (valid cron expressions, valid paths)
  • Security policy structure

Validation exits with code 0 on success and non-zero on failure, making it suitable for CI pipelines.

Dry-Run Mode

Run an agent without making any LLM API calls:

initrunner run role.yaml --dry-run -p "Test prompt"

Dry-run mode replaces the configured model with a TestModel that returns deterministic placeholder responses. This lets you verify:

  • Tool registration and discovery
  • Trigger configuration and startup
  • Memory system initialization
  • Skill loading and merging
  • Guardrail enforcement logic
  • Sink configuration

No API keys are required and no tokens are consumed. Use dry-run mode during development to catch configuration errors before spending on API calls.

Test Suites

The initrunner test command runs structured test suites against an agent using an eval framework.

initrunner test role.yaml -s test_suite.yaml

Test Suite Format

A test suite is a YAML file defining test cases with inputs and expected outcomes:

name: support-agent-tests
description: Regression tests for the support agent

tests:
  - name: answers_product_question
    prompt: "What is the return policy?"
    assertions:
      - type: contains
        value: "30 days"
      - type: contains
        value: "refund"

  - name: rejects_off_topic
    prompt: "What's the weather like?"
    assertions:
      - type: not_contains
        value: "forecast"
      - type: max_tokens
        value: 200

  - name: uses_search_tool
    prompt: "Find articles about shipping delays"
    assertions:
      - type: tool_called
        value: search_documents
      - type: contains
        value: "shipping"

  - name: stays_within_budget
    prompt: "Write a comprehensive guide to our product line"
    assertions:
      - type: max_tokens
        value: 4096
      - type: max_tool_calls
        value: 10

Assertion Types

TypeDescription
containsOutput contains the specified string (case-insensitive)
not_containsOutput does not contain the specified string
regexOutput matches the regex pattern
max_tokensOutput token count is within the limit
max_tool_callsNumber of tool calls is within the limit
tool_calledThe specified tool was invoked during the run
tool_not_calledThe specified tool was not invoked
exit_statusRun completed with the expected status (success or error)

Running Tests

# Run a test suite
initrunner test role.yaml -s test_suite.yaml

# Dry-run tests (no API calls, uses TestModel)
initrunner test role.yaml -s test_suite.yaml --dry-run

# Verbose output
initrunner test role.yaml -s test_suite.yaml -v
FlagTypeDefaultDescription
-s, --suitestr(required)Path to the test suite YAML
--dry-runboolfalseUse TestModel instead of real API calls
-v, --verboseboolfalseShow full output for each test case

Test Output

Running suite: support-agent-tests (4 tests)

  ✓ answers_product_question (1.2s, 340 tokens)
  ✓ rejects_off_topic (0.8s, 95 tokens)
  ✓ uses_search_tool (2.1s, 520 tokens)
  ✗ stays_within_budget
      FAIL: max_tokens — expected ≤4096, got 4301

Results: 3 passed, 1 failed (4.1s total)

Testing Workflow

A practical workflow for developing and testing agents:

  1. Validate — catch schema errors early:

    initrunner validate role.yaml
  2. Dry-run — verify tool registration and config without API calls:

    initrunner run role.yaml --dry-run -p "Test prompt"
  3. Interactive test — manual testing in REPL mode:

    initrunner run role.yaml -i
  4. Suite test — run automated assertions against real model output:

    initrunner test role.yaml -s tests/regression.yaml
  5. CI integration — validate and dry-run in CI, suite tests on schedule:

    # In CI pipeline
    initrunner validate role.yaml
    initrunner test role.yaml -s tests/smoke.yaml --dry-run

On this page