InitRunner

Observability

InitRunner supports opt-in distributed tracing via OpenTelemetry. When enabled, agent runs, LLM requests, tool calls, ingestion pipelines, and delegation chains all emit traces that can be visualized in any OTel-compatible backend (Jaeger, Grafana Tempo, Datadog, Honeycomb, Logfire, etc.).

The SQLite audit trail remains the lightweight default. Observability adds a second, richer signal layer — both run side-by-side.

Quick Start

See traces in under a minute — no Docker, no external services:

pip install initrunner[observability]
initrunner run traced-agent.yaml -p "What time is it?" --no-audit

JSON spans print to stderr showing the full trace hierarchy: the parent initrunner.agent.run span, the PydanticAI agent run and chat spans, and the running tool (get_current_time) tool span.

Console Output Example

With backend: console, each completed span is printed to stderr as a JSON object. A typical run produces output like this (timestamps and IDs shortened for readability):

{
    "name": "running tool (get_current_time)",
    "context": {
        "trace_id": "0x3a1f...",
        "span_id": "0x8b2c...",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0x4d1e...",
    "start_time": "2026-02-17T12:00:00.100000Z",
    "end_time": "2026-02-17T12:00:00.102000Z",
    "status": { "status_code": "OK" },
    "attributes": {}
}
{
    "name": "chat gpt-4o-mini",
    "context": {
        "trace_id": "0x3a1f...",
        "span_id": "0x4d1e..."
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": "0x9f3a...",
    "attributes": {
        "gen_ai.operation.name": "chat",
        "gen_ai.request.model": "gpt-4o-mini",
        "gen_ai.response.model": "gpt-4o-mini-2024-07-18",
        "gen_ai.usage.input_tokens": 85,
        "gen_ai.usage.output_tokens": 24
    }
}
{
    "name": "initrunner.agent.run",
    "context": {
        "trace_id": "0x3a1f...",
        "span_id": "0x7e5b..."
    },
    "kind": "SpanKind.INTERNAL",
    "attributes": {
        "initrunner.agent_name": "traced-agent",
        "initrunner.run_id": "a1b2c3d4",
        "initrunner.tokens_total": 109,
        "initrunner.duration_ms": 1200,
        "initrunner.success": true
    }
}

Spans appear in completion order (leaf spans first, root span last). All spans share the same trace_id, forming a single trace.

Installation

pip install initrunner[observability]

This installs opentelemetry-sdk, opentelemetry-exporter-otlp, and opentelemetry-instrumentation-logging.

For the Logfire backend, install separately:

pip install logfire

Configuration

Add an observability section to your role's spec:

spec:
  observability:
    backend: otlp              # "otlp" | "logfire" | "console"
    endpoint: http://localhost:4317
    service_name: my-agent     # default: agent metadata.name
    trace_tool_calls: true
    trace_token_usage: true
    sample_rate: 1.0
    include_content: false     # include prompts/completions in spans
FieldTypeDefaultDescription
backendotlp | logfire | consoleotlpExporter backend
endpointstringhttp://localhost:4317OTLP gRPC endpoint (ignored for console/logfire)
service_namestringagent nameService name in traces
trace_tool_callsbooltrueEmit spans for tool calls
trace_token_usagebooltrueEmit token usage metrics
sample_ratefloat (0.0–1.0)1.0Trace sampling rate
include_contentboolfalseInclude prompt/completion text in spans

Quickstart with Jaeger

Docker run

docker run -d --name jaeger \
  -p 16686:16686 \
  -p 4317:4317 \
  jaegertracing/all-in-one:latest

Docker Compose

# docker-compose.yaml
services:
  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - "16686:16686"   # Jaeger UI
      - "4317:4317"     # OTLP gRPC
docker compose up -d

Run with OTLP

Add observability to your role:

spec:
  observability:
    backend: otlp
    endpoint: http://localhost:4317

Run your agent:

initrunner run role.yaml -p "Hello, world"

Open Jaeger UI at http://localhost:16686 and search for your agent's service name.

Span Hierarchy

When observability is enabled, traces follow this hierarchy:

initrunner.agent.run                    ← InitRunner parent span
├── agent run                           ← PydanticAI agent span
│   ├── chat gpt-4o                     ← LLM request span
│   ├── running tool (my_tool)          ← Tool execution span
│   └── chat gpt-4o                     ← Follow-up LLM request
└── initrunner.ingest                   ← Ingestion pipeline span (if applicable)

InitRunner-Specific Spans

Span NameAttributes
initrunner.agent.runinitrunner.run_id, initrunner.agent_name, initrunner.trigger_type, initrunner.tokens_total, initrunner.duration_ms, initrunner.success
initrunner.ingestinitrunner.agent_name, initrunner.ingest.files_processed, initrunner.ingest.chunks_created

PydanticAI Spans (Automatic)

PydanticAI emits these spans when instrument is set on the Agent:

  • agent run — Full agent run lifecycle
  • chat {model} — Each LLM API call (SpanKind.CLIENT)
  • running tool — Each tool execution
  • gen_ai.client.token.usage — Token usage histogram metric

Distributed Traces via Delegation

In compose orchestrations, trace context propagates automatically through delegation chains using W3C Trace Context (traceparent/tracestate headers).

initrunner.agent.run [service_a]
├── agent run [PydanticAI]
│   ├── chat gpt-4o
│   └── running tool (delegate)
└── initrunner.agent.run [service_b]    ← linked via traceparent
    └── agent run [PydanticAI]
        └── chat gpt-4o

This means you can visualize an entire multi-agent pipeline as a single distributed trace in Jaeger or your preferred backend.

Backends

OTLP (Default)

Sends traces via gRPC to any OTLP-compatible collector. Uses BatchSpanProcessor for efficient batching.

Console

Prints spans to stderr. Useful for quick debugging:

spec:
  observability:
    backend: console

Logfire

Uses Pydantic Logfire for managed observability:

spec:
  observability:
    backend: logfire
    service_name: my-agent

Logfire manages its own TracerProvider — InitRunner delegates to logfire.configure() and does not create a manual provider.

Audit vs Observability

Both systems record agent activity, but they serve different purposes:

Audit TrailObservability
PurposeCompliance, history, debuggingDistributed tracing, performance analysis
BackendLocal SQLite (built-in)Any OTel collector (Jaeger, Tempo, Datadog, etc.)
DependenciesNone (included)pip install initrunner[observability]
DefaultEnabledOpt-in
GranularityOne record per agent runNested spans (run → LLM call → tool call)
Multi-agentIndependent per-run recordsDistributed traces across delegation chains
QuerySQL / initrunner audit exportJaeger UI, Grafana, vendor dashboards
RetentionAuto-pruned SQLite (configurable)Managed by your OTel backend

Use audit when you need a lightweight, zero-dependency log of what happened — prompts, outputs, token usage, and success/failure for every run.

Use observability when you need to understand how it happened — latency breakdowns across LLM calls and tools, distributed traces across multi-agent pipelines, and integration with your existing monitoring stack.

Both can run simultaneously. See Audit Trail for audit configuration.

Log Correlation

When observability is enabled, Python log records are automatically enriched with trace_id and span_id fields via OTel's LoggingInstrumentor. This allows correlating application logs with traces in backends that support log-trace correlation (Grafana Loki + Tempo, Datadog, etc.).

Zero Overhead When Disabled

When spec.observability is not set:

  • No OTel SDK is imported
  • trace.get_tracer("initrunner") returns a no-op tracer
  • Span context injection/extraction are no-ops
  • CLI startup time is unaffected

Troubleshooting

Missing SDK

RuntimeError: OpenTelemetry observability requires: pip install initrunner[observability]

Install the optional dependency group: pip install initrunner[observability]

No Traces Appearing

  1. Verify the OTLP endpoint is reachable: curl http://localhost:4317
  2. Check sample_rate is not 0.0
  3. Try backend: console to verify spans are being created
  4. Ensure the collector/Jaeger is accepting gRPC on port 4317 (not HTTP on 4318)

Duplicate Spans with Logfire

If you see duplicate spans when using backend: logfire, ensure you're not also setting up a manual TracerProvider elsewhere. Logfire manages its own providers — InitRunner correctly delegates to logfire.configure() without creating additional providers.

On this page