API Server
The initrunner run <role> --serve command exposes any agent as an OpenAI-compatible HTTP API. Use InitRunner agents as drop-in replacements for OpenAI in any client that speaks the chat completions format — including the official OpenAI SDKs, curl, and tools like Open WebUI.
Quick Start
# Start the server
initrunner run role.yaml --serve
# With authentication
initrunner run role.yaml --serve --api-key my-secret-key
# Custom host/port
initrunner run role.yaml --serve --host 0.0.0.0 --port 3000CLI Options
See CLI Reference — Run Options for the full flag list. The key --serve flags:
| Option | Type | Default | Description |
|---|---|---|---|
--serve | bool | false | Enable API server mode |
--host | str | 127.0.0.1 | Host to bind to (0.0.0.0 for all interfaces) |
--port | int | 8000 | Port to listen on |
--api-key | str | null | API key for Bearer token authentication |
--cors-origin | str | null | Allowed CORS origin (repeatable) |
--audit-db | Path | ~/.initrunner/audit.db | Audit database path |
--no-audit | bool | false | Disable audit logging |
Endpoints
GET /health
Always returns 200 OK. Not protected by authentication.
{"status": "ok"}GET /v1/models
Lists available models. Returns the agent's metadata.name as the model ID.
{
"object": "list",
"data": [
{
"id": "my-agent",
"object": "model",
"created": 1700000000,
"owned_by": "initrunner"
}
]
}POST /v1/chat/completions
The main chat completions endpoint. Accepts the standard OpenAI request format.
| Field | Type | Default | Description |
|---|---|---|---|
model | str | "" | Model name (ignored — uses role config) |
messages | list | [] | Conversation messages (role + content) |
stream | bool | false | Enable SSE streaming |
ChatMessage Fields
| Field | Type | Description |
|---|---|---|
role | str | "user", "assistant", or "system" |
content | str | list[ContentPart] | Plain text string, or a list of content parts for multimodal input |
Multimodal Input
The content field supports multimodal content parts in the standard OpenAI format. See Multimodal Input for the full reference.
Content Part Types
| Type | Field | Description |
|---|---|---|
text | text | Plain text content |
image_url | image_url | Image via HTTP URL or base64 data: URI |
input_audio | input_audio | Audio as base64 with format specifier |
Image via URL
curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
]
}]
}'Image via Base64
curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image."},
{"type": "image_url", "image_url": {"url": "data:image/png;base64,iVBORw0KGgo..."}}
]
}]
}'Audio Input
curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Transcribe this audio."},
{"type": "input_audio", "input_audio": {"data": "<base64>", "format": "mp3"}}
]
}]
}'OpenAI Python SDK (multimodal)
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8000/v1", api_key="unused")
response = client.chat.completions.create(
model="my-agent",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
],
}],
)
print(response.choices[0].message.content)Streaming
When stream: true, the server responds with Server-Sent Events (SSE):
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant"}}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop"}]}
data: [DONE]Multi-Turn Conversations
Use the X-Conversation-Id header for server-side conversation history:
- Send a request with
X-Conversation-Id: conv-001. - The server stores message history after each request.
- Subsequent requests with the same ID use stored history — only the last user message is the new prompt.
- Conversations expire after 1 hour of inactivity.
Authentication
When --api-key is set, all /v1/* endpoints require:
Authorization: Bearer <api-key>The /health endpoint is never protected.
Usage Examples
curl
curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Hello!"}]
}'curl (with auth and conversation)
# First message
curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer my-secret-key" \
-H "X-Conversation-Id: conv-001" \
-d '{"messages": [{"role": "user", "content": "My name is Alice."}]}'
# Follow-up
curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer my-secret-key" \
-H "X-Conversation-Id: conv-001" \
-d '{"messages": [{"role": "user", "content": "What is my name?"}]}'OpenAI Python SDK
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:8000/v1",
api_key="my-secret-key", # or "unused" if no --api-key set
)
response = client.chat.completions.create(
model="my-agent",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)OpenAI Python SDK (streaming)
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:8000/v1",
api_key="unused",
)
stream = client.chat.completions.create(
model="my-agent",
messages=[{"role": "user", "content": "Tell me a story."}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")OpenAI Node.js SDK
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://127.0.0.1:8000/v1",
apiKey: "my-secret-key",
});
const response = await client.chat.completions.create({
model: "my-agent",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);Open WebUI Integration
Open WebUI gives you a ChatGPT-like web interface for any InitRunner agent. Because initrunner run --serve speaks the OpenAI wire format, Open WebUI works out of the box — no plugins or adapters needed.
Setup
This walkthrough uses the support-agent example, which includes a RAG knowledge base.
1. Ingest the knowledge base
initrunner ingest examples/roles/support-agent/support-agent.yaml2. Start the InitRunner server
initrunner run examples/roles/support-agent/support-agent.yaml --serve --host 0.0.0.0 --port 3000
--host 0.0.0.0is required so the Docker container can reach the server.
3. Launch Open WebUI
docker run -d \
--name open-webui \
--network host \
-e OPENAI_API_BASE_URL=http://127.0.0.1:3000/v1 \
-e OPENAI_API_KEY=unused \
-v open-webui:/app/backend/data \
ghcr.io/open-webui/open-webui:main4. Open your browser
Navigate to http://localhost:8080, create a local account, and select the support-agent model from the model dropdown. Start chatting — responses are served by your InitRunner agent.
Cleanup
docker rm -f open-webui
docker volume rm open-webuiNotes
- If you start the server with
--api-key, setOPENAI_API_KEYto the same value in thedocker runcommand. - For production deployments, consider running both services behind a reverse proxy with TLS.