InitRunner

Docker Sandbox

The Docker sandbox runs tool subprocesses inside disposable docker run --rm --init containers. It is the portable option: works on macOS, Windows, and Linux, supports pinned OS images, and handles bridge networking natively.

For the cross-backend config reference, see Runtime Sandbox. For the Linux-native alternative with no daemon, see Bubblewrap Sandbox. For running InitRunner itself inside Docker (a different topic), see Docker.

Why Docker

  • Cross-platform. Works the same on macOS, Windows, and Linux.
  • Pinned environment. The image is the filesystem. Upgrading the host does not change what the sandbox sees.
  • Bridge networking. For tools that need outbound HTTP through a user-defined network, egress allowlist, or Docker DNS aliases, only Docker supports it.
  • Standard flags. Memory (-m), CPU (--cpus), read-only rootfs (--read-only), pid limit (--pids-limit), container user (--user) — all stock docker run options.

Requirements

A reachable Docker daemon. Preflight runs docker info before any tool launches and raises SandboxUnavailableError with install remediation when the daemon is missing:

PlatformCommand
Debian/Ubuntuapt install docker.io && systemctl start docker
Fedoradnf install docker && systemctl start docker
Archpacman -S docker && systemctl start docker
macOSbrew install --cask docker, then open Docker Desktop
WindowsInstall Docker Desktop

Preflight also checks the configured image with docker image inspect and runs docker pull if it is missing. Private images need docker login on the host first.

Quick Start

apiVersion: initrunner/v1
kind: Agent
metadata:
  name: sandboxed-agent
spec:
  role: You are a code execution assistant.
  model:
    provider: openai
    name: gpt-5-mini
  tools:
    - type: shell
    - type: python
  security:
    sandbox:
      backend: docker

This runs all shell and Python tool invocations inside python:3.12-slim containers with no network access and a read-only root filesystem.

Looking for the pre-v2026.4.16 security.docker block? It was replaced by the unified security.sandbox schema. See Migration for the before/after.

Enabling it

security:
  sandbox:
    backend: docker         # or: auto (prefers bwrap on Linux, falls back to Docker)
    network: none           # none | bridge | host
    memory_limit: 256m
    cpu_limit: 1.0
    read_only_rootfs: true
    allowed_read_paths: []
    allowed_write_paths: []
    bind_mounts: []
    env_passthrough: []
    docker:
      image: python:3.12-slim
      user: auto            # "auto" | "1000:1000" | null (root)
      extra_args: []        # dangerous flags blocked by schema

Configuration reference

Cross-backend fields live under security.sandbox. Docker-specific fields live under security.sandbox.docker.

Shared fields

FieldTypeDefaultDescription
network"none" | "bridge" | "host""none"Container network mode. none blocks at the kernel level.
memory_limitstr"256m"Memory cap in Docker format (256m, 1g, …).
cpu_limitfloat1.0Fractional cores.
read_only_rootfsbooltrueMount the root filesystem read-only. A writable /tmp (64 MB, noexec,nosuid) is added automatically.
allowed_read_pathslist[str][]Host paths mounted read-only. Validated against permitted roots at load time.
allowed_write_pathslist[str][]Host paths mounted read-write.
bind_mountslist[BindMount][]Extra mounts. Each entry becomes one -v host:container[:ro] flag.
env_passthroughlist[str][]Env var names to pass into the container, filtered through scrub_env().

Docker-specific fields

FieldTypeDefaultDescription
docker.imagestr"python:3.12-slim"Image to use for containers.
docker.user"auto" | str | null"auto"Container user. "auto" maps to the current uid:gid when writable mounts exist. null runs as root.
docker.extra_argslist[str][]Extra docker run flags. Security-sensitive flags are rejected.

BindMount fields

FieldTypeDefaultDescription
sourcestr(required)Host path. Relative paths resolve against the role file's directory.
targetstr(required)Container path. Must be absolute.
read_onlybooltrueMount as read-only.

Isolation model

Each tool call becomes one docker run --rm --init invocation. --init spawns a tiny PID-1 that reaps zombies and forwards signals. Without it, ctrl-C does not stop a shell running sleep.

Base flags

FlagPurpose
--rmContainer is deleted when the process exits. No lingering state.
--inittini as PID 1 for signal handling and zombie reaping.
--name initrunner-<hash>Unique name for cleanup on timeout.
--label initrunner.managed=trueIdentifies InitRunner-managed containers for bulk cleanup.
--pids-limit 256Caps fork bombs.
--read-only (when read_only_rootfs: true)Root filesystem is read-only.
--tmpfs /tmp:rw,noexec,nosuid,size=64mWritable /tmp without allowing writes elsewhere.

Network

network:FlagBehavior
none--network noneNo interfaces, no DNS, no connectivity. Kernel-level block.
bridge--network bridgeDefault Docker bridge; outbound traffic is NAT'd through the host.
host--network hostShares the host network stack. Equivalent to no isolation at the network layer.

Working directory and mounts

  • /work — the tool's cwd, bind-mounted read-write. Set as the container's working directory via -w /work.
  • /role — the role directory, read-only. Role-relative bind_mounts resolve against this path on the host.
  • bind_mounts — user-configured. Each entry becomes one -v host:container[:ro] flag. Relative source paths resolve against role_dir. Missing sources raise ValueError at build time. No silent failures.
  • Tool-internal mounts — e.g. python_exec binding a tempfile. Code-controlled, no schema validation.

User mapping

The --user flag depends on docker.user and whether writable mounts exist:

docker.userWritable mount?--user value
"auto"yes (work_dir or rw bind_mount)<host uid>:<host gid>
"auto"no(omitted — container default user)
"1000:1000" (explicit)either1000:1000
nulleither(omitted — runs as root inside container)

Auto mapping prevents a common pain point: the container writes files as root, then the host user cannot delete them.

Environment

Container env starts clean. Host variables pass through only when:

  1. They are listed in env_passthrough and exist on the host. scrub_env() strips sensitive prefixes (OPENAI_API_KEY, AWS_SECRET, …) first.
  2. The tool sets them explicitly via env={...} on its run() call.

Each becomes one -e KEY=value flag.

Resource limits

FieldFlagNotes
memory_limit-m 256mContainer is OOM-killed at the limit. Exit code 137 triggers an auto-appended hint: "Container killed (OOM). Increase security.sandbox.memory_limit (current: 256m)."
cpu_limit--cpus 1.0Fractional cores.
pids_limit--pids-limit 256Always on. Caps runaway forks.

extra_args validation

docker.extra_args accepts additional docker run flags (e.g. --ulimit=nofile=1024). A blocklist rejects flags that defeat isolation:

  • --privileged
  • --cap-add (any form: bare, --cap-add=NET_ADMIN, --cap-add NET_ADMIN)
  • --security-opt when it disables seccomp or apparmor
  • --pid=host, --ipc=host, --uts=host, --userns=host
  • --device, --volume-driver, --runtime

Attempting to use these raises a validation error at role load time.

Container cleanup on timeout

When a tool exceeds its timeout, subprocess.run kills the local docker CLI, but the container keeps running. The backend catches subprocess.TimeoutExpired and runs docker rm -f <name> to force-remove it. The backend swallows any cleanup failure so it cannot mask the original timeout error.

Preflight

initrunner doctor --role <file> checks two things:

  1. The Docker daemon answers docker info.
  2. The configured image exists locally, or docker pull succeeds.

Run it once per role change so image pulls happen outside the hot path.

Examples

Data processing with file access

security:
  sandbox:
    backend: docker
    network: none
    memory_limit: 512m
    cpu_limit: 2.0
    bind_mounts:
      - source: ./data
        target: /data
        read_only: true
      - source: ./output
        target: /output
        read_only: false
    env_passthrough: [LANG, TZ]
    docker:
      image: python:3.12-slim

Minimal sandbox

security:
  sandbox:
    backend: docker

All defaults: python:3.12-slim, no network, 256 MB RAM, 1 CPU, read-only rootfs.

Custom image with extra args

security:
  sandbox:
    backend: docker
    memory_limit: 1g
    read_only_rootfs: false
    docker:
      image: node:20-slim
      extra_args: ["--pids-limit=100", "--ulimit=nofile=1024"]

Complete example role

See the docker-sandbox example for a ready-to-use role:

initrunner examples copy docker-sandbox
initrunner run docker-sandbox.yaml -p "Use python to compute 2**100"

Custom image requirements

When using a custom image, it must meet these requirements:

  • Interpreter on PATH. The Python tool runs python3 inside sandboxes. The script tool uses the configured interpreter (default /bin/sh). If the interpreter is missing, the container exits with "not found".
  • Writable /tmp. When read_only_rootfs: true (default), a writable /tmp is provided as a tmpfs (64 MB, noexec, nosuid). The image does not need to provide /tmp itself.
  • Working directory at /work. The tool's working directory is bind-mounted at /work. Your image should not expect a specific working directory.
  • No special init system needed. InitRunner passes --init (tini) automatically.

Running InitRunner itself in Docker

When InitRunner runs inside a container and you want sandboxed tools, the inner InitRunner still needs a Docker daemon. Two patterns:

  1. Socket passthrough (simpler, less secure) — mount /var/run/docker.sock into the InitRunner container. The inner process gets effective root on the host via the socket; use only for trusted roles.
  2. Docker-in-Docker (safer, heavier) — run a dind sidecar and point InitRunner at it with DOCKER_HOST=tcp://dind:2375.

See Docker — socket passthrough for the compose snippet.

Audit

Each call emits a sandbox.exec security event:

backend=docker argv0=/usr/bin/python rc=0 duration_ms=312

Query with:

initrunner audit security-events --event-type sandbox.exec

Limitations

  • Per-call startup cost. A Docker container takes ~200–500 ms to start. bwrap is about 10× faster on the same host. Use backend: auto to prefer bwrap when available.
  • Daemon dependency. Every tool call needs the daemon up. If it dies, tools fail with SandboxUnavailableError.
  • Image distribution. The first run may pull the image (up to 5 minutes). Run initrunner doctor --role <file> to pull outside the hot path.
  • No seccomp customization in v1. The sandbox uses Docker's default seccomp profile. The schema does not expose custom profiles.

On this page