Docker Sandbox

The Docker sandbox runs tool subprocesses inside disposable docker run --rm --init containers. It is the portable option: works on macOS, Windows, and Linux, supports pinned OS images, and handles bridge networking natively.

For the cross-backend config reference, see Runtime Sandbox. For the Linux-native alternative with no daemon, see Bubblewrap Sandbox. For running InitRunner itself inside Docker (a different topic), see Docker.

Why Docker

Cross-platform. Works the same on macOS, Windows, and Linux.
Pinned environment. The image is the filesystem. Upgrading the host does not change what the sandbox sees.
Bridge networking. For tools that need outbound HTTP through a user-defined network, egress allowlist, or Docker DNS aliases, only Docker supports it.
Standard flags. Memory (-m), CPU (--cpus), read-only rootfs (--read-only), pid limit (--pids-limit), container user (--user) — all stock docker run options.

Requirements

A reachable Docker daemon. Preflight runs docker info before any tool launches and raises SandboxUnavailableError with install remediation when the daemon is missing:

Platform	Command
Debian/Ubuntu	`apt install docker.io && systemctl start docker`
Fedora	`dnf install docker && systemctl start docker`
Arch	`pacman -S docker && systemctl start docker`
macOS	`brew install --cask docker`, then open Docker Desktop
Windows	Install Docker Desktop

Preflight also checks the configured image with docker image inspect and runs docker pull if it is missing. Private images need docker login on the host first.

Quick Start

apiVersion: initrunner/v1
kind: Agent
metadata:
  name: sandboxed-agent
spec:
  role: You are a code execution assistant.
  model:
    provider: openai
    name: gpt-5-mini
  tools:
    - type: shell
    - type: python
  security:
    sandbox:
      backend: docker

This runs all shell and Python tool invocations inside python:3.12-slim containers with no network access and a read-only root filesystem.

Looking for the pre-v2026.4.16 security.docker block? It was replaced by the unified security.sandbox schema. See Migration for the before/after.

Enabling it

security:
  sandbox:
    backend: docker         # or: auto (prefers bwrap on Linux, falls back to Docker)
    network: none           # none | bridge | host
    memory_limit: 256m
    cpu_limit: 1.0
    read_only_rootfs: true
    allowed_read_paths: []
    allowed_write_paths: []
    bind_mounts: []
    env_passthrough: []
    docker:
      image: python:3.12-slim
      user: auto            # "auto" | "1000:1000" | null (root)
      extra_args: []        # dangerous flags blocked by schema

Configuration reference

Cross-backend fields live under security.sandbox. Docker-specific fields live under security.sandbox.docker.

Shared fields

Field	Type	Default	Description
`network`	`"none" \| "bridge" \| "host"`	`"none"`	Container network mode. `none` blocks at the kernel level.
`memory_limit`	`str`	`"256m"`	Memory cap in Docker format (`256m`, `1g`, …).
`cpu_limit`	`float`	`1.0`	Fractional cores.
`read_only_rootfs`	`bool`	`true`	Mount the root filesystem read-only. A writable `/tmp` (64 MB, `noexec,nosuid`) is added automatically.
`allowed_read_paths`	`list[str]`	`[]`	Host paths mounted read-only. Validated against permitted roots at load time.
`allowed_write_paths`	`list[str]`	`[]`	Host paths mounted read-write.
`bind_mounts`	`list[BindMount]`	`[]`	Extra mounts. Each entry becomes one `-v host:container[:ro]` flag.
`env_passthrough`	`list[str]`	`[]`	Env var names to pass into the container, filtered through `scrub_env()`.

Docker-specific fields

Field	Type	Default	Description
`docker.image`	`str`	`"python:3.12-slim"`	Image to use for containers.
`docker.user`	`"auto" \| str \| null`	`"auto"`	Container user. `"auto"` maps to the current uid:gid when writable mounts exist. `null` runs as root.
`docker.runtime`	`"runc" \| "runsc" \| "kata-runtime" \| "kata-qemu" \| "kata-fc" \| "kata-clh" \| null`	`null`	Container runtime. `null` uses Docker's default (`runc`). Validated at preflight against `docker info` registered runtimes; an unregistered choice fails with a per-runtime install hint. Since v2026.5.2. See Hardened runtimes.
`docker.extra_args`	`list[str]`	`[]`	Extra `docker run` flags. Security-sensitive flags are rejected.

`BindMount` fields

Field	Type	Default	Description
`source`	`str`	(required)	Host path. Relative paths resolve against the role file's directory.
`target`	`str`	(required)	Container path. Must be absolute.
`read_only`	`bool`	`true`	Mount as read-only.

Isolation model

Each tool call becomes one docker run --rm --init invocation. --init spawns a tiny PID-1 that reaps zombies and forwards signals. Without it, ctrl-C does not stop a shell running sleep.

Base flags

Flag	Purpose
`--rm`	Container is deleted when the process exits. No lingering state.
`--init`	tini as PID 1 for signal handling and zombie reaping.
`--name initrunner-<hash>`	Unique name for cleanup on timeout.
`--label initrunner.managed=true`	Identifies InitRunner-managed containers for bulk cleanup.
`--pids-limit 256`	Caps fork bombs.
`--read-only` (when `read_only_rootfs: true`)	Root filesystem is read-only.
`--tmpfs /tmp:rw,noexec,nosuid,size=64m`	Writable `/tmp` without allowing writes elsewhere.

Network

`network:`	Flag	Behavior
`none`	`--network none`	No interfaces, no DNS, no connectivity. Kernel-level block.
`bridge`	`--network bridge`	Default Docker bridge; outbound traffic is NAT'd through the host.
`host`	`--network host`	Shares the host network stack. Equivalent to no isolation at the network layer.

Working directory and mounts

/work — the tool's cwd, bind-mounted read-write. Set as the container's working directory via -w /work.
/role — the role directory, read-only. Role-relative bind_mounts resolve against this path on the host.
bind_mounts — user-configured. Each entry becomes one -v host:container[:ro] flag. Relative source paths resolve against role_dir. Missing sources raise ValueError at build time. No silent failures.
Tool-internal mounts — e.g. python_exec binding a tempfile. Code-controlled, no schema validation.

User mapping

The --user flag depends on docker.user and whether writable mounts exist:

`docker.user`	Writable mount?	`--user` value
`"auto"`	yes (work_dir or rw bind_mount)	`<host uid>:<host gid>`
`"auto"`	no	(omitted — container default user)
`"1000:1000"` (explicit)	either	`1000:1000`
`null`	either	(omitted — runs as root inside container)

Auto mapping prevents a common pain point: the container writes files as root, then the host user cannot delete them.

Environment

Container env starts clean. Host variables pass through only when:

They are listed in env_passthrough and exist on the host. scrub_env() strips sensitive prefixes (OPENAI_API_KEY, AWS_SECRET, …) first.
The tool sets them explicitly via env={...} on its run() call.

Each becomes one -e KEY=value flag.

Resource limits

Field	Flag	Notes
`memory_limit`	`-m 256m`	Container is OOM-killed at the limit. Exit code 137 triggers an auto-appended hint: "Container killed (OOM). Increase security.sandbox.memory_limit (current: 256m)."
`cpu_limit`	`--cpus 1.0`	Fractional cores.
`pids_limit`	`--pids-limit 256`	Always on. Caps runaway forks.

`extra_args` validation

docker.extra_args accepts additional docker run flags (e.g. --ulimit=nofile=1024). A blocklist rejects flags that defeat isolation:

--privileged
--cap-add (any form: bare, --cap-add=NET_ADMIN, --cap-add NET_ADMIN)
--security-opt when it disables seccomp or apparmor
--pid=host, --ipc=host, --uts=host, --userns=host
--device, --volume-driver, --runtime

Attempting to use these raises a validation error at role load time.

Since v2026.5.2, --runtime is also a first-class field at security.sandbox.docker.runtime. Passing it through extra_args (in any form) is rejected at load time; use the schema field instead. See Hardened runtimes.

Container cleanup on timeout

When a tool exceeds its timeout, subprocess.run kills the local docker CLI, but the container keeps running. The backend catches subprocess.TimeoutExpired and runs docker rm -f <name> to force-remove it. The backend swallows any cleanup failure so it cannot mask the original timeout error.

Preflight

initrunner doctor --role <file> checks two things:

The Docker daemon answers docker info.
The configured image exists locally, or docker pull succeeds.

Run it once per role change so image pulls happen outside the hot path.

Examples

Data processing with file access

security:
  sandbox:
    backend: docker
    network: none
    memory_limit: 512m
    cpu_limit: 2.0
    bind_mounts:
      - source: ./data
        target: /data
        read_only: true
      - source: ./output
        target: /output
        read_only: false
    env_passthrough: [LANG, TZ]
    docker:
      image: python:3.12-slim

Minimal sandbox

security:
  sandbox:
    backend: docker

All defaults: python:3.12-slim, no network, 256 MB RAM, 1 CPU, read-only rootfs.

Custom image with extra args

security:
  sandbox:
    backend: docker
    memory_limit: 1g
    read_only_rootfs: false
    docker:
      image: node:20-slim
      extra_args: ["--pids-limit=100", "--ulimit=nofile=1024"]

Hardened runtime (gVisor)

security:
  sandbox:
    backend: docker
    network: none
    docker:
      image: python:3.12-slim
      runtime: runsc        # gVisor; swap to kata-runtime / kata-qemu / kata-fc / kata-clh for a microVM

The runtime must be installed on the host and registered with Docker. Confirm with docker info --format '{{json .Runtimes}}'. See Hardened runtimes for the full picture.

Complete example role

See the docker-sandbox example for a ready-to-use role:

initrunner examples copy docker-sandbox
initrunner run docker-sandbox.yaml -p "Use python to compute 2**100"

Custom image requirements

When using a custom image, it must meet these requirements:

Interpreter on PATH. The Python tool runs python3 inside sandboxes. The script tool uses the configured interpreter (default /bin/sh). If the interpreter is missing, the container exits with "not found".
Writable /tmp. When read_only_rootfs: true (default), a writable /tmp is provided as a tmpfs (64 MB, noexec, nosuid). The image does not need to provide /tmp itself.
Working directory at /work. The tool's working directory is bind-mounted at /work. Your image should not expect a specific working directory.
No special init system needed. InitRunner passes --init (tini) automatically.

Hardened runtimes (gVisor, Kata)

Since v2026.5.2, security.sandbox.docker.runtime accepts six values and emits --runtime <name> on every docker run call.

Runtime	Class	What it adds over `runc`
`runc`	Container	Default. Same kernel as the host.
`runsc`	Userspace kernel	gVisor. A user-space process intercepts the syscall surface, narrowing the host kernel attack surface. Most Python and Node code works unchanged; numerical kernels and io_uring-heavy code need testing.
`kata-runtime`	microVM	Kata Containers, default hypervisor. Real guest kernel inside a lightweight VM.
`kata-qemu`	microVM	Kata pinned to QEMU.
`kata-fc`	microVM	Kata pinned to Firecracker.
`kata-clh`	microVM	Kata pinned to Cloud Hypervisor.

Each runtime must be installed on the host and registered with Docker. Confirm with:

docker info --format '{{json .Runtimes}}' | jq 'keys'

If the configured runtime is not in that list, preflight fails with a per-runtime install hint and the agent does not start. There is no silent fallback to runc.

For when to pick which class, see Sandbox Comparison.

Running InitRunner itself in Docker

When InitRunner runs inside a container and you want sandboxed tools, the inner InitRunner still needs a Docker daemon. Two patterns:

Socket passthrough (simpler, less secure) — mount /var/run/docker.sock into the InitRunner container. The inner process gets effective root on the host via the socket; use only for trusted roles.
Docker-in-Docker (safer, heavier) — run a dind sidecar and point InitRunner at it with DOCKER_HOST=tcp://dind:2375.

See Docker — socket passthrough for the compose snippet.

Audit

Each call emits a sandbox.exec security event:

backend=docker argv0=/usr/bin/python rc=0 duration_ms=312

Query with:

initrunner audit security-events --event-type sandbox.exec

Limitations

Per-call startup cost. A Docker container takes ~200–500 ms to start. bwrap is about 10× faster on the same host. Use backend: auto to prefer bwrap when available.
Daemon dependency. Every tool call needs the daemon up. If it dies, tools fail with SandboxUnavailableError.
Image distribution. The first run may pull the image (up to 5 minutes). Run initrunner doctor --role <file> to pull outside the hot path.
No seccomp customization in v1. The sandbox uses Docker's default seccomp profile. The schema does not expose custom profiles.

On this page