Defense in depth
VoidBox uses a layered security model with five distinct isolation boundaries. Each layer provides independent protection — compromise of one layer does not grant access through subsequent layers.
Five layers of defense
Layer 1: Hardware isolation (KVM / VZ)
— Separate kernel, memory space, devices per VM
Layer 2: Seccomp-BPF (Linux/KVM)
— VMM thread restricted to KVM ioctls + vsock + networking syscalls
Layer 3: Session authentication (vsock)
— 32-byte random secret, per-VM, injected at boot
Layer 4: Guest hardening (guest-agent)
— Command allowlist, rlimits, privilege drop, timeout watchdog
Layer 5: Network isolation
— CIDR deny list (both platforms, different enforcement points)
• Linux/KVM: host-side SLIRP packet filter
• macOS/VZ: guest-side blackhole routes
— Linux/KVM only: SLIRP rate limiting and max concurrent connections
Layer 1: Hardware isolation (KVM / VZ)
Each VoidBox runs in its own micro-VM with a separate kernel, memory space, and devices. Hardware virtualization enforces isolation — not advisory process controls. On macOS, Apple’s Virtualization.framework provides equivalent hypervisor-level isolation.
Layer 2: Seccomp-BPF (Linux/KVM)
On Linux/KVM, the VMM event-loop thread is restricted via seccomp-BPF to only the syscalls needed for KVM operation: KVM ioctls, vsock communication, and networking syscalls. Violations kill that thread (SECCOMP_RET_KILL_THREAD), ending the run without taking down other guests on the host. On macOS/VZ this layer does not apply.
Layer 3: Session authentication (vsock)
Every VM gets a unique 32-byte random session secret, injected via kernel command line. The host authenticates each control connection with a Ping/Pong handshake before sending any ExecRequest, WriteFile, PTY, or telemetry-subscription messages.
Host Guest
| |
+-- getrandom(32 bytes) |
+-- hex-encode -> kernel cmdline |
| voidbox.secret=abc123... |
| |
| boot |
| -----------------------------------> |
| +-- parse /proc/cmdline
| +-- store in OnceLock
| |
+-- Ping [secret + version] |
| -----------------------------------> |
| +-- verify secret
| +-- mark connection authenticated
| <----------------------------------- |
| Pong [version] |
| |
+-- ExecRequest { ... } |
| -----------------------------------> |
| +-- execute on authenticated channel
| <----------------------------------- |
| ExecResponse { ... } |
Layer 4: Guest hardening (guest-agent)
The guest-agent (PID 1) enforces four independent controls:
Command allowlist
Only approved binaries execute. The allowlist is read from /etc/voidbox/allowed_commands.json, provisioned by the trusted host at boot.
Resource limits
setrlimit enforces memory, file descriptor, and process count limits. Read from /etc/voidbox/resource_limits.json.
Privilege drop
Child processes run as uid:1000. The guest-agent drops privileges before executing any command, preventing root access inside the VM.
Timeout watchdog
A watchdog timer sends SIGKILL to child processes that exceed the configured timeout, preventing runaway execution.
Layer 5: Network isolation
The CIDR deny list applies on both platforms, but the enforcement point differs. Linux/KVM adds two further host-side controls; macOS/VZ does not, because Apple’s NAT attachment exposes no host-side filter hook.
CIDR deny list (both platforms)
The deny list (default: 169.254.0.0/16 link-local, e.g. cloud metadata) is enforced in different layers depending on the backend:
- Linux/KVM — host-side SLIRP packet filter. The smoltcp-based usermode stack consults the deny list before NATing and drops outbound packets whose destination matches. The guest never sees a response. No file is provisioned in the guest.
- macOS/VZ — guest-side blackhole routes. Apple’s
VZNATNetworkDeviceAttachmenthas no host-side filter hook, so the host writes/etc/voidbox/network_deny_list.jsonbefore boot. The guest-agent reads it and installs blackhole routes for each entry immediately after bringing up the network — before any workload runs.
Host-side SLIRP controls (Linux/KVM only)
On top of the deny-list filter, the Linux/KVM SLIRP stack adds:
- Rate limiting on new connections — prevents connection floods from the guest.
- Maximum concurrent connection limit — bounds host resource usage.
macOS/VZ networking
macOS uses VZNATNetworkDeviceAttachment. The VM boundary and the guest-side blackhole routes still apply, but rate limiting and the concurrent connection cap are not enforced host-side there.
Snapshot security considerations
Snapshot cloning shares identical VM state across restored instances. Three areas require awareness:
RNG entropy
Restored VMs inherit the same /dev/urandom pool. Operationally, treat cloned snapshots as cloned execution state: use short-lived tasks, avoid assuming fresh entropy after restore, and rebuild snapshots when that matters.
ASLR
The guest kernel boots with nokaslr, so KASLR is disabled guest-wide — this is a guest-kernel property, not a snapshot-specific effect. Clones inherit an identical layout regardless. Mitigations: short-lived tasks, no direct network addressability (SLIRP NAT), and the command allowlist limiting attack surface.
Session isolation
Restored VMs reuse the snapshot’s stored session secret for vsock authentication (the secret is baked into the guest’s kernel cmdline in snapshot memory). Per-restore secret rotation would require guest-side support.