Sub-second VM restore
VoidBox supports sub-second VM restore via snapshot/restore. On Linux/KVM the runtime captures vCPU registers, memory, and devices directly, then restores via COW mmap. On macOS/VZ it wraps Apple’s native save/restore APIs. Either way, the guest resumes execution without re-booting the kernel or re-running initialization.
All snapshot features are explicit opt-in only. If you never set a snapshot field, the system behaves exactly as before — cold boot, zero snapshot code runs.
The sections below describe the Linux/KVM mechanism in detail. For macOS, see macOS / VZ snapshots.
Snapshot types (Linux/KVM)
| Type | When Created | Contents | Use Case |
|---|---|---|---|
| Base | After cold boot, VM stopped | Full memory dump + all KVM state | Golden image for repeated boots |
| Diff | After dirty tracking enabled, VM stopped | Only modified pages since base | Layered caching (base + delta) |
YAML spec
Top-level snapshot
# Applies to all boxes
sandbox:
memory_mb: 256
snapshot: 'abc123def456'
Per-box override
pipeline:
boxes:
- name: analyst
prompt: 'analyze data'
sandbox:
snapshot: 'def789'
- name: coder
prompt: 'write code'
# no snapshot = cold boot
Rust API
use void_box::agent_box::VoidBox;
// Cold boot (default — no snapshot)
let box1 = VoidBox::new("analyst")
.prompt("analyze data")
.memory_mb(256)
.build()?;
// Restore from snapshot (explicit opt-in)
let box2 = VoidBox::new("analyst")
.prompt("analyze data")
.snapshot("/path/to/snapshot/dir") // or hash prefix
.build()?;
CLI commands
# Create a base snapshot after cold boot (kernel required; initramfs optional)
voidbox snapshot create --kernel /path/to/vmlinuz --initramfs /path/to/initramfs.cpio.gz
# Optional: layered diff snapshot once a base exists for the same config hash
voidbox snapshot create --kernel /path/to/vmlinuz --initramfs /path/to/initramfs.cpio.gz --diff
# List stored snapshots
voidbox snapshot list
# Delete a snapshot
voidbox snapshot delete <hash-prefix>
# Run with a snapshot (via spec)
voidbox run --file spec.yaml # spec has sandbox.snapshot set
Daemon API
# POST /v1/runs with snapshot override (daemon default: 127.0.0.1:43100)
curl -X POST http://127.0.0.1:43100/v1/runs \
-H 'Content-Type: application/json' \
-d '{"file": "workflow.yaml", "snapshot": "abc123def456"}'
Design principles
- No snapshot field set → cold boot, zero snapshot code runs
- No auto-detection of existing snapshots
- No auto-creation of snapshots during normal runs
- No auto-restore — only if the user passes an explicit path or hash
- No env var fallback — spec or code only
- Every new field defaults to
None— the system behaves identically to before if untouched
Performance benchmarks
Measured on Linux/KVM with 256 MB RAM, 1 vCPU, userspace virtio-vsock. These numbers are from a single test host — actual results vary with CPU, memory, and storage.
| Phase | Time | Notes |
|---|---|---|
| Cold boot | ~10 ms | |
| Base snapshot | ~420 ms | Full 256 MB memory dump |
| Base restore | ~1.3 ms | COW mmap, lazy page loading |
| Diff snapshot | ~270 ms | Only dirty pages (~1.5 MB, 0.6% of RAM) |
| Diff restore | ~3 ms | Base COW mmap + dirty page overlay |
| Base speedup | ~8x | Cold boot / base restore |
| Diff savings | 99.4% | Memory file size reduction |
Storage layout
~/.void-box/snapshots/
<hash-prefix>/ # base snapshot
state.bin # Linux/KVM base metadata and device state
memory.mem # full memory dump (Linux/KVM)
<hash-prefix>-diff/ # Linux/KVM diff snapshot
state.bin
memory.diff # dirty pages only
<hash-prefix>/ # macOS/VZ base snapshot
vz_meta.json
vm.vzvmsave
Restore flow
The 7-step restore process:
1. VmSnapshot::load(dir) Read state.bin (vCPU, irqchip, PIT, vsock, config)
2. Vm::new(memory_mb) Create KVM VM with matching memory size
3. restore_memory(mem, path) COW mmap(MAP_PRIVATE|MAP_FIXED) — lazy page loading
4. vm.restore_irqchip(state) Restore PIC master/slave + IOAPIC
5. VirtioVsockMmio::restore() Restore vsock device registers (userspace backend)
6. create_vcpu_restored(state) Per-vCPU restore (see register restore order below)
7. vCPU threads resume Guest continues execution from snapshot point
Memory restore
Memory restore uses kernel MAP_PRIVATE lazy page loading — pages are demand-faulted from the file, writes create anonymous copies. No userfaultfd required.
vCPU register restore order
The restore sequence in cpu.rs is order-sensitive. Getting it wrong causes silent guest crashes (kernel panic → reboot via port 0x64).
1. MSRs KVM_SET_MSRS
2. sregs KVM_SET_SREGS (segment regs, CR0/CR3/CR4)
3. LAPIC KVM_SET_LAPIC + periodic timer bootstrap (see below)
4. IA32_TSC_DEADLINE KVM_SET_MSRS — restored after LAPIC, which clears MSRs
5. vcpu_events KVM_SET_VCPU_EVENTS (exception/interrupt state)
6. XCRs (XCR0) KVM_SET_XCRS — MUST come before xsave
7. xsave (FPU/SSE) KVM_SET_XSAVE — depends on XCR0 for feature mask
8. regs KVM_SET_REGS (GP registers, RIP, RFLAGS)
9. MP state KVM_SET_MP_STATE — restores HALTED/RUNNABLE per vCPU for SMP
10. KVM_KVMCLOCK_CTRL Marks pvclock as paused so the guest adjusts timers on resume
XCR0 restore is critical. XCR0 controls which XSAVE features (x87, SSE, AVX) are active. Without it, the guest’s XRSTORS instruction triggers a #GP because the default XCR0 only enables x87, but the guest’s XSAVE area references SSE/AVX features.
MP state matters for SMP. Without KVM_SET_MP_STATE, secondary vCPUs default to RUNNABLE instead of their actual state (usually HALTED), breaking SMP resume. KVM_KVMCLOCK_CTRL is best-effort (fails with EINVAL when kvm-clock isn’t active) and prevents the guest’s soft-lockup watchdog from panicking after the pause.
LAPIC timer bootstrap
When the guest was idle (NO_HZ) at snapshot time, the LAPIC timer is masked with vector=0 (LVTT=0x10000). After restore, no timer interrupt ever fires, so the scheduler never runs. The restore code detects this state and bootstraps a periodic LAPIC timer (mode=periodic, vector=0xEC, TMICT=0x200000, TDCR=divide-by-1) to kick the scheduler back to life.
Vsock backend for snapshot
The userspace virtio-vsock backend must be used for VMs that will be snapshotted. The kernel vhost backend (/dev/vhost-vsock) does not expose internal vring indices, making queue state capture incomplete. The userspace backend tracks last_avail_idx/last_used_idx directly, ensuring clean snapshot/restore of the virtqueue state.
CID preservation
The snapshot stores the VM’s actual CID (assigned at cold boot). On restore, the same CID is reused — the guest kernel caches the CID during virtio-vsock probe and silently drops packets with mismatched dst_cid.
Opt-in plumbing
Every layer has an optional snapshot field that defaults to None:
| Layer | Field | Type | Default |
|---|---|---|---|
SandboxBuilder | .snapshot(path) | Option<PathBuf> | None |
BoxConfig | snapshot | Option<PathBuf> | None |
SandboxSpec (YAML) | sandbox.snapshot | Option<String> | None |
BoxSandboxOverride | sandbox.snapshot | Option<String> | None |
CreateRunRequest (API) | snapshot | Option<String> | None |
Resolution chain: per-box override → top-level spec → None (cold boot).
Snapshot resolution
When a snapshot string is provided, the runtime resolves it as:
- Hash prefix →
~/.void-box/snapshots/<prefix>/(ifstate.binorvz_meta.jsonexists) - Literal path → treat as directory path (if
state.binorvz_meta.jsonexists) - Neither → warning printed, cold boot
Resolution is backend-agnostic: state.bin identifies a Linux/KVM snapshot, vz_meta.json a macOS/VZ snapshot. No env var fallback, no auto-detection.
Cache management
- LRU eviction:
evict_lru(max_bytes)removes oldest snapshots first - Layer hashing:
compute_layer_hash(base, layer, content)for deterministic cache keys - Listing:
list_snapshots()/voidbox snapshot list - Deletion:
delete_snapshot(prefix)/voidbox snapshot delete <prefix>
Snapshot cache is stored at ~/.void-box/snapshots/.
macOS / VZ snapshots
The VZ backend wraps Apple’s native saveMachineStateToURL: / restoreMachineStateFromURL: APIs (macOS 14+) rather than serializing vCPU registers and memory directly. Apple manages the VM state blob; VoidBox manages the continuity metadata around it. There is no separate diff snapshot on VZ — each save produces a complete restorable state.
Apple refuses to restore a VM whose VZVirtualMachineConfiguration drifts from the one used at save time (memory, vCPUs, network, kernel cmdline, machine identifier). VoidBox persists a JSON sidecar alongside Apple’s save blob to survive that constraint on cold hosts.
Save and restore flow
Save:
1. Pause VM VZVirtualMachine.pause
2. saveMachineStateToURL: Apple writes opaque state blob (vm.vzvmsave)
3. Write vz_meta.json sidecar VzSnapshotMeta (VoidBox continuity fields)
4. Stop VM from paused state No resume/pause round-trip
Restore:
1. Read vz_meta.json Recover identifier + saved config
2. Reconcile with caller config Override drifting memory/vcpus/network silently
3. Build VZVirtualMachineConfiguration using the saved identifier
4. restoreMachineStateFromURL: Apple restores opaque state
5. Resume Guest continues execution
VzSnapshotMeta sidecar fields
| Field | Purpose |
|---|---|
session_secret | Guest-agent auth token baked into the kernel cmdline at save time |
memory_mb, vcpus, network | Reconciliation targets — override caller config if drifting |
boot_clock_secs | Wall-clock at save so the kernel cmdline matches at restore |
config_hash | Continuity check against the caller’s BackendConfig |
machine_identifier | VZGenericMachineIdentifier.dataRepresentation (required by Apple) |
Storage layout (VZ)
~/.void-box/snapshots/
└── <hash-prefix>/
├── vm.vzvmsave # Apple's opaque save blob
└── vz_meta.json # VzSnapshotMeta sidecar (JSON)
enable_snapshots opt-in
SandboxBuilder::enable_snapshots(true) (plumbed through SandboxConfig → BackendConfig) gates Apple’s validateSaveRestoreSupportWithError check at cold boot. Some device combinations (e.g. virtiofs shares) make Apple reject snapshot capability validation even when the VM runs fine for non-snapshot workloads, so cold boots that do not opt in skip the check and keep working.
Security considerations
Snapshot cloning shares identical VM state across restored instances — affecting RNG entropy, guest page-table layout (KASLR is disabled guest-wide regardless), and session-secret reuse for vsock auth.
See Snapshot security considerations in the Security Model page for the full rationale and mitigations.