Sub-second VM restore

VoidBox supports sub-second VM restore via snapshot/restore. On Linux/KVM the runtime captures vCPU registers, memory, and devices directly, then restores via COW mmap. On macOS/VZ it wraps Apple’s native save/restore APIs. Either way, the guest resumes execution without re-booting the kernel or re-running initialization.

All snapshot features are explicit opt-in only. If you never set a snapshot field, the system behaves exactly as before — cold boot, zero snapshot code runs.

The sections below describe the Linux/KVM mechanism in detail. For macOS, see macOS / VZ snapshots.

Snapshot types (Linux/KVM)

TypeWhen CreatedContentsUse Case
BaseAfter cold boot, VM stoppedFull memory dump + all KVM stateGolden image for repeated boots
DiffAfter dirty tracking enabled, VM stoppedOnly modified pages since baseLayered caching (base + delta)

YAML spec

Top-level snapshot

# Applies to all boxes
sandbox:
  memory_mb: 256
  snapshot: 'abc123def456'

Per-box override

pipeline:
  boxes:
    - name: analyst
      prompt: 'analyze data'
      sandbox:
        snapshot: 'def789'
    - name: coder
      prompt: 'write code'
      # no snapshot = cold boot

Rust API

use void_box::agent_box::VoidBox;

// Cold boot (default — no snapshot)
let box1 = VoidBox::new("analyst")
    .prompt("analyze data")
    .memory_mb(256)
    .build()?;

// Restore from snapshot (explicit opt-in)
let box2 = VoidBox::new("analyst")
    .prompt("analyze data")
    .snapshot("/path/to/snapshot/dir")   // or hash prefix
    .build()?;

CLI commands

# Create a base snapshot after cold boot (kernel required; initramfs optional)
voidbox snapshot create --kernel /path/to/vmlinuz --initramfs /path/to/initramfs.cpio.gz

# Optional: layered diff snapshot once a base exists for the same config hash
voidbox snapshot create --kernel /path/to/vmlinuz --initramfs /path/to/initramfs.cpio.gz --diff

# List stored snapshots
voidbox snapshot list

# Delete a snapshot
voidbox snapshot delete <hash-prefix>

# Run with a snapshot (via spec)
voidbox run --file spec.yaml   # spec has sandbox.snapshot set

Daemon API

# POST /v1/runs with snapshot override (daemon default: 127.0.0.1:43100)
curl -X POST http://127.0.0.1:43100/v1/runs \
  -H 'Content-Type: application/json' \
  -d '{"file": "workflow.yaml", "snapshot": "abc123def456"}'

Design principles

  • No snapshot field set → cold boot, zero snapshot code runs
  • No auto-detection of existing snapshots
  • No auto-creation of snapshots during normal runs
  • No auto-restore — only if the user passes an explicit path or hash
  • No env var fallback — spec or code only
  • Every new field defaults to None — the system behaves identically to before if untouched

Performance benchmarks

Measured on Linux/KVM with 256 MB RAM, 1 vCPU, userspace virtio-vsock. These numbers are from a single test host — actual results vary with CPU, memory, and storage.

PhaseTimeNotes
Cold boot~10 ms
Base snapshot~420 msFull 256 MB memory dump
Base restore~1.3 msCOW mmap, lazy page loading
Diff snapshot~270 msOnly dirty pages (~1.5 MB, 0.6% of RAM)
Diff restore~3 msBase COW mmap + dirty page overlay
Base speedup~8xCold boot / base restore
Diff savings99.4%Memory file size reduction

Storage layout

~/.void-box/snapshots/
  <hash-prefix>/         # base snapshot
      state.bin          # Linux/KVM base metadata and device state
      memory.mem         # full memory dump (Linux/KVM)

  <hash-prefix>-diff/    # Linux/KVM diff snapshot
      state.bin
      memory.diff        # dirty pages only

  <hash-prefix>/         # macOS/VZ base snapshot
      vz_meta.json
      vm.vzvmsave

Restore flow

The 7-step restore process:

1. VmSnapshot::load(dir)           Read state.bin (vCPU, irqchip, PIT, vsock, config)
2. Vm::new(memory_mb)              Create KVM VM with matching memory size
3. restore_memory(mem, path)       COW mmap(MAP_PRIVATE|MAP_FIXED) — lazy page loading
4. vm.restore_irqchip(state)       Restore PIC master/slave + IOAPIC
5. VirtioVsockMmio::restore()      Restore vsock device registers (userspace backend)
6. create_vcpu_restored(state)     Per-vCPU restore (see register restore order below)
7. vCPU threads resume             Guest continues execution from snapshot point

Memory restore

Memory restore uses kernel MAP_PRIVATE lazy page loading — pages are demand-faulted from the file, writes create anonymous copies. No userfaultfd required.

vCPU register restore order

The restore sequence in cpu.rs is order-sensitive. Getting it wrong causes silent guest crashes (kernel panic → reboot via port 0x64).

1. MSRs              KVM_SET_MSRS
2. sregs             KVM_SET_SREGS (segment regs, CR0/CR3/CR4)
3. LAPIC             KVM_SET_LAPIC + periodic timer bootstrap (see below)
4. IA32_TSC_DEADLINE KVM_SET_MSRS — restored after LAPIC, which clears MSRs
5. vcpu_events       KVM_SET_VCPU_EVENTS (exception/interrupt state)
6. XCRs (XCR0)       KVM_SET_XCRS — MUST come before xsave
7. xsave (FPU/SSE)   KVM_SET_XSAVE — depends on XCR0 for feature mask
8. regs              KVM_SET_REGS (GP registers, RIP, RFLAGS)
9. MP state          KVM_SET_MP_STATE — restores HALTED/RUNNABLE per vCPU for SMP
10. KVM_KVMCLOCK_CTRL Marks pvclock as paused so the guest adjusts timers on resume

XCR0 restore is critical. XCR0 controls which XSAVE features (x87, SSE, AVX) are active. Without it, the guest’s XRSTORS instruction triggers a #GP because the default XCR0 only enables x87, but the guest’s XSAVE area references SSE/AVX features.

MP state matters for SMP. Without KVM_SET_MP_STATE, secondary vCPUs default to RUNNABLE instead of their actual state (usually HALTED), breaking SMP resume. KVM_KVMCLOCK_CTRL is best-effort (fails with EINVAL when kvm-clock isn’t active) and prevents the guest’s soft-lockup watchdog from panicking after the pause.

LAPIC timer bootstrap

When the guest was idle (NO_HZ) at snapshot time, the LAPIC timer is masked with vector=0 (LVTT=0x10000). After restore, no timer interrupt ever fires, so the scheduler never runs. The restore code detects this state and bootstraps a periodic LAPIC timer (mode=periodic, vector=0xEC, TMICT=0x200000, TDCR=divide-by-1) to kick the scheduler back to life.

Vsock backend for snapshot

The userspace virtio-vsock backend must be used for VMs that will be snapshotted. The kernel vhost backend (/dev/vhost-vsock) does not expose internal vring indices, making queue state capture incomplete. The userspace backend tracks last_avail_idx/last_used_idx directly, ensuring clean snapshot/restore of the virtqueue state.

CID preservation

The snapshot stores the VM’s actual CID (assigned at cold boot). On restore, the same CID is reused — the guest kernel caches the CID during virtio-vsock probe and silently drops packets with mismatched dst_cid.

Opt-in plumbing

Every layer has an optional snapshot field that defaults to None:

LayerFieldTypeDefault
SandboxBuilder.snapshot(path)Option<PathBuf>None
BoxConfigsnapshotOption<PathBuf>None
SandboxSpec (YAML)sandbox.snapshotOption<String>None
BoxSandboxOverridesandbox.snapshotOption<String>None
CreateRunRequest (API)snapshotOption<String>None

Resolution chain: per-box override → top-level spec → None (cold boot).

Snapshot resolution

When a snapshot string is provided, the runtime resolves it as:

  1. Hash prefix~/.void-box/snapshots/<prefix>/ (if state.bin or vz_meta.json exists)
  2. Literal path → treat as directory path (if state.bin or vz_meta.json exists)
  3. Neither → warning printed, cold boot

Resolution is backend-agnostic: state.bin identifies a Linux/KVM snapshot, vz_meta.json a macOS/VZ snapshot. No env var fallback, no auto-detection.

Cache management

  • LRU eviction: evict_lru(max_bytes) removes oldest snapshots first
  • Layer hashing: compute_layer_hash(base, layer, content) for deterministic cache keys
  • Listing: list_snapshots() / voidbox snapshot list
  • Deletion: delete_snapshot(prefix) / voidbox snapshot delete <prefix>

Snapshot cache is stored at ~/.void-box/snapshots/.

macOS / VZ snapshots

The VZ backend wraps Apple’s native saveMachineStateToURL: / restoreMachineStateFromURL: APIs (macOS 14+) rather than serializing vCPU registers and memory directly. Apple manages the VM state blob; VoidBox manages the continuity metadata around it. There is no separate diff snapshot on VZ — each save produces a complete restorable state.

Apple refuses to restore a VM whose VZVirtualMachineConfiguration drifts from the one used at save time (memory, vCPUs, network, kernel cmdline, machine identifier). VoidBox persists a JSON sidecar alongside Apple’s save blob to survive that constraint on cold hosts.

Save and restore flow

Save:
  1. Pause VM                         VZVirtualMachine.pause
  2. saveMachineStateToURL:           Apple writes opaque state blob (vm.vzvmsave)
  3. Write vz_meta.json sidecar       VzSnapshotMeta (VoidBox continuity fields)
  4. Stop VM from paused state        No resume/pause round-trip

Restore:
  1. Read vz_meta.json                Recover identifier + saved config
  2. Reconcile with caller config     Override drifting memory/vcpus/network silently
  3. Build VZVirtualMachineConfiguration using the saved identifier
  4. restoreMachineStateFromURL:      Apple restores opaque state
  5. Resume                           Guest continues execution

VzSnapshotMeta sidecar fields

FieldPurpose
session_secretGuest-agent auth token baked into the kernel cmdline at save time
memory_mb, vcpus, networkReconciliation targets — override caller config if drifting
boot_clock_secsWall-clock at save so the kernel cmdline matches at restore
config_hashContinuity check against the caller’s BackendConfig
machine_identifierVZGenericMachineIdentifier.dataRepresentation (required by Apple)

Storage layout (VZ)

~/.void-box/snapshots/
  └── <hash-prefix>/
      ├── vm.vzvmsave          # Apple's opaque save blob
      └── vz_meta.json         # VzSnapshotMeta sidecar (JSON)

enable_snapshots opt-in

SandboxBuilder::enable_snapshots(true) (plumbed through SandboxConfigBackendConfig) gates Apple’s validateSaveRestoreSupportWithError check at cold boot. Some device combinations (e.g. virtiofs shares) make Apple reject snapshot capability validation even when the VM runs fine for non-snapshot workloads, so cold boots that do not opt in skip the check and keep working.

Security considerations

Snapshot cloning shares identical VM state across restored instances — affecting RNG entropy, guest page-table layout (KASLR is disabled guest-wide regardless), and session-secret reuse for vsock auth.

See Snapshot security considerations in the Security Model page for the full rationale and mitigations.