Observability Setup

VoidBox exposes structured run events and guest telemetry by default. OTLP trace and metric export — covering pipeline and stage spans, tool-call child spans, token/cost/duration metrics, and guest procfs telemetry — is available when built with the opentelemetry feature. All output is correlated with the [vm:NAME] log prefix for per-box filtering.

Enable OTLP

Build with the opentelemetry feature flag:

cargo build --features opentelemetry

Then set the endpoint environment variable when running an observable example:

VOIDBOX_OTLP_ENDPOINT=http://localhost:4317 \
cargo run --features opentelemetry --example playground_pipeline

OTLP export is gRPC only today — HTTP/protobuf endpoints are not supported.

Trace structure

When observable execution is enabled, traces follow a hierarchy from the pipeline level down to individual tool calls within each VM stage. The box name is encoded in the span name, not as an attribute.

pipeline:<pipeline_name>
  └─ stage:data_analyst
       └─ claude.exec
            ├─ claude.tool.Read
            ├─ claude.tool.Bash
            └─ attributes:
                gen_ai.usage.input_tokens
                gen_ai.usage.output_tokens
                gen_ai.request.model
                claude.total_cost_usd
                claude.tools_count
  └─ stage:quant_analyst
       └─ ...

Tool calls are child spans (claude.tool.<tool_name>) nested under a claude.exec span for each box, not span events. Each stage span carries token, cost, and model attributes using the OpenTelemetry gen_ai.* semantic conventions.

Guest telemetry

The guest-agent inside each micro-VM periodically reads /proc/stat and /proc/meminfo, then sends TelemetryBatch messages over vsock to the host. On the host side, the TelemetryAggregator ingests these batches and exports them as OTLP metrics.

Guest telemetry gives you per-VM resource utilization without any agent-side instrumentation. CPU and memory metrics flow automatically as long as the guest-agent is running.

Configuration

Environment variable	Description
`VOIDBOX_OTLP_ENDPOINT`	OTLP gRPC endpoint (e.g. `http://localhost:4317`). Takes precedence over `OTEL_EXPORTER_OTLP_ENDPOINT`.
`OTEL_EXPORTER_OTLP_ENDPOINT`	Standard OpenTelemetry fallback for the endpoint.
`VOIDBOX_SERVICE_NAME`	Service name for exported telemetry (default: `void-box`).
`OTEL_EXPORTER_OTLP_HEADERS`	Read by the upstream OTLP SDK for auth headers (`key=value,key2=value2`). VoidBox itself doesn’t inject them.
`VOIDBOX_OTLP_HEADERS`	Parsed internally, but not yet wired into the exporter path; do not rely on it yet for collector auth.
`VOIDBOX_OTEL_DEBUG`	Set to `1` to log the resolved OTLP config at startup.

Default sample rate is 1.0 — every pipeline, stage, and tool-call span is exported.

Playground

The repository includes a ready-to-run observability stack in playground/. The compose file launches a single grafana/otel-lgtm container — Grafana + Loki + Tempo + Mimir bundled together — so you get traces, logs, and metrics from one endpoint at localhost:4317.

Start it with the wrapper script, which also provisions the KVM artifacts, picks a provider (Anthropic or Ollama), and runs playground_pipeline against the stack:

playground/up.sh

See the playground/ directory in the repository for details.