Observability Setup

VoidBox exposes structured run events and guest telemetry by default. OTLP trace and metric export — covering pipeline and stage spans, tool-call child spans, token/cost/duration metrics, and guest procfs telemetry — is available when built with the opentelemetry feature. All output is correlated with the [vm:NAME] log prefix for per-box filtering.

Enable OTLP

Build with the opentelemetry feature flag:

cargo build --features opentelemetry

Then set the endpoint environment variable when running an observable example:

VOIDBOX_OTLP_ENDPOINT=http://localhost:4317 \
cargo run --features opentelemetry --example playground_pipeline

OTLP export is gRPC only today — HTTP/protobuf endpoints are not supported.

Trace structure

When observable execution is enabled, traces follow a hierarchy from the pipeline level down to individual tool calls within each VM stage. The box name is encoded in the span name, not as an attribute.

pipeline:<pipeline_name>
  └─ stage:data_analyst
       └─ claude.exec
            ├─ claude.tool.Read
            ├─ claude.tool.Bash
            └─ attributes:
                gen_ai.usage.input_tokens
                gen_ai.usage.output_tokens
                gen_ai.request.model
                claude.total_cost_usd
                claude.tools_count
  └─ stage:quant_analyst
       └─ ...

Tool calls are child spans (claude.tool.<tool_name>) nested under a claude.exec span for each box, not span events. Each stage span carries token, cost, and model attributes using the OpenTelemetry gen_ai.* semantic conventions.

Guest telemetry

The guest-agent inside each micro-VM periodically reads /proc/stat and /proc/meminfo, then sends TelemetryBatch messages over vsock to the host. On the host side, the TelemetryAggregator ingests these batches and exports them as OTLP metrics.

Guest telemetry gives you per-VM resource utilization without any agent-side instrumentation. CPU and memory metrics flow automatically as long as the guest-agent is running.

Configuration

Environment variableDescription
VOIDBOX_OTLP_ENDPOINTOTLP gRPC endpoint (e.g. http://localhost:4317). Takes precedence over OTEL_EXPORTER_OTLP_ENDPOINT.
OTEL_EXPORTER_OTLP_ENDPOINTStandard OpenTelemetry fallback for the endpoint.
VOIDBOX_SERVICE_NAMEService name for exported telemetry (default: void-box).
OTEL_EXPORTER_OTLP_HEADERSRead by the upstream OTLP SDK for auth headers (key=value,key2=value2). VoidBox itself doesn’t inject them.
VOIDBOX_OTLP_HEADERSParsed internally, but not yet wired into the exporter path; do not rely on it yet for collector auth.
VOIDBOX_OTEL_DEBUGSet to 1 to log the resolved OTLP config at startup.

Default sample rate is 1.0 — every pipeline, stage, and tool-call span is exported.

Playground

The repository includes a ready-to-run observability stack in playground/. The compose file launches a single grafana/otel-lgtm container — Grafana + Loki + Tempo + Mimir bundled together — so you get traces, logs, and metrics from one endpoint at localhost:4317.

Start it with the wrapper script, which also provisions the KVM artifacts, picks a provider (Anthropic or Ollama), and runs playground_pipeline against the stack:

playground/up.sh

See the playground/ directory in the repository for details.