rtb-telemetry v0.1 — Opt-in anonymous usage telemetry¶
Status: IMPLEMENTED — 12 unit + 6 BDD acceptance criteria green;
T6 used an insta snapshot accepted in-commit.
Target crate: rtb-telemetry
Parent contract: §17 of the framework spec
and the two-level opt-in policy in CLAUDE.md.
1. Motivation¶
GTB documents a two-level opt-in: tool authors enable the telemetry feature at compile time, users opt in at runtime. Events carry enough to inform development (command name, duration, tool version, salted machine ID) and nothing else — no PII, no args, no paths, no file contents.
v0.1 ships the types and the three sinks that support local dev and
CI smoke-testing. The OTLP pipeline, HTTP sink, and rtb-cli
wiring land in v0.2.
2. Scope boundaries (explicit)¶
In scope for v0.1¶
Eventstruct — timestamp, event name, tool name, tool version, salted machine ID, custom attrs (HashMap<String, String>).TelemetrySinkasync trait —emit(&Event),flush().- Built-in sinks:
NoopSink,FileSink(JSONL),MemorySink(tests). TelemetryContext— holds tool metadata + active sink + opt-in flag;record(event_name)is the main user surface.MachineId::derive(&salt) -> String— salted-SHA-256 ofmachine-uid::get(). Never the raw ID.CollectionPolicy::{Disabled, Enabled}— the runtime opt-in switch. Disabled is always honoured: no machine ID derivation, no sink calls, no events retained.TelemetryErrorwith miette::Diagnostic.
Deferred¶
- OTLP exporter: pulls in
opentelemetry+opentelemetry-otlpwith its own dep tree. Lands in v0.2. - HTTP JSON sink: simple
reqwestPOST to a downstream endpoint; also v0.2 once we wire a real opt-in prompt. - Batching + retry on sinks — v0.1 sinks are synchronous-on-emit.
- rtb-cli
telemetrysubcommand (enable/disable/status/reset) — lands with rtb-cli v0.2 once this crate is stable. - Automatic redaction of attrs — callers responsible for only passing redacted values. A v0.2 hook can integrate with an rtb-redact crate.
3. Public API¶
3.1 Crate root¶
pub use context::{CollectionPolicy, TelemetryContext};
pub use error::TelemetryError;
pub use event::Event;
pub use machine::MachineId;
pub use sink::{FileSink, MemorySink, NoopSink, TelemetrySink};
3.2 Event¶
#[derive(Debug, Clone, serde::Serialize)]
pub struct Event {
pub name: String,
pub tool: String,
pub tool_version: String,
pub machine_id: String, // salted SHA-256 hex
pub timestamp_utc: String, // RFC 3339
pub attrs: std::collections::HashMap<String, String>,
}
Construction:
- Event::new(name, tool, tool_version, machine_id) — required
shape; attrs starts empty.
- with_attr(k, v) -> Self fluent setter.
3.3 TelemetrySink¶
#[async_trait::async_trait]
pub trait TelemetrySink: Send + Sync + 'static {
async fn emit(&self, event: &Event) -> Result<(), TelemetryError>;
async fn flush(&self) -> Result<(), TelemetryError> { Ok(()) }
}
3.4 Built-in sinks¶
NoopSink— emit is a no-op; alwaysOk. Default whenCollectionPolicy::Disabled.FileSink— JSONL intoPathBuf.emitappendsserde_json::to_string(&event)\natomically (open/append/close per event for simplicity; batching in v0.2).MemorySink—Arc<Mutex<Vec<Event>>>— tests inspect viaMemorySink::snapshot().
3.5 TelemetryContext¶
pub struct TelemetryContext {
tool: String,
tool_version: String,
machine_id: String,
sink: Arc<dyn TelemetrySink>,
policy: CollectionPolicy,
}
impl TelemetryContext {
pub fn builder() -> TelemetryContextBuilder;
pub async fn record(&self, event_name: &str) -> Result<(), TelemetryError>;
pub async fn record_with_attrs(
&self,
event_name: &str,
attrs: HashMap<String, String>,
) -> Result<(), TelemetryError>;
pub async fn flush(&self) -> Result<(), TelemetryError>;
}
When policy == Disabled:
- record short-circuits to Ok(()) without building an event.
- flush short-circuits too.
3.6 MachineId¶
pub struct MachineId;
impl MachineId {
/// Salted SHA-256 of `machine_uid::get()`. Hex-encoded.
/// Falls back to a random `uuid::Uuid` when the OS doesn't
/// expose a machine ID (containers, WASI).
pub fn derive(salt: &str) -> String;
}
Tests verify only that it returns a hex string of length 64.
3.7 TelemetryError¶
#[derive(Debug, thiserror::Error, miette::Diagnostic)]
#[non_exhaustive]
pub enum TelemetryError {
#[error("sink I/O error: {0}")]
#[diagnostic(code(rtb::telemetry::io))]
Io(#[from] std::io::Error),
#[error("serialisation error: {0}")]
#[diagnostic(code(rtb::telemetry::serde))]
Serde(String),
}
4. Acceptance criteria¶
4.1 Unit tests (T#)¶
- T1 —
TelemetrySinkis object-safe —Arc<dyn TelemetrySink>compiles. - T2 —
NoopSink::emitis Ok and does nothing observable. - T3 —
MemorySink::emitrecords an event — snapshot returns the emitted event. - T4 —
FileSinkappends JSONL — write two events, read the file, assert two well-formed JSON lines. - T5 —
FileSinkcreates parent dirs — passing/tmp/xyz/.../events.jsonlto a non-existing parent succeeds. - T6 —
Eventserialises with the expected field names. Insta snapshot of a fixed event. - T7 —
MachineId::deriveis hex/64 — format sanity. - T8 —
MachineId::deriveis stable for a fixed salt — calling twice returns the same hash. - T9 —
TelemetryContext::recordemits through the sink when policy isEnabled. - T10 —
TelemetryContext::recordno-ops when policy isDisabled. - T11 —
TelemetryContext::record_with_attrsattaches the supplied attrs on the emitted event. - T12 —
TelemetryContextisClone + Send + Sync.
4.2 Gherkin scenarios (S#)¶
- S1 —
recordwith Disabled policy emits nothing. - S2 —
recordwith Enabled policy + MemorySink emits one event. - S3 —
record_with_attrssets the attrs map on the event. - S4 — Two sequential records are observable as two events in registration order.
- S5 —
FileSinkwrites JSONL lines to disk. - S6 —
MachineId::derivewith a fixed salt returns the same hash across two calls.
5. Security & operational requirements¶
#![forbid(unsafe_code)].- Machine ID is derived lazily, only when policy is
Enabled. A Disabled context never touchesmachine-uid::get(). - No logging of raw machine ID. Only the salted hash.
FileSinkwrites with mode 0644 on Unix (no special perms).Event::attrskeys and values are caller-suppliedStrings. The framework does not redact them — callers are on the hook. v0.2 will integrate anrtb-redacthelper.
6. Non-goals¶
- No backoff / retry / batching. Every
emitis a synchronous write. Downstream crates wanting durability use theFileor OTLP sinks layered with their own buffering. - No event schema versioning. v0.1
Eventis documented as#[non_exhaustive]so we can extend without breaking.
7. Rollout plan¶
- Land spec + tests + impl in one
feat(telemetry)commit. - v0.2 adds
HttpSink+OtlpSinkand thetelemetrysubcommand inrtb-cli.
8. Open questions¶
- O1 — Sync or async
emit? Async, for symmetry with the credential store and future OTLP backend.FileSinkwrapsstd::fs::OpenOptionsintokio::task::spawn_blocking. - O2 — Should
Event::timestamp_utcbetime::OffsetDateTime? Strings keep serde shape stable and avoid pulling timezone crates into consumers. Lean: String. - O3 — Should
MachineId::derivehash a version/discriminator with the salt so rotating the salt invalidates old IDs? Yes — the salt itself is the discriminator. Callers choose a per-tool salt. - O4 — Where should
FileSinkdefault its path? v0.1 requires caller-supplied path.directories::ProjectDirs::data_dir()would be a sensible default — deferred until rtb-cli wires this up.