Goal
Build the telemetry data collection layer that all downstream ML use cases share. The engine should emit three streams with <5% tick-loop overhead.
Three Streams
Stream 1: Config Snapshot (once per run)
{
"run_id": "...",
"biome": "TEMPERATE",
"sim_config_merged": { ... },
"world_rates": { ... },
"species_count": 12,
"entity_count_initial": 45,
"grid_dimensions": [32, 32, 32],
"seed": 42
}
Emitted at engine init. Captures the full merged config (defaults + JSON overrides) so every telemetry row is traceable back to its parameter set.
Stream 2: Time-Series Aggregates (every K ticks)
Not per-entity — that's O(entities × ticks). Instead aggregate:
- Per-species: mean/std of hunger, energy, hydration, health, reproductive_drive, colony_health
- Voxel grid stats: mean, min, max, spatial variance per layer (not individual cells)
- Event counts in windows: deaths, reproductions, predations, pollinations per K-tick window
- Population counts by species and state
Configurable sampling interval K (default 50 ticks = 5 seconds at 10Hz).
Stream 3: Event Log (batched)
The existing EventRecord stream is already structured. Batch with tick ranges for compact storage.
Output Format
- JSONL by default (one line per sample, easy to tail/stream)
- Optional parquet writer for training pipeline ingestion
- File naming:
{run_id}_telemetry.jsonl, {run_id}_events.jsonl
Module Structure
server/ecosim/telemetry.py # TelemetryEmitter class
server/ecosim/telemetry_aggregates.py # aggregation functions (per-species, voxel stats)
scripts/run_batch.py # batch runner: Sobol sampling → N runs → collect telemetry
Design Constraints
- <5% overhead on tick loop (aggregate in-place during existing phases, don't add a new phase)
- No external dependencies for the emitter itself (stdlib only — numpy/pandas optional for parquet output)
- Configurable via world JSON or engine constructor arg
- Works with existing WebSocket tick packets (telemetry is additive, not replacing anything)
Acceptance Criteria
Related
- Docs:
docs/ECOSIM_PARAMETER_TELEMETRY_SPACE.md (Phase 1)
- Feeds into: surrogate model training, sensitivity analysis, learned diffusion data collection
Goal
Build the telemetry data collection layer that all downstream ML use cases share. The engine should emit three streams with <5% tick-loop overhead.
Three Streams
Stream 1: Config Snapshot (once per run)
{ "run_id": "...", "biome": "TEMPERATE", "sim_config_merged": { ... }, "world_rates": { ... }, "species_count": 12, "entity_count_initial": 45, "grid_dimensions": [32, 32, 32], "seed": 42 }Emitted at engine init. Captures the full merged config (defaults + JSON overrides) so every telemetry row is traceable back to its parameter set.
Stream 2: Time-Series Aggregates (every K ticks)
Not per-entity — that's O(entities × ticks). Instead aggregate:
Configurable sampling interval K (default 50 ticks = 5 seconds at 10Hz).
Stream 3: Event Log (batched)
The existing
EventRecordstream is already structured. Batch with tick ranges for compact storage.Output Format
{run_id}_telemetry.jsonl,{run_id}_events.jsonlModule Structure
Design Constraints
Acceptance Criteria
TelemetryEmitterclass withon_init(),on_tick(),flush()APIscripts/run_batch.pyruns N configs and produces telemetry filesRelated
docs/ECOSIM_PARAMETER_TELEMETRY_SPACE.md(Phase 1)