Skip to content

Telemetry emitter: config snapshot + time-series aggregates + event batching #77

Description

@biosynthart

Goal

Build the telemetry data collection layer that all downstream ML use cases share. The engine should emit three streams with <5% tick-loop overhead.

Three Streams

Stream 1: Config Snapshot (once per run)

{
  "run_id": "...",
  "biome": "TEMPERATE",
  "sim_config_merged": { ... },
  "world_rates": { ... },
  "species_count": 12,
  "entity_count_initial": 45,
  "grid_dimensions": [32, 32, 32],
  "seed": 42
}

Emitted at engine init. Captures the full merged config (defaults + JSON overrides) so every telemetry row is traceable back to its parameter set.

Stream 2: Time-Series Aggregates (every K ticks)

Not per-entity — that's O(entities × ticks). Instead aggregate:

  • Per-species: mean/std of hunger, energy, hydration, health, reproductive_drive, colony_health
  • Voxel grid stats: mean, min, max, spatial variance per layer (not individual cells)
  • Event counts in windows: deaths, reproductions, predations, pollinations per K-tick window
  • Population counts by species and state

Configurable sampling interval K (default 50 ticks = 5 seconds at 10Hz).

Stream 3: Event Log (batched)

The existing EventRecord stream is already structured. Batch with tick ranges for compact storage.

Output Format

  • JSONL by default (one line per sample, easy to tail/stream)
  • Optional parquet writer for training pipeline ingestion
  • File naming: {run_id}_telemetry.jsonl, {run_id}_events.jsonl

Module Structure

server/ecosim/telemetry.py          # TelemetryEmitter class
server/ecosim/telemetry_aggregates.py  # aggregation functions (per-species, voxel stats)
scripts/run_batch.py                # batch runner: Sobol sampling → N runs → collect telemetry

Design Constraints

  • <5% overhead on tick loop (aggregate in-place during existing phases, don't add a new phase)
  • No external dependencies for the emitter itself (stdlib only — numpy/pandas optional for parquet output)
  • Configurable via world JSON or engine constructor arg
  • Works with existing WebSocket tick packets (telemetry is additive, not replacing anything)

Acceptance Criteria

  • TelemetryEmitter class with on_init(), on_tick(), flush() API
  • Config snapshot emitted at engine init
  • Per-species state var aggregates every K ticks
  • Voxel layer statistics (mean/min/max/var) every K ticks
  • Event window counts every K ticks
  • JSONL output with run_id prefix
  • scripts/run_batch.py runs N configs and produces telemetry files
  • Benchmark: tick loop overhead <5% with telemetry enabled

Related

  • Docs: docs/ECOSIM_PARAMETER_TELEMETRY_SPACE.md (Phase 1)
  • Feeds into: surrogate model training, sensitivity analysis, learned diffusion data collection

Metadata

Metadata

Assignees

No one assigned

    Labels

    shelvedDeferred until after scalability work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions