Telemetry emitter: config snapshot + time-series aggregates + event batching

## Goal

Build the telemetry data collection layer that all downstream ML use cases share. The engine should emit three streams with <5% tick-loop overhead.

## Three Streams

### Stream 1: Config Snapshot (once per run)

```json
{
  "run_id": "...",
  "biome": "TEMPERATE",
  "sim_config_merged": { ... },
  "world_rates": { ... },
  "species_count": 12,
  "entity_count_initial": 45,
  "grid_dimensions": [32, 32, 32],
  "seed": 42
}
```

Emitted at engine init. Captures the full merged config (defaults + JSON overrides) so every telemetry row is traceable back to its parameter set.

### Stream 2: Time-Series Aggregates (every K ticks)

Not per-entity — that's O(entities × ticks). Instead aggregate:

- **Per-species:** mean/std of hunger, energy, hydration, health, reproductive_drive, colony_health
- **Voxel grid stats:** mean, min, max, spatial variance per layer (not individual cells)
- **Event counts in windows:** deaths, reproductions, predations, pollinations per K-tick window
- **Population counts** by species and state

Configurable sampling interval K (default 50 ticks = 5 seconds at 10Hz).

### Stream 3: Event Log (batched)

The existing `EventRecord` stream is already structured. Batch with tick ranges for compact storage.

## Output Format

- JSONL by default (one line per sample, easy to tail/stream)
- Optional parquet writer for training pipeline ingestion
- File naming: `{run_id}_telemetry.jsonl`, `{run_id}_events.jsonl`

## Module Structure

```
server/ecosim/telemetry.py          # TelemetryEmitter class
server/ecosim/telemetry_aggregates.py  # aggregation functions (per-species, voxel stats)
scripts/run_batch.py                # batch runner: Sobol sampling → N runs → collect telemetry
```

## Design Constraints

- <5% overhead on tick loop (aggregate in-place during existing phases, don't add a new phase)
- No external dependencies for the emitter itself (stdlib only — numpy/pandas optional for parquet output)
- Configurable via world JSON or engine constructor arg
- Works with existing WebSocket tick packets (telemetry is additive, not replacing anything)

## Acceptance Criteria

- [ ] `TelemetryEmitter` class with `on_init()`, `on_tick()`, `flush()` API
- [ ] Config snapshot emitted at engine init
- [ ] Per-species state var aggregates every K ticks
- [ ] Voxel layer statistics (mean/min/max/var) every K ticks
- [ ] Event window counts every K ticks
- [ ] JSONL output with run_id prefix
- [ ] `scripts/run_batch.py` runs N configs and produces telemetry files
- [ ] Benchmark: tick loop overhead <5% with telemetry enabled

## Related

- Docs: `docs/ECOSIM_PARAMETER_TELEMETRY_SPACE.md` (Phase 1)
- Feeds into: surrogate model training, sensitivity analysis, learned diffusion data collection

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Telemetry emitter: config snapshot + time-series aggregates + event batching #77

Goal

Three Streams

Stream 1: Config Snapshot (once per run)

Stream 2: Time-Series Aggregates (every K ticks)

Stream 3: Event Log (batched)

Output Format

Module Structure

Design Constraints

Acceptance Criteria

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Telemetry emitter: config snapshot + time-series aggregates + event batching #77

Description

Goal

Three Streams

Stream 1: Config Snapshot (once per run)

Stream 2: Time-Series Aggregates (every K ticks)

Stream 3: Event Log (batched)

Output Format

Module Structure

Design Constraints

Acceptance Criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions