Batching of Snapshots

In experimental data acquisition, we often need to batch together many measurements/"shots" at the same setpoints (inputs). Duplicating their general meta-data can be cumbersome and reading such files in analysis is highly non-contiguous.

## Meshes

In the [Hardware Aware AI (HAAI) beamline data aquisiton](https://github.com/ericcropp/Data_Sharing) schema draft of @ericcropp, SLAC started to add extra dimensions on the records (variables) inside a snapshot to capture multiple shots.

Example: a 2D quad scan ([to measure 4D emittance](https://journals.aps.org/prab/abstract/10.1103/PhysRevSTAB.17.052801)) adds two extra "batch" dimensions on the slowest varying index to each data record's feature dimensions.

## Particles

A natural way to extend openPMD dataframes / tables is to add an extra column (1d record) for every batch dimension and concatenate the data of all batches.
The new columns then identify which particle particle (row) belongs to which batch.

<img width="817" height="664" alt="Image" src="https://github.com/user-attachments/assets/1c9128fa-0199-40b5-aacc-6b33fdc661ac" />

# Possible Complication

As a real-life complication, not every "diagnostics"/"shot" has the same frequency of output: In a batch of 100 shots, some records contribute only every N-th time and might not all start exactly at the same time.

Possible approach: as long as the batch can be in some way time aligned and the frequency of contributions to records is stable within the batch, one can:
- use a global shot (e.g., first common shot) as the openPMD snapshot number, e.g., `100`
  - add an attribute how many entries are in the batch at most, e.g., "50 snapshots in this batch"
- store an attribute similar to a numpy slice / [interval](https://warpx.readthedocs.io/en/latest/usage/parameters.html#time-intervals) on each record to store contribution slicing, e.g, "105:121:5": this record has global snapshots 105, 110, 115 and 120 inside the 100-150 shot batch interval

Discussed on March 4, 2026: This feature received no great resonance in the HAAI meeting as it complicates the logic for human and machine-reading a lot. We would limit a batching implementation to equal contributions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batching of Snapshots #292

Meshes

Particles

Possible Complication

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Batching of Snapshots #292

Description

Meshes

Particles

Possible Complication

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions