Hi, thanks for releasing the preprocessed DreamZero-DROID dataset and the conversion script.
I am trying to align external per-frame / per-clip annotations derived from raw DROID 1.0.1 with the released GEAR-Dreams/DreamZero-DROID-Data LeRobot dataset.
From the docs, I understand that the released dataset is generated from DROID 1.0.1 by:
- applying
droid_sample_ranges_v1_0_1.json to remove idle frames,
- filtering failed episodes,
- filtering episodes without language annotations,
- reindexing the remaining episodes into LeRobot format.
I have two questions:
-
Episode-level mapping:
Is there a released mapping from each processed LeRobot episode, e.g.
data/chunk-XXX/episode_YYYYYY.parquet / videos/.../episode_YYYYYY.mp4,
back to the original DROID trajectory path, e.g.
gs://xembodiment_data/r2d2/r2d2-data-full/.../trajectory.h5?
If not, is the intended way to reproduce this mapping by rerunning scripts/data/convert_droid.py on raw DROID 1.0.1 and reconstructing the kept_registry?
-
Frame-level mapping:
For a processed episode, are the video frame indices exactly the concatenation of the raw DROID frame ranges in droid_sample_ranges_v1_0_1.json?
For example, if:
keep_ranges = [[0, 24], [31, 171]],
should processed video frame indices map as:
- processed frames
0..23 -> raw frames 0..23
- processed frames
24..163 -> raw frames 31..170
In other words, is the released video frame index always the idle-filtered raw DROID frame index with the gaps removed?
This mapping would be very useful for aligning external annotations such as PointWorld-DROID flow clips, whose HDF5 groups are named like {start}:{end}.
Thanks!
Hi, thanks for releasing the preprocessed DreamZero-DROID dataset and the conversion script.
I am trying to align external per-frame / per-clip annotations derived from raw DROID 1.0.1 with the released
GEAR-Dreams/DreamZero-DROID-DataLeRobot dataset.From the docs, I understand that the released dataset is generated from DROID 1.0.1 by:
droid_sample_ranges_v1_0_1.jsonto remove idle frames,I have two questions:
Episode-level mapping:
Is there a released mapping from each processed LeRobot episode, e.g.
data/chunk-XXX/episode_YYYYYY.parquet/videos/.../episode_YYYYYY.mp4,back to the original DROID trajectory path, e.g.
gs://xembodiment_data/r2d2/r2d2-data-full/.../trajectory.h5?If not, is the intended way to reproduce this mapping by rerunning
scripts/data/convert_droid.pyon raw DROID 1.0.1 and reconstructing thekept_registry?Frame-level mapping:
For a processed episode, are the video frame indices exactly the concatenation of the raw DROID frame ranges in
droid_sample_ranges_v1_0_1.json?For example, if:
keep_ranges = [[0, 24], [31, 171]],should processed video frame indices map as:
0..23-> raw frames0..2324..163-> raw frames31..170In other words, is the released video frame index always the idle-filtered raw DROID frame index with the gaps removed?
This mapping would be very useful for aligning external annotations such as PointWorld-DROID flow clips, whose HDF5 groups are named like
{start}:{end}.Thanks!