Skip to content

Add source_video link field to PoseEstimation#56

Open
h-mayorquin wants to merge 4 commits into
rly:mainfrom
h-mayorquin:create_formal_links
Open

Add source_video link field to PoseEstimation#56
h-mayorquin wants to merge 4 commits into
rly:mainfrom
h-mayorquin:create_formal_links

Conversation

@h-mayorquin
Copy link
Copy Markdown
Contributor

Adds an optional source_video link to ImageSeries on PoseEstimation, consistent with how TrainingFrame already links to its source video. This provides a formal NWB reference to the source video instead of relying on the string paths in original_videos, which can become stale or break when files are moved (e.g., dandi/dandi-cli#1817).

Closes #12 and supersedes #13 with a simpler approach that requires no custom IO mapper.

To discuss: the current approach keeps original_videos is kept for backwards compatibility but I am unsure if we should remove this and bumpb the schema source. I think we can leep it here and decide this later

@h-mayorquin
Copy link
Copy Markdown
Contributor Author

We are discussing whether this should be source_videos.

The thing is that original_videos is plural because the container is meant for multiple cameras. DANNCE:

https://github.com/spoonsso/dannce

@h-mayorquin
Copy link
Copy Markdown
Contributor Author

Add labeled_video as well for consistency.

@rly
Copy link
Copy Markdown
Owner

rly commented Apr 24, 2026

@h-mayorquin and I discussed this over zoom. PoseEstimation does not actually work well for the multi-camera use case. PoseEstimationSeries stores pose estimate time series for a single keypoint for a single camera (coordinates are noted as pixel-based). There is no defined way to store the pose estimates in a global / experimental space, e.g., 3D coordinates after triangulation from multiple cameras. This should probably be stored in a different way, or at least labeled differently. So we were thinking of keeping PoseEstimation for the single camera case, which is the most common use case that I am aware of. Separately, we can add a MultiCameraPoseEstimation object to handle storage of multiple camera videos, linking between PoseEstimationSeries and camera / camera video, and storage of the 3D pose estimates in a global space. It could also store calibration information and positions of the cameras relative to one another. This should be driven by actual multi-camera use cases like DANNCE.

@h-mayorquin
Copy link
Copy Markdown
Contributor Author

Thanks, @rly I added the requested parameter now and the changelog.

@CodyCBakerPhD
Copy link
Copy Markdown

So we were thinking of keeping PoseEstimation for the single camera case, which is the most common use case that I am aware of. Separately, we can add a MultiCameraPoseEstimation object to handle storage of multiple camera videos, linking between PoseEstimationSeries and camera / camera video, and storage of the 3D pose estimates in a global space.

Sounds good to me!

alessandratrapani added a commit to alessandratrapani/ndx-pose that referenced this pull request May 19, 2026
…` neurodata types

## Add `MultiCameraPoseEstimation`, `CameraView`, and `CameraCalibration` neurodata types

### Motivation

`PoseEstimation` is designed for single-camera, pixel-space pose data (e.g. DeepLabCut).
It does not model multi-camera 3D setups well: there is no place to store calibration
parameters, the relationship between cameras and their videos is implicit (order-based
string paths), and 3D world-space estimates have no structural separation from 2D pixel
estimates.

This PR adds three new neurodata types to handle multi-camera 3D pose estimation
(DANNCE, Anipose, etc.).

---

### New types

#### `CameraCalibration` (`NWBDataInterface`)

Stores intrinsic and extrinsic calibration parameters for a set of cameras. Each row of
every dataset corresponds to one camera, in the same order as the linked `Device` objects.

| Field | Required | Shape | Description |
|-------|----------|-------|-------------|
| `intrinsic_matrix` | yes | `(n_cameras, 3, 3)` | Camera matrix K |
| `rotation_matrix` | no | `(n_cameras, 3, 3)` | Rotation from world to camera frame |
| `translation_vector` | no | `(n_cameras, 3)` | Translation from world to camera frame |
| `distortion_coefficients` | no | `(n_cameras, N)` | Lens distortion coefficients |
| `devices` | no | — | Links to `Device` objects, one per camera (must already be in the NWBFile) |

#### `CameraView` (`NWBDataInterface`)

Represents a single camera's contribution to the pose estimation pipeline. Groups a
camera device, its raw video, and optionally its per-camera 2D keypoint estimates in
pixel space.

| Field | Required | Description |
|-------|----------|-------------|
| `device` | yes | Link to a `Device` already in the NWBFile |
| `source_video` | no | Link to an `ImageSeries` stored in acquisition |
| `pose_estimation_series` | no | 2D `PoseEstimationSeries` children (pixel-space estimates for this camera) |

#### `MultiCameraPoseEstimation` (`NWBDataInterface`)

Top-level container for 3D multi-camera pose estimation. Holds 3D world-space keypoints,
one `CameraView` per camera, and optionally a calibration object and skeleton link.

| Field | Required | Description |
|-------|----------|-------------|
| `pose_estimation_series` | no | 3D `PoseEstimationSeries` children in world-space coordinates |
| `camera_views` | no | `CameraView` children, one per camera |
| `calibration` | no | `CameraCalibration` child |
| `skeleton` | no | Link to a `Skeleton` in a `Skeletons` object |
| `description` | no | Description of the pose estimation procedure |
| `scorer` | no | Name of the scorer / algorithm |
| `source_software` | no | Name of the software tool |
| `source_software_version` | no | Version string of the software tool |

---

### Design decisions

- **`labeled_video` omitted**: labeled videos (pose overlays) are derived visualization
  artifacts, not raw data. They do not belong in the primary data storage path and are
  therefore excluded from `CameraView`.
- **`CameraView` as a named container per camera**: each camera has its own named group,
  making the device ↔ video ↔ 2D-estimates relationship explicit and structural rather
  than relying on implicit ordering of parallel lists.
- **Links to `ImageSeries` in acquisition**: `CameraView.source_video` follows the same
  pattern introduced in PR rly#56 for `PoseEstimation.source_video` — the `ImageSeries`
  lives in acquisition and is linked from the pose container.
- **Calibration row order matches device link order**: `CameraCalibration` datasets are
  row-indexed to match the order of the linked `Device` list, making the
  camera-to-parameters mapping unambiguous.

---

### Files changed

| File | Change |
|------|--------|
| `spec/ndx-pose.extensions.yaml` | Added `CameraCalibration`, `CameraView`, `MultiCameraPoseEstimation` type definitions |
| `spec/ndx-pose.namespace.yaml` | Version bumped to `0.2.2` |
| `src/pynwb/ndx_pose/pose.py` | Implemented all three classes |
| `src/pynwb/ndx_pose/io/pose.py` | Added `MultiCameraPoseEstimationMap` for `source_software_version` IO mapping |
| `src/pynwb/ndx_pose/__init__.py` | Exported `CameraCalibration`, `CameraView` |
| `src/pynwb/ndx_pose/testing/mock/pose.py` | Added `mock_CameraCalibration`, `mock_CameraView`; rewrote `mock_MultiCameraPoseEstimation` |
| `src/pynwb/tests/unit/test_pose.py` | Added `TestCameraCalibrationConstructor`, `TestCameraViewConstructor`; rewrote `TestMultiCameraPoseEstimationConstructor` |
| `src/pynwb/tests/integration/hdf5/test_pose.py` | Added `TestCameraCalibrationRoundtrip`, `TestCameraViewRoundtrip`; rewrote `TestMultiCameraPoseEstimationRoundtrip` and `TestMultiCameraPoseEstimationRoundtripPyNWB` |
@h-mayorquin
Copy link
Copy Markdown
Contributor Author

@rly Maybe we can merge this now that @alessandratrapani is working on #57

@pauladkisson
Copy link
Copy Markdown
Contributor

Would love to see this feature!

The generated spec YAML had the new source_video and labeled_video links
and updated original_videos/labeled_videos docs added by hand, but the
generator was not updated. Add the two NWBLinkSpec entries, update the
dataset docs, and bump the version to 0.2.1 so re-running the generator
reproduces the committed schema.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

original_videos should be a link

4 participants