viz/layers: stereo QuadLayer (paired mailbox + per-eye binding)#534
Draft
farbod-nv wants to merge 4 commits into
Draft
viz/layers: stereo QuadLayer (paired mailbox + per-eye binding)#534farbod-nv wants to merge 4 commits into
farbod-nv wants to merge 4 commits into
Conversation
QuadLayer::Config gains ``stereo`` and ``stereo_baseline_m``. Stereo
layers allocate two DeviceImages per mailbox slot (left + right) and
expose a two-arg submit:
void submit(left, right, stream = 0);
Both eyes are copied + the single cuda_done_writing signal is emitted
on the same CUDA stream, so stream ordering — not a second semaphore
— is what guarantees the renderer never sees a half-matched pair.
Strict invariants: submit(src) throws std::logic_error on a stereo
layer, submit(left, right) throws on a mono layer.
record() in kXr binds the left descriptor for view 0 and the right
for view 1. Window / offscreen (single-view) renders left only.
Baseline disparity is applied host-side by translating each eye's
placement position by ±stereo_baseline_m/2 along the placement's
local +x axis before composing the MVP — no shader change.
Mip generation runs on both eyes in the pre-render-pass when stereo
+ generate_mipmaps are both on. Descriptor pool sizing + writes
double when stereo.
Memory: 2× per-slot allocation for stereo layers (~112 MB at 1080p
RGBA8 + 7 slots vs ~56 MB mono).
Tests (catch2, gpu-gated where needed):
- ctor rejects non-finite stereo_baseline_m
- stereo allocates paired DeviceImages for every slot (distinct
vk_image + cuda_array per eye)
- mono device_image_right is null
- mono submit(left, right) throws
- stereo submit(left) throws
- stereo submit lands matching L/R pair in the latest slot
- stereo rapid submits keep every L/R pair atomic (decodes the
per-submit index from the left pixel and proves the right pixel
is the matching one for every written slot — covers torn writes
+ cross-eye aliasing in one shot)
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Pythonization of the stereo path that landed in C++:
QuadLayerConfig
* stereo: bool (default False)
* stereo_baseline_mm: float (default 0.0). Disparity in *millimeters*
rather than meters — typical IPDs / camera baselines are 50–80 mm
and meter values felt off when typed out.
QuadLayer.submit(left, right=None, stream=0)
* One method, accepts a VizBuffer OR any __cuda_array_interface__-
exposing object (CuPy / PyTorch / Numba / numpy on a CUDA pointer)
for each of left / right. Conversion + validation moved into
bindings_helpers.hpp::cuda_array_to_viz_buffer, callable from the
binding lambda twice (once per eye).
* Mono layer + right buffer → RuntimeError("...mono...").
* Stereo layer + missing right → RuntimeError("...stereo...").
* submit_cuda_array removed; in-tree callers
(camera_viz/pipeline/runner.py + viz_init.py + the python_tests
file) updated to the new shape.
Tests added to test_offscreen_session.py:
* test_stereo_quad_layer_round_trip — distinct L/R sources, render
in offscreen mode, readback must show the LEFT buffer (per the
single-view fallback documented on the layer).
* test_stereo_invariants — submit(left, right) on mono and
submit(left) on stereo both raise the right RuntimeError.
* test_stereo_invalid_baseline_rejected — NaN baseline rejected at
add_quad_layer time.
* Module-level skipif gate keeps these no-ops on the GPU-less CI
runners (same as the existing tests in this file).
C++ rename: QuadLayer::Config::stereo_baseline_m → stereo_baseline_mm
and the corresponding C++ test. The host-side translation in record()
multiplies by 0.0005f (×0.5 to halve per eye, ×0.001 to convert mm→m
for the world-meters placement.pose.position).
Plumbing for stereo all the way through the local pipeline:
pipeline.Frame
* New ``image_right: Optional[Any]`` field. None on mono frames;
set by stereo sources to the right-eye buffer. The two eyes must
come from the same capture instant — the QuadLayer mailbox is the
one that proves atomicity at the GPU side.
pipeline.runner
* VizRunner dispatches submit(image, image_right, stream) when
``frame.image_right`` is set, else falls back to submit(image, stream).
sources
* ``SyntheticStereoSource`` — animated test pattern with a
configurable horizontal pixel disparity between the eyes. Lets
camera_viz exercise the stereo path without stereo camera hardware.
* ``PairedFrameSource(name, left, right)`` — generic wrapper that
merges two per-eye FrameSources into one stereo source. Skips the
publish until both eyes are ready, so the layer never sees an
unpaired update.
* ``build_local_camera``: when ``cameras.<cam>.stereo: true``,
wraps the existing per-eye ZED / OAK-D sources in
PairedFrameSource (or directly returns SyntheticStereoSource for
synthetic). v4l2 + stereo combo is rejected — USB UVC is mono.
camera_viz.py
* SourceEntry promoted to a dataclass carrying (source, placement,
stereo, stereo_baseline_mm). YAML keys:
cameras.<cam>.stereo : producer-side toggle
placements.<cam>.stereo_baseline_mm: render-side disparity (mm)
The layer-builder loop sets QuadLayerConfig.stereo and
.stereo_baseline_mm from the entry.
* RTP entries leave stereo=false for now — paired-stream RTP is C4.
configs/synthetic_stereo.yaml — hardware-free demo config exercising
the whole stack end to end.
C4 of the stereo plan: stereo cameras over the RTP path send two
independent H.264 streams (rtp.port for left, rtp.port_right for
right). The receiver opens both, pairs them via PairedFrameSource,
and feeds the QuadLayer mailbox — atomicity guarantees come back at
the GPU side, not on the wire. Wire-level drift is acceptable per
the agreed spec.
camera_streamer
* ``_pick_mono_source`` → ``_eye_sources`` returns [src] for mono
cameras, [left, right] for stereo (unwrapped from the
PairedFrameSource that build_local_camera returns).
* ``_build_sender`` → ``_build_senders`` returns a list; one
RtpH264Sender per eye. Supervisor starts both, polls both for
liveness, restarts both as a pair if either drops — keeping eyes
in lockstep across reconnects.
* Requires ``rtp.port_right`` when the camera is stereo. Mono
config is unchanged.
camera_viz._build_rtp_entries
* Stereo cameras open two RtpH264Source instances (one per port),
named "<cam>.left" / "<cam>.right", and wrap them in
PairedFrameSource. SourceEntry carries stereo + baseline through
to the QuadLayerConfig the same way the local path does.
* Mono RTP cameras unchanged.
sources.PairedFrameSource
* Expose ``.left`` / ``.right`` properties so camera_streamer can
fan out to per-eye senders without re-opening the camera.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
QuadLayer::Config gains
stereoandstereo_baseline_m. Stereo layers allocate two DeviceImages per mailbox slot (left + right) and expose a two-arg submit:Both eyes are copied + the single cuda_done_writing signal is emitted on the same CUDA stream, so stream ordering — not a second semaphore — is what guarantees the renderer never sees a half-matched pair. Strict invariants: submit(src) throws std::logic_error on a stereo layer, submit(left, right) throws on a mono layer.
record() in kXr binds the left descriptor for view 0 and the right for view 1. Window / offscreen (single-view) renders left only. Baseline disparity is applied host-side by translating each eye's placement position by ±stereo_baseline_m/2 along the placement's local +x axis before composing the MVP — no shader change.
Mip generation runs on both eyes in the pre-render-pass when stereo
Memory: 2× per-slot allocation for stereo layers (~112 MB at 1080p RGBA8 + 7 slots vs ~56 MB mono).
Tests (catch2, gpu-gated where needed):