diff --git a/docs/source/getting_started/quick_start.rst b/docs/source/getting_started/quick_start.rst index d9c3bed4d..99f746ef8 100644 --- a/docs/source/getting_started/quick_start.rst +++ b/docs/source/getting_started/quick_start.rst @@ -343,6 +343,8 @@ More Information - :doc:`teleop_session` — learn how ``TeleopSession`` works and how to build custom retargeting pipelines +- :doc:`televiz` — visualize robot camera and sensor feeds in an XR headset with + the Televiz compositor (and share a single XR session with your teleop pipeline) - :doc:`build_from_source/index` — build the C++ core, Python bindings, and plugins from source diff --git a/docs/source/getting_started/televiz.rst b/docs/source/getting_started/televiz.rst new file mode 100644 index 000000000..95080b014 --- /dev/null +++ b/docs/source/getting_started/televiz.rst @@ -0,0 +1,412 @@ +.. SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +.. SPDX-License-Identifier: Apache-2.0 + +Televiz +======= + +Televiz (``isaacteleop.viz``) is a lightweight compositor for Isaac Teleop. It composites camera and +sensor feeds — with 3D rendered content coming soon — into an XR headset, a desktop window, or an +offscreen buffer, integrating directly with the device-tracking and retargeting pipeline. + +It is a **compositor**, not a capture or streaming layer: it consumes GPU frames and assembles them +into a final image. Camera capture, decode, and network transport live in the application (see +:doc:`/references/camera_streaming`). + +The compositor is implemented in C++ (``namespace viz``, built on Vulkan + OpenXR + CUDA with no +external rendering-framework dependency) and exposed through a pybind11 binding. This page uses the +Python API, which mirrors the C++ names one-to-one — see `C++ API`_ to link against the library +directly. + +.. contents:: On this page + :local: + :depth: 2 + +Overview +-------- + +The central object is :code-file:`VizSession `, +which owns the Vulkan context, the display target, the OpenXR session (in XR mode), and a registry +of **layers**. Content producers submit GPU buffers to layers; the session composites every layer +into one frame each time you call ``render()``. + +The built-in layer type today is +:code-file:`QuadLayer ` — a CUDA-fed 2D texture +plane (mono or stereo), optionally placed in 3D space. Use it for camera feeds. + +.. note:: + + **Coming soon:** ``ProjectionLayer``, a full-view stereo RGBD layer for external renderers + (gsplat, nvblox, neural reconstruction) that produce per-view ``(color, depth)`` buffers, + Z-composited with quads. It is not yet available in this release — see `ProjectionLayer + (coming soon)`_ below. + +All symbols are imported from the top-level module:: + + import isaacteleop.viz as televiz + +Display modes +------------- + +A session runs in exactly one display mode, set on the config: + +.. list-table:: + :header-rows: 1 + :widths: 20 80 + + * - ``DisplayMode`` + - Behavior + * - ``kXr`` + - OpenXR + Vulkan. Per-eye swapchains, stereo rendering, depth composition layer. Requires a + running OpenXR runtime (e.g. CloudXR). + * - ``kWindow`` + - GLFW desktop window. Layers are aspect-fit tiled; stereo layers show the left eye. + * - ``kOffscreen`` + - No display. Composite to an internal target and pull pixels back with + ``readback_to_host()``. Useful for tests and headless rendering. + +Quick start +----------- + +A minimal offscreen render-and-readback (no GPU display, no headset): + +.. code-block:: python + + import cupy as cp + import isaacteleop.viz as televiz + + cfg = televiz.VizSessionConfig() + cfg.mode = televiz.DisplayMode.kOffscreen + cfg.window_width = 1024 + cfg.window_height = 1024 + + session = televiz.VizSession.create(cfg) + + layer_cfg = televiz.QuadLayerConfig() + layer_cfg.name = "cam" + layer_cfg.resolution = televiz.Resolution(1024, 1024) + layer = session.add_quad_layer(layer_cfg) + + # Any __cuda_array_interface__ array (CuPy / PyTorch / Numba) or a VizBuffer. + frame = cp.zeros((1024, 1024, 4), dtype=cp.uint8) # RGBA8 + layer.submit(frame) + + info = session.render() # wait + composite + present + img = session.readback_to_host() # HostImage; numpy.asarray(img) for pixels + + session.destroy() + +For a window or headset, set ``mode`` to ``DisplayMode.kWindow`` or ``DisplayMode.kXr`` instead. The +layer setup is identical; you just drive a frame loop (see `Frame loop`_) rather than the one-shot +``readback_to_host()``, which is offscreen-only. + +Session configuration +--------------------- + +``VizSessionConfig`` fields: + +.. list-table:: + :header-rows: 1 + :widths: 28 16 56 + + * - Field + - Default + - Description + * - ``mode`` + - — + - ``DisplayMode.kXr`` / ``kWindow`` / ``kOffscreen``. Required. + * - ``window_width`` / ``window_height`` + - — + - Render size for window and offscreen modes. Ignored in XR (the runtime dictates per-eye + resolution; query it with ``get_recommended_resolution()``). + * - ``app_name`` + - ``"televiz"`` + - OpenXR application name. + * - ``required_extensions`` + - ``[]`` + - Extra OpenXR instance extensions to enable when Televiz hosts the session and downstream + components (e.g. ``TeleopSession`` trackers) need them. Televiz already enables its own + rendering extensions. See `Sharing the XR session`_. + * - ``xr_near_z`` / ``xr_far_z`` + - — + - Near / far planes for the XR projection. + * - ``xr_system_wait_seconds`` + - — + - How long to wait for the OpenXR system (headset) to become available at create time. + * - ``clear_color`` + - — + - Background color as an ``(r, g, b, a)`` sequence in ``[0, 1]``. + * - ``gpu_timing`` + - — + - Enable GPU timestamp queries, surfaced via ``get_gpu_timing()``. + +Construct the session with the factory; never call the class directly: + +.. code-block:: python + + session = televiz.VizSession.create(cfg) + +Layers +------ + +Layers render in **insertion order** — the first added renders first (underneath). A layer is owned +by the session; ``add_quad_layer`` returns a **non-owning** handle, so don't keep it past the +session's lifetime. + +QuadLayer +^^^^^^^^^ + +A 2D plane fed by a CUDA buffer. Configure it with ``QuadLayerConfig``: + +.. list-table:: + :header-rows: 1 + :widths: 26 14 60 + + * - Field + - Default + - Description + * - ``name`` + - — + - Layer name (used as the placement key in app config). + * - ``resolution`` + - — + - Source texture size, a ``Resolution``. Submitted buffers must match it. + * - ``format`` + - — + - ``PixelFormat`` of the source (typically ``kRGBA8``). + * - ``placement`` + - — + - Optional ``QuadLayerPlacement`` (``pose`` + ``size_meters``) for 3D placement in XR. + * - ``stereo`` + - ``False`` + - Per-eye stereo. When ``True``, ``submit`` requires both eyes' buffers; view 0 (left) samples + the left buffer, view 1 (right) the right. Memory doubles. + * - ``stereo_baseline_mm`` + - ``0`` + - Horizontal disparity between the left/right planes (mm), along the placement's local +x axis. + ``0`` → both eyes see the same world quad. XR + stereo only. + * - ``generate_mipmaps`` + - ``True`` + - Allocate + regenerate a capped mip chain each frame; sampler uses trilinear filtering. + +Submit and place a frame: + +.. code-block:: python + + layer = session.add_quad_layer(layer_cfg) + + # Mono: pass exactly one buffer. Stereo: layer.submit(left, right). + layer.submit(rgba_array) # optional: stream= + + # 3D placement (XR). Pose is OpenXR stage space: position (x,y,z), + # orientation quaternion (w,x,y,z). size_meters is (width, height). + placement = televiz.QuadLayerPlacement( + televiz.Pose3D(position=(0.0, 0.0, -1.5), orientation=(1.0, 0.0, 0.0, 0.0)), + size_meters=(1.0, 0.5625), + ) + layer.set_placement(placement) + layer.set_visible(True) + +``submit(left, right=None, stream=0)`` accepts a ``VizBuffer`` or any +``__cuda_array_interface__`` object; the binding converts it and releases the GIL across the copy. +For a stereo layer both buffers are copied on the same stream and signaled together, so the renderer +never sees a half-matched pair. Lock-mode placement strategies (``world`` / ``head`` / ``lazy``) are +**application policy** and ship in the sample, not in the module. + +ProjectionLayer (coming soon) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. note:: + + ``ProjectionLayer`` is under active development and **not yet available in this release**. The + description below is a preview of the planned API and may change. + +A planned full-view RGBD layer for in-loop renderers — gsplat, nvblox, or neural reconstruction +engines that produce per-view ``(color, depth)`` buffers. Unlike ``QuadLayer``, the renderer will +run **in-loop** with the XR frame loop: render against the predicted view poses from the current +frame, then submit between ``begin_frame()`` and ``end_frame()``. Output is composited with depth, +so it Z-combines with quad layers. + +Frame loop +---------- + +Two API levels drive the frame loop. Both release the GIL during blocking waits. + +**Convenience** — ``render()`` does wait + composite + present in one call and returns a +``FrameInfo``. Internally it checks ``should_render`` and skips the GPU pass when the runtime +says the frame won't be visible; producers' ``submit`` writes still land in the back buffer. + +.. code-block:: python + + while running: + cam_layer.submit(camera_frame) + info = session.render() + +**Explicit** — ``begin_frame()`` / ``end_frame()``, when the app needs the +``FrameInfo`` *before* submitting (e.g. to read the predicted view poses before rendering, or to +skip expensive decode when not visible): + +.. code-block:: python + + while running: + info = session.begin_frame() + if info.should_render: + cam_layer.submit(decode_camera()) # skip decode when not visible + session.end_frame() + +``FrameInfo`` carries ``frame_index``, ``predicted_display_time`` (XR time in ns; 0 outside +XR), ``delta_time`` (CPU wall-clock seconds — usable without any XR knowledge), ``should_render``, +``resolution``, and ``views``. Each ``ViewInfo`` in ``views`` has ``viewport``, ``fov``, and +``pose`` — 2 entries in XR stereo, 1 (identity pose) in window / offscreen. + +Session state +------------- + +A session moves through ``SessionState``: + +``kUninitialized → kReady → kRunning → kStopping → kLost → kDestroyed`` + +- ``kReady`` after ``create`` — add layers and submit content. +- ``kRunning`` once the frame loop is active. +- ``kStopping`` (XR) — the runtime is stopping; ``end_frame`` submits empty frames. +- ``kLost`` (XR) — the session was lost; ``render`` / ``begin_frame`` raise. Destroy and recreate + the ``VizSession`` (Televiz supports clean in-process recreation). +- ``kDestroyed`` after ``destroy``. + +Query it with ``get_state()``; in window mode ``should_close()`` reports the window-close +request. OpenXR events are polled inside ``begin_frame``, which drives the XR-specific transitions. + +.. _sharing-the-xr-session: + +Sharing the XR session +---------------------- + +Only one OpenXR session is allowed per process. In XR mode ``VizSession`` creates a **graphics-bound** +session (Isaac Teleop's own ``OpenXRSession`` is headless and cannot render). When you use Televiz +*and* ``TeleopSession`` together, let Televiz own the session and hand its live handles to +``TeleopSession`` so trackers attach to the same session — one CloudXR connection, synchronized +timing. + +Declare the extensions your trackers need in ``required_extensions`` (Televiz adds its own rendering +extensions automatically), then pass the handles through: + +.. code-block:: python + + import isaacteleop.viz as televiz + from isaacteleop.teleop_session_manager import TeleopSession, TeleopSessionConfig + from isaacteleop.deviceio import DeviceIOSession + from teleopcore.oxr import OpenXRSessionHandles + + cfg = televiz.VizSessionConfig() + cfg.mode = televiz.DisplayMode.kXr + # Aggregate the XR extensions downstream trackers need so they're present + # on the XrInstance Televiz is about to create. + cfg.required_extensions = DeviceIOSession.get_required_extensions(trackers) + viz_session = televiz.VizSession.create(cfg) + + config = TeleopSessionConfig( + app_name="MyApp", + pipeline=pipeline, + oxr_handles=OpenXRSessionHandles(*viz_session.get_oxr_handles()), + ) + with TeleopSession(config) as session: + while running: + session.step() + cam_layer.submit(camera_frame) + viz_session.render() + +``get_oxr_handles()`` returns ``(instance, session, space, proc_addr)`` as raw ``uint64`` +values (or ``None`` outside ``kXr``); wrap them with ``OpenXRSessionHandles(*tuple)`` for +``TeleopSessionConfig.oxr_handles``. ``VizSession`` and ``TeleopSession`` keep **independent** state +machines and lifecycles — either can run without the other. In the unified pattern the underlying +OpenXR session is shared, so if it is lost both must be recreated. See +:doc:`teleop_session` for the ``TeleopSession`` side. + +API reference +------------- + +VizSession +^^^^^^^^^^ + +- ``create(config) -> VizSession`` *(static)* — validate config + initialize Vulkan / display backend. +- ``render() -> FrameInfo`` — wait + composite + present. +- ``begin_frame() -> FrameInfo`` / ``end_frame()`` — explicit two-phase frame loop. +- ``add_quad_layer(config) -> QuadLayer`` — construct + register a layer; returns a non-owning handle. +- ``readback_to_host() -> HostImage`` — most recent frame as RGBA8 host pixels (``kOffscreen`` only). +- ``get_state() -> SessionState``, ``should_close() -> bool``, ``is_xr_mode() -> bool``. +- ``get_recommended_resolution() -> Resolution`` — runtime per-eye resolution (XR). +- ``head_pose_now() -> Optional[Pose3D]`` — current head pose (``kXr`` only; ``None`` on tracking loss). +- ``get_oxr_handles() -> Optional[tuple]`` — ``(instance, session, space, proc_addr)`` as raw ``uint64``. +- ``get_frame_timing_stats() -> FrameTimingStats`` / ``get_gpu_timing() -> GpuFrameTiming``. +- ``destroy()`` — release all resources (idempotent). +- Properties ``vk_device`` / ``vk_physical_device`` / ``vk_queue_family_index`` — raw handles for + wiring Televiz into a foreign Vulkan app. Most users won't touch these. + +QuadLayer +^^^^^^^^^ + +- ``submit(left, right=None, stream=0)`` — submit a frame (mono: ``left`` only; stereo: both). +- ``set_placement(placement)`` / ``placement()`` — 3D placement (``None`` → fullscreen, window mode). +- ``set_visible(visible)`` / ``is_visible()``. +- Properties ``resolution``, ``format``, ``aspect_ratio``, ``name``. + +Data types +^^^^^^^^^^ + +- ``VizBuffer`` — non-owning 2D pixel buffer descriptor. Device buffers expose + ``__cuda_array_interface__`` (``cupy.asarray(buf)``); host buffers expose ``__array_interface__`` + (``numpy.asarray(buf)``). +- ``HostImage`` — owning host pixel buffer returned by ``readback_to_host``; wrap with + ``numpy.asarray``. +- ``Resolution`` ``(width, height)``, ``Pose3D`` (``position``, ``orientation`` as + ``(w, x, y, z)``), ``Fov``, ``ViewInfo``. +- Enums ``DisplayMode``, ``PixelFormat`` (``kRGBA8`` / ``kD32F``), + ``MemorySpace``, ``SessionState``. + +C++ API +------- + +Televiz is a C++ library; ``isaacteleop.viz`` is a thin pybind11 binding over it. The Python and C++ +APIs share the same type and method names (``VizSession``, ``QuadLayer``, ``submit``, +``set_placement``, ``DisplayMode::kXr``, …), so everything on this page maps directly to C++. All +symbols live in ``namespace viz``, and headers use nested include paths:: + + #include + #include + #include + +Enable ``BUILD_VIZ`` (default ``OFF``; requires Vulkan, the CUDA toolkit, and ``glslangValidator``) +and link the relevant CMake target: + +.. list-table:: + :header-rows: 1 + :widths: 18 18 64 + + * - Target + - Alias + - Provides + * - ``viz_core`` + - ``viz::core`` + - Core types (``VizBuffer``, ``Pose3D``, ``HostImage``, ``DeviceImage``) and Vulkan / CUDA infrastructure + * - ``viz_layers`` + - ``viz::layers`` + - ``LayerBase`` and the built-in layers (``QuadLayer``, …) + * - ``viz_session`` + - ``viz::session`` + - ``VizSession``, the compositor, ``FrameInfo``, window / offscreen backends + * - ``viz_xr`` + - ``viz::xr`` + - OpenXR backend — per-eye swapchains, depth composition layer + +Public headers live under :code-dir:`src/viz//cpp/inc/viz/ `. One difference from +the Python bindings: in C++, layers are added with a single templated +``VizSession::add_layer(args...)`` method, which also accepts your own ``LayerBase`` subclasses — +the route for plugging in a custom renderer. See :doc:`/references/build` for build options and +output locations. + +More information +---------------- + +- :doc:`/references/camera_streaming` — the reference ``camera_viz`` sample built on Televiz +- :doc:`teleop_session` — how ``TeleopSession`` works and how to share its OpenXR session +- :code-dir:`src/viz/ ` — module source, organized as ``core`` / ``layers`` / ``session`` / + ``xr`` / ``shaders`` / ``python`` sub-modules diff --git a/docs/source/index.rst b/docs/source/index.rst index 8313f9aeb..01da0f42c 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -24,6 +24,7 @@ Key features: - **Standardized device interface** for XR headsets, gloves, foot pedals, body trackers - **Flexible retargeting** to different robot embodiments - **Plugin system** for adding new devices and use cases +- **Visualization module (Televiz)** for streaming robot camera and sensor feeds, plus 3D reconstruction content to XR headsets .. tip:: @@ -50,6 +51,7 @@ Table of Contents getting_started/build_from_source/index getting_started/teleop_session getting_started/teleop_control_state_machine + getting_started/televiz getting_started/contributing .. toctree:: @@ -60,6 +62,7 @@ Table of Contents references/build device/index references/retargeting/index + references/camera_streaming references/mcap_record_replay references/oob_teleop_control references/license diff --git a/docs/source/overview/architecture.rst b/docs/source/overview/architecture.rst index 63795114f..8efcb5f54 100644 --- a/docs/source/overview/architecture.rst +++ b/docs/source/overview/architecture.rst @@ -36,3 +36,17 @@ Data Interface - Standardized `data schema `_ defined in the FlatBuffers (fbs) format. - Data recording & playback with `mcap `_ - Dataset interoperability with `LeRobot `_ + +Visualization (Televiz) +----------------------- + +Televiz (``isaacteleop.viz``) is a lightweight compositor module for visualizing what the operator +sees — camera and sensor feeds, plus 3D rendered content — in an XR headset or a desktop window. + +- Composites multiple sources into one view: 2D camera/sensor planes (``QuadLayer``) today, with full-view stereo RGBD (``ProjectionLayer``) for 3D rendered content coming soon +- Per-eye stereo rendering and 3D placement in XR; the same API drives windowed and offscreen output +- Zero-copy submission of GPU frames straight from CuPy, PyTorch, or any CUDA memory object +- Shares one OpenXR session with the teleop device trackers, so rendering and tracking can run over a single CloudXR connection + +See :doc:`../getting_started/televiz` for the module API and :doc:`../references/camera_streaming` for +the reference camera-streaming sample. diff --git a/docs/source/references/camera_streaming.rst b/docs/source/references/camera_streaming.rst new file mode 100644 index 000000000..e5619736e --- /dev/null +++ b/docs/source/references/camera_streaming.rst @@ -0,0 +1,242 @@ +.. SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +.. SPDX-License-Identifier: Apache-2.0 + +Camera Streaming +================ + +``camera_viz`` is the reference camera-streaming sample built on :doc:`Televiz +` (``isaacteleop.viz``). It captures frames from one or more cameras and +visualizes them on a desktop window or an XR headset — one plane per camera, aspect-fit — and can +stream a robot's cameras to a remote workstation over the network. + +The sample lives at :code-dir:`examples/camera_viz/ `; this page summarizes how +it works. For the exact command surface and flags, see the +:code-file:`README `. + +.. contents:: On this page + :local: + :depth: 2 + +Modes +----- + +.. list-table:: + :header-rows: 1 + :widths: 16 84 + + * - Mode + - What it does + * - **Direct** + - The workstation runs the viewer with cameras attached locally (``source: local``). + * - **Split** + - The robot runs a sender that ships RTP H.264 to a workstation receiver (``source: rtp``). + Wired Ethernet only. + +Supported cameras +----------------- + +The camera kind is selected per entry by the YAML ``type:`` field: + +.. list-table:: + :header-rows: 1 + :widths: 16 84 + + * - ``type:`` + - Notes + * - ``synthetic`` + - GPU test pattern — no hardware. ``stereo: true`` adds a ``disparity_px`` offset between eyes. + * - ``v4l2`` + - USB / UVC cameras — anything ``v4l2-ctl --list-formats-ext`` reports. + * - ``oakd`` + - OAK-D mono RGB / LEFT / RIGHT (stereo not yet wired). + * - ``zed`` + - ZED 2 / Mini / X One; mono or ``stereo: true`` (per-eye SDK retrieve, zero-copy on the GPU). + +Output goes to a window or an XR headset. Stereo cameras render true side-by-side stereo in XR; +window mode shows the left eye. XR placement lock modes are ``world`` / ``head`` / ``lazy``. + +How it works +------------ + +The sample is organized so that capture, transport, and visualization are cleanly separated, with +Televiz as the compositor at the end of the chain: + +.. code-block:: text + :class: code-100col + + camera_viz/ + ├── camera_viz.sh — CLI: setup / loopback / run / deploy / service-* + ├── camera_viz.py — receiver / viewer (drives a Televiz VizSession) + ├── camera_streamer.py — robot-side RTP sender (per-camera supervisor) + ├── pipeline/ — source ABC + threaded runner + ├── placements/ — XR lock-mode strategies (world / head / lazy) + ├── sources/ — V4L2 / OAK-D / ZED / synthetic / rtp_h264 + ├── transports/ — RTP sender + receiver (native + GStreamer) + ├── codec/ — native NVENC / NVDEC pybind module + ├── configs/ — one YAML per camera kind + └── scripts/ — installer + systemd unit template + +- **Sources** (:code-dir:`sources/ `) implement a common source ABC and + hand frames to a threaded runner in :code-dir:`pipeline/ `. Each + source produces GPU frames where possible — e.g. the ZED source uses ``retrieve_image(MEM.GPU)`` so + BGRA8 stays in VRAM and a CUDA kernel channel-swaps into contiguous RGBA with no host round-trip. +- **The viewer** (:code-file:`camera_viz.py `) creates a + ``VizSession`` and adds one ``QuadLayer`` per enabled camera, then submits each frame to its layer + and calls ``render()`` once per frame. Stereo cameras submit both eyes. +- **Transport** (:code-dir:`transports/ `) carries the split mode: + an RTP H.264 sender on the robot and a receiver on the workstation, with a native NVENC/NVDEC codec + module (:code-dir:`codec/ `) or a GStreamer fallback. +- **Placement** (:code-dir:`placements/ `) holds the XR lock-mode + strategies. Placement is application policy — Televiz only renders a layer at whatever pose the app + sets each frame. + +Setup +----- + +One-time setup builds the Televiz wheel and installs the sample's Python environment: + +.. code-block:: bash + + cmake -B build -DBUILD_VIZ=ON + cmake --build build --target python_wheel --parallel + examples/camera_viz/camera_viz.sh setup + source examples/camera_viz/.venv/bin/activate + +``setup`` installs every Python dependency into ``.venv/`` via ``uv``, builds the native NVENC/NVDEC +codec, and probes system packages (GStreamer plugins, cairo / girepository headers, JetPack +``cuda-nvrtc`` + ``ld.so`` wiring). When something is missing it prints the exact ``apt-get`` line +and prompts ``[y/N]`` — answering ``n`` or running non-interactively aborts. + +Useful flags: ``--no-{v4l2,oakd,rtp}``, ``--with-zed``, ``--sender-only``, ``--jetson``. Pass +``--venv PATH`` to install into an existing virtual environment. + +Running +------- + +Direct (cameras on the workstation) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Set ``source: local`` in the config and run the viewer: + +.. code-block:: bash + + ./camera_viz.sh run configs/v4l2.yaml + +Swap the config for ``oakd.yaml``, ``zed.yaml``, ``synthetic.yaml``, ``synthetic_stereo.yaml``, or +``multi_camera.yaml``. + +Split (robot → workstation over RTP) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. warning:: + + **Wired networks only.** There is no retransmit or FEC — one lost packet corrupts one frame until + the next IDR (default every 5 s). + +Set ``source: rtp`` in the config, export the robot/streaming credentials once per shell, then +deploy the sender and run the viewer: + +.. code-block:: bash + + export REMOTE_HOST=10.0.0.5 REMOTE_USER=nvidia + export STREAMING_HOST=10.0.0.42 # workstation IP + + ./camera_viz.sh deploy configs/v4l2.yaml # full deploy + systemd unit on the robot + ./camera_viz.sh run configs/v4l2.yaml # viewer on the workstation + +``deploy`` rsyncs the source to the robot, installs sender dependencies, renders a +``camera-streamer.service`` systemd user unit (injecting ``--host`` from ``$STREAMING_HOST`` without +editing the YAML on disk), and enables it. Operate the running unit with +``./camera_viz.sh service-{status,logs,restart}``. The sender retries forever across unplug, SDK +errors, and network blips. + +Loopback +^^^^^^^^ + +``./camera_viz.sh loopback configs/v4l2.yaml`` runs the sender and viewer together on ``127.0.0.1`` +— the quickest way to smoke-test the RTP path on one machine. + +Configuration +------------- + +A single YAML drives both capture and visualization. Each ``cameras:`` entry becomes its own plane +(and, in split mode, its own RTP port). Abbreviated: + +.. code-block:: yaml + + source: local | rtp + streaming: + host: 192.168.1.100 # workstation IP (overridden at deploy time) + encoder: auto | native | gstreamer + + cameras: + - name: cam + enabled: true + type: v4l2 # v4l2 | oakd | zed | synthetic + width: 2560 + height: 720 + fps: 30 + stereo: false # zed / synthetic only — per-eye capture + SBS in XR + rtp: + port: 5000 # left eye when stereo + port_right: 5001 # required when stereo + source: rtp + bitrate_mbps: 15 + + display: + mode: window | xr + window: { width, height } + xr: { near_z, far_z } + clear_color: [r, g, b, a] + placements: + cam: + lock_mode: lazy # world | head | lazy + distance: 1.5 + # size: [w_m, h_m] + # stereo_baseline_mm: 0 + +See the :code-dir:`configs/ ` directory for a complete, commented YAML +per camera kind. + +Lock modes (XR) +^^^^^^^^^^^^^^^ + +How a camera plane is positioned relative to the operator's head each frame: + +.. list-table:: + :header-rows: 1 + :widths: 16 84 + + * - Mode + - Behavior + * - ``world`` + - Placed once in front of you and stays put. + * - ``head`` + - Follows your head every frame. + * - ``lazy`` + - World-locked, but re-snaps in front of you when you look away (default). + +Lazy-mode knobs live under ``placements.``: ``look_away_angle_deg``, ``reposition_distance``, +``reposition_delay_s``, ``transition_duration_s``. + +Sharing the XR session with TeleopSession +----------------------------------------- + +Only one OpenXR session is allowed per process, so when this sample runs alongside teleoperation the +``VizSession`` owns the session and hands its handles to ``TeleopSession`` / ``DeviceIOSession``, +which then skip creating their own. The viewer builds the session with the trackers' required +extensions and forwards the handles through ``TeleopSessionConfig.oxr_handles``: + +.. code-block:: python + + cfg.required_extensions = DeviceIOSession.get_required_extensions(trackers) + viz_session = televiz.VizSession.create(cfg) + + config = TeleopSessionConfig( + app_name="MyApp", + pipeline=pipeline, + oxr_handles=OpenXRSessionHandles(*viz_session.get_oxr_handles()), + ) + +See the *Sharing the XR session* section of :doc:`/getting_started/televiz` for the full pattern +(imports, frame loop, and how the two sessions' lifecycles relate), and +:doc:`/getting_started/teleop_session` for the ``TeleopSession`` side.