Add distance_to_camera and distance_to_image_plane camera presets to cartpole, dexsuite_kuka, and shadow_hand with test coverage#6101
Conversation
cartpole, dexsuite_kuka, and shadow_hand with test coverage.
There was a problem hiding this comment.
📋 Code Review Summary
This PR adds distance_to_camera and distance_to_image_plane camera presets across three environments (cartpole, dexsuite_kuka, shadow_hand) with comprehensive golden image test coverage. The changes are well-structured and follow established patterns in the codebase.
✅ Strengths
Consistent Implementation Pattern
- The new presets correctly follow the existing pattern used for
depthacross all three environments - Proper use of
BaseCartpoleTiledCameraCfg,BASE_CAMERA_CFG.replace(), and_ShadowHandBaseTiledCameraCfgconstructors
Good Documentation
- Clear docstrings with appropriate warnings about benchmarking-only usage for shadow_hand presets
- Well-written changelog entry with proper RST formatting and cross-references to affected classes
Thoughtful Validation Logic Update
- The refactored
validate_config()inshadow_hand_vision_env_cfg.pycorrectly generalizes depth-like type handling using set difference rather than direct equality
Comprehensive Test Coverage
- 20 golden images added covering all physics/renderer combinations (newton, physx, ovphysx × isaacsim_rtx_renderer, ovrtx_renderer)
rendering_test_utils.pyproperly updated to include new data types and disable CNN for these depth-like types
💭 Minor Observations
-
Observation space consistency (cartpole_direct_camera_env_cfg.py lines 77-78): The
observation_space=[1, 100, 100]correctly matchesdepth(single-channel output), which is appropriate. -
Validation edge case: The updated condition
if self.tiled_camera.data_types and not non_depth_data_types and self.feature_extractor.enabledcorrectly handles the case where all data types are depth-like AOVs. The addedself.tiled_camera.data_typescheck prevents issues with empty data_types lists. -
DexSuite preset ordering (camera_cfg.py): The new distance presets are added after
depthinBaseTiledCameraCfgbut aftersemantic_segmentationinWristTiledCameraCfg. This is a cosmetic inconsistency but doesn't affect functionality. Consider aligning the ordering for maintainability.
📝 Summary
Clean, well-tested PR that extends camera preset support for distance AOV types. The implementation is consistent with existing patterns and the validation logic update is correctly generalized. No blocking issues identified.
Greptile SummaryThis PR extends three existing camera environments (cartpole, dexsuite_kuka, shadow_hand) with
Confidence Score: 4/5The config and validation changes are straightforward and correct; the main open question is whether the identical golden-image hashes for the two depth modes under ovrtx_renderer reflect genuine renderer behaviour or a capture error. All three environment configs are extended consistently and the shadow_hand depth-guard generalisation is logically sound. The only concern is that across cartpole, dexsuite_kuka, and shadow_hand the ovrtx_renderer golden images for distance_to_camera and distance_to_image_plane are bit-for-bit identical, meaning the CI tests will not catch a future regression where these two modes diverge under that renderer. If it turns out the images were captured from the same render pass, the golden-image protection for ovrtx_renderer is weaker than intended. The LFS golden image files under source/isaaclab_tasks/test/golden_images/ for the ovrtx_renderer combinations, and source/isaaclab_tasks/test/rendering_test_utils.py where the shadow_hand depth guard is updated. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[validate_config called] --> B{renderer_type == newton_warp?}
B -- Yes --> C{unsupported data types?}
C -- Yes --> D[raise ValueError: warp only supports rgb/depth]
C -- No --> E[Continue]
B -- No --> E
E --> F{data_types non-empty?}
F -- No --> G[OK - no error]
F -- Yes --> H{all types in depth-like set?\ndepth / distance_to_camera /\ndistance_to_image_plane}
H -- No --> G
H -- Yes --> I{feature_extractor.enabled?}
I -- No --> G
I -- Yes --> J[raise ValueError: depth-only for benchmarking only]
Reviews (1): Last reviewed commit: "Added Richard Lei to CONTRIBUTORS.md" | Re-trigger Greptile |
| @@ -0,0 +1,3 @@ | |||
| version https://git-lfs.github.com/spec/v1 | |||
| oid sha256:ea4e6ba2251666e4df0c937fbd123489f2a06939bda9e674d5f84eb5c8831f7d | |||
There was a problem hiding this comment.
Identical golden-image hash for distance_to_camera vs distance_to_image_plane under ovrtx_renderer
Across all three environments (cartpole, dexsuite_kuka, shadow_hand) the ovrtx_renderer golden images for distance_to_camera and distance_to_image_plane share the exact same LFS hash, meaning the test is comparing both AOVs against an identical reference. These two depth modes are geometrically distinct — distance_to_camera is the Euclidean distance from the camera origin, while distance_to_image_plane applies an angular (cosine) correction — so they should produce different pixel values for any off-centre object. If ovrtx_renderer genuinely returns the same buffer for both modes (an upstream limitation), the golden-image tests for ovrtx_renderer will never detect a future regression where they diverge.
In addition, for dexsuite and shadow_hand the isaacsim_rtx_renderer-distance_to_image_plane hash equals the ovrtx_renderer-distance_to_camera hash, which is even harder to explain by renderer behaviour alone. Could you confirm whether the ovrtx_renderer is known not to differentiate these two modes, and if the golden images were captured from distinct rendering passes?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Interesting! Will investigate
There was a problem hiding this comment.
Sounds good — thanks for looking into it! Let me know if you have any questions.
There was a problem hiding this comment.
Sounds good — let me know if you need a second pair of eyes on it! 👍
There was a problem hiding this comment.
Per https://github.com/isaac-sim/IsaacLab/blob/develop/source/isaaclab_ov/isaaclab_ov/renderers/ovrtx_usd.py#L18, in the ovrtx renderer path, "depth", "distance_to_image_plane", and "distance_to_camera" are all collapsed to use the DistanceToImagePlaneSD render var, which explains why those 3 images are the same.
The question is if that is the intended behavior, as distance_to_camera vs image_plane geometrically mean different things - @pbarejko @huidongc
There was a problem hiding this comment.
Thanks for digging into this — that's a clear explanation. Looking at get_render_var_config, all three (depth, distance_to_image_plane, distance_to_camera) do indeed resolve to the same DistanceToImagePlaneSD render var, which confirms why the golden images are identical.
Semantically, distance_to_camera (Euclidean distance from camera origin to point) and distance_to_image_plane (perpendicular distance along the camera's optical axis) measure different things — especially for pixels far from the optical center, the difference can be meaningful for downstream tasks. So this does seem like a gap worth clarifying.
Good call tagging @pbarejko and @huidongc — understanding whether this is a current renderer limitation or intentional simplification will help determine if the tests here should expect identical images or if the renderer should eventually distinguish them.
Description
Add distance_to_camera and distance_to_image_plane camera presets to cartpole, dexsuite_kuka, and shadow_hand with test coverage.
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there