feat(actuators): add GRU and GRU-residual network actuator models by vidurv-nvidia · Pull Request #6083 · isaac-sim/IsaacLab

vidurv-nvidia · 2026-06-09T19:45:46Z

Description

Adds two recurrent (GRU) actuator network models to isaaclab.actuators, alongside the existing MLP and LSTM models:

ActuatorNetGRU — explicit actuator; a GRU predicts the total joint effort from the joint position error and velocity. Derives from IdealPDActuator, so the predicted effort is clipped to the actuator's effort limit.
ActuatorNetGRUResidual — implicit-PD actuator that adds a GRU-predicted residual feed-forward effort. The physics engine applies the PD term using the configured stiffness/damping; the desired positions/velocities are preserved on return.

Both load a TorchScript network that exposes a .gru submodule and implements forward(x, hidden) -> (output, hidden) with a 2-dim input [position_error, velocity] (mirroring ActuatorNetLSTM). Each input and the output support optional (mean, std) normalization via config fields (None = identity). A non-finite network output is sanitized to zero effort before it reaches the engine.

These run on the standard (torch) actuator path. The Newton native-actuator path (use_newton_actuators=True) does not yet provide a GRU controller, so these configs are not authored onto that path; adding a Newton-side GRU controller would be a separate change.

Fixes # (no associated issue)

Type of change

New feature (non-breaking change which adds functionality)

Screenshots

N/A — no visual change.

Checklist

I have read and understood the contribution guidelines
I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have added a changelog fragment under source/<pkg>/changelog.d/ for every touched package (do not edit CHANGELOG.rst or bump extension.toml — CI handles that)
I have added my name to the CONTRIBUTORS.md or my name already exists there

Add ActuatorNetGRU, an explicit actuator whose GRU network predicts the total joint effort, and ActuatorNetGRUResidual, an implicit-PD actuator that adds a GRU-predicted residual feed-forward effort. Both support a configurable set of input columns (position, position_error, velocity, solver_pd) and optional per-column input normalization and output denormalization carried directly on the config. ActuatorNetGRU derives from IdealPDActuator: it predicts the total effort, so the simple symmetric effort-limit clamp applies and the velocity-dependent DC-motor torque-speed parameters are not required. ActuatorNetGRUResidual derives from ImplicitActuator and injects the residual as feed-forward effort while preserving the desired joint positions and velocities for the engine-side PD term; effort limits are enforced by the simulation via effort_limit_sim, as for any implicit actuator. Add unit tests, an API documentation entry, and a changelog fragment.

isaaclab-review-bot

🤖 Isaac Lab Review Bot — PR #6083

feat(actuators): add GRU and GRU-residual network actuator models

Summary

Well-structured PR that adds GRU and GRU-residual actuator models following the established LSTM pattern. The shared _GRUActuatorMixin cleanly factors the common GRU logic, and the test suite (530 lines, 74 tests) is comprehensive. CI passes on pre-commit and builds; installation tests pending.

Findings

✅ ~~Suggestion — ActuatorNetGRU.reset() should call super().reset(env_ids) for future-proofing~~ Fixed in 63903b9.

✅ ~~Warning — _resolve_normalization silently floors std but doesn't warn on suspiciously small values~~ Fixed in 63903b9 — proper logger.warning() with field identification added.

🔵 Suggestion — Test coverage: consider adding a multi-step partial-reset interleaving test

The existing test_actuator_net_gru_reset validates that resetting env 0 zeros its hidden state while env 1's state persists. A stronger test would: (1) step both envs for N steps, (2) reset env 0, (3) step both envs again, (4) verify env 0's output matches a fresh-start trajectory while env 1's output matches the continued trajectory. This would catch subtle bugs where the view-based reshape accidentally aliases memory incorrectly across batch dimensions.

🟡 Warning — ActuatorNetGRUResidual.compute() — control_action.joint_positions must not be None

In ActuatorNetGRUResidual.compute(), after _predict_gru_effort (which validates joint_positions is not None), the method proceeds to compute error_pos = control_action.joint_positions - joint_pos. This is safe because _predict_gru_effort raises if positions are None. However, if the residual variant is ever refactored to skip calling _predict_gru_effort first (e.g., a conditional path), this would NPE. A defensive comment or assertion documenting this invariant would help maintainability.

🔵 Suggestion — Minor: docstring for output_normalization uses reversed y * std + mean vs standard (y - mean) / std

The config docstring correctly states the output is denormalized as y * std + mean (inverse of z-score normalization). This is intentional and correct — just noting that users familiar with standard normalization (x - mean) / std might initially misread the output direction. The docstring is already clear; no action needed.

Verdict

✅ Looks good overall. The implementation cleanly follows the established LSTM actuator pattern, the mixin factoring is sensible, error handling (non-finite sanitization, input validation) is solid, and test coverage is thorough. The remaining findings are non-blocking suggestions.

🔄 Update (2025-06-09)

Reviewed incremental changes (1f10825 → 63903b9). Two previously flagged items were addressed:

✅ super().reset(env_ids) now called in ActuatorNetGRU.reset()
✅ _resolve_normalization now emits logger.warning() with field name when std is below floor

Additional changes: CONTRIBUTORS.md updated (author added). No new issues introduced. LGTM.

isaaclab-review-bot · 2026-06-09T19:50:21Z

+    Operations.
+    """
+
+    def reset(self, env_ids: Sequence[int]):


🔵 Suggestion: Consider calling super().reset(env_ids) here for consistency with ActuatorNetGRUResidual.reset() and to future-proof against base class changes.

Suggested change

def reset(self, env_ids: Sequence[int]):

def reset(self, env_ids: Sequence[int]):

super().reset(env_ids)

self._reset_gru_state(env_ids)

Done in 63903b9 — ActuatorNetGRU.reset() now calls super().reset(env_ids) before resetting the hidden state, matching ActuatorNetGRUResidual.

Thanks for the fix! 👍

isaaclab-review-bot · 2026-06-09T19:50:21Z

+
+    def _resolve_normalization(self, stats: tuple[float, float] | None) -> tuple[float, float]:
+        """Return the ``(mean, std)`` to apply, defaulting to identity and flooring the std."""
+        if stats is None:


🟡 Warning: If std is passed as 0.0 (e.g., from a misconfigured normalization config), this silently floors to 1e-8, which can make the normalized inputs explode to ~O(1e8). Consider logging a warning when the input std is below the floor.

Done in 63903b9 — _resolve_normalization now logs a warning when a configured std is below the floor (and identifies which normalization field), since a near-zero std amplifies the normalized values.

Thanks, the warning with field identification is exactly what I had in mind. LGTM ✅

- Call super().reset() in ActuatorNetGRU.reset for consistency with the residual variant and to future-proof against base-class changes. - Warn when a configured normalization std is floored, since a near-zero std can amplify the normalized values. - Add Vidur Vij to CONTRIBUTORS.

greptile-apps · 2026-06-09T20:00:58Z

Greptile Summary

This PR adds two GRU-based actuator models — ActuatorNetGRU (explicit, total-torque) and ActuatorNetGRUResidual (implicit-PD with a GRU residual) — alongside the existing MLP and LSTM models. Both are implemented via a shared _GRUActuatorMixin that loads a TorchScript network, validates its shape, manages the recurrent hidden state per environment, and supports optional (mean, std) normalization on each of the three inputs (position, position error, velocity) and the output.

The mixin faithfully mirrors the ActuatorNetLSTM pattern (input assembly outside inference_mode, hidden-state copy via in-place slice assignment inside the block) and adds normalization, NaN sanitization, and a negative-std guard that the LSTM lacks.
ActuatorNetGRUCfg derives from IdealPDActuatorCfg with stiffness/damping nulled out; ActuatorNetGRUResidualCfg derives from ImplicitActuatorCfg and preserves PD gains so the physics engine handles the PD term while the GRU contributes a residual feed-forward effort.
The test file provides comprehensive coverage: both variants, multi-layer GRU, effort clipping, NaN sanitization, input/output normalization math, selective env reset, error paths, and a production-sized 2-layer export architecture.

Confidence Score: 5/5

Safe to merge — the new GRU actuator path is self-contained, fully tested, and does not modify any existing code paths.

All three code paths (input assembly, GRU inference, hidden-state reset) are additive and isolated from the existing LSTM and MLP actuators. The mixin design avoids touching any shared state, the normalization guard correctly raises on negative std and floors near-zero values, and the NaN sanitization prevents non-finite network output from reaching the physics engine. Tests cover the full surface area including edge cases.

No files require special attention; the single note in actuator_net.py is a warning-message wording improvement.

Important Files Changed

Filename	Overview
source/isaaclab/isaaclab/actuators/actuator_net.py	Adds _GRUActuatorMixin, ActuatorNetGRU, and ActuatorNetGRUResidual; mirrors LSTM pattern with a 3-input GRU and optional normalization for each channel; one misleading warning message for near-zero output_normalization std.
source/isaaclab/isaaclab/actuators/actuator_net_cfg.py	Adds ActuatorNetGRUCfg and ActuatorNetGRUResidualCfg; normalization fields and network_file are clearly documented.
source/isaaclab/test/actuators/test_actuator_net_gru.py	Comprehensive test suite covering both actuator variants, multi-layer GRU, effort clipping, NaN sanitization, input/output normalization, selective env reset, and error validation.
source/isaaclab/isaaclab/actuators/init.pyi	Adds GRU classes and configs to all and public imports; no issues.
docs/source/api/lab/isaaclab.actuators.rst	Adds GRU actuator entries to the API reference in the same style as the existing LSTM section.

Sequence Diagram

sequenceDiagram
    participant Sim as Simulation Loop
    participant GRU as ActuatorNetGRU / ActuatorNetGRUResidual
    participant Mix as _GRUActuatorMixin
    participant Net as TorchScript GRU Network

    Sim->>GRU: compute(control_action, joint_pos, joint_vel)
    GRU->>Mix: _predict_gru_effort(...)
    Mix->>Mix: assemble sea_input [position, pos_error, velocity]
    Mix->>Mix: apply input normalization (outside inference_mode)
    Mix->>Net: forward(sea_input, sea_hidden_state) [inference_mode]
    Net-->>Mix: (output, new_hidden_state)
    Mix->>Mix: denormalize output and nan_to_num
    Mix-->>GRU: effort [num_envs, num_joints]

    alt ActuatorNetGRU (explicit)
        GRU->>GRU: clip effort via _clip_effort
        GRU->>Sim: "joint_efforts set, joint_positions/velocities = None"
    else ActuatorNetGRUResidual (implicit-PD)
        GRU->>GRU: accumulate residual into joint_efforts
        GRU->>GRU: compute approximate total effort for telemetry
        GRU->>Sim: joint_efforts augmented, joint_positions/velocities preserved
    end

    Sim->>GRU: reset(env_ids)
    GRU->>Mix: _reset_gru_state(env_ids)
    Mix->>Mix: "sea_hidden_state_per_env[:, env_ids] = 0"

_{Reviews (2): Last reviewed commit: "Use fixed 3-input GRU contract and harde..." | Re-trigger Greptile}

greptile-apps · 2026-06-09T20:01:02Z

+
+        Args:
+            control_action: The joint action instance holding the desired joint positions.
+            joint_pos: The current joint positions. Shape is (num_envs, num_joints).
+            joint_vel: The current joint velocities. Shape is (num_envs, num_joints).
+
+        Returns:
+            The predicted effort [N·m or N, depending on joint type]. Shape is
+            (num_envs, num_joints).
+
+        Raises:
+            ValueError: If ``control_action.joint_positions`` is None.
+        """
+        if control_action.joint_positions is None:
+            raise ValueError("GRU actuator input requires control_action.joint_positions to be set.")
+        # normalized [position_error, velocity] inputs
+        pos_error = (control_action.joint_positions - joint_pos).flatten()
+        self.sea_input[:, 0, 0] = (pos_error - self._pos_error_norm[0]) / self._pos_error_norm[1]
+        self.sea_input[:, 0, 1] = (joint_vel.flatten() - self._vel_norm[0]) / self._vel_norm[1]
+
+        # run inference, then denormalize and guard against a non-finite output
+        with torch.inference_mode():
+            output, self.sea_hidden_state[:] = self.network(self.sea_input, self.sea_hidden_state)


Negative std silently corrected to 1e-8 instead of raising an error

max(float(std), self._GRU_STD_FLOOR) will replace any negative std (e.g., -2.0) with 1e-8 — effectively dividing by nearly zero — while the warning message says "flooring it, which can amplify the normalized values". A user who accidentally swaps the (mean, std) tuple order (passing (std, mean)) or passes a negative value gets input amplification of up to 1 / 1e-8 = 1e8 with a warning that does not clearly signal the sign correction. Only positive-but-small std values belong in the "floor" category; a negative std should raise ValueError to force the misconfiguration to be fixed.

Fixed in e1dea0f — _resolve_normalization now raises ValueError for a negative std (only 0 ≤ std < 1e-8 is floored, with the warning). Added a regression test (test_actuator_net_gru_negative_std_raises).

Excellent — raising on negative std while flooring the near-zero case is the right split. Regression test seals it. 👍

greptile-apps · 2026-06-09T20:01:03Z

+    """
+
+    def reset(self, env_ids: Sequence[int]):
+        super().reset(env_ids)
+        self._reset_gru_state(env_ids)
+


Denormalization applied to an inference-mode tensor outside the context block

output returned by the network inside torch.inference_mode() is an inference tensor. The denormalization output * self._output_norm[1] + self._output_norm[0] and the subsequent nan_to_num are executed outside the with block, so they run with gradient tracking enabled (if a grad context is active). In most deployment scenarios this is harmless, but it creates unnecessary ops when called from within a torch.no_grad() or outer inference_mode() scope. Consider moving the denormalization and sanitization inside the with torch.inference_mode(): block for consistency and to avoid any surprise interactions with autograd.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fixed in e1dea0f — the denormalization and nan_to_num now run inside the with torch.inference_mode() block.

Thanks for the fix! Confirmed — having everything under the inference_mode() context is cleaner and correct. 👍

Thanks for the quick fix! Verified that the denormalization and nan_to_num are now properly scoped inside the inference_mode() block. 👍

- Use a fixed [position, position_error, velocity] network input (3 inputs) to match the trained actuator-model feature set, instead of the 2-input [position_error, velocity] contract; add position_normalization. - Raise on a negative normalization std and reject bidirectional GRU networks instead of silently mis-handling them. - Run output denormalization and the non-finite guard inside inference_mode. - Strengthen tests: check the explicit actuator against an independent reference forward and cover the negative-std error path.

AntoineRichard · 2026-06-10T13:31:40Z

Hi @vidurv-nvidia could you target newton's actuators? We will be deprecating our internal actuators, I think it would make sense to use the shared actuators they built this way IsaacSim and Newton could use them too. Or at least could these actuators modify the newton actuators rather than our own? @jvonmuralt for viz.

github-actions Bot added documentation Improvements or additions to documentation isaac-lab Related to Isaac Lab team labels Jun 9, 2026

isaaclab-review-bot Bot reviewed Jun 9, 2026

View reviewed changes

vidurv-nvidia marked this pull request as ready for review June 9, 2026 19:54

vidurv-nvidia requested review from Mayankm96, jtigue-bdai and kellyguo11 as code owners June 9, 2026 19:54

vidurv-nvidia marked this pull request as draft June 9, 2026 19:55

greptile-apps Bot reviewed Jun 9, 2026

View reviewed changes

Conversation

vidurv-nvidia commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Screenshots

Checklist

Uh oh!

isaaclab-review-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

🤖 Isaac Lab Review Bot — PR #6083

Summary

Findings

Verdict

🔄 Update (2025-06-09)

Uh oh!

isaaclab-review-bot Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

vidurv-nvidia Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

vidurv-nvidia Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

vidurv-nvidia Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

vidurv-nvidia Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

isaaclab-review-bot Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

AntoineRichard commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vidurv-nvidia commented Jun 9, 2026 •

edited

Loading

isaaclab-review-bot Bot left a comment •

edited

Loading

greptile-apps Bot commented Jun 9, 2026 •

edited

Loading