Skip to content

feat(phyai): Support gr00t#40

Open
yuerqiqi wants to merge 5 commits into
MEmbodied:mainfrom
yuerqiqi:feat-support-gr00t
Open

feat(phyai): Support gr00t#40
yuerqiqi wants to merge 5 commits into
MEmbodied:mainfrom
yuerqiqi:feat-support-gr00t

Conversation

@yuerqiqi

@yuerqiqi yuerqiqi commented Jul 3, 2026

Copy link
Copy Markdown

Summary

This PR adds native GR00T-N1.7 model support to PhyAI.

Changes

  • Register the gr00t_n17 model entry in the engine.
  • Add GR00T-N1.7 model components:
    • Qwen3-VL/Cosmos backbone adapter
    • action head
    • WS1 scheduler
    • model runner
  • Add GR00T preprocessing utilities under phyai-utils-tools.
  • Update package dependencies required by the GR00T processor/model path.

Summary by CodeRabbit

  • New Features
    • Added support for the GR00T-N1.7 model, including a checkpoint-agnostic processor for multimodal preprocessing and action decoding.
    • Introduced native deterministic image handling plus configurable state/action normalization and relative-to-absolute action conversion.
    • Added a new single-GPU scheduling path with an inference runner that returns normalized action outputs for later decoding.
    • Enabled the model to be discovered by the engine as a registered plugin, with optional CUDA-graph acceleration when compatible.
  • Chores
    • Updated dependencies (including image support) and added an optional fast tokenizer extra.

@yuerqiqi yuerqiqi requested a review from chenghuaWang as a code owner July 3, 2026 05:25
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@yuerqiqi, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 2 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6a65fd8b-dae8-4168-8386-ffa777338129

📥 Commits

Reviewing files that changed from the base of the PR and between c8acab4 and 949842c.

📒 Files selected for processing (1)
  • CLAUDE.md
📝 Walkthrough

Walkthrough

This PR adds GR00T-N1.7 preprocessing utilities and a native GR00T-N1.7 model stack, including engine registration, configuration, native Qwen3-VL backbone code, action-head runners, a scheduler, and a plugin entry.

Changes

phyai-utils-tools GR00T preprocessing package

Layer / File(s) Summary
Dependency updates
phyai-utils-tools/pyproject.toml
Adds pillow and the fastokenizer optional dependency group, alongside the GR00T-related dependency pins.
GR00T ops: image transforms and action/state math
phyai-utils-tools/src/phyai_utils_tools/models/gr00t/ops_gr00t.py
Adds embodiment canonicalization, config loading, deterministic image transforms, Qwen3-VL image preprocessing, normalization, and rotation/transform conversions.
GR00TProcessor configuration and core processing
phyai-utils-tools/src/phyai_utils_tools/models/gr00t/__init__.py, phyai-utils-tools/src/phyai_utils_tools/models/gr00t/processor_gr00t.py
Adds the processor dataclasses, checkpoint loading, observation validation, preprocessing, decoding, and inverse normalization logic.
Vision-language preprocessing paths
phyai-utils-tools/src/phyai_utils_tools/models/gr00t/processor_gr00t.py
Implements injected and native VLM preprocessing paths, token expansion, tokenizer/image parameter caching, and public exports.

phyai GR00T-N1.7 model and engine plugin

Layer / File(s) Summary
Engine wiring and package exports
phyai/src/phyai/engine.py, phyai/src/phyai/models/gr00t_n17/__init__.py
Registers the plugin import and defines package-level re-exports for GR00T-N1.7 symbols.
Configuration dataclasses
phyai/src/phyai/models/gr00t_n17/configuration_gr00t_n17.py
Adds validated processor, backbone, DiT, VL self-attention, action-head, and top-level config dataclasses.
Native Qwen3-VL backbone implementation
phyai/src/phyai/models/gr00t_n17/qwen3_vl_native.py
Implements the native vision tower, text decoder, merged backbone core, graph planning, and conditional-generation wrapper.
GR00T-N1.7 backbone, DiT action head, and top-level model
phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py
Implements the backbone wrapper, embodiment-conditioned layers, DiT blocks, denoising loop, and top-level model container.
CUDA-graph model runners
phyai/src/phyai/models/gr00t_n17/model_runner_gr00t_n17.py
Implements backbone and action-head runners with graph capture/replay and eager fallback paths.
WS1 scheduler
phyai/src/phyai/models/gr00t_n17/scheduler_ws1_gr00t_n17.py
Adds the request dataclass and scheduler that manage runner lifecycle and per-step execution.
Engine plugin entry
phyai/src/phyai/models/gr00t_n17/main_gr00t_n17.py
Adds GR00TN17Args and GR00TN17Entry with remap composition, setup, step, close, and target dumping.

Estimated code review effort: 5 (Critical) | ~120 minutes

Suggested reviewers: chenghuaWang

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adding GR00T-N1.7 support to PhyAI.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@chenghuaWang chenghuaWang changed the title Feat support gr00t feat(phyai): Support gr00t Jul 3, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces native inference support for the GR00T-N1.7 model, moving processor operations to a workspace leaf package (phyai-utils-tools) and implementing native Qwen3-VL/Cosmos-Reason2 backbone and action head components in phyai. The review feedback highlights several critical improvements: validating batch dimensions in ops_gr00t.py to prevent silent truncation during zipping; invoking ReplicatedLinear instances directly instead of calling .forward() to ensure PyTorch hooks are triggered, while updating subclasses to call super().forward() to avoid infinite recursion; and adding shape and index bounds checks in processor_gr00t.py to prevent silent incorrect slicing of actions and unhandled IndexError exceptions when processing image tokens.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread phyai-utils-tools/src/phyai_utils_tools/models/gr00t/ops_gr00t.py
Comment thread phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py
Comment thread phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py Outdated
Comment thread phyai/src/phyai/models/gr00t_n17/qwen3_vl_native.py
Comment thread phyai/src/phyai/models/gr00t_n17/qwen3_vl_native.py Outdated
Comment thread phyai-utils-tools/src/phyai_utils_tools/models/gr00t/processor_gr00t.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🧹 Nitpick comments (5)
phyai/src/phyai/models/gr00t_n17/main_gr00t_n17.py (1)

107-152: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

No cleanup if setup fails mid-way.

If load_pretrained (or _load_qwen3vl_model) raises after self.model is assigned, self.model is left populated with a half-initialized (no scheduler) model — dump_targets() would then expose an incompletely loaded model instead of failing cleanly, and any GPU memory already allocated for the model is not released.

🛡️ Suggested fix
     def setup(self, args: EntryArgs) -> None:
         if not isinstance(args, GR00TN17Args):
             raise TypeError(
                 "GR00TN17Entry.setup expected GR00TN17Args, got "
                 f"{type(args).__name__}."
             )
-        eng = get_engine_config()
+        eng = get_engine_config()
+        try:
+            self._setup_impl(args, eng)
+        except Exception:
+            self.model = None
+            self.scheduler = None
+            raise
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai/src/phyai/models/gr00t_n17/main_gr00t_n17.py` around lines 107 - 152,
The GR00TN17Entry.setup path can leave a partially initialized model behind if
_load_qwen3vl_model or load_pretrained fails after self.model is assigned.
Update setup to clean up any allocated state on initialization failure by
wrapping the model/scheduler construction in error handling and resetting
self.model and any related fields (and releasing resources if applicable) before
re-raising. Keep the fix localized around GR00TN17Entry.setup, GR00TN17Model,
and GR00TN17WS1Scheduler so dump_targets() cannot observe a half-loaded model.
phyai/src/phyai/models/gr00t_n17/configuration_gr00t_n17.py (1)

176-179: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Duplicate attention_backend validation across three configs.

The same {"eager", "sdpa", "flashinfer"} membership check and error message pattern is repeated in GR00TN17BackboneConfig, GR00TN17DiTConfig, and GR00TN17VLSelfAttentionConfig. Extract a small shared validator to reduce duplication and keep the allowed backend set in one place.

♻️ Proposed helper
+_ATTENTION_BACKENDS = {"eager", "sdpa", "flashinfer"}
+
+
+def _validate_attention_backend(name: str, value: str) -> None:
+    if value not in _ATTENTION_BACKENDS:
+        raise ValueError(f"{name} must be one of: eager, sdpa, flashinfer.")

Then replace each inline check, e.g.:

-        if self.attention_backend not in {"eager", "sdpa", "flashinfer"}:
-            raise ValueError(
-                "backbone.attention_backend must be one of: eager, sdpa, flashinfer."
-            )
+        _validate_attention_backend("backbone.attention_backend", self.attention_backend)

Also applies to: 227-230, 260-263

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai/src/phyai/models/gr00t_n17/configuration_gr00t_n17.py` around lines 176
- 179, The attention_backend membership check and error message are duplicated
in GR00TN17BackboneConfig, GR00TN17DiTConfig, and GR00TN17VLSelfAttentionConfig;
extract a small shared validator or constant for the allowed backends so the set
and message live in one place. Update the validation in the config classes to
call that shared helper instead of repeating the inline {"eager", "sdpa",
"flashinfer"} check, keeping the existing behavior and error text consistent.
phyai/src/phyai/models/gr00t_n17/scheduler_ws1_gr00t_n17.py (1)

50-71: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

No validation that use_cuda_graph=True requires a CUDA device.

device defaults to "cpu" (Line 63) while use_cuda_graph defaults to True (Line 56/64). If a caller instantiates the scheduler with the defaults on a CPU-only host, CUDA graph capture in GR00TN17ActionHeadRunner would likely fail at setup()/first forward() with an unclear low-level error instead of a descriptive one raised here.

Consider guarding at construction time, e.g. disabling/raising when use_cuda_graph is set but self.device.type != "cuda".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai/src/phyai/models/gr00t_n17/scheduler_ws1_gr00t_n17.py` around lines 50
- 71, The scheduler initializer in GR00TN17SchedulerWS1GR00TN17 accepts
use_cuda_graph=True with a default CPU device, but it never validates that CUDA
graphs require a CUDA device. Add a construction-time guard in __init__ after
self.device is set and before creating GR00TN17ActionHeadRunner, so that when
use_cuda_graph is enabled and self.device.type is not "cuda" you either raise a
clear error or disable CUDA graph usage explicitly. Reference the existing
__init__, self.device, use_cuda_graph, and GR00TN17ActionHeadRunner wiring so
the fix stays localized.
phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py (2)

39-77: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Route PHYAI_GR00T_DIT_SDPA_MATH through phyai.env.

This adds a new PHYAI_* knob but reads os.environ directly, bypassing the repository’s env registry. Move the declaration/parsing to phyai/src/phyai/env.py and consume that helper here. As per coding guidelines, **/*.py: “Declare new PHYAI_* environment variables in phyai/src/phyai/env.py instead of reading os.environ ad hoc.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py` around lines 39 - 77,
The SDPA math toggle is being read directly from os.environ in
_dit_math_sdpa_context, which bypasses the centralized PHYAI env registry. Move
the PHYAI_GR00T_DIT_SDPA_MATH declaration/parsing into phyai.env, then update
modeling_gr00t_n17.py to consume that helper instead of reading the environment
ad hoc. Use the existing _DIT_SDPA_MATH_ENV symbol and _dit_math_sdpa_context as
the integration points.

Source: Coding guidelines


818-839: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Default the attention backend to flashinfer.

The constructor-level default and docstring currently make direct GR00TN17Attention usage prefer sdpa; keep the masked SDPA fallback, but make the default backend flashinfer.

Proposed change
-    The default backend is ``"sdpa"`` (phyai's recommended path, CUDA-graph
+    The default backend is ``"flashinfer"`` (phyai's recommended path, CUDA-graph
@@
-        backend: str = "sdpa",
+        backend: str = "flashinfer",

As per coding guidelines, phyai/**/*.py: “Use flashinfer by default for attention.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py` around lines 818 -
839, Update GR00TN17Attention so the constructor default backend is flashinfer
instead of sdpa, and adjust the docstring text to reflect that flashinfer is the
default path while sdpa remains only the masked/cross-attention fallback. Keep
the _SUPPORTED_BACKENDS set unchanged and make sure the __init__ signature and
any backend-selection logic in modeling_gr00t_n17.py are consistent with the new
default.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@phyai-utils-tools/src/phyai_utils_tools/models/gr00t/ops_gr00t.py`:
- Around line 461-468: Validate batch dimensions in relative_non_eef_to_absolute
and the EEF conversion path before any broadcasting or zip-based iteration. The
current logic in relative_non_eef_to_absolute and the zip(rel, state) flow can
silently truncate or mix mismatched batch sizes. Add explicit shape checks in
these helpers, and only allow a single reference batch to be broadcast
intentionally; otherwise raise on incompatible batch dimensions.
- Around line 405-415: The normalization helpers currently create outputs with
np.zeros_like(values), which preserves the input dtype and can truncate
fractional results for integer arrays. Update normalize_values_minmax and the
other normalization functions referenced in this block to always allocate
floating-point output (for example by using a float dtype based on the input or
explicit casting) so assignments of normalized values keep decimals. Keep the
fix localized to the normalization helper implementations so their return values
are consistently float arrays.

In `@phyai-utils-tools/src/phyai_utils_tools/models/gr00t/processor_gr00t.py`:
- Around line 514-521: The state normalization logic in processor_gr00t.py is
ignoring the global use_mean_std flag unless the key is also in
mean_std_embedding_keys, so both the normalize and unnormalize paths can
incorrectly fall back to min/max. Update the branching in the relevant state
handling inside the processor methods that use normalize_values_meanstd /
normalize_values_minmax and their inverse counterparts so that use_mean_std is
honored globally, while still allowing mean_std_embedding_keys to override
per-key behavior where intended.
- Around line 298-302: The tokenizer setup in the GR00T processor only left-pads
the VLM processor’s tokenizer, but injected tokenizers passed through the
processor keep their default padding side. Update the tokenizer initialization
path in the processor class so the injected tokenizer assigned to
self._tokenizer is also forced to padding_side = "left", matching the behavior
already applied to self.vlm_processor.tokenizer. Apply the same fix in the other
tokenizer setup block referenced by the comment so both code paths stay
consistent.
- Line 349: The override merge in ProcessorGr00t from_pretrained is dropping
explicit None values, so intentional None-based constructor settings never reach
initialization. Update the kwargs/overrides handling in the from_pretrained path
(the kwargs.update(...) logic in ProcessorGr00t) so explicit None entries are
preserved and only truly absent values are ignored, keeping named options like
crop/resize knobs able to be disabled via None.
- Around line 590-621: Add cross-modal batch validation in _validate_observation
for GR00TObservation so state, language, and video inputs share the same batch
size before preprocessing. In _process_vlm, explicitly check that each modality
tensor/list uses the same leading batch dimension instead of relying on zip()
truncation, and raise a clear ValueError when state, action_mask, embodiment_id,
language, or video batch sizes diverge. Use the existing _validate_observation
and _process_vlm symbols to place the new checks near the current shape
validations.

In `@phyai/src/phyai/models/gr00t_n17/main_gr00t_n17.py`:
- Around line 137-144: The GR00TN17 setup path is calling a private backbone
helper, so make the Qwen3VL loader part of the public API. Rename
`_load_qwen3vl_model()` to `load_qwen3vl_model()` in `modeling_gr00t_n17.py`,
then update `GR00TN17Entry.setup()` to call the new public method and adjust any
other internal references to the renamed symbol so the backbone surface is
consistent.

In `@phyai/src/phyai/models/gr00t_n17/model_runner_gr00t_n17.py`:
- Around line 10-35: The new PHYAI_GR00T_BACKBONE_CUDA_GRAPH flag is being read
directly in model_runner_gr00t_n17.py, bypassing centralized validation. Move
its definition/parsing into phyai.env and replace _backbone_cuda_graph_enabled()
in GR00T N17 model_runner to use the shared env helper instead of os.environ, so
invalid values are handled consistently with the other PHYAI_* settings.
- Around line 85-90: The CUDA-graph branch in model_runner_gr00t_n17.py is
passing backbone_inputs positionally into backbone.backbone_graph_plan, but that
API is keyword-only and will fail when _backbone_cuda_graph_enabled() is true.
Update the call in Gr00tN17ModelRunner around the backbone_graph_plan /
_replay_or_capture path to unpack the graph-plan inputs into the expected
keyword arguments before calling backbone_graph_plan, then keep using the
returned core_fn, buffers, key, and model_inputs for build_graph_output.
- Line 299: The replayed action outputs from graph.replay(inputs) are returning
a tensor that aliases static CUDA-graph buffers, so later replays can overwrite
values still held by the caller. In the model runner path that returns
graph.replay(inputs), clone the replayed output before returning it, following
the same pattern used in the backbone path notes, so each call hands back an
independent tensor.

In `@phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py`:
- Around line 1518-1524: The `denoise_step` path in
`GR00TN17NativeImplementationError` handling currently ignores `action_mask` by
leaving `vel_strength` as all ones, so masked requests can slip through
unconstrained. Update `prepare_action_input`/`denoise_step` in
`modeling_gr00t_n17.py` to either apply `action_mask` when computing
`vel_strength` or explicitly reject any non-None `action_mask` alongside
`action` with a clear `GR00TN17NativeImplementationError`, and make the same fix
in the duplicated logic around `denoise_step` in the later block.
- Around line 1091-1100: The block selection in GR00TN17BasicTransformerBlock
should not depend on index parity; instead, use each block’s is_cross_attention
state to decide whether it is an alternate VL-DiT/self-attention block. Update
the block construction in the GR00TN17 model so cross_attention_dim is set based
on that block property, and mirror the same logic in the forward path that
currently branches on idx % 2. Make sure the code paths referenced by
use_alternate_vl_dit and interleave_self_attention stay consistent so action
embeddings are never routed into cross-attention projections with mismatched
dimensions.

In `@phyai/src/phyai/models/gr00t_n17/qwen3_vl_native.py`:
- Around line 368-381: The attention constructors in qwen3_vl_native.py still
default to sdpa, so callers that omit attention_backend bypass the required
production default; update the default to flashinfer in each affected
constructor and keep the existing CP-environment fallback to MagiAttention where
applicable. Make the change consistently across the named init paths (including
the Attention setup and any wrapper modules using _attention_backend_kwargs) so
the native attention stack resolves to flashinfer unless CP is set.
- Around line 653-673: The _vision_preamble method’s return annotation is wrong
because it currently declares only three torch.Tensor values while it actually
returns four values, including seqlens as a list[int]. Update the
_vision_preamble signature in qwen3_vl_native.py to match the real return shape
and types, keeping the tuple annotation aligned with pos_embed, rotary_pos_emb,
cu_seqlens, and seqlens so the function remains correctly typed and consistent
with its callers.
- Around line 1447-1459: The graph path in qwen3_vl_native.py bypasses the
image-token/feature alignment validation that the eager path does via
get_placeholder_mask, so malformed inputs can reach index_copy_ during CUDA
graph capture or replay. Add the same pre-capture count check in the graph-path
setup around image_index and the buffers assembly, using the existing
image-token count logic from the eager path, so invalid preprocessing fails
before capture.

---

Nitpick comments:
In `@phyai/src/phyai/models/gr00t_n17/configuration_gr00t_n17.py`:
- Around line 176-179: The attention_backend membership check and error message
are duplicated in GR00TN17BackboneConfig, GR00TN17DiTConfig, and
GR00TN17VLSelfAttentionConfig; extract a small shared validator or constant for
the allowed backends so the set and message live in one place. Update the
validation in the config classes to call that shared helper instead of repeating
the inline {"eager", "sdpa", "flashinfer"} check, keeping the existing behavior
and error text consistent.

In `@phyai/src/phyai/models/gr00t_n17/main_gr00t_n17.py`:
- Around line 107-152: The GR00TN17Entry.setup path can leave a partially
initialized model behind if _load_qwen3vl_model or load_pretrained fails after
self.model is assigned. Update setup to clean up any allocated state on
initialization failure by wrapping the model/scheduler construction in error
handling and resetting self.model and any related fields (and releasing
resources if applicable) before re-raising. Keep the fix localized around
GR00TN17Entry.setup, GR00TN17Model, and GR00TN17WS1Scheduler so dump_targets()
cannot observe a half-loaded model.

In `@phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py`:
- Around line 39-77: The SDPA math toggle is being read directly from os.environ
in _dit_math_sdpa_context, which bypasses the centralized PHYAI env registry.
Move the PHYAI_GR00T_DIT_SDPA_MATH declaration/parsing into phyai.env, then
update modeling_gr00t_n17.py to consume that helper instead of reading the
environment ad hoc. Use the existing _DIT_SDPA_MATH_ENV symbol and
_dit_math_sdpa_context as the integration points.
- Around line 818-839: Update GR00TN17Attention so the constructor default
backend is flashinfer instead of sdpa, and adjust the docstring text to reflect
that flashinfer is the default path while sdpa remains only the
masked/cross-attention fallback. Keep the _SUPPORTED_BACKENDS set unchanged and
make sure the __init__ signature and any backend-selection logic in
modeling_gr00t_n17.py are consistent with the new default.

In `@phyai/src/phyai/models/gr00t_n17/scheduler_ws1_gr00t_n17.py`:
- Around line 50-71: The scheduler initializer in GR00TN17SchedulerWS1GR00TN17
accepts use_cuda_graph=True with a default CPU device, but it never validates
that CUDA graphs require a CUDA device. Add a construction-time guard in
__init__ after self.device is set and before creating GR00TN17ActionHeadRunner,
so that when use_cuda_graph is enabled and self.device.type is not "cuda" you
either raise a clear error or disable CUDA graph usage explicitly. Reference the
existing __init__, self.device, use_cuda_graph, and GR00TN17ActionHeadRunner
wiring so the fix stays localized.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6abfde87-5c60-4765-8cea-7f93188970ab

📥 Commits

Reviewing files that changed from the base of the PR and between 1644b93 and 3b59a6c.

📒 Files selected for processing (12)
  • phyai-utils-tools/pyproject.toml
  • phyai-utils-tools/src/phyai_utils_tools/models/gr00t/__init__.py
  • phyai-utils-tools/src/phyai_utils_tools/models/gr00t/ops_gr00t.py
  • phyai-utils-tools/src/phyai_utils_tools/models/gr00t/processor_gr00t.py
  • phyai/src/phyai/engine.py
  • phyai/src/phyai/models/gr00t_n17/__init__.py
  • phyai/src/phyai/models/gr00t_n17/configuration_gr00t_n17.py
  • phyai/src/phyai/models/gr00t_n17/main_gr00t_n17.py
  • phyai/src/phyai/models/gr00t_n17/model_runner_gr00t_n17.py
  • phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py
  • phyai/src/phyai/models/gr00t_n17/qwen3_vl_native.py
  • phyai/src/phyai/models/gr00t_n17/scheduler_ws1_gr00t_n17.py

Comment thread phyai-utils-tools/src/phyai_utils_tools/models/gr00t/ops_gr00t.py Outdated
Comment thread phyai-utils-tools/src/phyai_utils_tools/models/gr00t/ops_gr00t.py
Comment thread phyai-utils-tools/src/phyai_utils_tools/models/gr00t/processor_gr00t.py Outdated
Comment thread phyai-utils-tools/src/phyai_utils_tools/models/gr00t/processor_gr00t.py Outdated
Comment thread phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py
Comment on lines +1518 to +1524
dt = 1.0 / self.num_inference_timesteps
vel_strength = torch.ones_like(actions)
if action_input.action is not None:
raise GR00TN17NativeImplementationError(
"RTC/inpainting action inputs are not implemented in the native "
"GR00T-N1.7 action-head path yet."
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Do not silently ignore action_mask.

prepare_action_input carries action_mask, but denoise_step leaves vel_strength as all ones and only rejects action. Either apply the mask semantics or reject it as unsupported to avoid unconstrained masked-action requests.

Proposed safe rejection
         dt = 1.0 / self.num_inference_timesteps
         vel_strength = torch.ones_like(actions)
+        if action_input.action_mask is not None:
+            raise GR00TN17NativeImplementationError(
+                "action_mask inputs are not implemented in the native "
+                "GR00T-N1.7 action-head path yet."
+            )
         if action_input.action is not None:
             raise GR00TN17NativeImplementationError(

Also applies to: 1635-1640

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai/src/phyai/models/gr00t_n17/modeling_gr00t_n17.py` around lines 1518 -
1524, The `denoise_step` path in `GR00TN17NativeImplementationError` handling
currently ignores `action_mask` by leaving `vel_strength` as all ones, so masked
requests can slip through unconstrained. Update
`prepare_action_input`/`denoise_step` in `modeling_gr00t_n17.py` to either apply
`action_mask` when computing `vel_strength` or explicitly reject any non-None
`action_mask` alongside `action` with a clear
`GR00TN17NativeImplementationError`, and make the same fix in the duplicated
logic around `denoise_step` in the later block.

Comment on lines +368 to +381
attention_backend: str = "sdpa",
) -> None:
super().__init__()
self.dim = int(config.hidden_size)
self.num_heads = int(config.num_heads)
self.head_dim = self.dim // self.num_heads
self.scaling = self.head_dim**-0.5
self.attention_backend = attention_backend
self.attn = Attention(
self.num_heads,
self.head_dim,
causal=False,
backend=attention_backend,
backend_kwargs=_attention_backend_kwargs(attention_backend),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 Performance & Scalability | 🟠 Major | ⚡ Quick win

Default the native attention path to flashinfer.

These constructors default to sdpa, so any caller that omits attention_backend bypasses the project’s required production default. Change the default to flashinfer throughout this stack, keeping the existing CPU fallback behavior. As per coding guidelines, "phyai/**/*.py: Use flashinfer by default for attention; when CP environment variable is set, use MagiAttention instead."

Proposed fix
-        attention_backend: str = "sdpa",
+        attention_backend: str = "flashinfer",

Apply the same default change to each constructor in the listed ranges.

Also applies to: 465-488, 528-558, 785-802, 930-940, 992-1012, 1199-1219, 1559-1569

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai/src/phyai/models/gr00t_n17/qwen3_vl_native.py` around lines 368 - 381,
The attention constructors in qwen3_vl_native.py still default to sdpa, so
callers that omit attention_backend bypass the required production default;
update the default to flashinfer in each affected constructor and keep the
existing CP-environment fallback to MagiAttention where applicable. Make the
change consistently across the named init paths (including the Attention setup
and any wrapper modules using _attention_backend_kwargs) so the native attention
stack resolves to flashinfer unless CP is set.

Source: Coding guidelines

Comment thread phyai/src/phyai/models/gr00t_n17/qwen3_vl_native.py
Comment thread phyai/src/phyai/models/gr00t_n17/qwen3_vl_native.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant