Skip to content

feat(phyai): Add PhyAI native LIBERO pi0.5 benchmark support#21

Open
rebecca26358 wants to merge 1 commit into
MEmbodied:mainfrom
rebecca26358:feature/phyai-libero-native-pi05
Open

feat(phyai): Add PhyAI native LIBERO pi0.5 benchmark support#21
rebecca26358 wants to merge 1 commit into
MEmbodied:mainfrom
rebecca26358:feature/phyai-libero-native-pi05

Conversation

@rebecca26358

@rebecca26358 rebecca26358 commented Jun 9, 2026

Copy link
Copy Markdown

Summary

  • Add a PhyAI model server for vla-evaluation-harness LIBERO benchmark integration.
  • Add PI05LiberoPolicy and PI05LiberoPipeline for LIBERO observation preprocessing and action postprocessing.
  • Add OpenPI pi05_libero -> PhyAI safetensors/config/processor conversion tool.
  • Update pi0.5 scheduler/model/backend logic for camera-mode support and numerical alignment.

Validation

  • python3 -m py_compile on changed Python entry points.
  • uv run pytest phyai-utils-tools/tests/test_pi05_libero_pipeline.py
  • LIBERO smoke test: 1/1 success.
  • LIBERO-Spatial first task, 50 episodes: 49/50 success (98.0%).

Notes

  • This PR uses the converted checkpoint path /mnt/data2/shared_models/pi05_libero_phyai_converted in the server config.
  • Auxiliary local alignment scripts and tmp_compare_pi05 were not included in the commit.

Summary by CodeRabbit

  • New Features

    • Added a new LIBERO policy adapter with unified checkpoint loading and inference.
    • Introduced environment variables to configure camera mode and tokenizer path.
    • Improved handling of multi-camera observations, prompt generation, and normalization/unnormalization.
  • Bug Fixes

    • Enhanced checkpoint compatibility via weight key remapping.
    • Improved consistency of pre/post-processing using checkpoint-derived statistics, including compat-mode validation.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request integrates the PhyAI pi0.5 model server with LIBERO, adding observation processing pipelines, high-level policy wrappers, a model server for the evaluation harness, and a checkpoint conversion tool. It also refactors the eager attention backend to support non-contiguous KV cache layouts and adds a CPU fallback for gated activations. Key feedback includes clipping digitized state values to prevent out-of-bounds bin indexing, safely handling null image resolutions in the configuration, flattening state arrays to handle extra batch dimensions robustly, and validating gated activation functions early when forcing the PyTorch fallback.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +180 to +182
state_np = states.detach().cpu().numpy()
bins = np.linspace(-1.0, 1.0, 257)[:-1]
discretized = np.digitize(state_np, bins=bins) - 1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If any state value in state_np is less than -1.0 (due to noise or numerical precision), np.digitize will return 0, resulting in a bin index of -1. This -1 will be formatted as "-1" in the prompt, which tokenizes into two separate tokens ("-" and "1") instead of a single bin token, breaking the prompt format. Consider clipping the discretized bin indices to [0, 255] to ensure they are always valid.

Suggested change
state_np = states.detach().cpu().numpy()
bins = np.linspace(-1.0, 1.0, 257)[:-1]
discretized = np.digitize(state_np, bins=bins) - 1
state_np = states.detach().cpu().numpy()
bins = np.linspace(-1.0, 1.0, 257)[:-1]
discretized = np.clip(np.digitize(state_np, bins=bins) - 1, 0, 255)

Comment on lines +38 to +42
self.image_size = int(
image_size
or self.config.get("image_resolution", [224, 224])[0]
or 224
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If "image_resolution" is explicitly set to null in the JSON configuration file, self.config.get("image_resolution", [224, 224]) will return None, causing None[0] to raise a TypeError: 'NoneType' object is not subscriptable. Consider safely checking the type and value of "image_resolution" before indexing.

        image_res = self.config.get("image_resolution")
        self.image_size = int(
            image_size
            or (image_res[0] if isinstance(image_res, list) and image_res else None)
            or 224
        )

Comment on lines +229 to +231
arr = np.asarray(state, dtype=np.float32)
if arr.shape != (8,):
raise ValueError(f"Expected LIBERO state shape (8,), got {arr.shape}")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the state has an extra batch dimension (e.g., shape (1, 8)), arr.shape != (8,) will raise a ValueError. Flattening the array first ensures it is robust to such shape variations.

Suggested change
arr = np.asarray(state, dtype=np.float32)
if arr.shape != (8,):
raise ValueError(f"Expected LIBERO state shape (8,), got {arr.shape}")
arr = np.asarray(state, dtype=np.float32).flatten()
if arr.shape != (8,):
raise ValueError(f"Expected LIBERO state shape (8,), got {arr.shape}")

Comment thread phyai/src/phyai/layers/mlp/dense_mlp.py Outdated
Comment on lines +109 to +110
if os.environ.get("PHYAI_FORCE_TORCH_GATED_ACT") == "1":
return _torch_gated_act(canon)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When PHYAI_FORCE_TORCH_GATED_ACT is set to "1", _torch_gated_act(canon) is returned immediately without validating if canon is a supported activation function. This delays the ValueError until the first forward pass. Consider validating canon inside _torch_gated_act or before returning it to fail early during model construction.

Suggested change
if os.environ.get("PHYAI_FORCE_TORCH_GATED_ACT") == "1":
return _torch_gated_act(canon)
if os.environ.get("PHYAI_FORCE_TORCH_GATED_ACT") == "1":
if canon not in _GATED_ACTS:
raise ValueError(
f"Unsupported gated activation {name!r}; expected one of {_GATED_ACTS!r}."
)
return _torch_gated_act(canon)

@chenghuaWang chenghuaWang changed the title Add PhyAI native LIBERO pi0.5 benchmark support feat(phyai): Add PhyAI native LIBERO pi0.5 benchmark support Jun 9, 2026
Comment thread configs/model_servers/phyai/libero.yaml Outdated
@@ -0,0 +1,19 @@
# PhyAI pi0.5 model server — LIBERO smoke/integration config.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. do not use chinese comments.
  2. why using PHYAI_FORCE_TORCH_GATED_ACT ?

def _resolve_tokenizer_path(self, name_or_path: str) -> str:
local_candidates = {
"google/paligemma-3b-pt-224": [
Path("/mnt/data2/shared_models/paligemma-3b-pt-224"),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. absolute path is not allowed

"numpy",
"safetensors",
"torch==2.11",
"transformers==5.8.1",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. is transformers really needed

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 12314d44-3324-483f-9650-3ae8cbc61574

📥 Commits

Reviewing files that changed from the base of the PR and between 02e940a and 02e4160.

📒 Files selected for processing (3)
  • phyai/src/phyai/env.py
  • phyai/src/phyai/policies/__init__.py
  • phyai/src/phyai/policies/pi05_libero.py
✅ Files skipped from review due to trivial changes (1)
  • phyai/src/phyai/policies/init.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • phyai/src/phyai/policies/pi05_libero.py

📝 Walkthrough

Walkthrough

Adds PI05LiberoPolicy as a public adapter that loads checkpoint-driven settings, converts LIBERO observations into model inputs, runs phyai.engine.Engine, and returns postprocessed NumPy actions. It also adds two policy-related environment variables.

Changes

PI05LiberoPolicy LIBERO Adapter

Layer / File(s) Summary
Weight remap and package export
phyai/src/phyai/policies/pi05_libero.py, phyai/src/phyai/policies/__init__.py
_lerobot_pi05_weight_remap strips model. prefixes and drops lm_head.weight. PI05LiberoPolicy is re-exported as the sole public symbol of phyai.policies.
Policy construction and configuration resolution
phyai/src/phyai/env.py, phyai/src/phyai/policies/pi05_libero.py
PI05LiberoPolicy.__init__ reads checkpoint config and JSON metadata, resolves cameras/tokenizer/modes, validates compat stats, builds PI05Processor, and wires phyai.engine.Engine. PHYAI_CAMERA_MODE and PHYAI_TOKENIZER_PATH are added to the env registry.
Observation preprocessing and tokenization
phyai/src/phyai/policies/pi05_libero.py
observation_to_raw and observation_to_request_inputs extract images, state, and task; standard mode delegates to PI05Processor.preprocess; compat mode builds pixel values, normalizes state, and tokenizes prompts. Image extraction, resize/pad, state normalization, and prompt formatting are implemented in the same flow.
Inference execution and action postprocessing
phyai/src/phyai/policies/pi05_libero.py
infer() builds a PI05Request, runs engine.step() under torch.inference_mode(), and returns {"actions": ...}. _postprocess_actions handles standard and compat-mode action conversion, and close() shuts down the engine.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hopped through configs, tidy and neat,
With camera names and tokens to greet.
I nibbled prefixes, trimmed them away,
Then spun raw obs into actions to play.
My little engine hummed, vroom-vroom, hooray!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 19.48% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adding native PhyAI support for the LIBERO pi0.5 benchmark.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
phyai-utils-tools/tests/test_pi05_libero_pipeline.py (1)

23-24: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make this test hermetic; avoid machine-specific absolute checkpoint paths.

Hard-coding /mnt/data2/... makes this test environment-dependent and brittle in CI/dev machines that lack that mount.

Minimal stabilization option
+import os
+import pytest
 ...
-    checkpoint = Path("/mnt/data2/shared_models/pi05_libero_finetuned_v044")
+    checkpoint_env = os.environ.get("PI05_LIBERO_TEST_CHECKPOINT")
+    if not checkpoint_env:
+        pytest.skip("Set PI05_LIBERO_TEST_CHECKPOINT to run this integration-backed test.")
+    checkpoint = Path(checkpoint_env)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai-utils-tools/tests/test_pi05_libero_pipeline.py` around lines 23 - 24,
The test has a hard-coded machine-specific absolute path
`/mnt/data2/shared_models/pi05_libero_finetuned_v044` for the checkpoint
parameter in the PI05LiberoPipeline instantiation, which makes the test fail on
machines without that mount and breaks in CI environments. Replace this absolute
path with a hermetic approach such as using a pytest fixture that provides a
valid checkpoint path, mocking the checkpoint parameter, using a relative test
resource path, or employing an environment variable with a fallback to test data
bundled with the repository. Ensure the checkpoint assignment on line 23 no
longer references the machine-specific mount path.
🧹 Nitpick comments (2)
tools/convert_openpi_pi05_to_phyai.py (1)

185-186: 💤 Low value

Prefer direct attribute access over getattr with constant attribute.

Using getattr(node, "shape") provides no additional safety over direct attribute access when the attribute name is a constant string.

Suggested fix
-        shape = tuple(getattr(node, "shape"))
-        dtype = getattr(node, "dtype", None)
+        shape = tuple(node.shape)
+        dtype = getattr(node, "dtype", None)  # dtype may be absent, keep getattr with default
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/convert_openpi_pi05_to_phyai.py` around lines 185 - 186, Replace the
getattr calls with direct attribute access since both "shape" and "dtype" are
constant attribute names. Change the line `shape = tuple(getattr(node,
"shape"))` to `shape = tuple(node.shape)`. For the dtype line using
`getattr(node, "dtype", None)`, either change it to direct attribute access
`dtype = node.dtype` if the attribute is guaranteed to exist, or use a hasattr
check if the attribute may not be present, rather than relying on getattr with a
default value for a constant attribute name.

Source: Linters/SAST tools

phyai-utils-tools/src/phyai_utils_tools/pipeline/pi05_libero.py (1)

323-327: ⚡ Quick win

Harden mean/std unnormalization against stat/action dimensional mismatches.

_unnormalize_action() handles dim mismatch in the quantile path but not in the mean/std path. A checkpoint with padded stats can trigger a broadcast failure here.

Proposed fix
         mean = self._unnormalizer_stats.get("action.mean")
         std = self._unnormalizer_stats.get("action.std")
         if mean is None or std is None:
             return action
-        return action * torch.clamp(std.to(action), min=1e-8) + mean.to(action)
+        mean_t = mean.to(action)
+        std_t = torch.clamp(std.to(action), min=1e-8)
+        dim = min(action.shape[-1], mean_t.shape[-1], std_t.shape[-1])
+        head = action[..., :dim] * std_t[..., :dim] + mean_t[..., :dim]
+        if dim == action.shape[-1]:
+            return head
+        return torch.cat([head, action[..., dim:]], dim=-1)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai-utils-tools/src/phyai_utils_tools/pipeline/pi05_libero.py` around lines
323 - 327, The mean/std unnormalization path in the _unnormalize_action() method
does not handle dimensional mismatches between the stats tensors and the action
tensor, unlike the quantile path. When checkpoint stats are padded, the
broadcast operation on the return line can fail. Add dimensional mismatch
handling to the mean and std tensors before they are used in the unnormalization
calculation (reshape or slice them to match the action tensor dimensions),
similar to the pattern used in the quantile unnormalization path within the same
method.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@phyai/src/phyai/models/pi05/modeling_pi05.py`:
- Line 603: The docstring for the embed_lang() function contains outdated
information about vision embedding scaling that no longer reflects the current
behavior after the recent changes to embed_scale and removal of vision-side
scaling. Update the docstring to remove or correct the reference to
sqrt(projection_dim) scaling for vision embeddings, ensuring the documentation
accurately describes the current scaling behavior and does not mislead future
modifications.

In `@tools/convert_openpi_pi05_to_phyai.py`:
- Around line 545-549: The temporary file created by mkstemp is not cleaned up
if save_file or os.replace raises an exception, leading to orphaned files
accumulating on disk. Wrap the save_file and os.replace calls in a try-finally
block to ensure the temporary file is properly deleted even when exceptions
occur. In the finally block, check if the temporary file still exists and remove
it using os.remove to guarantee cleanup regardless of success or failure.

---

Duplicate comments:
In `@phyai-utils-tools/tests/test_pi05_libero_pipeline.py`:
- Around line 23-24: The test has a hard-coded machine-specific absolute path
`/mnt/data2/shared_models/pi05_libero_finetuned_v044` for the checkpoint
parameter in the PI05LiberoPipeline instantiation, which makes the test fail on
machines without that mount and breaks in CI environments. Replace this absolute
path with a hermetic approach such as using a pytest fixture that provides a
valid checkpoint path, mocking the checkpoint parameter, using a relative test
resource path, or employing an environment variable with a fallback to test data
bundled with the repository. Ensure the checkpoint assignment on line 23 no
longer references the machine-specific mount path.

---

Nitpick comments:
In `@phyai-utils-tools/src/phyai_utils_tools/pipeline/pi05_libero.py`:
- Around line 323-327: The mean/std unnormalization path in the
_unnormalize_action() method does not handle dimensional mismatches between the
stats tensors and the action tensor, unlike the quantile path. When checkpoint
stats are padded, the broadcast operation on the return line can fail. Add
dimensional mismatch handling to the mean and std tensors before they are used
in the unnormalization calculation (reshape or slice them to match the action
tensor dimensions), similar to the pattern used in the quantile unnormalization
path within the same method.

In `@tools/convert_openpi_pi05_to_phyai.py`:
- Around line 185-186: Replace the getattr calls with direct attribute access
since both "shape" and "dtype" are constant attribute names. Change the line
`shape = tuple(getattr(node, "shape"))` to `shape = tuple(node.shape)`. For the
dtype line using `getattr(node, "dtype", None)`, either change it to direct
attribute access `dtype = node.dtype` if the attribute is guaranteed to exist,
or use a hasattr check if the attribute may not be present, rather than relying
on getattr with a default value for a constant attribute name.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b2b82cf2-b245-43ad-9c57-ef2603549d59

📥 Commits

Reviewing files that changed from the base of the PR and between 73335ef and 2dd7b90.

📒 Files selected for processing (9)
  • phyai-utils-tools/pyproject.toml
  • phyai-utils-tools/src/phyai_utils_tools/pipeline/__init__.py
  • phyai-utils-tools/src/phyai_utils_tools/pipeline/pi05_libero.py
  • phyai-utils-tools/src/phyai_utils_tools/pipeline/processors.py
  • phyai-utils-tools/tests/test_pi05_libero_pipeline.py
  • phyai/src/phyai/models/pi05/modeling_pi05.py
  • phyai/src/phyai/policies/__init__.py
  • phyai/src/phyai/policies/pi05_libero.py
  • tools/convert_openpi_pi05_to_phyai.py

Comment thread phyai/src/phyai/models/pi05/modeling_pi05.py
Comment thread tools/convert_openpi_pi05_to_phyai.py Outdated
@rebecca26358 rebecca26358 force-pushed the feature/phyai-libero-native-pi05 branch from bcf6d97 to 02e940a Compare June 24, 2026 14:17

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@phyai/src/phyai/policies/pi05_libero.py`:
- Around line 159-170: The camera mode and tokenizer path logic in PI05Libero is
reading PHYAI_CAMERA_MODE and PHYAI_TOKENIZER_PATH directly from os.environ;
move both settings into phyai.env first and then consume that shared env API
from _resolve_camera_names and _resolve_tokenizer_name. Update the policy to use
the project’s environment accessors/constants instead of ad hoc os.environ
lookups so these new PHYAI_* settings are declared centrally in phyai.env.
- Around line 264-268: The camera selection in pi05_libero’s observation lookup
is too permissive because the fallback through
candidates.extend(images.values()) can silently pick an arbitrary camera like
agentview instead of failing when a wrist alias is missing. Update the lookup
logic in the policy code around the candidate resolution to only accept the
explicit aliases from keys (and any direct obs lookup), and remove the catch-all
images.values() fallback so missing expected cameras surface as a clear KeyError
rather than returning the wrong image.
- Around line 86-91: The compat normalization loading in the policy
initialization should fail fast instead of allowing missing stats to fall back
to raw scaling. Update the `_load_processor_state` usage in `PI05LiberoPolicy`
so that when `_use_phyai_compat` is enabled, missing `policy_preprocessor.json`
or `policy_postprocessor.json` tensors trigger an explicit error during setup
rather than continuing with unset normalizer/unnormalizer state; keep the check
close to `_normalizer_stats` and `_unnormalizer_stats` assignment so bad
checkpoint conversions are surfaced immediately.
- Around line 363-370: The prompt-building path in the state discretization
logic currently allows out-of-range bin IDs because `np.digitize(...)-1` can
produce `-1` for values below the lower bound. Update the `discretized` handling
in the `pi05_libero.py` prompt construction flow so the per-sample `state_bins`
are clamped into the valid bin index range before `state_str` is assembled,
keeping the `prompts.append(...)` output free of negative state IDs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 55cbf657-91ab-441f-aaea-cc8373a37aba

📥 Commits

Reviewing files that changed from the base of the PR and between bcf6d97 and 02e940a.

📒 Files selected for processing (2)
  • phyai/src/phyai/policies/__init__.py
  • phyai/src/phyai/policies/pi05_libero.py
✅ Files skipped from review due to trivial changes (1)
  • phyai/src/phyai/policies/init.py

Comment thread phyai/src/phyai/policies/pi05_libero.py
Comment thread phyai/src/phyai/policies/pi05_libero.py Outdated
Comment thread phyai/src/phyai/policies/pi05_libero.py
Comment thread phyai/src/phyai/policies/pi05_libero.py
@rebecca26358 rebecca26358 force-pushed the feature/phyai-libero-native-pi05 branch from 02e940a to 02e4160 Compare June 24, 2026 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants