feat(phyai): Add PhyAI native LIBERO pi0.5 benchmark support by rebecca26358 · Pull Request #21 · MEmbodied/phyai

rebecca26358 · 2026-06-09T04:30:10Z

Summary

Add a PhyAI model server for vla-evaluation-harness LIBERO benchmark integration.
Add PI05LiberoPolicy and PI05LiberoPipeline for LIBERO observation preprocessing and action postprocessing.
Add OpenPI pi05_libero -> PhyAI safetensors/config/processor conversion tool.
Update pi0.5 scheduler/model/backend logic for camera-mode support and numerical alignment.

Validation

python3 -m py_compile on changed Python entry points.
uv run pytest phyai-utils-tools/tests/test_pi05_libero_pipeline.py
LIBERO smoke test: 1/1 success.
LIBERO-Spatial first task, 50 episodes: 49/50 success (98.0%).

Notes

This PR uses the converted checkpoint path /mnt/data2/shared_models/pi05_libero_phyai_converted in the server config.
Auxiliary local alignment scripts and tmp_compare_pi05 were not included in the commit.

Summary by CodeRabbit

New Features
- Added a new LIBERO policy adapter with unified checkpoint loading and inference.
- Introduced environment variables to configure camera mode and tokenizer path.
- Improved handling of multi-camera observations, prompt generation, and normalization/unnormalization.
Bug Fixes
- Enhanced checkpoint compatibility via weight key remapping.
- Improved consistency of pre/post-processing using checkpoint-derived statistics, including compat-mode validation.

gemini-code-assist

Code Review

This pull request integrates the PhyAI pi0.5 model server with LIBERO, adding observation processing pipelines, high-level policy wrappers, a model server for the evaluation harness, and a checkpoint conversion tool. It also refactors the eager attention backend to support non-contiguous KV cache layouts and adds a CPU fallback for gated activations. Key feedback includes clipping digitized state values to prevent out-of-bounds bin indexing, safely handling null image resolutions in the configuration, flattening state arrays to handle extra batch dimensions robustly, and validating gated activation functions early when forcing the PyTorch fallback.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-09T04:32:24Z

+            state_np = states.detach().cpu().numpy()
+            bins = np.linspace(-1.0, 1.0, 257)[:-1]
+            discretized = np.digitize(state_np, bins=bins) - 1


If any state value in state_np is less than -1.0 (due to noise or numerical precision), np.digitize will return 0, resulting in a bin index of -1. This -1 will be formatted as "-1" in the prompt, which tokenizes into two separate tokens ("-" and "1") instead of a single bin token, breaking the prompt format. Consider clipping the discretized bin indices to [0, 255] to ensure they are always valid.

Suggested change

state_np = states.detach().cpu().numpy()

bins = np.linspace(-1.0, 1.0, 257)[:-1]

discretized = np.digitize(state_np, bins=bins) - 1

state_np = states.detach().cpu().numpy()

bins = np.linspace(-1.0, 1.0, 257)[:-1]

discretized = np.clip(np.digitize(state_np, bins=bins) - 1, 0, 255)

gemini-code-assist · 2026-06-09T04:32:24Z

+        self.image_size = int(
+            image_size
+            or self.config.get("image_resolution", [224, 224])[0]
+            or 224
+        )


If "image_resolution" is explicitly set to null in the JSON configuration file, self.config.get("image_resolution", [224, 224]) will return None, causing None[0] to raise a TypeError: 'NoneType' object is not subscriptable. Consider safely checking the type and value of "image_resolution" before indexing.

image_res = self.config.get("image_resolution") self.image_size = int( image_size or (image_res[0] if isinstance(image_res, list) and image_res else None) or 224 )

gemini-code-assist · 2026-06-09T04:32:24Z

+        arr = np.asarray(state, dtype=np.float32)
+        if arr.shape != (8,):
+            raise ValueError(f"Expected LIBERO state shape (8,), got {arr.shape}")


If the state has an extra batch dimension (e.g., shape (1, 8)), arr.shape != (8,) will raise a ValueError. Flattening the array first ensures it is robust to such shape variations.

Suggested change

arr = np.asarray(state, dtype=np.float32)

if arr.shape != (8,):

raise ValueError(f"Expected LIBERO state shape (8,), got {arr.shape}")

arr = np.asarray(state, dtype=np.float32).flatten()

if arr.shape != (8,):

raise ValueError(f"Expected LIBERO state shape (8,), got {arr.shape}")

gemini-code-assist · 2026-06-09T04:32:24Z

+    if os.environ.get("PHYAI_FORCE_TORCH_GATED_ACT") == "1":
+        return _torch_gated_act(canon)


When PHYAI_FORCE_TORCH_GATED_ACT is set to "1", _torch_gated_act(canon) is returned immediately without validating if canon is a supported activation function. This delays the ValueError until the first forward pass. Consider validating canon inside _torch_gated_act or before returning it to fail early during model construction.

Suggested change

if os.environ.get("PHYAI_FORCE_TORCH_GATED_ACT") == "1":

return _torch_gated_act(canon)

if os.environ.get("PHYAI_FORCE_TORCH_GATED_ACT") == "1":

if canon not in _GATED_ACTS:

raise ValueError(

f"Unsupported gated activation {name!r}; expected one of {_GATED_ACTS!r}."

)

return _torch_gated_act(canon)

chenghuaWang · 2026-06-09T05:20:43Z

@@ -0,0 +1,19 @@
+# PhyAI pi0.5 model server — LIBERO smoke/integration config.


do not use chinese comments.

why using PHYAI_FORCE_TORCH_GATED_ACT ?

chenghuaWang · 2026-06-09T05:21:35Z

+    def _resolve_tokenizer_path(self, name_or_path: str) -> str:
+        local_candidates = {
+            "google/paligemma-3b-pt-224": [
+                Path("/mnt/data2/shared_models/paligemma-3b-pt-224"),


absolute path is not allowed

chenghuaWang · 2026-06-09T05:22:26Z

+    "numpy",
+    "safetensors",
+    "torch==2.11",
+    "transformers==5.8.1",


is transformers really needed

coderabbitai · 2026-06-15T13:10:29Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 12314d44-3324-483f-9650-3ae8cbc61574

📥 Commits

Reviewing files that changed from the base of the PR and between 02e940a and 02e4160.

📒 Files selected for processing (3)

phyai/src/phyai/env.py
phyai/src/phyai/policies/__init__.py
phyai/src/phyai/policies/pi05_libero.py

✅ Files skipped from review due to trivial changes (1)

phyai/src/phyai/policies/init.py

🚧 Files skipped from review as they are similar to previous changes (1)

phyai/src/phyai/policies/pi05_libero.py

📝 Walkthrough

Walkthrough

Adds PI05LiberoPolicy as a public adapter that loads checkpoint-driven settings, converts LIBERO observations into model inputs, runs phyai.engine.Engine, and returns postprocessed NumPy actions. It also adds two policy-related environment variables.

Changes

PI05LiberoPolicy LIBERO Adapter

Layer / File(s)	Summary
Weight remap and package export `phyai/src/phyai/policies/pi05_libero.py`, `phyai/src/phyai/policies/__init__.py`	`_lerobot_pi05_weight_remap` strips `model.` prefixes and drops `lm_head.weight`. `PI05LiberoPolicy` is re-exported as the sole public symbol of `phyai.policies`.
Policy construction and configuration resolution `phyai/src/phyai/env.py`, `phyai/src/phyai/policies/pi05_libero.py`	`PI05LiberoPolicy.__init__` reads checkpoint config and JSON metadata, resolves cameras/tokenizer/modes, validates compat stats, builds `PI05Processor`, and wires `phyai.engine.Engine`. `PHYAI_CAMERA_MODE` and `PHYAI_TOKENIZER_PATH` are added to the env registry.
Observation preprocessing and tokenization `phyai/src/phyai/policies/pi05_libero.py`	`observation_to_raw` and `observation_to_request_inputs` extract images, state, and task; standard mode delegates to `PI05Processor.preprocess`; compat mode builds pixel values, normalizes state, and tokenizes prompts. Image extraction, resize/pad, state normalization, and prompt formatting are implemented in the same flow.
Inference execution and action postprocessing `phyai/src/phyai/policies/pi05_libero.py`	`infer()` builds a `PI05Request`, runs `engine.step()` under `torch.inference_mode()`, and returns `{"actions": ...}`. `_postprocess_actions` handles standard and compat-mode action conversion, and `close()` shuts down the engine.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hopped through configs, tidy and neat,
With camera names and tokens to greet.
I nibbled prefixes, trimmed them away,
Then spun raw obs into actions to play.
My little engine hummed, vroom-vroom, hooray!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 19.48% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: adding native PhyAI support for the LIBERO pi0.5 benchmark.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

phyai-utils-tools/tests/test_pi05_libero_pipeline.py (1)

23-24: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make this test hermetic; avoid machine-specific absolute checkpoint paths.

Hard-coding /mnt/data2/... makes this test environment-dependent and brittle in CI/dev machines that lack that mount.

Minimal stabilization option

+import os
+import pytest
 ...
-    checkpoint = Path("/mnt/data2/shared_models/pi05_libero_finetuned_v044")
+    checkpoint_env = os.environ.get("PI05_LIBERO_TEST_CHECKPOINT")
+    if not checkpoint_env:
+        pytest.skip("Set PI05_LIBERO_TEST_CHECKPOINT to run this integration-backed test.")
+    checkpoint = Path(checkpoint_env)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai-utils-tools/tests/test_pi05_libero_pipeline.py` around lines 23 - 24,
The test has a hard-coded machine-specific absolute path
`/mnt/data2/shared_models/pi05_libero_finetuned_v044` for the checkpoint
parameter in the PI05LiberoPipeline instantiation, which makes the test fail on
machines without that mount and breaks in CI environments. Replace this absolute
path with a hermetic approach such as using a pytest fixture that provides a
valid checkpoint path, mocking the checkpoint parameter, using a relative test
resource path, or employing an environment variable with a fallback to test data
bundled with the repository. Ensure the checkpoint assignment on line 23 no
longer references the machine-specific mount path.

🧹 Nitpick comments (2)

tools/convert_openpi_pi05_to_phyai.py (1)

185-186: 💤 Low value

Prefer direct attribute access over getattr with constant attribute.

Using getattr(node, "shape") provides no additional safety over direct attribute access when the attribute name is a constant string.

Suggested fix

-        shape = tuple(getattr(node, "shape"))
-        dtype = getattr(node, "dtype", None)
+        shape = tuple(node.shape)
+        dtype = getattr(node, "dtype", None)  # dtype may be absent, keep getattr with default

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/convert_openpi_pi05_to_phyai.py` around lines 185 - 186, Replace the
getattr calls with direct attribute access since both "shape" and "dtype" are
constant attribute names. Change the line `shape = tuple(getattr(node,
"shape"))` to `shape = tuple(node.shape)`. For the dtype line using
`getattr(node, "dtype", None)`, either change it to direct attribute access
`dtype = node.dtype` if the attribute is guaranteed to exist, or use a hasattr
check if the attribute may not be present, rather than relying on getattr with a
default value for a constant attribute name.

Source: Linters/SAST tools

phyai-utils-tools/src/phyai_utils_tools/pipeline/pi05_libero.py (1)

323-327: ⚡ Quick win

Harden mean/std unnormalization against stat/action dimensional mismatches.

_unnormalize_action() handles dim mismatch in the quantile path but not in the mean/std path. A checkpoint with padded stats can trigger a broadcast failure here.

Proposed fix

         mean = self._unnormalizer_stats.get("action.mean")
         std = self._unnormalizer_stats.get("action.std")
         if mean is None or std is None:
             return action
-        return action * torch.clamp(std.to(action), min=1e-8) + mean.to(action)
+        mean_t = mean.to(action)
+        std_t = torch.clamp(std.to(action), min=1e-8)
+        dim = min(action.shape[-1], mean_t.shape[-1], std_t.shape[-1])
+        head = action[..., :dim] * std_t[..., :dim] + mean_t[..., :dim]
+        if dim == action.shape[-1]:
+            return head
+        return torch.cat([head, action[..., dim:]], dim=-1)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@phyai-utils-tools/src/phyai_utils_tools/pipeline/pi05_libero.py` around lines
323 - 327, The mean/std unnormalization path in the _unnormalize_action() method
does not handle dimensional mismatches between the stats tensors and the action
tensor, unlike the quantile path. When checkpoint stats are padded, the
broadcast operation on the return line can fail. Add dimensional mismatch
handling to the mean and std tensors before they are used in the unnormalization
calculation (reshape or slice them to match the action tensor dimensions),
similar to the pattern used in the quantile unnormalization path within the same
method.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@phyai/src/phyai/models/pi05/modeling_pi05.py`:
- Line 603: The docstring for the embed_lang() function contains outdated
information about vision embedding scaling that no longer reflects the current
behavior after the recent changes to embed_scale and removal of vision-side
scaling. Update the docstring to remove or correct the reference to
sqrt(projection_dim) scaling for vision embeddings, ensuring the documentation
accurately describes the current scaling behavior and does not mislead future
modifications.

In `@tools/convert_openpi_pi05_to_phyai.py`:
- Around line 545-549: The temporary file created by mkstemp is not cleaned up
if save_file or os.replace raises an exception, leading to orphaned files
accumulating on disk. Wrap the save_file and os.replace calls in a try-finally
block to ensure the temporary file is properly deleted even when exceptions
occur. In the finally block, check if the temporary file still exists and remove
it using os.remove to guarantee cleanup regardless of success or failure.

---

Duplicate comments:
In `@phyai-utils-tools/tests/test_pi05_libero_pipeline.py`:
- Around line 23-24: The test has a hard-coded machine-specific absolute path
`/mnt/data2/shared_models/pi05_libero_finetuned_v044` for the checkpoint
parameter in the PI05LiberoPipeline instantiation, which makes the test fail on
machines without that mount and breaks in CI environments. Replace this absolute
path with a hermetic approach such as using a pytest fixture that provides a
valid checkpoint path, mocking the checkpoint parameter, using a relative test
resource path, or employing an environment variable with a fallback to test data
bundled with the repository. Ensure the checkpoint assignment on line 23 no
longer references the machine-specific mount path.

---

Nitpick comments:
In `@phyai-utils-tools/src/phyai_utils_tools/pipeline/pi05_libero.py`:
- Around line 323-327: The mean/std unnormalization path in the
_unnormalize_action() method does not handle dimensional mismatches between the
stats tensors and the action tensor, unlike the quantile path. When checkpoint
stats are padded, the broadcast operation on the return line can fail. Add
dimensional mismatch handling to the mean and std tensors before they are used
in the unnormalization calculation (reshape or slice them to match the action
tensor dimensions), similar to the pattern used in the quantile unnormalization
path within the same method.

In `@tools/convert_openpi_pi05_to_phyai.py`:
- Around line 185-186: Replace the getattr calls with direct attribute access
since both "shape" and "dtype" are constant attribute names. Change the line
`shape = tuple(getattr(node, "shape"))` to `shape = tuple(node.shape)`. For the
dtype line using `getattr(node, "dtype", None)`, either change it to direct
attribute access `dtype = node.dtype` if the attribute is guaranteed to exist,
or use a hasattr check if the attribute may not be present, rather than relying
on getattr with a default value for a constant attribute name.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b2b82cf2-b245-43ad-9c57-ef2603549d59

📥 Commits

Reviewing files that changed from the base of the PR and between 73335ef and 2dd7b90.

📒 Files selected for processing (9)

phyai-utils-tools/pyproject.toml
phyai-utils-tools/src/phyai_utils_tools/pipeline/__init__.py
phyai-utils-tools/src/phyai_utils_tools/pipeline/pi05_libero.py
phyai-utils-tools/src/phyai_utils_tools/pipeline/processors.py
phyai-utils-tools/tests/test_pi05_libero_pipeline.py
phyai/src/phyai/models/pi05/modeling_pi05.py
phyai/src/phyai/policies/__init__.py
phyai/src/phyai/policies/pi05_libero.py
tools/convert_openpi_pi05_to_phyai.py

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@phyai/src/phyai/policies/pi05_libero.py`:
- Around line 159-170: The camera mode and tokenizer path logic in PI05Libero is
reading PHYAI_CAMERA_MODE and PHYAI_TOKENIZER_PATH directly from os.environ;
move both settings into phyai.env first and then consume that shared env API
from _resolve_camera_names and _resolve_tokenizer_name. Update the policy to use
the project’s environment accessors/constants instead of ad hoc os.environ
lookups so these new PHYAI_* settings are declared centrally in phyai.env.
- Around line 264-268: The camera selection in pi05_libero’s observation lookup
is too permissive because the fallback through
candidates.extend(images.values()) can silently pick an arbitrary camera like
agentview instead of failing when a wrist alias is missing. Update the lookup
logic in the policy code around the candidate resolution to only accept the
explicit aliases from keys (and any direct obs lookup), and remove the catch-all
images.values() fallback so missing expected cameras surface as a clear KeyError
rather than returning the wrong image.
- Around line 86-91: The compat normalization loading in the policy
initialization should fail fast instead of allowing missing stats to fall back
to raw scaling. Update the `_load_processor_state` usage in `PI05LiberoPolicy`
so that when `_use_phyai_compat` is enabled, missing `policy_preprocessor.json`
or `policy_postprocessor.json` tensors trigger an explicit error during setup
rather than continuing with unset normalizer/unnormalizer state; keep the check
close to `_normalizer_stats` and `_unnormalizer_stats` assignment so bad
checkpoint conversions are surfaced immediately.
- Around line 363-370: The prompt-building path in the state discretization
logic currently allows out-of-range bin IDs because `np.digitize(...)-1` can
produce `-1` for values below the lower bound. Update the `discretized` handling
in the `pi05_libero.py` prompt construction flow so the per-sample `state_bins`
are clamped into the valid bin index range before `state_str` is assembled,
keeping the `prompts.append(...)` output free of negative state IDs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 55cbf657-91ab-441f-aaea-cc8373a37aba

📥 Commits

Reviewing files that changed from the base of the PR and between bcf6d97 and 02e940a.

📒 Files selected for processing (2)

phyai/src/phyai/policies/__init__.py
phyai/src/phyai/policies/pi05_libero.py

✅ Files skipped from review due to trivial changes (1)

phyai/src/phyai/policies/init.py

gemini-code-assist Bot reviewed Jun 9, 2026

View reviewed changes

chenghuaWang changed the title ~~Add PhyAI native LIBERO pi0.5 benchmark support~~ feat(phyai): Add PhyAI native LIBERO pi0.5 benchmark support Jun 9, 2026

chenghuaWang reviewed Jun 9, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread phyai/src/phyai/models/pi05/modeling_pi05.py

Comment thread tools/convert_openpi_pi05_to_phyai.py Outdated

rebecca26358 force-pushed the feature/phyai-libero-native-pi05 branch from bcf6d97 to 02e940a Compare June 24, 2026 14:17

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

Comment thread phyai/src/phyai/policies/pi05_libero.py

Comment thread phyai/src/phyai/policies/pi05_libero.py Outdated

Comment thread phyai/src/phyai/policies/pi05_libero.py

Comment thread phyai/src/phyai/policies/pi05_libero.py

feat(phyai): add PI05 LIBERO policy adapter

02e4160

rebecca26358 force-pushed the feature/phyai-libero-native-pi05 branch from 02e940a to 02e4160 Compare June 24, 2026 14:31

		if os.environ.get("PHYAI_FORCE_TORCH_GATED_ACT") == "1":
		return _torch_gated_act(canon)

		@@ -0,0 +1,19 @@
		# PhyAI pi0.5 model server — LIBERO smoke/integration config.

Uh oh!

Conversation

rebecca26358 commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Notes

Summary by CodeRabbit

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

chenghuaWang Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

chenghuaWang Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

chenghuaWang Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rebecca26358 commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading