Add script for e2e agentic scene gen by qianl-nv · Pull Request #778 · isaac-sim/IsaacLab-Arena

qianl-nv · 2026-06-11T07:29:24Z

Summary

Add environment_generation_runner for running the agentic environment generation end to end

Detailed description

What was the reason for the change?
We need an entry point to run the full agentic env gen pipeline for integration testing.
What has been changed?
A new script is added to run the agentic env gen. It does the following steps:
- Takes user prompt from CLI, pass through EnvGenAgent to generate EnvIntentSpec
- EnvIntentSpec compiles to UnresolvedArenaEnvGraphSpec (asset match into registry), then ArenaEnvGraphSpec (constraints populated). The final ArenaEnvGraphSpec serialized to yaml.
- ArenaEnvGraphSpec is loaded back from yaml, and build into an ArenaEnvironment
- Run zero_action policy simulation for visualizing output (with RelationSolver initializing layouts)
What is the impact of this change?
- The env generation agent is harden to reduce server side failure rate from ~30% to <1%.
- Small fixes are added to the agent: add anchor relation, and add lights, and the generate arena env is valid & can be visualized correctly in sim.
- Known limitations: no support for ObjectSet or ObjectReference. No user in the loop editing.

To run:
python -m isaaclab_arena.agentic_environment_generation.environment_generation_runner --num_envs 16 --env_spacing 1.5 --num-steps 2000 --viz kit --prompt "franka pick up avocado from the maple table and place it into a bowl on the table. there are other veggies on the table as distractor."

2026-4-14-agentic-arena-env-generation-design-google-docs (1)

mkdir(parents=True, exist_ok=True) on the output path's parent so a first-time run on a fresh checkout doesn't crash with FileNotFoundError when the llm_generated/ directory doesn't exist yet. Signed-off-by: Qian Lin <qianl@nvidia.com>

Replace all hard-coded [:10] / [:5] slices in TraceEvent candidate lists with a single named constant to make the limit explicit and easy to change in one place. Signed-off-by: Qian Lin <qianl@nvidia.com>

Replace frozenset with a plain set literal and add an inline comment to each entry explaining when that error stage fires. Signed-off-by: Qian Lin <qianl@nvidia.com>

Replace the Attributes block in TraceEvent with per-field docstrings, and move the numbered resolution strategy steps from resolve_item's docstring into inline comments at each corresponding code branch. Signed-off-by: Qian Lin <qianl@nvidia.com>

Consistent with background and item nodes, set id=query on the embodiment node so task params that reference the robot by the agent-emitted name (e.g. "franka") resolve correctly against known_ids, rather than silently missing when the resolved name differs ("franka_ik"). Signed-off-by: Qian Lin <qianl@nvidia.com>

When two items share the same query and neither has an instance_name they produce the same node ID silently. Mark the spot for a follow-up. Signed-off-by: Qian Lin <qianl@nvidia.com>

Signed-off-by: Qian Lin <qianl@nvidia.com>

Now that resolve_embodiment returns None on miss (no silent fallback), "embodiment.miss" is an error stage. Split the old combined test into two: one verifying tag-pool relaxation is non-fatal, and one asserting an unknown embodiment is correctly reported as a resolution error. Signed-off-by: Qian Lin <qianl@nvidia.com>

Move graph and trace printing into helpers so main() stays readable, and sanitize env_name when writing proposal YAML.

Add _agent_node_id so background, embodiment, and item resolvers share one rule: graph node ids match agent-emitted reference strings.

Align relation resolution trace events with SpatialRelationSpec.reference.

IntentCompiler.compile() produces UnresolvedArenaEnvGraphSpec; AssetMatcher handles catalog binding. Keeps UnresolvedArenaEnvGraphSpec.resolve() unchanged.

Route item, background, and embodiment matching through one resolve_name API with required/preferred tags and consistent _best_match ordering. IntentCompiler calls it directly.

IntentCompiler passes item, background, or embodiment so trace stages stay decoupled from required_tags like object.

isaaclab-review-bot

Code Review: Add script for e2e agentic scene gen

Nice work integrating the full agentic environment generation pipeline! The end-to-end runner is well-structured, and the fixes for anchor relations, dome lighting, and reasoning_content handling address real issues. A few observations below:

🔴 High Priority

1. Overly broad exception catch in retry loop (environment_generation_agent.py)

except (json.JSONDecodeError, Exception) as exc:
    last_exc = exc

(json.JSONDecodeError, Exception) is equivalent to just except Exception since every exception is an Exception. This means non-retryable errors (e.g. pydantic.ValidationError indicating a genuine schema mismatch, TypeError, or even KeyboardInterrupt propagating) will be silently swallowed and retried. Consider narrowing to:

except (json.JSONDecodeError, pydantic.ValidationError) as exc:
    last_exc = exc

This way, truly unexpected errors still bubble up immediately rather than wasting retry budget.

2. assert statements are not retried — inconsistent error handling

The assert choices and assert route != "empty" statements inside the retry loop raise AssertionError, which is not caught by the except block. This means:

Empty choices (content filter / guardrail) → immediate crash (no retry)
Malformed JSON → retried

If the intent is to also retry on empty/filtered responses (which are often transient), replace the asserts with explicit checks that raise a retriable exception:

if not choices:
    raise RuntimeError("No choices returned — content filter or rate limit")

And catch RuntimeError in the retry handler.

🟡 Medium Priority

3. Pose.identity() only applied for is_anchor — potential ordering concern

In arena_env_graph_conversion_utils.py:

if spatial_constraint.kind == "is_anchor" and subject_asset.get_initial_pose() is None:
    subject_asset.set_initial_pose(Pose.identity())

This works when spatial constraints are processed in a single pass, but if the same asset appears in multiple constraints and a later constraint sets a pose, it will overwrite the identity pose. The logic is fine today, but a comment noting that identity is intentionally a fallback-only default (and won't clobber an explicit pose) would prevent future confusion.

4. AssetRegistry() instantiation in fallback path

light_cls = AssetRegistry().get_asset_by_name("light")
scene_assets.append(light_cls())

If AssetRegistry is not a singleton, this creates a fresh registry every time the fallback fires. If it is a singleton, a brief comment confirming that would help readers. Also: get_asset_by_name("light") could return None if the registry doesn't have a "light" entry — a guard/assertion would prevent a confusing TypeError: 'NoneType' object is not callable.

5. Runner has no graceful cleanup on mid-pipeline failure

If the runner fails between environment construction (Step 5) and env.close() (end of Step 6), the Isaac Sim app may leak GPU resources. A try/finally around the env lifecycle would be more robust:

env = builder.make_registered()
try:
    # ... run policy ...
finally:
    env.close()

🟢 Low Priority / Suggestions

6. _safe_stem could produce collisions for different prompts

The regex re.sub(r"[^\w.-]+", "_", name) collapses distinct env names into the same stem (e.g. "pick & place" and "pick / place" both → "pick_place"). Adding a short hash suffix (e.g. first 6 chars of the prompt's MD5) would prevent silent overwrites in --out-dir.

7. No tests for the new runner or the retry logic

The new environment_generation_runner.py (217 lines) and the retry mechanism in generate_spec don't have corresponding tests. At minimum, a unit test for the retry path (mocking client.chat.completions.create to return malformed JSON then valid JSON) would prevent regressions. The extract_response_text preference change (JSON-prefixed reasoning_content) also warrants a test given it affects all DeepSeek calls.

8. System prompt reformatting — functional change buried in style commit

The system prompt now includes two new behavioral rules (is_anchor requirement, lighting auto-injection). These are meaningful semantic changes but the commit message says "Reformat system prompt for better readability". Consider documenting these behavioral additions in the PR description or a separate commit for traceability.

Summary

The core logic is sound — the retry mechanism, dome light injection, and anchor handling all address real failure modes. The main concern is the exception handling in the retry loop (items 1-2): as written, it catches too broadly while simultaneously not catching the assertion failures it shares scope with. A focused fix there would make the retry behavior predictable and robust.

qianl-nv added 30 commits June 9, 2026 17:53

Add resolver impl

d36aaf6

Simplify intent resolver validations

a93f653

Add docstrings

632a5d4

split unit test

adc1d13

Use tags on embodiment for pulling ik variants

1283adc

IntentResovler outputs UnresolvedArenaEnvGraphSpec

1d74ae9

IntentResolver directly copies tasks

bceab51

Extract magic candidate limit to _MAX_CANDIDATES constant

60ea964

Replace all hard-coded [:10] / [:5] slices in TraceEvent candidate lists with a single named constant to make the limit explicit and easy to change in one place. Signed-off-by: Qian Lin <qianl@nvidia.com>

Change _ERROR_TRACE_STAGES to plain set with per-entry comments

3256044

Replace frozenset with a plain set literal and add an inline comment to each entry explaining when that error stage fires. Signed-off-by: Qian Lin <qianl@nvidia.com>

Remove embodiment fallback

95a311f

Add TODO for duplicate item node ID detection

34176b9

When two items share the same query and neither has an instance_name they produce the same node ID silently. Mark the spot for a follow-up. Signed-off-by: Qian Lin <qianl@nvidia.com>

Fix ArenaEnvGraphNodeType import in test_asset_resolver

b3de23d

Signed-off-by: Qian Lin <qianl@nvidia.com>

Extract dump helpers in try_environment_intent_schema

6a80006

Move graph and trace printing into helpers so main() stays readable, and sanitize env_name when writing proposal YAML.

Centralize agent node id policy in IntentResolver

f9befc6

Add _agent_node_id so background, embodiment, and item resolvers share one rule: graph node ids match agent-emitted reference strings.

Add name in todo

33054ad

Rename unknown_parent trace stage to unknown_reference

6847875

Align relation resolution trace events with SpatialRelationSpec.reference.

Rename IntentResolver and AssetResolver to compiler/matcher

51f822d

IntentCompiler.compile() produces UnresolvedArenaEnvGraphSpec; AssetMatcher handles catalog binding. Keeps UnresolvedArenaEnvGraphSpec.resolve() unchanged.

Unify AssetMatcher on resolve_name with tag pools

e5e180b

Route item, background, and embodiment matching through one resolve_name API with required/preferred tags and consistent _best_match ordering. IntentCompiler calls it directly.

Add explicit trace_prefix to AssetMatcher.resolve_name

ab1ef28

IntentCompiler passes item, background, or embodiment so trace stages stay decoupled from required_tags like object.

Simpify asset matcher logic

b82faad

Make asset match a helper function and move to intent_compiler

4d613e8

rename to fuzzy_match

14b3657

Improve doces

334333c

Add missing trace error

0f0e89b

Add env_gen e2e script with zero_policy runner

b56db59

Increase max token to avoid truncating response

2708ae7

qianl-nv added 6 commits June 10, 2026 20:48

Fix missing anchor and set anchor init pose

24b3639

Reformat system prompt for better readibility

430ddcf

Handling reasoning_content better

84b59cc

Add agent retries

37db33a

Merge branch 'main' into qianl/dev/agentic_e2e

1a3482a

Inject dome light in intent compiler

c40cb3d

isaaclab-review-bot Bot reviewed Jun 11, 2026

View reviewed changes

Base automatically changed from qianl/dev/agentic_resolver to main June 11, 2026 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add script for e2e agentic scene gen#778

Add script for e2e agentic scene gen#778
qianl-nv wants to merge 36 commits into
mainfrom
qianl/dev/agentic_e2e

qianl-nv commented Jun 11, 2026 •

edited

Loading

Uh oh!

isaaclab-review-bot Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

qianl-nv commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Detailed description

Uh oh!

isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

Code Review: Add script for e2e agentic scene gen

🔴 High Priority

🟡 Medium Priority

🟢 Low Priority / Suggestions

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qianl-nv commented Jun 11, 2026 •

edited

Loading