Add agent intent to arena graph spec conversion#770
Conversation
There was a problem hiding this comment.
Code Review: Add resolver for agent intent to arena graph spec conversion
Overall, this is a well-structured implementation with good separation of concerns. The trace-based error reporting pattern (instead of exceptions) is a solid choice for a pipeline where partial resolution is acceptable. The test suite is comprehensive and well-documented. Below are specific findings:
1. 🔴 ObjectRelationLibraryRegistry() instantiated per spatial constraint (Performance / Correctness)
File: intent_resolver.py, line ~143 (inside _build_spatial_constraint)
relation_registry = ObjectRelationLibraryRegistry()This creates a new registry instance for every spatial constraint in the initial state graph. If ObjectRelationLibraryRegistry uses SingletonMeta (like AssetRegistry), this is merely wasteful. If it doesn't, it could be re-parsing/re-loading data on each call.
Suggestion: Instantiate once in __init__ or resolve() and reuse:
def __init__(self, registry: AssetRegistry | None = None) -> None:
self.registry = registry or AssetRegistry()
self._relation_registry = ObjectRelationLibraryRegistry()
...2. 🟡 resolve_embodiment hardcodes fallback without checking registry (Silent failure risk)
File: asset_resolver.py, lines ~109-111
self.trace.append(TraceEvent("embodiment.miss", name, None, note="falling back to franka_ik"))
return "franka_ik"If "franka_ik" is not registered in the catalog (e.g., in a deployment with a different robot fleet), the downstream _resolve_embodiment_node will create a node named "franka_ik" that doesn't correspond to any actual asset. The fallback assumption isn't validated.
Suggestion: Either verify "franka_ik" is registered before returning it, or make the fallback configurable:
FALLBACK_EMBODIMENT = "franka_ik"
# ...
if not self.registry.is_registered(FALLBACK_EMBODIMENT):
self.trace.append(TraceEvent("embodiment.fallback_missing", name, None))
return name # or raise3. 🟡 Node ID collision when multiple items resolve to same query string
File: intent_resolver.py, line ~96
id=item.instance_name or item.query,If two Item objects share the same query (e.g., two "bowl" items for different positions) and neither has an instance_name, both would produce the same node ID. Since nodes is a list (not a dict), duplicates would silently appear, and known_ids (a set) would mask the second.
Suggestion: Add duplicate-ID detection after building nodes, or document that callers must provide instance_name for duplicate queries:
seen_ids: set[str] = set()
node_id = item.instance_name or item.query
if node_id in seen_ids:
self.trace.append(TraceEvent("item.duplicate_id", node_id, None))
continue # or auto-suffix
seen_ids.add(node_id)4. 🟡 _LLM_GENERATED_DIR written unconditionally without mkdir
File: try_environment_intent_schema.py, line ~133
out_path = _LLM_GENERATED_DIR / f"{env_graph_spec.env_name}_proposal.yaml"
env_graph_spec.write_yaml(out_path)If the llm_generated/ directory doesn't exist, this will raise FileNotFoundError. The path is resolved relative to the source tree (Path(__file__).resolve().parents[2]), so first-time runs in a fresh checkout would fail.
Suggestion:
out_path.parent.mkdir(parents=True, exist_ok=True)
env_graph_spec.write_yaml(out_path)5. 🟢 _pool_for returns empty list on empty intersection — consider trace event
File: asset_resolver.py, lines ~145-150
def _pool_for(self, tags: list[str]) -> list[str]:
assets = None
for tag in tags:
tagged = {a.name for a in self.registry.get_assets_by_tag(tag)}
assets = tagged if assets is None else assets & tagged
return sorted(assets or [])When tags is an empty list, assets remains None and the method returns []. This is handled by callers (e.g., item.tag_pool_empty trace), but calling _pool_for([]) from resolve_name with required_tag=None goes to self.registry.get_all_keys() instead — which is correct but relies on the caller never passing an empty tag list to _pool_for. A defensive check or docstring note would clarify intent.
6. 🟢 Test file is excellent — minor gap in resolve_name fuzzy path coverage
File: tests/test_resolver.py
The test suite thoroughly covers item resolution strategies, embodiment fallbacks, spatial constraints, and task wiring. One gap: resolve_name with required_tag=None (the fallback-to-all-keys path) is exercised only implicitly via the background resolver. A direct unit test for this branch would strengthen coverage, e.g.:
def test_resolve_name_fuzzy_without_tag():
resolver = _make_resolver()
cls = resolver.asset_resolver.resolve_name("maple_tabl", required_tag=None)
assert cls is not None and cls.name == "maple_table"Summary: Clean implementation with strong test coverage. The main actionable items are the per-constraint registry instantiation (#1) and the potential for node ID collisions (#3). The other findings are lower-priority hardening improvements.
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR adds AssetResolver and IntentResolver classes for converting agent-generated EnvironmentIntentSpec to ArenaEnvGraphSpec as part of the agentic environment generation pipeline. The implementation is well-structured with comprehensive tracing for debugging and a defensive "never raise" design that logs errors instead of failing. The test coverage is thorough with 530 lines of pure-Python unit tests covering all resolution strategies.
Design Assessment
Design is sound. The resolver architecture appropriately separates concerns:
AssetResolverhandles catalog binding (exact → substring → fuzzy matching)IntentResolverorchestrates the full conversion pipeline with proper trace aggregation
The "trace events" pattern for error tracking is a good choice for this LLM-in-the-loop workflow where partial success needs to be communicated back to the agent.
Findings
🟡 Warning: ObjectRelationLibraryRegistry instantiated on every constraint — intent_resolver.py:143
A new ObjectRelationLibraryRegistry() instance is created inside _build_spatial_constraint which is called for each spatial relation. If this registry has initialization overhead (singleton lookup, catalog loading), this could be inefficient for specs with many constraints.
Consider moving the registry instantiation to resolve() or caching it as an instance attribute.
🟡 Warning: Hardcoded fallback embodiment may not exist — asset_resolver.py:117
When embodiment resolution fails completely, the code falls back to "franka_ik" unconditionally. If franka_ik isn't registered in the actual registry (e.g., in a test or reduced deployment), the downstream code will receive a string that doesn't correspond to a real asset.
Consider validating that the fallback exists, or making it configurable.
🔵 Suggestion: Type hint return could be more specific — asset_resolver.py:52
The resolve_item method returns type | None but the actual return is the asset class from the registry. If the registry returns something more specific (e.g., a protocol or base class), consider tightening the return type for better IDE support.
🔵 Suggestion: _derive_env_name could handle edge cases — intent_resolver.py:95
If spec.tasks is empty, first_kind becomes "task", resulting in generic names like llm_gen_table_task. This is handled but might benefit from a comment explaining the design intent for empty-task specs.
Test Coverage
- Coverage adequate: Yes — The 530-line test file comprehensively covers:
- All resolution strategies (exact, substring, fuzzy, tag-pool relaxation, miss)
- Embodiment family defaults and fallback behavior
- Spatial constraint construction (binary/unary relations, skip paths)
- Task spec wiring and multi-task scenarios
- Error tracking via
has_resolution_errors - Edge cases (empty state graph, instance name overrides)
- Test quality: Good — Tests use proper dependency injection via
FakeAssetRegistry, isolating the resolver logic from Isaac Sim/Kit dependencies.
CI Status
Pre-commit checks are currently in progress.
Verdict
Minor fixes needed
The implementation is well-designed and thoroughly tested. The warnings above are optimizations/robustness improvements rather than correctness issues. The resolver correctly handles all expected inputs and degrades gracefully on unexpected ones.
Update (commit 5021dd3): Previous concerns addressed. No new issues in the incremental changes.
Status of previous findings:
-
✅ ObjectRelationLibraryRegistry per-constraint instantiation — Fixed. The entire
ObjectRelationLibraryRegistryusage was removed from_build_spatial_constraint. Kind validation now happens upstream inSpatialRelationSpec._validate_kind_and_arity, eliminating the per-constraint registry lookup entirely. -
✅ Hardcoded IK_DEFAULTS dict — Fixed. The static
IK_DEFAULTSmapping was replaced with a dynamic tag-based lookup (_pool_for(["embodiment", "ik"])). Embodiment classes now carrytags = ["embodiment", "ik"], making family → IK variant resolution data-driven. -
⚠️ Hardcoded fallback to"franka_ik"— Still present (not changed in this commit). The original inline comment stands.
Other improvements in this commit:
- Comprehensive docstrings added to all
AssetResolverpublic methods - Test suite refactored: monolithic
test_resolver.pysplit intotest_asset_resolver.py+test_intent_resolver.pywith shared fixtures in_resolver_test_helpers.py - Removed dead code:
"in"relation skip logic and"relation.initial.unsupported_kind"error stage (validation moved to model layer)---
Update (commit 5d5bb67): Major feature commit adding UnresolvedArenaEnvGraphSpec.resolve() and TaskTransition system. No new issues found.
New in this commit:
task_transition.py: DeclarativeTaskTransition/Relocate/SetStatedataclasses describing how task success changes env-graph stateUnresolvedArenaEnvGraphSpec.resolve(): Chains sequential tasks into state specs (N tasks → N+1 states withis_delta=Truedeltas)ArenaEnvGraphSpecBaseextracted as shared base class forArenaEnvGraphSpecandUnresolvedArenaEnvGraphSpecPickAndPlaceTask.success_state_transition()andRotateRevoluteJointTask.success_state_transition()implementedTaskBase.success_state_transition()abstract base withNotImplementedErrorfallbackArenaEnvGraphStateSpec.is_deltafield added to distinguish full snapshots from differential statesgraph_spec_utils.pyrefactored:assert_references_exist→assert_constraint_references+assert_task_wiring(cleaner separation)- Comprehensive new tests:
test_task_transitions.py,test_arena_env_graph_spec.py(resolve chain tests), groundtruth init YAML try_environment_intent_schema.pynow invokes the full resolve pipeline and writes YAML output
Status of previous findings:
⚠️ Hardcoded fallback to"franka_ik"— Still present, unchanged. Original inline comment stands.- ✅ All other previously-noted issues remain addressed.---
Update (commit 415f840): All remaining concerns addressed. Clean incremental improvement.
Status of previous findings:
-
✅ Hardcoded fallback to
"franka_ik"— Fixed!resolve_embodimentnow returnsNoneon miss instead of silently falling back. The caller (_resolve_embodiment_node) correctly handlesNoneby omitting the node. This is a much cleaner design — the caller decides what to do when resolution fails. -
✅ Embodiment node ID = resolved name (P1 issue) — Fixed!
_resolve_embodiment_nodenow assignsid=query(original agent-emitted name) consistent with other node types. Tasks referencing the robot by its original name will now resolve correctly againstknown_ids.
Other improvements in this commit:
_build_task_specs→_trace_tasks: Tasks now pass through without being needlessly copied — the method only emits trace events for unknown param references. Cleaner separation of concerns._MAX_CANDIDATESmodule-level constant replaces magic number10throughoutasset_resolver.py_ERROR_TRACE_STAGESexpanded to include"embodiment.miss"(now that it is a true terminal failure)frozenset→ mutablesetfor_ERROR_TRACE_STAGES(minor style; no functional impact)- Docstrings reformatted to inline comments (style preference)
mkdir(parents=True, exist_ok=True)guards output directory intry_environment_intent_schema.py- Tests updated to validate new node-ID semantics and embodiment-miss behavior
No new issues found. The incremental changes are focused and well-tested.---
Update (commit 29c97fd): Minor test and import maintenance. No new issues.
Changes:
- Import path updated:
arena_env_graph_spec→arena_env_graph_types(module rename) - Test
test_has_resolution_errors_false_when_only_relaxation_or_fallbacksplit into two focused tests:test_has_resolution_errors_false_when_only_tag_relaxation— validates tag-pool relaxation is still non-fataltest_has_resolution_errors_true_when_embodiment_unknown— validates unknown embodiment IS a resolution error (consistent withembodiment.missbeing in_ERROR_TRACE_STAGES)
Status: ✅ All previous findings remain addressed. Tests correctly reflect the embodiment-miss-as-error semantic from the prior commit.---
Update (b82faad): Clean rename/refactor commit. No new issues.
Changes:
AssetResolver→AssetMatcher,IntentResolver→IntentCompiler(class + file renames)- Test helpers renamed accordingly (
_intent_compiler_test_helpers.py) - Embodiment
tags = ["embodiment", "ik"]added toDroidDifferentialIK,FrankaIK,G1WBCPink,GR1T2Pink ArenaEnvGraphSpecBaseextracted;from_yaml/from_dict/write_yaml/to_dict/nodes_by_idmoved to basegraph_spec_utils:assert_references_existsplit intoassert_constraint_references+assert_task_wiringtry_environment_intent_schema.pynow runs the full compile → resolve → write-YAML pipeline- Test YAML groundtruth updated to use delta states (
is_delta: true)
Status: ✅ All previous findings remain addressed. No new issues introduced.---
Update (0f0e89b): Architecture simplification — AssetMatcher class inlined as module-level functions. No new issues.
Changes:
asset_matcher.pydeleted;TraceEvent+ matching logic moved intointent_compiler.pyasmatch_asset()and_fuzzy_match()free functionsIntentCompilerno longer owns anAssetMatcherinstance;_resolve_*_nodemethods callmatch_asset()directly and extendself.tracewith returned events_ASSET_ERROR_STAGESexpanded:empty_poolstages now correctly treated as resolution errors for all three categories (background, embodiment, item)- Test consolidation:
test_asset_matcher.py+_intent_compiler_test_helpers.pymerged intotest_intent_compiler.pywith inlined fixtures - New tests added:
test_embodiment_required_tag_pool_empty_records_error,test_item_required_tag_pool_empty_records_error - Integration test
test_get_assets_with_all_tagsadded totest_asset_registry.py
Assessment: Good simplification. The functional approach (match_asset returning (result, events) tuple) is cleaner than the stateful class approach — trace ownership is now explicit at call sites. The expanded error stages correctly catch empty-pool scenarios that were previously silent omissions.
Status: ✅ All previous findings remain addressed. No new issues introduced.
Greptile SummaryThis PR adds
Confidence Score: 4/5Safe to merge with one fix: G1WBCAgilePinkEmbodiment is missing the ik tag and will resolve incorrectly for agile G1 queries. G1WBCAgilePinkEmbodiment, which uses PINK IK control, was not tagged ik alongside the other three IK embodiments changed in this PR. Any agent query targeting the agile G1 IK variant will silently fall back to the full embodiment pool and may select the wrong robot. The rest of the PR is logically sound and well covered. isaaclab_arena/embodiments/g1/g1.py - G1WBCAgilePinkEmbodiment is missing the ik tag Important Files Changed
Reviews (6): Last reviewed commit: "Append asset-matcher trace events in pla..." | Re-trigger Greptile |
5021dd3 to
5d5bb67
Compare
mkdir(parents=True, exist_ok=True) on the output path's parent so a first-time run on a fresh checkout doesn't crash with FileNotFoundError when the llm_generated/ directory doesn't exist yet. Signed-off-by: Qian Lin <qianl@nvidia.com>
Replace all hard-coded [:10] / [:5] slices in TraceEvent candidate lists with a single named constant to make the limit explicit and easy to change in one place. Signed-off-by: Qian Lin <qianl@nvidia.com>
Replace frozenset with a plain set literal and add an inline comment to each entry explaining when that error stage fires. Signed-off-by: Qian Lin <qianl@nvidia.com>
Replace the Attributes block in TraceEvent with per-field docstrings, and move the numbered resolution strategy steps from resolve_item's docstring into inline comments at each corresponding code branch. Signed-off-by: Qian Lin <qianl@nvidia.com>
Consistent with background and item nodes, set id=query on the
embodiment node so task params that reference the robot by the
agent-emitted name (e.g. "franka") resolve correctly against known_ids,
rather than silently missing when the resolved name differs ("franka_ik").
Signed-off-by: Qian Lin <qianl@nvidia.com>
When two items share the same query and neither has an instance_name they produce the same node ID silently. Mark the spot for a follow-up. Signed-off-by: Qian Lin <qianl@nvidia.com>
Signed-off-by: Qian Lin <qianl@nvidia.com>
Now that resolve_embodiment returns None on miss (no silent fallback), "embodiment.miss" is an error stage. Split the old combined test into two: one verifying tag-pool relaxation is non-fatal, and one asserting an unknown embodiment is correctly reported as a resolution error. Signed-off-by: Qian Lin <qianl@nvidia.com>
29c97fd to
6921657
Compare
Move graph and trace printing into helpers so main() stays readable, and sanitize env_name when writing proposal YAML.
Add _agent_node_id so background, embodiment, and item resolvers share one rule: graph node ids match agent-emitted reference strings.
Align relation resolution trace events with SpatialRelationSpec.reference.
IntentCompiler.compile() produces UnresolvedArenaEnvGraphSpec; AssetMatcher handles catalog binding. Keeps UnresolvedArenaEnvGraphSpec.resolve() unchanged.
Route item, background, and embodiment matching through one resolve_name API with required/preferred tags and consistent _best_match ordering. IntentCompiler calls it directly.
IntentCompiler passes item, background, or embodiment so trace stages stay decoupled from required_tags like object.
alexmillane
left a comment
There was a problem hiding this comment.
Looks clean!
A bunch of nits, nothing substantial!
Rename TraceEvent to IntentResolutionTraceEvent, use absolute imports, document the three-stage match_asset fallback order, and record ambiguous fuzzy matches in the trace note. Signed-off-by: Qian Lin <qianl@nvidia.com>
Move match_asset helpers out of IntentCompiler into a dedicated module and add test_asset_matcher with shared fake-registry fixtures. Signed-off-by: Qian Lin <qianl@nvidia.com>
Pass the compiler trace into match_asset so callers no longer unpack (chosen, events) tuples, and add a stage-3-only fuzzy match test. Signed-off-by: Qian Lin <qianl@nvidia.com>
qianl-nv
left a comment
There was a problem hiding this comment.
Thanks for the helpful nits Alex! Implemented all your comments except the UnresolvedArenaEnvGraphSpec renamimg, opening a separate MR for it.
Summary
Add AssetResolver and IntentResolver converting agent output to arena spec
Detailed description
This implements Part 2b of the Agentic Env Gen pipeline, converting agent output (IntentSpec) to UnresolvedArenaEnvGraphSpec
Add match_asset to find loose match of names to exact object from the asset registry
Add ik tags to embodiments. This allows matching a robot family to its ik variant if multiple variants exist.
Add IntentCompiler for converting the agent generated intent spec to an arena env spec
This adds a building block in the agentic env gen pipeline. It has no impact on existing functions
To run: python -m isaaclab_arena_examples.agentic_environment_genenration.try_environment_intent_schema