Robot Skills Framework — ROS 2 Jazzy

A discovery-based robot/instrument orchestration framework. Every skill — hardware-bound (MoveIt2 arm motion, pylabrobot instrument calls) or software-only (LLM-authored scripts) — is exposed as a self-advertising ROS 2 action. The orchestrator never hardcodes endpoints; it subscribes to latched <node>/skills manifests and dispatches goals over DDS.

Built on ROS 2 Jazzy + MoveIt2 + a Python BehaviorTree.CPP-v4-compatible executor. All action servers are Python; arm atoms drive MoveIt2 over its native action/service surface (no per-language MoveGroupInterface shim). See docs/adr/0001-action-server-language.md for the rationale.

Architecture

Three views of the same system.

Runtime architecture (docs/architecture.svg · docs/architecture.excalidraw)

What runs where: browser + agent host on the outside, orchestrator PC, per-provider PCs, the DDS plumbing between them.

Abstraction architecture (docs/abstraction.svg · docs/abstraction.excalidraw)

The Python class hierarchy in lib/robot_skills_py/: two parallel bases (SkillNode / InstrumentMultiActionNode), plus the @action decorator that every multi-action server is built from.

Dispatch chain (docs/dispatch.svg · docs/dispatch.excalidraw)

How a BT XML tag actually reaches a working skill: RosActionNode (BT-side adapter) ↔ SkillManifest (the contract published over the latched <node>/skills topic) ↔ SkillNode / InstrumentMultiActionNode (server-side base, hosts the ActionServer, dispatches its own ActionClient to MoveIt2 / pylabrobot / vendor SDKs). Includes an explicit "does MoveIt need a SkillNode?" answer.

Full text writeup: docs/architecture.md. Open the editable sources at excalidraw.com.

Core principles

A ROS action is the only skill API. Arm atoms (@action-decorated methods on a RobotSkillNode), instrument atoms (InstrumentMultiActionNodes wrapping pylabrobot / vendor SDKs), and agent-authored scripts all advertise the same way and are dispatched the same way.
Discovery, not registration. Every node hosting skills publishes a latched <node>/skills manifest with TRANSIENT_LOCAL durability and LIVELINESS_AUTOMATIC QoS. Restart and late-join are handled by DDS, not heartbeats.
Topology = trust. Code runs on the host that owns its execution context. The orchestrator never executes agent-authored code; it dispatches to ROS endpoints.
One process per provider host. Each robot/instrument PC runs a single multi-action node hosting all of that host's atoms — RobotArmActionServer for arms, InstrumentMultiActionNode subclasses for instruments.
Each top-level directory maps to one deployment role.

Class hierarchy (in lib/robot_skills_py/) — two parallel bases keyed off shape, not vendor:

Base class	Use for	Examples
`SkillNode` (single-action)	One action per process	`FrankaGripperSkillNode`, `ImagingSimNode`
`RobotSkillNode` (extends `SkillNode`)	MoveIt-coupled single-action atom	held in reserve — single-action arm atoms are rare today
`InstrumentSkillNode` (extends `SkillNode`)	Single-action atom with a sim/real backend hook	held in reserve
`InstrumentMultiActionNode` (parallel base)	Many actions on one shared device / FSM	`RobotArmActionServer` (12 arm atoms · Meca500/FR3), `MockRobotArmActionServer` (13 mock atoms), `RosbagSkillsNode`, `LiconicActionServer`, `HamiltonStarActionServer` (14 actions on a 12-state FSM)

@action(action_type, action_name, …) decorates each method on a multi-action subclass; the base class wires ActionServer constructors, manifest publication via SkillAdvertiser, parameter declarations, deferred-init, and (optionally) an AsyncioBridge for async-native backends. Architecture diagram: docs/abstraction.png (editable: docs/abstraction.excalidraw).

Directory layout

Top-level dir	Role	Runs on
lib/	Shared libraries — host-agnostic, no runtime	Consumed by every host
src/	Orchestrator processes + assets	Orchestrator PC
agent/	MCP-host processes + agent-local services	Agent's machine (anywhere on DDS)
providers/	Per-robot/instrument code	The robot/instrument PC

Providers

Provider	Code	Skills
Mecademic Meca500 6-DoF arm	providers/meca500/	All 12 arm atoms hosted in one `RobotArmActionServer` Python process talking to MoveIt2 and ros2_control over their native ROS interfaces (MoveToNamed/Joint/Cartesian, MoveCartesianLinear, Gripper, SetDIO, RobotEnable, CheckSystemReady, CheckCollision, UpdatePlanningScene, DetectObject, CapturePointCloud) — plus `RecordRosbag` / `StopRecording` from a sibling `rosbag_skills_node`.
Franka FR3 7-DoF arm + Franka Hand	providers/fr3/	Same 12 arm atoms via the unified `RobotArmActionServer` (re-parameterized for FR3 / `panda_arm` planning group). `franka_gripper_skill` adds a `FrankaGripperControl` atom that bridges `robot_skills_msgs/FrankaGripperControl` onto the upstream `franka_gripper` Move/Grasp/Homing action set.
Arm mock sim (Meca500 + FR3)	providers/arm_mock_sim/	Director-managed mock arm atoms. Single `MockRobotArmActionServer` (`InstrumentMultiActionNode`) hosts all 13 mock atoms — same action interface as the production server, no MoveIt2. Launched per-robot under `/meca500` or `/fr3` namespace by the Director's `meca500-mock-sim-run` / `fr3-mock-sim-run` tasks.
PBI Liconic STX44 incubator	providers/pbi_liconic/	`TakeIn`, `Fetch` (pylabrobot) — single `InstrumentMultiActionNode`.
Hamilton STAR liquid handler	providers/pbi_liconic/	14 actions on one `InstrumentMultiActionNode` (12-state FSM gated through `gate_goal`): `MoveResource`, `HandoffTransfer`, `PickUpCoreGripper`, `ReturnCoreGripper`, `Aspirate` / `Dispense` / `PickUpTips` / `DropTips` (single-channel and 96-channel), `JogChannel`, `Transfer`.
Imaging station (sim)	providers/imaging_station/	`ImagePlate` (idempotent) — single-action `SkillNode`. Sim backend writes placeholder PNGs; real driver (BMG / Tecan Spark / BioTek Cytation / microscope) swaps in via the same action.

Both Liconic and Hamilton are pulled from the guyEIT/pbi_liconic upstream as a git subtree. See CLAUDE.md for the subtree pull/push commands.

The imaging-station provider is a fresh in-tree provider, not a subtree — it owns the robot_skills_msgs/action/ImagePlate interface and ships a sim backend so the campaign behaviour tree can run a full Liconic ↔ Hamilton ↔ Imager ↔ Hamilton ↔ Liconic loop without imager hardware. Real driver backends (BMG / Tecan / BioTek / microscope) plug into the same action interface.

Quick start

Prerequisites

Pixi (Linux desktop). The Docker Compose path is deprecated — keep it only for legacy bring-up; new work uses pixi natively.

One-time bootstrap (single host)

None — pixi install -e <env> resolves the four interfaces/ packages and the three framework helpers in lib/ via path-deps. No conda channel preload required.

Optional: populate `~/channel` for cross-host distribution

pixi run channel-pack

Builds every workspace ROS package and harvests the resulting .conda artifacts into ~/channel/. Run on the Director PC once, then pixi run channel-serve-up exposes the channel over HTTP so worker PCs can pull pre-built artifacts instead of rebuilding from source. See Distributed launch.

Run the whole lab in sim (director-managed)

pixi run -e director director-up
# In the dashboard's Topology panel: pick "lab-sim" → Launch

Brings up every provider sim (mock arm atoms for Meca500 + FR3, sim backends for Liconic + Hamilton) as independent director-managed processes, then the orch_lab_sim orchestrator. Each can be killed and restarted from the dashboard independently.

Per-provider sim (one provider in isolation)

pixi run -e director meca500-mock-sim-run   # mock Meca500 atoms under /meca500 (no MoveIt2)
pixi run -e director fr3-mock-sim-run       # mock FR3 atoms under /fr3 (no MoveIt2)
pixi run hamilton-sim-test                  # STAR sim backend + skill_server
pixi run liconic-sim-test                   # Liconic sim backend + skill_server

Submit the matching test tree from src/robot_behaviors/trees/.

Single-box deployments

Mode	Command	What runs
`real-native`	`pixi run real-native-up`	full real-robot stack on one PC

Distributed production

Per-PC pixi envs install only what each box needs. The four ROS interface packages and three framework helpers are path-deps inside the workspace (built locally on each host), or — for fleet deploys — fetched as pre-built .conda artifacts from the Director's HTTP channel (see "Distributed launch" below).

Env	Target PC	Tasks
`orchestrator`	control PC (legacy single-instance)	`orchestrator-up`
★ `director`	control PC (fleet-wide, Tier 1 dashboard, central pixi channel)	`director-up`, `channel-serve-up`, `meca500-mock-sim-run`, `fr3-mock-sim-run`
★ `launch-agent`	every PC the Director should manage	`launch-agent-up`
`meca500-host`	Meca500 robot PC	`meca500-real-run`
`fr3-host`	FR3 robot PC	`fr3-real-run`
`liconic-host`	Liconic / Hamilton PC	`liconic-up`, `liconic-sim-up`, `hamilton-up`, `hamilton-sim-up`
`real-native`	single-box real robot	`real-native-up`

Distributed launch (Topology Director) ★

Three control layers move the "which pixi run task lives on which PC" knowledge out of operators' shells and onto a single dashboard / MCP / YAML surface:

Topology Director (src/topology_director/) on the control PC reads config/topology.yaml, supervises N skill-server orchestrator instances (each in its own ROS namespace so multiple BTs can run in parallel), and fans launch / stop / kill commands out to per-host launch agents.
Launch agents (src/launch_agent/) — one per PC — own subprocess spawn + SIGINT → SIGTERM → SIGKILL escalation + kill_node for arbitrary ROS nodes.
Central pixi channel — python -m http.server over ~/channel on the Director PC. Worker hosts pull pre-built binary packages (ros-jazzy-robot-skills-msgs, ros-jazzy-franka-msgs, ros-jazzy-liconic-msgs, ros-jazzy-hamilton-star-msgs, ros-jazzy-launch-agent, …) instead of each rebuilding from source.

All Pixi environments include the shared ros-network activation feature. The same defaults also live in scripts/ros-network-env.sh, which daemon startup, Director/launch-agent foreground startup, self-restart helpers, and launch-agent supervised service launches source before starting ROS processes:

RMW_IMPLEMENTATION=rmw_fastrtps_cpp
ROS_DOMAIN_ID=0
ROS_AUTOMATIC_DISCOVERY_RANGE=SUBNET
ROS_STATIC_PEERS=10.6.104.87        # Director (callus) — every host unicasts SPDP here

Only ROS-framework env vars — no DDS vendor XML, no per-host interface pinning. Fast DDS (the Jazzy default, REP-2000 Tier-1) listens on every NIC out of the box.

Why static peers instead of pure multicast: the lab's 10.6.104.0/24 and 10.6.105.0/24 subnets are L3-routed via a common gateway. ROS 2 SPDP multicast uses TTL=1 so packets don't survive the gateway hop. Setting ROBOT_BEHAVIOURS_ROS_PEERS to the Director's IP means every worker unicasts its SPDP announcement directly to the Director; the Director learns each worker's address from the received packet and can reply. No per-worker IP config needed on the Director. When the network admin enables multicast routing between the two subnets (see the polite request below), static peers can be removed and SUBNET multicast will handle everything automatically.

Per-host overrides: create scripts/ros-peers.local (gitignored, sourced by ros-network-env.sh) to add further peers without touching git — useful for worker-to-worker traffic or temporary IPs:

# scripts/ros-peers.local (on any host that needs extra peers)
ROBOT_BEHAVIOURS_ROS_PEERS="10.6.104.87 10.6.105.23"

Verifying multicast forwarding (to confirm when the network admin has made the change):

# Run simultaneously — listener on Director, sender on a worker
pixi run -e director multicast-smoke -- --listen --seconds 20
pixi run -e launch-agent multicast-smoke -- --send --count 10

multicast_received=yes means SUBNET multicast works across the subnets and static peers are no longer needed.

General DDS smoke tests:

# Inspect whether a worker launch-agent is visible from the Director
pixi run -e director ros-dds-check -- --host meca500-control

# Cross-host raw DDS beacon test
pixi run -e launch-agent ros-dds-beacon
pixi run -e director ros-dds-check -- --listen meca500-control

Note: meca500-control as a hostname resolves to its Tailscale IP via MagicDNS — that's irrelevant to lab discovery. DDS uses the host's 10.6.10x.* interface; the Tailscale address is never involved. pixi run ros-network-setup prints the effective values on any host; daemon logs record them at start.

Director PC bringup

git pull
pixi install -e director                      # resolves Director + launch-agent + dashboard envs (path-deps; no channel preload required)

pixi run channel-pack                         # OPTIONAL: harvests every workspace .conda into ~/channel for worker PCs
pixi run channel-serve-up                     # background daemon: serves ~/channel over HTTP on :8082
pixi run launch-agent-up                      # background daemon: lets the Director manage tasks on this PC too
pixi run director-up                          # background daemon: Director + rosbridge + dashboard

Open http://<director-host>:8081/ for the Tier 1 home (topology + instance + kill controls). Tier 2 per-instance dashboards open from any instance row at /instance/<name>.

Worker PC bringup (Meca500, FR3, Liconic, …)

Each worker only needs the launch agent — the Director will tell it what to spawn at runtime via /launch_agents/<hostname>/launch_task.

git clone <repo>                              # or git pull on an existing clone
cd robot_behaviours

# Get the prebuilt msgs + launch_agent packages from the Director's HTTP channel
# (or just `pixi install -e launch-agent` locally if you'd rather build from source on the worker).
echo 'http://<director-host>:8082' | pixi config append --workspace channels -

pixi install -e launch-agent                  # resolves the agent + msgs from the Director's channel
pixi run -e launch-agent launch-agent-up      # background daemon: agent listens at /launch_agents/<hostname>/

That's it. The agent's heartbeat /launch_agents/<hostname>/info (latched JSON) auto-registers the worker with the Director. Subsequent pixi run -e launch-agent launch-agent-status confirms it's alive, …-logs tails the daemon log, …-down stops it.

The agent shells out to pixi run -e <env> <task> on demand, so the worker also needs whichever per-host env hosts the actual workload (meca500-host, fr3-host, liconic-host, …) installed alongside launch-agent. Adding the Director's HTTP channel means those envs solve from prebuilt binaries too — no C++ toolchain required on the worker.

Remote launch-agent updates

Fleet machines follow the production git branch for operator-triggered updates. Development can continue on main; promote a known-good commit by fast-forwarding production to that commit and pushing it:

git checkout main
git pull --ff-only
git checkout production
git merge --ff-only main
git push origin production

Each launch agent owns updates for its local checkout. The Director only relays the request; the target host runs the git and pixi work itself:

/director/update_launch_agent receives a host name, or JSON such as {"host":"meca500-control","branch":"production","strategy":"ff-only","git":true}.
The Director forwards that JSON to /launch_agents/<host>/update_self.
The launch agent fetches origin/<branch>, refuses to continue if the checkout is dirty, fast-forwards the branch, runs pixi install -e launch-agent, sources scripts/ros-network-env.sh, then restarts its own daemon from a detached helper process.
The host briefly disappears from the dashboard while the daemon restarts, then re-advertises /launch_agents/<host>/info.

The default launch-agent parameters are self_update_remote:=origin, self_update_branch:=production, and self_update_strategy:=ff-only. reset-hard exists for locked-down production checkouts where local edits must be discarded, but ff-only is the normal safe mode. The heartbeat includes git branch, short SHA, and dirty state so the Tier 1 dashboard can show which revision each host is running.

Restart without git/pixi is separate: /director/restart_launch_agent relays to /launch_agents/<host>/restart_self, which only bounces the daemon.

Daemon controls

Foreground	Background daemon	Stop	Status	Logs
`channel-serve`	`channel-serve-up`	`channel-serve-down`	`channel-serve-status`	`channel-serve-logs`
`launch-agent-run`	`launch-agent-up`	`launch-agent-down`	`launch-agent-status`	`launch-agent-logs`
`director-run`	`director-up`	`director-down`	`director-status`	`director-logs`

Cross-platform (macOS bash 3 + Linux bash 5), no systemd / launchctl dependency. PIDs and logs under ~/.local/state/robot_behaviours/daemons/<name>/. -up is idempotent; -down does TERM → 5 s grace → KILL of the whole subprocess tree. scripts/daemon.sh start ... sources scripts/ros-network-env.sh and writes the effective ROS network values to the daemon log before detaching.

Bringing up a fleet from one place

# CLI
ros2 service call /director/launch_profile robot_skills_msgs/srv/LaunchProfile \
  '{profile_name: "real-parallel", continue_on_failure: false}'

# Headline capability: real-parallel spawns two skill_server instances under
# /orch_meca and /orch_fr3 so two BT campaigns run concurrently.
ros2 action send_goal /orch_meca/skill_server/execute_behavior_tree ... &
ros2 action send_goal /orch_fr3/skill_server/execute_behavior_tree  ... &

# Selective kill — only that one orchestrator dies, everything else keeps running.
ros2 service call /director/kill_node robot_skills_msgs/srv/KillRosNode \
  '{node_name: "/orch_meca/skill_server"}'

# Hard kill the whole fleet (escape hatch)
ros2 service call /director/kill_all robot_skills_msgs/srv/CancelActiveTask '{reason: ""}'

Every service is also exposed via the MCP server (launch_profile, kill_node, spawn_orch_instance, terminate_orch_instance, list_orch_instances, get_topology, …) and via the Tier 1 dashboard.

config/topology.yaml ships five reference profiles: lab-sim, real-full-serial, real-parallel (the headline parallel-BT capability), real-meca-only, and parallel-sim. Edit the file and call /director/reload_spec to pick up changes without restarting.

Multi-instance namespacing caveat. orchestrator.launch.py accepts namespace:=… and pushes the orchestrator nodes under it, but the skill_server still publishes/subscribes on absolute /skill_server/... paths in many places. Running multiple instances at the root namespace works today (legacy single-orchestrator path); running disjoint instances under /orch_meca / /orch_fr3 requires the absolute-topic conversion tracked in src/robot_skill_server/NAMESPACE_AUDIT.md. Tier 2 dashboards display a banner when they're viewing a non-root instance.

Common workflows

# Rebuild after editing a .msg/.srv/.action — pixi-build-ros tracks the
# extra-input-globs in interfaces/<pkg>/pixi.toml and rebuilds the path-dep'd
# .conda automatically on the next install. Just re-run the env install:
pixi install -e <env>

# Open a sourced ROS shell
pixi run lite-native-shell        # or real-native-shell, meca500-host shell, etc.

# Inspect the running ROS graph
pixi run status

# Run the test suite
pixi run test

Calling skills

# List the skill registry (merged view of every */skills topic)
ros2 service call /skill_server/get_skill_descriptions \
  robot_skills_msgs/srv/GetSkillDescriptions \
  '{include_compounds: true, include_pddl: false}'

# Execute a behavior tree XML
ros2 action send_goal /skill_server/execute_behavior_tree \
  robot_skills_msgs/action/ExecuteBehaviorTree \
  "$(python3 -c 'import yaml,sys; xml=open(sys.argv[1]).read(); print(yaml.safe_dump({"tree_xml": xml, "tree_name": "demo", "target_mode": 0}, default_style="|"))' src/robot_behaviors/trees/move_to_home.xml)"

# Compose a tree from skill steps
ros2 service call /skill_server/compose_task \
  robot_skills_msgs/srv/ComposeTask \
  '{
    task_name: "my_task",
    sequential: true,
    steps: [
      {skill_name: "move_to_named_config", parameters_json: "{\"config_name\": \"home\"}"},
      {skill_name: "gripper_control",       parameters_json: "{\"command\": \"open\"}"}
    ]
  }'

# Sanity-check a plan against PDDL preconditions / effects before running it
ros2 service call /skill_server/validate_plan \
  robot_skills_msgs/srv/ValidatePlan \
  '{
    initial_state: ["robot_initialized", "gripper_open"],
    steps: [
      {skill_name: "move_to_named_config", parameters_json: "{\"config_name\": \"home\"}"},
      {skill_name: "pick_object",           parameters_json: "{}"}
    ]
  }'
# Returns valid=true + final_state, OR valid=false + first_failing_step
# + missing_preconditions[] pointing at the offending entry.

target_mode on ExecuteBehaviorTree: 0 = MODE_REAL (default, back-compat), 1 = MODE_SIM (one-shot dry-run), 2 = MODE_SIM_THEN_REAL (sim → operator approval gate → real).

Sim-before-real workflow ★

Long-running plans (multi-step assays, hours-long incubations) get a fast pre-flight against a paired /sim/* action surface, then a human approval gate before the real phase runs.

Each provider launch (make_robot_skill_server_launch) accepts namespace_prefix:=/sim and wraps its action servers in a PushRosNamespace group.
SkillDiscovery filters out /sim/* manifests so the registry is single-source-of-truth on real entries.
BtExecutor reads a sim_namespace_prefix parameter (default /sim) and prepends it to every server name during the SIM phase — same XML, both phases. sim_lab.launch.py overrides this to "" because lab-sim has no separate real backend; MODE_SIM and MODE_REAL then both resolve to the bare-path atoms. lab-up keeps the default and brings up paired sim+real action servers on one box for end-to-end approval-gate testing.
On a successful sim phase, BtExecutor latches a DryRunStatus on /skill_server/dryrun_status and waits on /skill_server/approve_dry_run (ApproveDryRun.srv). The dashboard surfaces an approve/reject modal.

Long-lived campaign workflow ★

For trees that run for weeks — operators trickling plates in and out of the Liconic, cycling each through the Hamilton-iSWAP to the imaging station and back — the framework persists tree state to SQLite and resumes after a skill_server crash without losing progress.

What survives a restart:

A SQLite-backed persistent blackboard: any key prefixed persistent. is mirrored to ~/.local/state/skill_server/tasks/{task_id}/state.db (WAL mode). Type-checked at write — only JSON-serialisable values land on disk.
Per-node tick checkpoints on every control / decorator / loop node (Sequence index, Repeat iteration, RetryUntilSuccessful attempt, WaitUntil deadline). On resume, the executor re-ticks from the root and each Checkpointable node hydrates its index — no work is repeated past the last successful child.
Action-inflight reconciliation: every RosActionNode records (node_path, server_name, goal_uuid, idempotent) before submitting. If a goal was in flight at crash time, the resume path inspects the row — idempotent skills auto-resubmit; non-idempotent skills refuse and surface an alert for the operator to resolve via OperatorDecision.

New control / utility nodes in tree_executor.py:

KeepRunningUntilFailure, Repeat num_cycles="N", WhileDoElse
WaitUntil timestamp="{...}" — wall-clock-aware sleep (deadline-preserving across restart)
BlackboardCondition key="..." expected="..." — gate a subtree on a persistent flag
PopFromQueue / PushToQueue — list-valued blackboard queues for operator-driven work
AdvancePlate — post-cycle bookkeeping (increments cycle, recomputes next_due_at, retires when target reached)

Operator services — split between bb_operator (campaign-level state) and skill_server (framework-level execution control):

Service	Owner	Purpose
`/bb_operator/add_plate`	sidecar	append a plate dict to `persistent.plate_queue` (trickle-in)
`/bb_operator/retire_plate`	sidecar	flag `plates.{name}.retiring = true` so the in-flight cycle finishes naturally and isn't re-queued
`/bb_operator/pause_campaign`	sidecar	toggle `persistent.paused`; `BlackboardCondition` gate halts the next iteration boundary
`/bb_operator/operator_decision`	sidecar	resolve a stuck non-idempotent action (`retry` / `skip-as-success` / `skip-as-failure` / `abort-tree`)
`/skill_server/pause_execution`	bt_executor	framework-level pause that sets `ctx.paused`, honoured at step boundaries by every control-flow node — works for any tree, not just campaigns
`/skill_server/cancel_active_task`	bt_executor	session-independent hard cancel; walks `_current_ctx`, sets `cancelled`, and tears down in-flight goals — reachable by any client (the action-cancel handshake requires the original goal id, which a restarted dashboard doesn't have)

Live dashboard view. The dashboard's Campaign panel (added 2026-04-27) subscribes to /skill_server/persistent_state — a latched JSON snapshot of the active task's persistent blackboard, republished by bb_operator after every service handler + a 1 Hz timer. Renders the plate queue + per-name index as a table with cycle / cadence / next-due / status; surfaces Add Plate (modal dialog), Pause after step / Resume, Cancel (hard halt via /skill_server/cancel_active_task), and per-row Retire trash icons. The Campaign preset layout (Layouts → CAMPAIGN) tiles it alongside Task Monitor + BT Tree + Executor + Logs. Validated end-to-end via Playwright + chromium-headless: empty-state → submit campaign tree → AddPlate → Pause → Resume → Cancel.

The campaign is defined by a campaign file (src/robot_behaviors/campaigns/plate_imaging_standard.campaign.xml) whose phases name behavior trees; the orchestrator (campaign_manager + bb_operator) owns the scheduling / dispatch / advancement loop. The per-plate cycle plate_imaging_cycle.xml is LiconicFetch → ObsImagingSequence (ObsBot PTZ, 3 presets) → LiconicTakeIn — the plate is imaged in place on the Liconic transfer tray (the Liconic shovel presents it; the ObsBot camera images it). The Hamilton iSWAP is not in the imaging leg — it only loads/unloads the incubator at campaign start/end.

Skill idempotency is declared per atom in SkillDescription.idempotent (defaults to false). The resume path uses it to decide whether to auto-resubmit on goal-gone or to halt and ask the operator.

MCP / agent surface

Agents (LLMs, MCP clients) drive the lab through agent/robot_skill_mcp/ — a FastMCP stdio server bridged into ROS:

Tool	Purpose
`list_skills`, `list_trees`	introspect the runtime registry
`compose_task`	build a BT XML from steps
`execute_tree`	dispatch `/skill_server/execute_behavior_tree`
`register_script`, `list_scripts`, `delete_script`	session-scoped agent-authored skills
`get_dryrun_status`, `approve_dry_run`	drive the sim-then-real gate
`get_pending_agent_prompts`, `submit_agent_response`	answer in-tree `Agent*` leaves (yes/no, choice, freeform, image analysis)
`read_image_snapshot`	fetch a PNG written by `AgentAnalyzeImage` (path-traversal-guarded to `~/.local/state/skill_server/agent_snapshots/`)
★ `get_topology`, `list_orch_instances`	inspect fleet state — hosts, tracked tasks, running orchestrator instances, recent events
★ `launch_profile`, `stop_profile`, `stop_all`, `kill_all`	bring up / tear down a named profile from `config/topology.yaml`
★ `kill_node`	hard-kill a ROS node by FQN (Director fans out to every reachable launch agent)
★ `spawn_orch_instance`, `terminate_orch_instance`	stand up an extra orchestrator instance ad-hoc for a parallel BT
★ `refresh_host_env`, `reload_topology_spec`	trigger `pixi install` on a worker; re-read `config/topology.yaml`

Agent-authored scripts run on the agent's host via agent/robot_script_server/ — the orchestrator never executes agent code; from its point of view a registered script is just another RunScript-typed action.

In-tree agent decision steps (AgentConfirm, AgentInput, AgentDecide, AgentAnalyzeImage) flow the other direction: the BT publishes AgentPrompt on /skill_server/agent_prompts, an MCP client pulls it via get_pending_agent_prompts, and the response goes back through submit_agent_response. Sample tree at src/robot_behaviors/trees/test_agent_inspect.xml.

Live update

pixi install -e <env> uses symlink-install for Python and XML share files; edits are live on save.

Change type	Action
Behavior tree XML	save the file — `BtExecutor` polls every 2 s, dashboard updates over the latched topic
Python skill / orchestrator code	save the file — symlinked. Restart the host node if it caches state at startup
`interfaces/<msgs>/*.msg/.srv/.action`	`pixi install -e <env>` rebuilds the path-dep'd `.conda` and consumers, then restart the node
`lib/robot_skills_py/` base classes	`pixi install -e <env>` (pixi-build-ros rebuilds the path-dep'd `.conda` and consumers) + node restart
Frontend	`vite build` + `pixi install -e director` (or `pixi run dashboard-dev` for hot reload)

Adding a skill

See docs/adding-skills.md for the three authoring patterns (A: an @action method on RobotArmActionServer — most arm-shaped atoms land here; B: a standalone SkillNode subclass for one-off single-action atoms; C: an InstrumentMultiActionNode subclass for multi-action instruments). Plus the Claude Code commands:

/new-skill-atom — generic primitive (lib/robot_arm_skills) or provider-specific atom
/new-compound-skill — vetted, persisted compound skill
/new-behavior-tree — XML tree
/debug-skill — diagnose a failing skill or tree

Name		Name	Last commit message	Last commit date
Latest commit History 227 Commits
.claude		.claude
.github/workflows		.github/workflows
.vite/deps		.vite/deps
.vscode		.vscode
agent		agent
config		config
docs		docs
interfaces		interfaces
lib		lib
providers		providers
scripts		scripts
src		src
vendor/ndi		vendor/ndi
.clang-format		.clang-format
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.mcp.json		.mcp.json
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
README.md		README.md
Sociius Style Guide _Standalone_.html		Sociius Style Guide _Standalone_.html
TODO		TODO
colcon_defaults.yaml		colcon_defaults.yaml
package-lock.json		package-lock.json
package.json		package.json
package.xml		package.xml
pixi.lock		pixi.lock
pixi.lock.bak		pixi.lock.bak
pixi.toml		pixi.toml
plate_timelapse.mp4		plate_timelapse.mp4
start_sim.sh		start_sim.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robot Skills Framework — ROS 2 Jazzy

Architecture

Directory layout

Providers

Quick start

Prerequisites

One-time bootstrap (single host)

Optional: populate `~/channel` for cross-host distribution

Run the whole lab in sim (director-managed)

Per-provider sim (one provider in isolation)

Single-box deployments

Distributed production

Distributed launch (Topology Director) ★

Director PC bringup

Worker PC bringup (Meca500, FR3, Liconic, …)

Remote launch-agent updates

Daemon controls

Bringing up a fleet from one place

Common workflows

Calling skills

Sim-before-real workflow ★

Long-lived campaign workflow ★

MCP / agent surface

Live update

Adding a skill

Further reading

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Robot Skills Framework — ROS 2 Jazzy

Architecture

Directory layout

Providers

Quick start

Prerequisites

One-time bootstrap (single host)

Optional: populate ~/channel for cross-host distribution

Run the whole lab in sim (director-managed)

Per-provider sim (one provider in isolation)

Single-box deployments

Distributed production

Distributed launch (Topology Director) ★

Director PC bringup

Worker PC bringup (Meca500, FR3, Liconic, …)

Remote launch-agent updates

Daemon controls

Bringing up a fleet from one place

Common workflows

Calling skills

Sim-before-real workflow ★

Long-lived campaign workflow ★

MCP / agent surface

Live update

Adding a skill

Further reading

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Optional: populate `~/channel` for cross-host distribution

Packages