Skip to content

guyEIT/robot_behaviours

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

227 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Robot Skills Framework — ROS 2 Jazzy

A discovery-based robot/instrument orchestration framework. Every skill — hardware-bound (MoveIt2 arm motion, pylabrobot instrument calls) or software-only (LLM-authored scripts) — is exposed as a self-advertising ROS 2 action. The orchestrator never hardcodes endpoints; it subscribes to latched <node>/skills manifests and dispatches goals over DDS.

Built on ROS 2 Jazzy + MoveIt2 + a Python BehaviorTree.CPP-v4-compatible executor. All action servers are Python; arm atoms drive MoveIt2 over its native action/service surface (no per-language MoveGroupInterface shim). See docs/adr/0001-action-server-language.md for the rationale.

Architecture

Three views of the same system.

Runtime architecture (docs/architecture.svg · docs/architecture.excalidraw)

Runtime architecture

What runs where: browser + agent host on the outside, orchestrator PC, per-provider PCs, the DDS plumbing between them.

Abstraction architecture (docs/abstraction.svg · docs/abstraction.excalidraw)

Abstraction architecture

The Python class hierarchy in lib/robot_skills_py/: two parallel bases (SkillNode / InstrumentMultiActionNode), plus the @action decorator that every multi-action server is built from.

Dispatch chain (docs/dispatch.svg · docs/dispatch.excalidraw)

Dispatch chain

How a BT XML tag actually reaches a working skill: RosActionNode (BT-side adapter) ↔ SkillManifest (the contract published over the latched <node>/skills topic) ↔ SkillNode / InstrumentMultiActionNode (server-side base, hosts the ActionServer, dispatches its own ActionClient to MoveIt2 / pylabrobot / vendor SDKs). Includes an explicit "does MoveIt need a SkillNode?" answer.

Full text writeup: docs/architecture.md. Open the editable sources at excalidraw.com.

Core principles

  1. A ROS action is the only skill API. Arm atoms (@action-decorated methods on a RobotSkillNode), instrument atoms (InstrumentMultiActionNodes wrapping pylabrobot / vendor SDKs), and agent-authored scripts all advertise the same way and are dispatched the same way.
  2. Discovery, not registration. Every node hosting skills publishes a latched <node>/skills manifest with TRANSIENT_LOCAL durability and LIVELINESS_AUTOMATIC QoS. Restart and late-join are handled by DDS, not heartbeats.
  3. Topology = trust. Code runs on the host that owns its execution context. The orchestrator never executes agent-authored code; it dispatches to ROS endpoints.
  4. One process per provider host. Each robot/instrument PC runs a single multi-action node hosting all of that host's atoms — RobotArmActionServer for arms, InstrumentMultiActionNode subclasses for instruments.
  5. Each top-level directory maps to one deployment role.

Class hierarchy (in lib/robot_skills_py/) — two parallel bases keyed off shape, not vendor:

Base class Use for Examples
SkillNode (single-action) One action per process FrankaGripperSkillNode, ImagingSimNode
RobotSkillNode (extends SkillNode) MoveIt-coupled single-action atom held in reserve — single-action arm atoms are rare today
InstrumentSkillNode (extends SkillNode) Single-action atom with a sim/real backend hook held in reserve
InstrumentMultiActionNode (parallel base) Many actions on one shared device / FSM RobotArmActionServer (12 arm atoms · Meca500/FR3), MockRobotArmActionServer (13 mock atoms), RosbagSkillsNode, LiconicActionServer, HamiltonStarActionServer (14 actions on a 12-state FSM)

@action(action_type, action_name, …) decorates each method on a multi-action subclass; the base class wires ActionServer constructors, manifest publication via SkillAdvertiser, parameter declarations, deferred-init, and (optionally) an AsyncioBridge for async-native backends. Architecture diagram: docs/abstraction.png (editable: docs/abstraction.excalidraw).

Directory layout

Top-level dir Role Runs on
lib/ Shared libraries — host-agnostic, no runtime Consumed by every host
src/ Orchestrator processes + assets Orchestrator PC
agent/ MCP-host processes + agent-local services Agent's machine (anywhere on DDS)
providers/ Per-robot/instrument code The robot/instrument PC

Providers

Provider Code Skills
Mecademic Meca500 6-DoF arm providers/meca500/ All 12 arm atoms hosted in one RobotArmActionServer Python process talking to MoveIt2 and ros2_control over their native ROS interfaces (MoveToNamed/Joint/Cartesian, MoveCartesianLinear, Gripper, SetDIO, RobotEnable, CheckSystemReady, CheckCollision, UpdatePlanningScene, DetectObject, CapturePointCloud) — plus RecordRosbag / StopRecording from a sibling rosbag_skills_node.
Franka FR3 7-DoF arm + Franka Hand providers/fr3/ Same 12 arm atoms via the unified RobotArmActionServer (re-parameterized for FR3 / panda_arm planning group). franka_gripper_skill adds a FrankaGripperControl atom that bridges robot_skills_msgs/FrankaGripperControl onto the upstream franka_gripper Move/Grasp/Homing action set.
Arm mock sim (Meca500 + FR3) providers/arm_mock_sim/ Director-managed mock arm atoms. Single MockRobotArmActionServer (InstrumentMultiActionNode) hosts all 13 mock atoms — same action interface as the production server, no MoveIt2. Launched per-robot under /meca500 or /fr3 namespace by the Director's meca500-mock-sim-run / fr3-mock-sim-run tasks.
PBI Liconic STX44 incubator providers/pbi_liconic/ TakeIn, Fetch (pylabrobot) — single InstrumentMultiActionNode.
Hamilton STAR liquid handler providers/pbi_liconic/ 14 actions on one InstrumentMultiActionNode (12-state FSM gated through gate_goal): MoveResource, HandoffTransfer, PickUpCoreGripper, ReturnCoreGripper, Aspirate / Dispense / PickUpTips / DropTips (single-channel and 96-channel), JogChannel, Transfer.
Imaging station (sim) providers/imaging_station/ ImagePlate (idempotent) — single-action SkillNode. Sim backend writes placeholder PNGs; real driver (BMG / Tecan Spark / BioTek Cytation / microscope) swaps in via the same action.

Both Liconic and Hamilton are pulled from the guyEIT/pbi_liconic upstream as a git subtree. See CLAUDE.md for the subtree pull/push commands.

The imaging-station provider is a fresh in-tree provider, not a subtree — it owns the robot_skills_msgs/action/ImagePlate interface and ships a sim backend so the campaign behaviour tree can run a full Liconic ↔ Hamilton ↔ Imager ↔ Hamilton ↔ Liconic loop without imager hardware. Real driver backends (BMG / Tecan / BioTek / microscope) plug into the same action interface.

Quick start

Prerequisites

  • Pixi (Linux desktop). The Docker Compose path is deprecated — keep it only for legacy bring-up; new work uses pixi natively.

One-time bootstrap (single host)

None — pixi install -e <env> resolves the four interfaces/ packages and the three framework helpers in lib/ via path-deps. No conda channel preload required.

Optional: populate ~/channel for cross-host distribution

pixi run channel-pack

Builds every workspace ROS package and harvests the resulting .conda artifacts into ~/channel/. Run on the Director PC once, then pixi run channel-serve-up exposes the channel over HTTP so worker PCs can pull pre-built artifacts instead of rebuilding from source. See Distributed launch.

Run the whole lab in sim (director-managed)

pixi run -e director director-up
# In the dashboard's Topology panel: pick "lab-sim" → Launch

Brings up every provider sim (mock arm atoms for Meca500 + FR3, sim backends for Liconic + Hamilton) as independent director-managed processes, then the orch_lab_sim orchestrator. Each can be killed and restarted from the dashboard independently.

Per-provider sim (one provider in isolation)

pixi run -e director meca500-mock-sim-run   # mock Meca500 atoms under /meca500 (no MoveIt2)
pixi run -e director fr3-mock-sim-run       # mock FR3 atoms under /fr3 (no MoveIt2)
pixi run hamilton-sim-test                  # STAR sim backend + skill_server
pixi run liconic-sim-test                   # Liconic sim backend + skill_server

Submit the matching test tree from src/robot_behaviors/trees/.

Single-box deployments

Mode Command What runs
real-native pixi run real-native-up full real-robot stack on one PC

Distributed production

Per-PC pixi envs install only what each box needs. The four ROS interface packages and three framework helpers are path-deps inside the workspace (built locally on each host), or — for fleet deploys — fetched as pre-built .conda artifacts from the Director's HTTP channel (see "Distributed launch" below).

Env Target PC Tasks
orchestrator control PC (legacy single-instance) orchestrator-up
director control PC (fleet-wide, Tier 1 dashboard, central pixi channel) director-up, channel-serve-up, meca500-mock-sim-run, fr3-mock-sim-run
launch-agent every PC the Director should manage launch-agent-up
meca500-host Meca500 robot PC meca500-real-run
fr3-host FR3 robot PC fr3-real-run
liconic-host Liconic / Hamilton PC liconic-up, liconic-sim-up, hamilton-up, hamilton-sim-up
real-native single-box real robot real-native-up

Distributed launch (Topology Director) ★

Three control layers move the "which pixi run task lives on which PC" knowledge out of operators' shells and onto a single dashboard / MCP / YAML surface:

  1. Topology Director (src/topology_director/) on the control PC reads config/topology.yaml, supervises N skill-server orchestrator instances (each in its own ROS namespace so multiple BTs can run in parallel), and fans launch / stop / kill commands out to per-host launch agents.
  2. Launch agents (src/launch_agent/) — one per PC — own subprocess spawn + SIGINT → SIGTERM → SIGKILL escalation + kill_node for arbitrary ROS nodes.
  3. Central pixi channelpython -m http.server over ~/channel on the Director PC. Worker hosts pull pre-built binary packages (ros-jazzy-robot-skills-msgs, ros-jazzy-franka-msgs, ros-jazzy-liconic-msgs, ros-jazzy-hamilton-star-msgs, ros-jazzy-launch-agent, …) instead of each rebuilding from source.

All Pixi environments include the shared ros-network activation feature. The same defaults also live in scripts/ros-network-env.sh, which daemon startup, Director/launch-agent foreground startup, self-restart helpers, and launch-agent supervised service launches source before starting ROS processes:

RMW_IMPLEMENTATION=rmw_fastrtps_cpp
ROS_DOMAIN_ID=0
ROS_AUTOMATIC_DISCOVERY_RANGE=SUBNET
ROS_STATIC_PEERS=10.6.104.87        # Director (callus) — every host unicasts SPDP here

Only ROS-framework env vars — no DDS vendor XML, no per-host interface pinning. Fast DDS (the Jazzy default, REP-2000 Tier-1) listens on every NIC out of the box.

Why static peers instead of pure multicast: the lab's 10.6.104.0/24 and 10.6.105.0/24 subnets are L3-routed via a common gateway. ROS 2 SPDP multicast uses TTL=1 so packets don't survive the gateway hop. Setting ROBOT_BEHAVIOURS_ROS_PEERS to the Director's IP means every worker unicasts its SPDP announcement directly to the Director; the Director learns each worker's address from the received packet and can reply. No per-worker IP config needed on the Director. When the network admin enables multicast routing between the two subnets (see the polite request below), static peers can be removed and SUBNET multicast will handle everything automatically.

Per-host overrides: create scripts/ros-peers.local (gitignored, sourced by ros-network-env.sh) to add further peers without touching git — useful for worker-to-worker traffic or temporary IPs:

# scripts/ros-peers.local (on any host that needs extra peers)
ROBOT_BEHAVIOURS_ROS_PEERS="10.6.104.87 10.6.105.23"

Verifying multicast forwarding (to confirm when the network admin has made the change):

# Run simultaneously — listener on Director, sender on a worker
pixi run -e director multicast-smoke -- --listen --seconds 20
pixi run -e launch-agent multicast-smoke -- --send --count 10

multicast_received=yes means SUBNET multicast works across the subnets and static peers are no longer needed.

General DDS smoke tests:

# Inspect whether a worker launch-agent is visible from the Director
pixi run -e director ros-dds-check -- --host meca500-control

# Cross-host raw DDS beacon test
pixi run -e launch-agent ros-dds-beacon
pixi run -e director ros-dds-check -- --listen meca500-control

Note: meca500-control as a hostname resolves to its Tailscale IP via MagicDNS — that's irrelevant to lab discovery. DDS uses the host's 10.6.10x.* interface; the Tailscale address is never involved. pixi run ros-network-setup prints the effective values on any host; daemon logs record them at start.

Director PC bringup

git pull
pixi install -e director                      # resolves Director + launch-agent + dashboard envs (path-deps; no channel preload required)

pixi run channel-pack                         # OPTIONAL: harvests every workspace .conda into ~/channel for worker PCs
pixi run channel-serve-up                     # background daemon: serves ~/channel over HTTP on :8082
pixi run launch-agent-up                      # background daemon: lets the Director manage tasks on this PC too
pixi run director-up                          # background daemon: Director + rosbridge + dashboard

Open http://<director-host>:8081/ for the Tier 1 home (topology + instance + kill controls). Tier 2 per-instance dashboards open from any instance row at /instance/<name>.

Worker PC bringup (Meca500, FR3, Liconic, …)

Each worker only needs the launch agent — the Director will tell it what to spawn at runtime via /launch_agents/<hostname>/launch_task.

git clone <repo>                              # or git pull on an existing clone
cd robot_behaviours

# Get the prebuilt msgs + launch_agent packages from the Director's HTTP channel
# (or just `pixi install -e launch-agent` locally if you'd rather build from source on the worker).
echo 'http://<director-host>:8082' | pixi config append --workspace channels -

pixi install -e launch-agent                  # resolves the agent + msgs from the Director's channel
pixi run -e launch-agent launch-agent-up      # background daemon: agent listens at /launch_agents/<hostname>/

That's it. The agent's heartbeat /launch_agents/<hostname>/info (latched JSON) auto-registers the worker with the Director. Subsequent pixi run -e launch-agent launch-agent-status confirms it's alive, …-logs tails the daemon log, …-down stops it.

The agent shells out to pixi run -e <env> <task> on demand, so the worker also needs whichever per-host env hosts the actual workload (meca500-host, fr3-host, liconic-host, …) installed alongside launch-agent. Adding the Director's HTTP channel means those envs solve from prebuilt binaries too — no C++ toolchain required on the worker.

Remote launch-agent updates

Fleet machines follow the production git branch for operator-triggered updates. Development can continue on main; promote a known-good commit by fast-forwarding production to that commit and pushing it:

git checkout main
git pull --ff-only
git checkout production
git merge --ff-only main
git push origin production

Each launch agent owns updates for its local checkout. The Director only relays the request; the target host runs the git and pixi work itself:

  1. /director/update_launch_agent receives a host name, or JSON such as {"host":"meca500-control","branch":"production","strategy":"ff-only","git":true}.
  2. The Director forwards that JSON to /launch_agents/<host>/update_self.
  3. The launch agent fetches origin/<branch>, refuses to continue if the checkout is dirty, fast-forwards the branch, runs pixi install -e launch-agent, sources scripts/ros-network-env.sh, then restarts its own daemon from a detached helper process.
  4. The host briefly disappears from the dashboard while the daemon restarts, then re-advertises /launch_agents/<host>/info.

The default launch-agent parameters are self_update_remote:=origin, self_update_branch:=production, and self_update_strategy:=ff-only. reset-hard exists for locked-down production checkouts where local edits must be discarded, but ff-only is the normal safe mode. The heartbeat includes git branch, short SHA, and dirty state so the Tier 1 dashboard can show which revision each host is running.

Restart without git/pixi is separate: /director/restart_launch_agent relays to /launch_agents/<host>/restart_self, which only bounces the daemon.

Daemon controls

Foreground Background daemon Stop Status Logs
channel-serve channel-serve-up channel-serve-down channel-serve-status channel-serve-logs
launch-agent-run launch-agent-up launch-agent-down launch-agent-status launch-agent-logs
director-run director-up director-down director-status director-logs

Cross-platform (macOS bash 3 + Linux bash 5), no systemd / launchctl dependency. PIDs and logs under ~/.local/state/robot_behaviours/daemons/<name>/. -up is idempotent; -down does TERM → 5 s grace → KILL of the whole subprocess tree. scripts/daemon.sh start ... sources scripts/ros-network-env.sh and writes the effective ROS network values to the daemon log before detaching.

Bringing up a fleet from one place

# CLI
ros2 service call /director/launch_profile robot_skills_msgs/srv/LaunchProfile \
  '{profile_name: "real-parallel", continue_on_failure: false}'

# Headline capability: real-parallel spawns two skill_server instances under
# /orch_meca and /orch_fr3 so two BT campaigns run concurrently.
ros2 action send_goal /orch_meca/skill_server/execute_behavior_tree ... &
ros2 action send_goal /orch_fr3/skill_server/execute_behavior_tree  ... &

# Selective kill — only that one orchestrator dies, everything else keeps running.
ros2 service call /director/kill_node robot_skills_msgs/srv/KillRosNode \
  '{node_name: "/orch_meca/skill_server"}'

# Hard kill the whole fleet (escape hatch)
ros2 service call /director/kill_all robot_skills_msgs/srv/CancelActiveTask '{reason: ""}'

Every service is also exposed via the MCP server (launch_profile, kill_node, spawn_orch_instance, terminate_orch_instance, list_orch_instances, get_topology, …) and via the Tier 1 dashboard.

config/topology.yaml ships five reference profiles: lab-sim, real-full-serial, real-parallel (the headline parallel-BT capability), real-meca-only, and parallel-sim. Edit the file and call /director/reload_spec to pick up changes without restarting.

Multi-instance namespacing caveat. orchestrator.launch.py accepts namespace:=… and pushes the orchestrator nodes under it, but the skill_server still publishes/subscribes on absolute /skill_server/... paths in many places. Running multiple instances at the root namespace works today (legacy single-orchestrator path); running disjoint instances under /orch_meca / /orch_fr3 requires the absolute-topic conversion tracked in src/robot_skill_server/NAMESPACE_AUDIT.md. Tier 2 dashboards display a banner when they're viewing a non-root instance.

Common workflows

# Rebuild after editing a .msg/.srv/.action — pixi-build-ros tracks the
# extra-input-globs in interfaces/<pkg>/pixi.toml and rebuilds the path-dep'd
# .conda automatically on the next install. Just re-run the env install:
pixi install -e <env>

# Open a sourced ROS shell
pixi run lite-native-shell        # or real-native-shell, meca500-host shell, etc.

# Inspect the running ROS graph
pixi run status

# Run the test suite
pixi run test

Calling skills

# List the skill registry (merged view of every */skills topic)
ros2 service call /skill_server/get_skill_descriptions \
  robot_skills_msgs/srv/GetSkillDescriptions \
  '{include_compounds: true, include_pddl: false}'

# Execute a behavior tree XML
ros2 action send_goal /skill_server/execute_behavior_tree \
  robot_skills_msgs/action/ExecuteBehaviorTree \
  "$(python3 -c 'import yaml,sys; xml=open(sys.argv[1]).read(); print(yaml.safe_dump({"tree_xml": xml, "tree_name": "demo", "target_mode": 0}, default_style="|"))' src/robot_behaviors/trees/move_to_home.xml)"

# Compose a tree from skill steps
ros2 service call /skill_server/compose_task \
  robot_skills_msgs/srv/ComposeTask \
  '{
    task_name: "my_task",
    sequential: true,
    steps: [
      {skill_name: "move_to_named_config", parameters_json: "{\"config_name\": \"home\"}"},
      {skill_name: "gripper_control",       parameters_json: "{\"command\": \"open\"}"}
    ]
  }'

# Sanity-check a plan against PDDL preconditions / effects before running it
ros2 service call /skill_server/validate_plan \
  robot_skills_msgs/srv/ValidatePlan \
  '{
    initial_state: ["robot_initialized", "gripper_open"],
    steps: [
      {skill_name: "move_to_named_config", parameters_json: "{\"config_name\": \"home\"}"},
      {skill_name: "pick_object",           parameters_json: "{}"}
    ]
  }'
# Returns valid=true + final_state, OR valid=false + first_failing_step
# + missing_preconditions[] pointing at the offending entry.

target_mode on ExecuteBehaviorTree: 0 = MODE_REAL (default, back-compat), 1 = MODE_SIM (one-shot dry-run), 2 = MODE_SIM_THEN_REAL (sim → operator approval gate → real).

Sim-before-real workflow ★

Long-running plans (multi-step assays, hours-long incubations) get a fast pre-flight against a paired /sim/* action surface, then a human approval gate before the real phase runs.

  • Each provider launch (make_robot_skill_server_launch) accepts namespace_prefix:=/sim and wraps its action servers in a PushRosNamespace group.
  • SkillDiscovery filters out /sim/* manifests so the registry is single-source-of-truth on real entries.
  • BtExecutor reads a sim_namespace_prefix parameter (default /sim) and prepends it to every server name during the SIM phase — same XML, both phases. sim_lab.launch.py overrides this to "" because lab-sim has no separate real backend; MODE_SIM and MODE_REAL then both resolve to the bare-path atoms. lab-up keeps the default and brings up paired sim+real action servers on one box for end-to-end approval-gate testing.
  • On a successful sim phase, BtExecutor latches a DryRunStatus on /skill_server/dryrun_status and waits on /skill_server/approve_dry_run (ApproveDryRun.srv). The dashboard surfaces an approve/reject modal.

Long-lived campaign workflow ★

For trees that run for weeks — operators trickling plates in and out of the Liconic, cycling each through the Hamilton-iSWAP to the imaging station and back — the framework persists tree state to SQLite and resumes after a skill_server crash without losing progress.

What survives a restart:

  • A SQLite-backed persistent blackboard: any key prefixed persistent. is mirrored to ~/.local/state/skill_server/tasks/{task_id}/state.db (WAL mode). Type-checked at write — only JSON-serialisable values land on disk.
  • Per-node tick checkpoints on every control / decorator / loop node (Sequence index, Repeat iteration, RetryUntilSuccessful attempt, WaitUntil deadline). On resume, the executor re-ticks from the root and each Checkpointable node hydrates its index — no work is repeated past the last successful child.
  • Action-inflight reconciliation: every RosActionNode records (node_path, server_name, goal_uuid, idempotent) before submitting. If a goal was in flight at crash time, the resume path inspects the row — idempotent skills auto-resubmit; non-idempotent skills refuse and surface an alert for the operator to resolve via OperatorDecision.

New control / utility nodes in tree_executor.py:

  • KeepRunningUntilFailure, Repeat num_cycles="N", WhileDoElse
  • WaitUntil timestamp="{...}" — wall-clock-aware sleep (deadline-preserving across restart)
  • BlackboardCondition key="..." expected="..." — gate a subtree on a persistent flag
  • PopFromQueue / PushToQueue — list-valued blackboard queues for operator-driven work
  • AdvancePlate — post-cycle bookkeeping (increments cycle, recomputes next_due_at, retires when target reached)

Operator services — split between bb_operator (campaign-level state) and skill_server (framework-level execution control):

Service Owner Purpose
/bb_operator/add_plate sidecar append a plate dict to persistent.plate_queue (trickle-in)
/bb_operator/retire_plate sidecar flag plates.{name}.retiring = true so the in-flight cycle finishes naturally and isn't re-queued
/bb_operator/pause_campaign sidecar toggle persistent.paused; BlackboardCondition gate halts the next iteration boundary
/bb_operator/operator_decision sidecar resolve a stuck non-idempotent action (retry / skip-as-success / skip-as-failure / abort-tree)
/skill_server/pause_execution bt_executor framework-level pause that sets ctx.paused, honoured at step boundaries by every control-flow node — works for any tree, not just campaigns
/skill_server/cancel_active_task bt_executor session-independent hard cancel; walks _current_ctx, sets cancelled, and tears down in-flight goals — reachable by any client (the action-cancel handshake requires the original goal id, which a restarted dashboard doesn't have)

Live dashboard view. The dashboard's Campaign panel (added 2026-04-27) subscribes to /skill_server/persistent_state — a latched JSON snapshot of the active task's persistent blackboard, republished by bb_operator after every service handler + a 1 Hz timer. Renders the plate queue + per-name index as a table with cycle / cadence / next-due / status; surfaces Add Plate (modal dialog), Pause after step / Resume, Cancel (hard halt via /skill_server/cancel_active_task), and per-row Retire trash icons. The Campaign preset layout (Layouts → CAMPAIGN) tiles it alongside Task Monitor + BT Tree + Executor + Logs. Validated end-to-end via Playwright + chromium-headless: empty-state → submit campaign tree → AddPlate → Pause → Resume → Cancel.

The campaign is defined by a campaign file (src/robot_behaviors/campaigns/plate_imaging_standard.campaign.xml) whose phases name behavior trees; the orchestrator (campaign_manager + bb_operator) owns the scheduling / dispatch / advancement loop. The per-plate cycle plate_imaging_cycle.xml is LiconicFetch → ObsImagingSequence (ObsBot PTZ, 3 presets) → LiconicTakeIn — the plate is imaged in place on the Liconic transfer tray (the Liconic shovel presents it; the ObsBot camera images it). The Hamilton iSWAP is not in the imaging leg — it only loads/unloads the incubator at campaign start/end.

Skill idempotency is declared per atom in SkillDescription.idempotent (defaults to false). The resume path uses it to decide whether to auto-resubmit on goal-gone or to halt and ask the operator.

MCP / agent surface

Agents (LLMs, MCP clients) drive the lab through agent/robot_skill_mcp/ — a FastMCP stdio server bridged into ROS:

Tool Purpose
list_skills, list_trees introspect the runtime registry
compose_task build a BT XML from steps
execute_tree dispatch /skill_server/execute_behavior_tree
register_script, list_scripts, delete_script session-scoped agent-authored skills
get_dryrun_status, approve_dry_run drive the sim-then-real gate
get_pending_agent_prompts, submit_agent_response answer in-tree Agent* leaves (yes/no, choice, freeform, image analysis)
read_image_snapshot fetch a PNG written by AgentAnalyzeImage (path-traversal-guarded to ~/.local/state/skill_server/agent_snapshots/)
get_topology, list_orch_instances inspect fleet state — hosts, tracked tasks, running orchestrator instances, recent events
launch_profile, stop_profile, stop_all, kill_all bring up / tear down a named profile from config/topology.yaml
kill_node hard-kill a ROS node by FQN (Director fans out to every reachable launch agent)
spawn_orch_instance, terminate_orch_instance stand up an extra orchestrator instance ad-hoc for a parallel BT
refresh_host_env, reload_topology_spec trigger pixi install on a worker; re-read config/topology.yaml

Agent-authored scripts run on the agent's host via agent/robot_script_server/ — the orchestrator never executes agent code; from its point of view a registered script is just another RunScript-typed action.

In-tree agent decision steps (AgentConfirm, AgentInput, AgentDecide, AgentAnalyzeImage) flow the other direction: the BT publishes AgentPrompt on /skill_server/agent_prompts, an MCP client pulls it via get_pending_agent_prompts, and the response goes back through submit_agent_response. Sample tree at src/robot_behaviors/trees/test_agent_inspect.xml.

Live update

pixi install -e <env> uses symlink-install for Python and XML share files; edits are live on save.

Change type Action
Behavior tree XML save the file — BtExecutor polls every 2 s, dashboard updates over the latched topic
Python skill / orchestrator code save the file — symlinked. Restart the host node if it caches state at startup
interfaces/<msgs>/*.msg/.srv/.action pixi install -e <env> rebuilds the path-dep'd .conda and consumers, then restart the node
lib/robot_skills_py/ base classes pixi install -e <env> (pixi-build-ros rebuilds the path-dep'd .conda and consumers) + node restart
Frontend vite build + pixi install -e director (or pixi run dashboard-dev for hot reload)

Adding a skill

See docs/adding-skills.md for the three authoring patterns (A: an @action method on RobotArmActionServer — most arm-shaped atoms land here; B: a standalone SkillNode subclass for one-off single-action atoms; C: an InstrumentMultiActionNode subclass for multi-action instruments). Plus the Claude Code commands:

  • /new-skill-atom — generic primitive (lib/robot_arm_skills) or provider-specific atom
  • /new-compound-skill — vetted, persisted compound skill
  • /new-behavior-tree — XML tree
  • /debug-skill — diagnose a failing skill or tree

Further reading

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors