Conversation
|
Re: prompt-injection threat model — Want to surface a concern that I think is materially under-weighted in the security-posture section. After tracing through the tool inventory, the dominant issue isn't actually The chain. Three pieces:
The combination of (1) + (2) + (3) is a full local-data exfiltration primitive: any page the agent reads can persuade it to read sensitive files and POST them to an attacker-controlled URL. No Pre-PR vs post-PR. The local-read capability has been there a while. What was missing was an arbitrary-destination outbound channel — Worst-case payload sketch. No special phrasing tricks beyond what already works:
If the LLM follows the chain, the user's provider API keys are exfiltrated in two tool calls and they see a clean summary of the article they asked about. Defense-in-depth options, rough order of effort/disruption:
The skills loader and matcher are clean, and the schema work is solid. The concern is specifically the combination of new outbound capability + existing local-read capability + indirect injection vector from page content. That deserves a tighter guardrail than a prompt-level "do not exfiltrate credentials" directive before un-gated rollout. (Side note: |
Browser-Harness Integration — Detailed Change Description
Intent
Let Rowboat's in-app assistant benefit from the browser-use/browser-harness domain-skills library
(currently 69 markdown files covering GitHub, LinkedIn, Amazon, Booking, Etsy, Letterboxd, etc.)
without running Python or replacing our existing embedded-browser tool stack. Skills are fetched
on demand from GitHub, cached locally, and surfaced to the agent as site-specific reference
knowledge that it reads before acting on a page.
Two upstream capabilities that the skills rely on (js(...) for in-page eval, http_get(...) for
API calls) are added as first-class tool actions so the skills' recipes port over with minimal
translation.
Architecture / Flow
"document.querySelector('form[action$="/star"]').submit()" })
│
└─ browser-control({ action: "read-page" }) — verify the form flipped to /unstar
The loader fetches from GitHub only on first use (or every 24h); subsequent suggestedSkills
lookups and load-browser-skill calls are pure local reads.
File-by-file
payload.
cached skills.
shapes, handling circular refs ([circular]), functions/symbols ([function]), bigints (string),
and capping results at 200 KB. Arrays and plain objects are walked recursively; unknown types are
stringified.
() => { … })() so the agent can await freely, runs it via the existing private
executeOnActiveTab path (which itself calls webContents.executeJavaScript(..., /* userGesture */
true)), and returns { ok: true, result } | { ok: false, error }.
executeJavaScript is sandboxed to the active tab's origin — this is not host code execution, it's
page-context JS. It can read DOM, document.cookie, localStorage, and make origin-scoped fetch
calls.
@x/core/dist/application/browser-skills/index.js, plus the SuggestedBrowserSkill type.
empty | error), runs the URL matcher against the index, returns up to 5 matches. Wrapped in
try/catch so a skills-system failure never breaks browser-control itself.
work and spread { suggestedSkills } onto the success result when non-empty.
returns { ...success, result }.
A self-contained module. Nothing else in core depends on it; builtin-tools and control-service
consume it.
loader.ts — fetch + cache + read.
timeout 20s.
manifest.json # SkillsIndex: fetchedAt, treeSha, entries[]
domain-skills//.md # verbatim markdown mirror
/git/trees/?recursive=1, filters for domain-skills/**/*.md blobs. Unauthenticated — subject
to GitHub's 60 req/hr anon limit, but we only make two calls per refresh.
mirror, builds the manifest entry (parsing # H1 as the title), sorts by id, persists
manifest.json. Per-file failures are logged and skipped (partial cache is better than no cache).
a background refresh. Subsequent calls reuse the in-flight promise (inFlightRefresh module var).
content, entry } | { ok: false, error }.
matcher.ts — URL → skills.
for each site checks whether any candidate matches the hostname exactly, as a suffix
(.), or as a substring. Collects all skill entries from matching sites, caps at limit.
for a google folder and shop.etsy.com for etsy. If a folder ever collides (e.g. news matching too
broadly), we'd add an override map at that point; not worth building upfront.
index.ts — barrel export for the two above.
Two new tools registered.
http-fetch (inserted before browser-control):
timeoutMs (1–60000, default 15000).
2 KB) for debugging.
The tool is explicitly framed in its own description as "unauthenticated API calls" — the
guidance points at browser-control({ action: "eval" }) + in-page fetch() when cookies are needed.
load-browser-skill (inserted before browser-control):
[{id, title, site}], cacheAgeMs, refreshing }. Exposed so the agent can discover skills
proactively (e.g. "list all github skills" before navigating).
as a manual escape hatch when the agent suspects a stale cache.
The assistant-facing prompt for the browser-control skill. Four additions:
agent is told that suggestedSkills appears on navigate / new-tab / read-page responses and must
be inspected before acting. Re-check after a cross-domain navigation.
index.ts — barrel export for the two above.
Two new tools registered.
http-fetch (inserted before browser-control):
json), timeoutMs (1–60000, default 15000).
(first 2 KB) for debugging.
The tool is explicitly framed in its own description as "unauthenticated API calls" — the
guidance points at browser-control({ action: "eval" }) + in-page fetch() when cookies are
needed.
load-browser-skill (inserted before browser-control):
[{id, title, site}], cacheAgeMs, refreshing }. Exposed so the agent can discover skills
proactively (e.g. "list all github skills" before navigating).
Useful as a manual escape hatch when the agent suspects a stale cache.
The assistant-facing prompt for the browser-control skill. Four additions:
agent is told that suggestedSkills appears on navigate / new-tab / read-page responseand
their normal work and spread { suggestedSkills } onto the success result when
non-empty.
browserViewManager.executeScript, returns { ...success, result }.
A self-contained module. Nothing else in core depends on it; builtin-tools and
control-service consume it.
loader.ts — fetch + cache + read.
TTL 24h, fetch timeout 20s.
manifest.json # SkillsIndex: fetchedAt, treeSha,
entries[]
domain-skills//.md # verbatim markdown mirror
/git/trees/?recursive=1, filters for domain-skills/**/*.md blobs.
Unauthenticated — subject to GitHub's 60 req/hr anon limit, but we only make two
calls per refresh.
needed.
into the mirror, builds the manifest entry (parsing # H1 as the title), sorts by id,
persists manifest.json. Per-file failures are logged and skipped (partial cache is
better than no cache).
and kick off a background refresh. Subsequent calls reuse the in-flight promise
(inFlightRefresh module var).
returns { ok, content, entry } | { ok: false, error }.
matcher.ts — URL → skills.
by site, and for each site checks whether any candidate matches the hostname exactly,
as a suffix (.), or as a substring. Collects all skill entries from
matching sites, caps at limit.
mail.google.com for a google folder and shop.etsy.com for etsy. If a folder ever
collides (e.g. news matching too broadly), we'd add an override map at that point;
not worth building upfront.
index.ts — barrel export for the two above.
Two new tools registered.
http-fetch (inserted before browser-control):
(text | json), timeoutMs (1–60000, default 15000).
bodyPreview (first 2 KB) for debugging.
The tool is explicitly framed in its own description as "unauthenticated API calls"—
the guidance points at browser-control({ action: "eval" }) + in-page fetch() when
cookies are needed.
load-browser-skill (inserted before browser-control):
skills: [{id, title, site}], cacheAgeMs, refreshing }. Exposed so the agent can
discover skills proactively (e.g. "list all github skills" before navigating).
SHA. Useful as a manual escape hatch when the agent suspects a stale cache.
(modified)
as a manual escape hatch when the agent suspects a stale cache.
The assistant-facing prompt for the browser-control skill. Four additions:
agent is told that suggestedSkills appears on navigate / new-tab / read-page responses and must
be inspected before acting. Re-check after a cross-domain navigation.
example, and a security note warning against exfiltrating cookies/localStorage to third-party
origins.
(when cookies are needed).
(github/repo-actions vs github/scraping), frames the skills as reference knowledge that must be
translated into our action vocabulary (the skills are Python-harness-shaped), mentions the
proactive list variant.
actions over eval when both work, but reach for eval when sites fight synthetic events or require
form-submit semantics. One bullet reinforcing "for read-only data, try http-fetch before DOM
scraping."
Security posture
The marginal capability jump from this change is smaller than it looks, but it's non-zero and
worth naming:
and issue same-origin fetch with credentials. It cannot execute host code; it's bounded by
Electron's webContents.executeJavaScript sandbox and the tab's origin policies. The existing
click/type on a logged-in tab already allows ordering, posting, and confirming on behalf of the
user; the new delta is silent exfiltration (scripted reads that don't appear in the browser pane
as visible interactions).
and 500KB body cap. No localhost/private-IP blocklist is applied — something to add if SSRF
concerns arise.
Not executed directly. Worst case is a skill tricks the LLM into calling eval with
attacker-supplied JS — but the LLM would see the JS, and the instruction-following bar for "make
the agent exfil cookies" via a public-PR skill is high. Still: sandboxing by origin, not
authenticity, is the defense.
The in-skill prompt text contains an explicit "do not exfiltrate credentials" directive for the
assistant.
Known gaps / not included
snapshot tarball at build time and unpack on first use if the manifest is missing.
refreshing).
folder named x for twitter.com), the heuristic matcher will miss it. A
~/.rowboat/config/browser-skills-hosts.json with folder→hostnames overrides would fix this when
it becomes a problem.
refresh (branch + tree) plus N raw.githubusercontent.com pulls (not counted against the API
budget). In practice this is fine for 69 skills on a 24h TTL.
localhost. Low priority for a local Electron app but worth a blocklist if this ever runs in a
shared/hosted context.
our eval, http_get(...) → http-fetch, wait(n) → our wait. The prompt tells it to do this, but
some skills are going to be lossy on first try. A post-translation cache (agent rewrites the
skill into our action vocab once and stores it alongside the original) is a natural follow-up.