[ comparison between percentage of source maps recovered by tool and which progams other OSS might miss ]
Authenticated web bundle scraper, JavaScript unpacker, and deobfuscator.
SrcApe drives a real Chromium against a target, captures every JavaScript bundle the SPA loads (including async / lazy / code-split chunks), and recovers the original source tree from any source maps that ship with the deployment. Works authenticated via Playwright storage state, cookies, or custom headers — so it can reach the parts of an app that exist behind login, where the interesting code actually lives. Built on top of existing libraries (Playwright, source-map-js) plus custom code for the crawl + recovery orchestration.
Existing source-map tools assume you have a .map URL in hand and walk you through extracting that one. Real SPAs ship dozens of chunks across routes — many lazy-loaded only after the app initializes for an authenticated user. None of the existing OSS tools cross the gap from "I have a domain" to "I have the entire source tree."
The closest existing tool, denandz/sourcemapper, is excellent for one map at a time and added basic --header support in early 2024. SrcApe is what happens when you push the use case all the way: drive a real browser, log in, capture everything.
- Bug bounty recon. Recover the original source tree of a target's deployed JS so you can read what's actually there, not the minified blob.
- AppSec / threat modeling. Pull the source of your own SaaS to feed into other analysis tools (incl. StoXSS — its sibling, which consumes SrcApe output to look for stored-XSS-shaped patterns).
- Reverse engineering / learning. Read how production React/Vue/Svelte apps are actually structured.
- CI / build verification. Confirm your prod builds aren't accidentally shipping source maps to the public.
Early development. CLI shape and output schema may change.
git clone https://github.com/b2d3/SrcApe
cd SrcApe
npm install
node srcape.mjs https://your-target.comOutput lands in out/<host>/:
out/your-target.com/
bundles/ Raw JS responses + their .map files
sources/ Recovered original sources (reconstructed tree)
recovery-report.md Human summary
recovery.json Machine-readable summary
The marketing pages an unauthenticated crawl sees load a small subset of an SPA's codebase. The interesting code — profile editors, dashboards, messaging, admin panels — lives behind login on app.<target>.com-style deploys. SrcApe takes auth state any of four ways. Pick whichever matches how you already have a logged-in session:
If you're already signed in to the target in Chrome (or any browser):
- Open DevTools on a logged-in page → Network tab → click any authenticated request
- Copy the value of the
Cookie:request header (or right-click → Copy → Copy value) - Paste into a file, then:
node srcape.mjs https://app.example.com --cookies-file cookies.txtOr inline if it's short:
node srcape.mjs https://app.example.com --cookies "session=abc123; user=42"--cookies and --cookies-file auto-detect the format. Three accepted shapes:
- Raw
Cookie:header line (with or without theCookie:prefix) - Netscape
cookies.txt(the format exported by "Get cookies.txt LOCALLY" or similar extensions) - JSON (a Playwright cookie array, or a full storage-state file's
{cookies: [...]})
For SaaS APIs that authenticate via Authorization: Bearer … rather than cookies:
node srcape.mjs https://api.example.com \
--header "Authorization=Bearer eyJ..." \
--header "X-API-Key=..."Headers are repeatable and apply to every request the browser makes — page load, bundle fetches, and .map fetches.
For modern SPAs that put auth tokens in localStorage / sessionStorage rather than cookies, you need a full storage-state JSON (which captures cookies + localStorage + sessionStorage in one file):
node srcape.mjs https://app.example.com --storage-state auth.jsonTo export storage state from a real session, use the Playwright CLI:
npx playwright codegen --save-storage auth.json https://app.example.com
# log in manually in the window that opens, then close the browserBeyond --storage-state, getting the full surface usually means seeding multiple URLs. Three modes:
# Multiple positional args (same host crawls in one browser session)
node srcape.mjs https://app.example.com/dashboard \
https://app.example.com/profile \
https://app.example.com/messages
# Auto-discover via /sitemap.xml + /robots.txt
node srcape.mjs example.com --sitemap --sitemap-limit 30
# BFS link-follow from the seed (same-site, capped depth + page count)
node srcape.mjs https://app.example.com --bfs 2 --bfs-limit 30
# BFS through an authenticated SPA — the dashboard, every route it links to,
# and every route those link to. This is how you get to the lazy-loaded
# admin / messaging / billing chunks that the seed page alone never touches.
node srcape.mjs https://app.example.com/dashboard \
--storage-state auth.json --bfs 2 --bfs-limit 40
# Read from a file
node srcape.mjs example.com --urls hunting-targets.txtThe --urls file also pairs well with recon tools. Generate a URL list elsewhere, feed it in:
# Pair with katana (ProjectDiscovery's modern crawler)
katana -u https://app.example.com -d 3 -silent > urls.txt
node srcape.mjs app.example.com --urls urls.txt --storage-state auth.json
# Pair with waybackurls for historical pages
waybackurls example.com | grep -v static | sort -u > urls.txt
node srcape.mjs example.com --urls urls.txt
# Pair with gau, hakrawler, gospider, etc. — any tool that outputs URLs.SrcApe also exposes a programmatic API for tools that want to build on the recovery pipeline:
import { recover, renderRecoveryReport } from "srcape";
const result = await recover({
urls: ["https://app.example.com/dashboard"],
outDir: "./out/app.example.com",
auth: {
storageStateFile: "./auth.json",
headers: { Authorization: "Bearer …" },
},
useSitemap: true,
log: console.log,
});
console.log(`Wrote ${result.stats.sourcesWritten} source files`);
const md = renderRecoveryReport("app.example.com", result);Sibling exports:
import { crawl } from 'srcape/crawl'— just the Playwright crawlimport { expandSourceMap, loadConsumer } from 'srcape/sourcemaps'— just the source-map workimport { discoverSitemap } from 'srcape/sitemap'— URL discovery via sitemap.xml
Use SrcApe only against targets you are authorized to test:
- Your own applications and infrastructure
- Bug-bounty programs whose scope explicitly includes the target
- CTFs and lab environments
Only use auth state from a session you own. Never use cookies or storage state obtained from anywhere else. Most bug-bounty programs explicitly allow authenticated scanning against your own test session; using someone else's session is a different and much more serious thing.
SrcApe stands on a lot of work by other people. It's meant to complement existing tools, not replace them.
- denandz/sourcemapper — the canonical single-map extractor. Go binary, BSD-3-licensed, 1.3k+ stars. Use it when you have one specific
.mapURL. Use SrcApe when you want to recover everything an authenticated SPA loads. - jsmap — alternative single-map extractor in Go.
- unwebpack-sourcemap — Python tool with similar single-map scope.
- Burp Source Mapper — Burp Pro extension that injects sourceMappingURL pragmas so DevTools loads originals. Requires Burp Pro + manual browsing.
- BitMapper — browser-side equivalent of Burp Source Mapper.
- Playwright — the underlying browser automation. SrcApe is basically a focused frontend on top of Playwright's network-interception + storage-state APIs.
- source-map-js — the pure-JS source-map parser SrcApe uses for VLQ decoding and position mapping.
Issues and PRs welcome once the surface stabilizes. The most useful contribution right now is trying it on a target you have permission to test and opening an issue with edge cases — odd source-map formats, auth flows that don't survive storage-state replay, SPAs whose chunks load only after specific user interactions.
MIT. See LICENSE.