Skip to content

Commit c3c35eb

Browse files
committed
Add E2E Playwright spec for full extract pipeline (PDF upload → ingest → run → CSV)
The spec at frontend/tests/e2e/extract-pdf-workflow.spec.ts drives login, corpus creation, two PDF uploads, ingest+embedding wait, extract creation with one column, run via real OpenAI, and CSV export. Gated on E2E_RUN_LLM_TESTS=true so it never runs in CI until the LLM call is mocked. New helpers in frontend/tests/e2e/helpers.ts: uploadPdfViaUI, waitForDocumentReady, createExtractViaUI, openExtractByName, addColumnViaUI, addDocumentsToExtractViaUI, runExtractAndWaitForFinish. Two new PDF fixtures (USC Title 1 + Eton agreement) committed under frontend/tests/fixtures/, picked because the existing doc_*_pdf_file.pdf fixtures are byte-identical clones of USC Title 1 and would not give us two distinct titles. Production-side touches are purely additive DOM attributes (data-testid='document-card', data-title, data-processing on /documents-view variants) added to frontend/src/views/Documents.tsx and frontend/src/components/documents/ModernDocumentItem.tsx so the wait-for-ready helper can poll the backendLock-driven UI signal without depending on hover-only action menus. Local-dev infrastructure: extended the celeryworker watchfiles ignore-paths in compose/local/django/celery/worker/start to cover media, staticfiles, docs, .claude, .playwright-mcp, playwright-report-e2e, test-results — without this, Playwright artifact writes and editor-driven file events were restarting the worker mid-task. local.yml's celeryworker command now points at the volume-mounted script so future edits take effect on container restart instead of needing an image rebuild. Also adds frontend/playwright.video.config.ts, a tiny config override that forces video recording for demo capture (default config keeps 'retain-on-failure' to avoid bloat).
1 parent 5559621 commit c3c35eb

11 files changed

Lines changed: 697 additions & 2 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12+
- **`frontend/tests/e2e/extract-pdf-workflow.spec.ts`** — full-stack Playwright E2E spec for the extract pipeline: login → create corpus → upload two PDFs (`frontend/tests/fixtures/{usc-title-1,eton-agreement}.pdf`) → wait for parse + embedding → create extract with one column → run with a real OpenAI call → CSV export → assert non-empty cells. Adds new helpers to `frontend/tests/e2e/helpers.ts` (`uploadPdfViaUI`, `waitForDocumentReady`, `createExtractViaUI`, `openExtractByName`, `addColumnViaUI`, `addDocumentsToExtractViaUI`, `runExtractAndWaitForFinish`). Gated on `E2E_RUN_LLM_TESTS=true`; skipped in CI until LLM responses can be mocked over the wire. Runs on the live `local.yml` stack; required tweaks to disable Auth0 (`.envs/.local/.django USE_AUTH0=false`) and to widen the celeryworker `watchfiles --ignore-paths` (in `compose/local/django/celery/worker/start` and the `local.yml` command pointer) so editor / Playwright artifact writes don't hot-reload the worker mid-task. Also adds `data-testid="document-card"` + `data-title` (+ `data-processing` on the `/documents`-view variants) to `frontend/src/views/Documents.tsx` and `data-testid="document-card"` + `data-title` to `frontend/src/components/documents/ModernDocumentItem.tsx`, so tests can poll for the `backendLock` UI signal without depending on hover-only action menus.
1213
- **Mypy: type analyzer, shared, agents, badges, worker_uploads; introduce shared protocols** (Issue #1335): Brought the five smaller, interface-rich target packages over the ≥70% return-annotation bar called for by the issue and seeded `opencontractserver/types/protocols.py` with the four protocols requested in the scope:
1314
- `VectorStoreProtocol` — minimum surface (`search` / `async_search`) implemented by `CoreAnnotationVectorStore` (`opencontractserver/llms/vector_stores/core_vector_stores.py`); imported and re-exported from that module so consumers can annotate against the protocol rather than the concrete dataclass.
1415
- `PipelineComponentProtocol``title` / `description` / `author` / `dependencies` surface that the pipeline registry duck-types against; imported from `opencontractserver/pipeline/base/base_component.py` so any future parser/embedder/thumbnailer registered outside the inheritance hierarchy still type-checks against the same contract.

compose/local/django/celery/worker/start

100644100755
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,4 @@ set -o errexit
44
set -o nounset
55

66

7-
watchfiles --target-type command "celery -A config.celery_app worker -l INFO --concurrency=1 -Q celery,worker_uploads"
7+
watchfiles --ignore-paths frontend,node_modules,.git,__pycache__,.mypy_cache,.pytest_cache,media,staticfiles,docs,.claude,.playwright-mcp,playwright-report-e2e,test-results --target-type command "celery -A config.celery_app worker -l INFO --concurrency=1 -Q celery,worker_uploads"
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
import baseConfig from "./playwright.config";
2+
import { defineConfig } from "@playwright/test";
3+
4+
/**
5+
* Playwright config override that forces video recording for every test.
6+
* Use only for capturing demo footage (`yarn playwright test -c playwright.video.config.ts ...`);
7+
* the default config keeps `video: "retain-on-failure"` to avoid bloat.
8+
*/
9+
export default defineConfig({
10+
...baseConfig,
11+
use: {
12+
...baseConfig.use,
13+
video: "on",
14+
viewport: { width: 1280, height: 800 },
15+
},
16+
});

frontend/src/components/documents/ModernDocumentItem.tsx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1286,6 +1286,8 @@ export const ModernDocumentItem: React.FC<ModernDocumentItemProps> = ({
12861286
<>
12871287
<CardContainer
12881288
ref={setNodeRef}
1289+
data-testid="document-card"
1290+
data-title={title || ""}
12891291
className={`${is_selected ? "is-selected" : ""} ${
12901292
isProcessing ? "backend-locked" : ""
12911293
} ${isFailed ? "failed" : ""} ${
@@ -1497,6 +1499,8 @@ export const ModernDocumentItem: React.FC<ModernDocumentItemProps> = ({
14971499
<>
14981500
<ListContainer
14991501
ref={setNodeRef}
1502+
data-testid="document-card"
1503+
data-title={title || ""}
15001504
className={`${is_selected ? "is-selected" : ""} ${
15011505
isProcessing ? "backend-locked" : ""
15021506
} ${isFailed ? "failed" : ""} ${isLongPressing ? "long-pressing" : ""}`}

frontend/src/views/Documents.tsx

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1380,6 +1380,9 @@ export const Documents = () => {
13801380
key={doc.id}
13811381
role="button"
13821382
tabIndex={0}
1383+
data-testid="document-card"
1384+
data-title={doc.title || ""}
1385+
data-processing={doc.backendLock ? "true" : "false"}
13831386
aria-label={`Open document ${doc.title || "Untitled"}`}
13841387
$selected={selected_document_ids.includes(doc.id)}
13851388
onClick={() => handleDocumentClick(doc)}
@@ -1502,6 +1505,9 @@ export const Documents = () => {
15021505
key={doc.id}
15031506
role="row"
15041507
tabIndex={0}
1508+
data-testid="document-card"
1509+
data-title={doc.title || ""}
1510+
data-processing={doc.backendLock ? "true" : "false"}
15051511
aria-label={`Open document ${doc.title || "Untitled"}`}
15061512
$selected={selected_document_ids.includes(doc.id)}
15071513
onClick={() => handleDocumentClick(doc)}
@@ -1569,6 +1575,9 @@ export const Documents = () => {
15691575
key={doc.id}
15701576
role="listitem"
15711577
tabIndex={0}
1578+
data-testid="document-card"
1579+
data-title={doc.title || ""}
1580+
data-processing={doc.backendLock ? "true" : "false"}
15721581
aria-label={`Open document ${doc.title || "Untitled"}`}
15731582
$selected={selected_document_ids.includes(doc.id)}
15741583
onClick={() => handleDocumentClick(doc)}
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
/**
2+
* E2E integration test: PDF upload → ingest → extract → CSV export.
3+
*
4+
* Drives the full Vite + Django + Postgres + Celery + OpenAI stack:
5+
*
6+
* 1. Logs in via the password form.
7+
* 2. Creates a corpus.
8+
* 3. Uploads two distinct PDFs into the corpus.
9+
* 4. Polls until both documents finish parsing + embedding.
10+
* 5. Creates a new Extract on the corpus with one column ("Document
11+
* Title") prompting for each PDF's title.
12+
* 6. Runs the extract, polls until cells finish.
13+
* 7. Exports to CSV and asserts each row produced *some* non-empty
14+
* title-related content (body-text or metadata fallback).
15+
*
16+
* Gated on `E2E_RUN_LLM_TESTS=true` because step 6 makes a real OpenAI
17+
* call. CI does not set the gate, so this spec is skipped there until
18+
* we have a way to mock LLM responses over the wire.
19+
*
20+
* INTENTIONAL ASSERTION SCOPE: this spec validates the *pipeline*
21+
* (upload → parse → embed → extract → export), not LLM commit behavior.
22+
* The default extraction model tends to echo upload-time metadata when
23+
* the prompt is permissive, and to enter a `failure_mode=no_final_response`
24+
* tool-loop when the prompt is strict (verbatim from page 1). Both
25+
* behaviors are tracked separately in the follow-up issue at
26+
* `docs/superpowers/specs/2026-04-29-followup-issue-no-final-response.md`.
27+
* Until that lands, the assertions here accept either body-text or
28+
* description-fallback cell contents — empty cells still fail loudly.
29+
*/
30+
31+
import { test, expect } from "./fixtures";
32+
import {
33+
TEST_USER,
34+
loginViaUI,
35+
createCorpusViaUI,
36+
uploadPdfViaUI,
37+
waitForDocumentReady,
38+
createExtractViaUI,
39+
openExtractByName,
40+
addColumnViaUI,
41+
addDocumentsToExtractViaUI,
42+
runExtractAndWaitForFinish,
43+
} from "./helpers";
44+
import fs from "fs";
45+
import path from "path";
46+
47+
const FIXTURE_DIR = path.resolve(__dirname, "../fixtures");
48+
const FIXTURE_USC = path.join(FIXTURE_DIR, "usc-title-1.pdf");
49+
const FIXTURE_ETON = path.join(FIXTURE_DIR, "eton-agreement.pdf");
50+
51+
// Unique per-run names so back-to-back local runs don't collide on
52+
// existing rows (the test does not currently clean up).
53+
const RUN_ID = Date.now();
54+
const CORPUS_TITLE = `E2E Extract PDF Corpus ${RUN_ID}`;
55+
const CORPUS_DESCRIPTION = "Corpus created by extract-pdf-workflow E2E spec.";
56+
const DOC_USC_TITLE = `USC Title 1 ${RUN_ID}`;
57+
const DOC_ETON_TITLE = `Eton Agreement ${RUN_ID}`;
58+
const EXTRACT_NAME = `Extract Titles ${RUN_ID}`;
59+
const COLUMN_NAME = "Document Title";
60+
// Permissive query. A strict "read first page verbatim" wording reliably
61+
// triggers the `failure_mode=no_final_response` issue — the agent reads
62+
// every byte sequentially and never commits — see the follow-up issue
63+
// for the agent-behavior fix. For this E2E test we only want to exercise
64+
// the upload → ingest → extract → CSV-export *pipeline*, not validate
65+
// model commit behavior, so we keep the query simple and the assertions
66+
// tolerant of either body-text or metadata-fallback answers.
67+
const COLUMN_QUERY = "What is the title of this document?";
68+
69+
test.describe("Extract PDF workflow (LLM-gated)", () => {
70+
test.skip(
71+
process.env.E2E_RUN_LLM_TESTS !== "true",
72+
"Requires E2E_RUN_LLM_TESTS=true and a backend OPENAI_API_KEY. " +
73+
"Local-only until LLM responses are mocked in CI."
74+
);
75+
76+
test.setTimeout(20 * 60 * 1000);
77+
78+
test("uploads two PDFs, runs an extract, exports CSV", async ({ page }) => {
79+
await test.step("login", async () => {
80+
await loginViaUI(page, TEST_USER.username, TEST_USER.password);
81+
});
82+
83+
await test.step("create corpus", async () => {
84+
await createCorpusViaUI(page, CORPUS_TITLE, CORPUS_DESCRIPTION);
85+
});
86+
87+
await test.step("upload USC Title 1 PDF", async () => {
88+
await uploadPdfViaUI(
89+
page,
90+
FIXTURE_USC,
91+
DOC_USC_TITLE,
92+
"USC Title 1 fixture",
93+
CORPUS_TITLE
94+
);
95+
});
96+
97+
await test.step("upload Eton agreement PDF", async () => {
98+
await uploadPdfViaUI(
99+
page,
100+
FIXTURE_ETON,
101+
DOC_ETON_TITLE,
102+
"Eton agreement fixture",
103+
CORPUS_TITLE
104+
);
105+
});
106+
107+
await test.step("wait for USC Title 1 to finish ingest", async () => {
108+
await waitForDocumentReady(page, DOC_USC_TITLE);
109+
});
110+
111+
await test.step("wait for Eton agreement to finish ingest", async () => {
112+
await waitForDocumentReady(page, DOC_ETON_TITLE);
113+
});
114+
115+
await test.step("create extract on the corpus", async () => {
116+
await createExtractViaUI(page, EXTRACT_NAME, CORPUS_TITLE);
117+
});
118+
119+
await test.step("open extract detail", async () => {
120+
await openExtractByName(page, EXTRACT_NAME);
121+
});
122+
123+
await test.step("add 'Document Title' column", async () => {
124+
await addColumnViaUI(page, COLUMN_NAME, COLUMN_QUERY);
125+
});
126+
127+
await test.step("add both documents to the extract", async () => {
128+
await addDocumentsToExtractViaUI(page, [DOC_USC_TITLE, DOC_ETON_TITLE]);
129+
});
130+
131+
await test.step("run extract and wait for finish", async () => {
132+
await runExtractAndWaitForFinish(page);
133+
});
134+
135+
await test.step("each row's title cell is non-empty", async () => {
136+
// Both rows present in the grid.
137+
await expect(page.getByText(DOC_USC_TITLE).first()).toBeVisible();
138+
await expect(page.getByText(DOC_ETON_TITLE).first()).toBeVisible();
139+
140+
// We assert every row has a non-empty Document Title cell whose
141+
// content isn't just the row's own document title. This catches a
142+
// regression where extraction silently produces empty cells, but
143+
// does NOT validate that the LLM reads the PDF body — that's a
144+
// separate concern tracked in the follow-up issue (see
145+
// docs/superpowers/specs/2026-04-29-followup-issue-no-final-response.md).
146+
// AG-Grid uses role="cell" for data cells.
147+
for (const docTitle of [DOC_USC_TITLE, DOC_ETON_TITLE]) {
148+
const row = page.getByRole("row").filter({ hasText: docTitle });
149+
const cells = row.getByRole("cell");
150+
const cellCount = await cells.count();
151+
expect(cellCount).toBeGreaterThan(0);
152+
let nonEmptySeen = false;
153+
for (let i = 0; i < cellCount; i++) {
154+
const text = (await cells.nth(i).textContent())?.trim() ?? "";
155+
if (text.length > 0 && text !== docTitle) {
156+
nonEmptySeen = true;
157+
break;
158+
}
159+
}
160+
expect(
161+
nonEmptySeen,
162+
`Row "${docTitle}" has no non-empty extracted cell — extract may have failed`
163+
).toBe(true);
164+
}
165+
});
166+
167+
await test.step("export CSV and verify contents", async () => {
168+
const downloadPromise = page.waitForEvent("download");
169+
await page
170+
.getByRole("button", { name: /Export CSV/i })
171+
.first()
172+
.click();
173+
const download = await downloadPromise;
174+
175+
const csvPath = await download.path();
176+
expect(
177+
csvPath,
178+
"Playwright did not give us a download path"
179+
).not.toBeNull();
180+
const csv = fs.readFileSync(csvPath!, "utf-8");
181+
182+
// Header line.
183+
expect(csv).toMatch(/Document Title/);
184+
185+
// At least three non-empty lines: header + one row per document.
186+
const dataLines = csv.split("\n").filter((l) => l.trim().length > 0);
187+
expect(dataLines.length).toBeGreaterThanOrEqual(3);
188+
189+
// CSV must contain SOME title-related text for each document.
190+
// Match either the body-text title (best case) or the upload-time
191+
// description fallback. The pipeline-only assertion is intentional:
192+
// the agent's tendency to echo metadata over body text is tracked
193+
// separately (see the follow-up issue). If ALL rows are empty, the
194+
// pipeline failed, and these regexes won't match anything either,
195+
// so the test still catches that.
196+
// USC: body says "TITLE 1 — GENERAL PROVISIONS"; description fallback "USC Title 1 fixture"
197+
expect(csv).toMatch(/general provisions|usc\s*title|title\s*1/i);
198+
// Eton: body says "EXCLUSIVE LICENSE AND PRODUCT DEVELOPMENT AGREEMENT"; fallback "Eton agreement fixture"
199+
expect(csv).toMatch(/exclusive license|development agreement|eton/i);
200+
});
201+
});
202+
});

0 commit comments

Comments
 (0)