You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Drop data-title attr; rely on filter({ hasText }) instead
Codecov was flagging data-title={x ?? ""} as a partial-coverage branch
on every card-view variant (Documents.tsx had 3, ModernDocumentItem.tsx
had 1). The fallback was only there because the JSX value can be null
or undefined; both forms (|| and ??) trigger codecov's branch tracker
even with no semantic difference at runtime.
Switching the helpers to the standard Playwright filter-by-text idiom
removes the need for the attribute entirely, which means we can drop
the JSX expression and its branch.
data-testid=document-card and data-processing={String(Boolean(...))}
remain — those are the load-bearing affordances and neither has a
branch codecov tracks.
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
11
11
12
12
-**VCR.py wrapper for LLM calls in `doc_extract_query_task`** — `opencontractserver/utils/vcr_replay.py` exposes a `maybe_vcr_cassette()` context manager that, when `OC_LLM_VCR_MODE` and `OC_LLM_VCR_CASSETTE` are set on the celery worker, records or replays every HTTP call to LLM provider hosts (currently `api.openai.com` / `api.anthropic.com`). A custom request-body matcher strips volatile values (millisecond timestamps, Django document PKs, OpenAI tool-call IDs, UUIDs) so a cassette recorded against one DB replays cleanly against another. With the env vars unset the wrapper is a no-op — production behavior is unchanged. Pre-recorded cassette for the E2E extract spec lives at `opencontractserver/tests/fixtures/cassettes/e2e_extract_pdf_workflow/extract.yaml`. Replay was verified end-to-end against a deliberately-fake `OPENAI_API_KEY` to confirm no real network call is made. See `docs/development/e2e_vcr.md` for record / replay / debug instructions.
13
13
-**`.github/workflows/frontend-e2e-extract.yml`** — CI workflow (currently `workflow_dispatch` only) that runs the new E2E extract spec against the full `local.yml` stack with `OC_LLM_VCR_MODE=replay` and a fake `OPENAI_API_KEY`. Manual-only because LlamaParse is not yet covered by the cassette and would otherwise be called for real on each run; activating on every PR is a follow-up that needs to extend `_LLM_HOSTS` and re-record.
14
-
- **`frontend/tests/e2e/extract-pdf-workflow.spec.ts`** — full-stack Playwright E2E spec for the extract pipeline: login → create corpus → upload two PDFs (`frontend/tests/fixtures/{usc-title-1,eton-agreement}.pdf`) → wait for parse + embedding → create extract with one column → run with a real OpenAI call → CSV export → assert non-empty cells. Adds new helpers to `frontend/tests/e2e/helpers.ts` (`uploadPdfViaUI`, `waitForDocumentReady`, `createExtractViaUI`, `openExtractByName`, `addColumnViaUI`, `addDocumentsToExtractViaUI`, `runExtractAndWaitForFinish`). Gated on `E2E_RUN_LLM_TESTS=true`; skipped in CI until LLM responses can be mocked over the wire. Runs on the live `local.yml` stack; required tweaks to disable Auth0 (`.envs/.local/.django USE_AUTH0=false`) and to widen the celeryworker `watchfiles --ignore-paths` (in `compose/local/django/celery/worker/start` and the `local.yml` command pointer) so editor / Playwright artifact writes don't hot-reload the worker mid-task. Also adds `data-testid="document-card"` + `data-title` (+ `data-processing` on the `/documents`-view variants) to `frontend/src/views/Documents.tsx` and `data-testid="document-card"` + `data-title` to `frontend/src/components/documents/ModernDocumentItem.tsx`, so tests can poll for the `backendLock` UI signal without depending on hover-only action menus.
14
+
- **`frontend/tests/e2e/extract-pdf-workflow.spec.ts`** — full-stack Playwright E2E spec for the extract pipeline: login → create corpus → upload two PDFs (`frontend/tests/fixtures/{usc-title-1,eton-agreement}.pdf`) → wait for parse + embedding → create extract with one column → run with a real OpenAI call → CSV export → assert non-empty cells. Adds new helpers to `frontend/tests/e2e/helpers.ts` (`uploadPdfViaUI`, `waitForDocumentReady`, `createExtractViaUI`, `openExtractByName`, `addColumnViaUI`, `addDocumentsToExtractViaUI`, `runExtractAndWaitForFinish`). Gated on `E2E_RUN_LLM_TESTS=true`; skipped in CI until LLM responses can be mocked over the wire. Runs on the live `local.yml` stack; required tweaks to disable Auth0 (`.envs/.local/.django USE_AUTH0=false`) and to widen the celeryworker `watchfiles --ignore-paths` (in `compose/local/django/celery/worker/start` and the `local.yml` command pointer) so editor / Playwright artifact writes don't hot-reload the worker mid-task. Also adds `data-testid="document-card"` (+ `data-processing` on the `/documents`-view variants) to `frontend/src/views/Documents.tsx` and `data-testid="document-card"` to `frontend/src/components/documents/ModernDocumentItem.tsx`, so tests can poll for the `backendLock` UI signal without depending on hover-only action menus. Cards are matched by `[data-testid="document-card"]` filtered with the visible title text — the standard Playwright pattern.
15
15
-**Mypy: type analyzer, shared, agents, badges, worker_uploads; introduce shared protocols** (Issue #1335): Brought the five smaller, interface-rich target packages over the ≥70% return-annotation bar called for by the issue and seeded `opencontractserver/types/protocols.py` with the four protocols requested in the scope:
16
16
-`VectorStoreProtocol` — minimum surface (`search` / `async_search`) implemented by `CoreAnnotationVectorStore` (`opencontractserver/llms/vector_stores/core_vector_stores.py`); imported and re-exported from that module so consumers can annotate against the protocol rather than the concrete dataclass.
17
17
-`PipelineComponentProtocol` — `title` / `description` / `author` / `dependencies` surface that the pipeline registry duck-types against; imported from `opencontractserver/pipeline/base/base_component.py` so any future parser/embedder/thumbnailer registered outside the inheritance hierarchy still type-checks against the same contract.
0 commit comments