Open-Source-Legal
diff --git a/‎CHANGELOG.md‎
Lines changed: 25 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎config/graphql/action_queries.py‎
Lines changed: 7 additions & 6 deletions b/‎config/graphql/action_queries.py‎
Lines changed: 7 additions & 6 deletions
diff --git a/‎config/graphql/agent_mutations.py‎
Lines changed: 3 additions & 3 deletions b/‎config/graphql/agent_mutations.py‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎config/graphql/agent_types.py‎
Lines changed: 14 additions & 12 deletions b/‎config/graphql/agent_types.py‎
Lines changed: 14 additions & 12 deletions
@@ -9,6 +9,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Added
 
+- **Return-type annotations across `config/graphql/` resolvers and mutations** (Issue #1332, follow-up to #1331): The largest, least-typed subtree in the backend (459 function definitions, ~4.4% return-annotation coverage at baseline) is now at 100% return-annotation coverage. Touched files include every `*_mutations.py`, every `*_queries.py`, every `*_types.py`, plus `filters.py`, `base.py`, `base_types.py`, `security.py`, `optimized_file_resolvers.py`, `permissioning/permission_annotator/{middleware,mixins,utils}.py`, and the small utility modules. No behavioral changes — annotations only.
+  - **`mutate(...)` on `graphene.Mutation` subclasses**: typed as forward references to the enclosing class (`-> "ClassName"`). Discovered and fixed the latent bug in `config/graphql/analysis_mutations.py:179` (`DeleteAnalysisMutation.mutate`) where the success path had no `return` statement; annotation is `-> "DeleteAnalysisMutation | None"` and an explicit `return None` was added to preserve the original implicit-None behavior on success.
+  - **`resolve_*` methods**: typed as `-> Any` by default, refined where the GraphQL field type makes the runtime return obvious (e.g. `resolve_in_use -> bool`, `resolve_datacell_count -> int`).
+  - **`AnnotatePermissionsForReadMixin`** (`config/graphql/permissioning/permission_annotator/mixins.py`): per the issue's specific guidance, `resolve_my_permissions -> list[str]`, `resolve_is_published -> bool`, `resolve_object_shared_with -> list[dict[str, Any]]`. The pre-existing wrong annotation `list[PermissionTypes]` (an Enum, while the implementation returns plain strings) was corrected to `list[str]`. The now-unused `PermissionTypes` import was removed.
+  - **Filter / queryset helpers** (`filter_by_*`, `text_search_method`, `get_node`, `get_queryset`, `_get_*`, etc.) typed as `-> Any` to keep the change conservative; tightening to `QuerySet[Model]` is a follow-up.
+  - **`config/graphql/permissioning/permission_annotator/utils.py`** had a broken import (`config.graphql.permission_annotator.middleware` instead of `config.graphql.permissioning.permission_annotator.middleware`) — fixed in passing.
+  - **`config/graphql/conversation_types.py`**: replaced `base64.binascii.Error` with a direct `binascii.Error` import (pre-existing — `base64` re-exports `binascii` at runtime but `mypy` doesn't see the re-export).
+  - **Var-annotated additions**: `id_to_children: dict[Any, list[Any]]` in `base_types.py`, `read_only_fields: list[str]` in `serializers.py`, `this_model_permission_id_map: dict[int, str]` etc. in middleware.
+  - **Five modules graduated from the mypy baseline** (`mypy.ini` → no longer `ignore_errors = True`): `config.graphql.base_types`, `config.graphql.conversation_types`, `config.graphql.permissioning.permission_annotator.middleware`, `config.graphql.permissioning.permission_annotator.utils`, `config.graphql.serializers`. Their entries in `docs/typing/mypy_baseline.txt` (11 lines) were also pruned. Future PRs can graduate the remaining baselined files as the structural issues they expose (custom `visible_to_user` manager method not seen by `django-stubs`, `set_permissions_for_obj_to_user` signature mismatch, mixin `_meta` access) are addressed.
+  - **Tooling**: zero new `# type: ignore` markers; black & isort applied; `flake8 config/graphql/` clean. `mypy --config-file mypy.ini opencontractserver config` passes with the updated baseline.
 - **Mypy: graduated `opencontractserver/users/tasks.py` out of the baseline** (Issue #1333 follow-up): `tasks.py` was the last `opencontractserver.users` module still suppressed in `mypy.ini`. PR #1370 left it untyped because the file is only loaded when `settings.USE_AUTH0=True`, so it never failed at runtime under the test settings; the typing gap kept the package short of the issue's "all four packages at ≥80% return-annotation coverage" Done-When criterion. Added return + parameter annotations to all five Auth0 sync tasks (`get_new_auth0_token`, `apply_data_to_user`, `sync_remote_user`, `ensure_valid_auth0_token`, `get_user_details_async`), introduced a module-level docstring documenting the `USE_AUTH0` gating, and removed the `[mypy-opencontractserver.users.tasks] ignore_errors = True` section. Local `data` rebound from request body (`dict[str, str]`) to response payload (`dict[str, Any]`) was split into two distinctly-named variables (`request_data` / `payload`) so the types are unambiguous; behavior is unchanged. No callers needed updating — `config/graphql_auth0_auth/utils.py` still consumes `sync_remote_user.delay(...)` exactly as before.
 
 ### Fixed
@@ -23,6 +33,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   - **Tests** — `opencontractserver/tests/test_extraction_grounding.py`:
     - `TestGroundingPipelinePDFIntegration` (new class): builds a synthetic two-page PAWLS payload (no real PDF binary needed), runs grounding through `build_translation_layer`, and verifies (a) annotations land on the correct page, (b) re-running grounding is idempotent, and (c) when PlasmaPDF returns `page=None` the annotation is **skipped** instead of being saved on page 1.
     - `test_ground_text_document_is_idempotent`: regression for the duplicate-annotation bug on the SPAN_LABEL path.
+
+- **`CreateCorpusActionModal` opened with the wrong default agent instructions for document triggers** (Issue #1385, `frontend/src/components/corpuses/CreateCorpusActionModal.tsx:136-144,168-171`): the `inlineAgentInstructions` state was initialised with `DEFAULT_MODERATOR_INSTRUCTIONS` even though the default trigger is `add_document` (a document trigger). The trigger-change handler at line 611 swaps to `DEFAULT_DOCUMENT_AGENT_INSTRUCTIONS`, but a user who created an inline agent on the default-selected trigger without first re-selecting the trigger would submit the moderator copy as the new agent's system instructions. Initialised both the `useState` default and `resetForm()` to `DEFAULT_DOCUMENT_AGENT_INSTRUCTIONS` so the pre-interaction value matches the default trigger. Updated `frontend/tests/CreateCorpusActionModal.ct.tsx` "inline-agent create: full happy path" mutation mock to expect `DEFAULT_DOCUMENT_AGENT_INSTRUCTIONS` — the previous mock variable masked this bug because `MockedProvider` was matching the stale moderator default rather than the trigger-appropriate one.
+
+### Changed
+
+- **Test/type cleanup follow-ups from the PR #1383 review** (Issue #1385):
+  - Pinned the `isProcessing` contract for SYNC_CONTENT in `frontend/tests/CorpusChat.ct.tsx` "SYNC_CONTENT renders a complete message immediately": added an `expect(input).toBeEnabled()` assertion after the reply renders, locking the documented invariant that `setIsProcessing(true)` is owned solely by `ASYNC_START` and that a SYNC_CONTENT-only reply must never disable the input.
+  - Consolidated the duplicated `::: oc-component` fence dispatcher: extracted `OcComponentBlock` interface and a new `buildOcComponentCustomBlocks(renderMarkdown)` helper into `frontend/src/utils/camlComponents.ts`. Both `frontend/src/hooks/useCamlComponentRenderer.tsx` and `frontend/src/components/corpuses/caml/CamlDirectiveRenderer.tsx` now share the same helper instead of each casting `block` independently.
+  - Replaced `route: any` and `page: any` escape hatches with the proper `Route` and `Page` types from `@playwright/test` in `frontend/tests/CorpusDescriptionEditor.ct.tsx` (`setupMdRoute` and the abort-route test).
+  - Migrated `.version-number` CSS-class locators in `frontend/tests/CorpusDescriptionEditor.ct.tsx` to a semantic `data-testid="version-number"` matcher (`page.getByTestId("version-number")`); added the test id to the rendered version-row in `frontend/src/components/corpuses/CorpusDescriptionEditor.tsx`.
+
 - **`test_superuser_sees_all_queryset` miscounts personal corpuses by 1** (Issue #1394, `opencontractserver/tests/test_visibility_managers.py`, `opencontractserver/tests/test_resolvers.py`): Two `VisibleToUserTests.test_superuser_sees_all_queryset` cases asserted that `Corpus.objects.visible_to_user(superuser).count() == 4` (public + private + 2 personal), but the actual count is 5 because the test DB starts with a pre-existing personal corpus owned by django-guardian's `AnonymousUser` (created during fixture setup before/around the username-based skip in `opencontractserver/users/signals.py::user_created_signal`). The assertion is now scoped to corpuses created by the test's two users (`creator__in=[self.user, self.superuser]`), making it resilient to any fixture-level corpuses that exist at test DB init time. Production code is unchanged.
 - **Merged `frontend` Codecov flag drops to ~33% on every commit where Frontend CI's CT job fails** (`frontend/package.json` `test:coverage:ct`): the script chained `playwright test ... && mkdir -p ... && nyc report ...`, so a failing CT run short-circuited before `nyc report` could turn the per-test JSON files in `.nyc_output` into an `lcov.info`. The downstream `Upload CT Coverage to Codecov` step (`if: success() || failure()`) then errored with "No coverage reports found" and `frontend-component` did not upload for that SHA. Codecov's server-side aggregation of the `frontend` flag was left with only `frontend-unit` (~23%) and `frontend-e2e` (~24%), pulling the merged number down to ~33% even though the previous commit was at ~67% — observed on six consecutive main commits 2026-04-26T01:02..02:58Z (`2d7033f8`..`be5bcfc8`) before recovering on `30298391`. Mirrored the existing `test:e2e:coverage` pattern (`; CT_EXIT=$?; nyc report ... || echo "No coverage data to report"; exit $CT_EXIT`) so `nyc report` runs regardless of test outcome and the lcov ships even on red CT runs. `frontend-component` will still report a slightly lower number when tests fail (failed tests register fewer hits), but it will report — keeping the merged `frontend` flag's denominator stable.
 - **`User.__init__` shared-state mutation re-introduced by branch merge** (`opencontractserver/users/models.py:172-180` removed): PR #1374 (commit `50ed6740`) deleted the `User.__init__` override that mutated `Field.validators[0]` on every instantiation, but a subsequent merge (`b68c1cb4 → 6d2cddbf`) resurrected the override along with its mypy-narrowing changes. The current main on commit `6d2cddbf` therefore reproduced the original `#1358` bug: `User(...)` rebound `username_field.validators[0]` and clobbered any third-party validator prepended to the list. Removed the `__init__` override entirely; the class-body declaration `validators=[UserUnicodeUsernameValidator()]` on the `username` field (still present from PR #1374) is the canonical and only declaration. Also dropped the now-unused `Field` import. Regression coverage from PR #1374 (`opencontractserver/tests/test_user_username_validator.py`) was already on main and is what surfaced the regression in CI.
@@ -37,6 +58,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
   - Regression coverage: `opencontractserver/tests/test_corpus_isolation_vector_store.py` — six tests covering cross-corpus leak, deletion-aware drop, orphan-set leak, document-scoped retrieval still returns structural rows, viewer-without-doc-permission excluded, creator still sees own row.
 - **Test-only**: `opencontractserver/tests/test_pydantic_ai_agents.py`, `opencontractserver/tests/test_structural_annotation_portability.py` — `Document.objects.create(...)` calls in `TransactionTestCase` setUp now pass `processing_started=timezone.now()` to short-circuit `process_doc_on_create_atomic`, which would otherwise eagerly chain a Celery PDF-ingest task that fails on the (file-less) test document and aborts the whole test class. Pre-existing failure, exposed cleanly when the regression suite was added.
 
+### Fixed
+
+- **`Embedding.embedder_path` could be NULL but was typed `str`** (Issue #1357, `opencontractserver/annotations/models.py:461-465`, `opencontractserver/annotations/models.py:584-585`, `opencontractserver/annotations/migrations/0068_enforce_embedder_path_not_null.py`): The Django field was declared `null=True, blank=True` while the Python annotation claimed `str`, causing a long-standing mypy `assignment` error and — more importantly — silently gutting the partial unique constraints added in migration 0059. Each `unique_embedding_per_{document,annotation,note,conversation,message}_embedder` constraint is conditioned on `<parent>__isnull=False` and keys on `(embedder_path, <parent>)`, so any row with `embedder_path IS NULL` bypassed duplicate prevention for its parent. Every production code path that creates an `Embedding` (`Embedding.objects.store_embedding()`, `HasEmbeddingMixin.add_embedding()`, `worker_uploads._store_embeddings()`) already supplies a concrete `embedder_path` or skips creation when empty, so enforcing non-null at the DB level matches actual behaviour rather than constraining it. New migration 0068 backfills any legacy NULL rows with `settings.DEFAULT_EMBEDDER` (deleting rows that would collide with an existing `(default_embedder_path, parent)` row under the partial unique constraint — they were previously unreachable via any query path since all call sites filter on a concrete embedder path), then `AlterField`s the column to `NOT NULL`. Removed the now-unreachable `or 'Unknown Model'` fallback in `Embedding.__str__`. Migration runs with `atomic = False` so the RunPython backfill commits before `AlterField` takes the `ACCESS EXCLUSIVE` lock to set `NOT NULL`, matching the pattern established by migration 0059.
+
 ### Added
 
 - **Coverage: raise Corpus Chat & Agent Management component tests** (Issue #1276): added 36 new Playwright CT tests across the four lowest-ROI corpus components to drive coverage toward the ≥60% target. Breakdown:
 
@@ -3,6 +3,7 @@
 """
 
 import logging
+from typing import Any
 
 import graphene
 from graphene_django.fields import DjangoConnectionField
@@ -33,7 +34,7 @@ class ActionQueryMixin:
     )
 
     @login_required
-    def resolve_corpus_action_templates(self, info, **kwargs):
+    def resolve_corpus_action_templates(self, info, **kwargs) -> Any:
         """Return available corpus action templates.
 
         Templates are system-level and read-only — any authenticated user
@@ -58,7 +59,7 @@ def resolve_corpus_action_templates(self, info, **kwargs):
     )
 
     @login_required
-    def resolve_corpus_actions(self, info, **kwargs):
+    def resolve_corpus_actions(self, info, **kwargs) -> Any:
         """
         Resolver for corpus_actions that returns actions visible to the current user.
         Can be filtered by corpus_id, trigger type, and disabled status.
@@ -93,7 +94,7 @@ def resolve_corpus_actions(self, info, **kwargs):
     )
 
     @login_required
-    def resolve_agent_action_results(self, info, **kwargs):
+    def resolve_agent_action_results(self, info, **kwargs) -> Any:
         """
         Resolver for agent_action_results that returns results visible to the current user.
         Can be filtered by corpus_action_id, document_id, and status.
@@ -142,7 +143,7 @@ def resolve_agent_action_results(self, info, **kwargs):
     )
 
     @login_required
-    def resolve_corpus_action_executions(self, info, **kwargs):
+    def resolve_corpus_action_executions(self, info, **kwargs) -> Any:
         """
         Resolver for corpus_action_executions that returns executions visible to
         the current user.
@@ -220,7 +221,7 @@ def resolve_corpus_action_executions(self, info, **kwargs):
     )
 
     @login_required
-    def resolve_corpus_action_trail_stats(self, info, corpus_id, since=None):
+    def resolve_corpus_action_trail_stats(self, info, corpus_id, since=None) -> Any:
         """
         Resolver for corpus_action_trail_stats that returns aggregated statistics
         for corpus action executions.
@@ -291,7 +292,7 @@ def resolve_corpus_action_trail_stats(self, info, corpus_id, since=None):
         corpus_id=graphene.ID(required=False),
     )
 
-    def resolve_document_corpus_actions(self, info, document_id, corpus_id=None):
+    def resolve_document_corpus_actions(self, info, document_id, corpus_id=None) -> Any:
         """
         Resolve document actions (corpus actions, extracts, analysis rows) with proper
         permission filtering.
 
@@ -80,7 +80,7 @@ def mutate(
         avatar_url=None,
         corpus_id=None,
         is_public=True,
-    ):
+    ) -> "CreateAgentConfigurationMutation":
         user = info.context.user
 
         try:
@@ -204,7 +204,7 @@ def mutate(
         avatar_url=None,
         is_active=None,
         is_public=None,
-    ):
+    ) -> "UpdateAgentConfigurationMutation":
         user = info.context.user
 
         try:
@@ -282,7 +282,7 @@ class Arguments:
 
     @login_required
     @graphql_ratelimit(rate=RateLimits.WRITE_LIGHT)
-    def mutate(root, info, agent_id):
+    def mutate(root, info, agent_id) -> "DeleteAgentConfigurationMutation":
         user = info.context.user
 
         try:
 
@@ -1,5 +1,7 @@
 """GraphQL type definitions for agent and action types."""
 
+from typing import Any
+
 import graphene
 from graphene import relay
 from graphene_django import DjangoObjectType
@@ -37,7 +39,7 @@ class Meta:
             "source_template__id": ["exact"],
         }
 
-    def resolve_pre_authorized_tools(self, info):
+    def resolve_pre_authorized_tools(self, info) -> Any:
         """Resolve pre_authorized_tools as a list of strings."""
         return self.pre_authorized_tools or []
 
@@ -61,15 +63,15 @@ class Meta:
             "creator__id": ["exact"],
         }
 
-    def resolve_tools_executed(self, info):
+    def resolve_tools_executed(self, info) -> Any:
         """Resolve tools_executed as a list of JSON objects."""
         return self.tools_executed or []
 
-    def resolve_execution_metadata(self, info):
+    def resolve_execution_metadata(self, info) -> Any:
         """Resolve execution_metadata as JSON dict."""
         return self.execution_metadata or {}
 
-    def resolve_duration_seconds(self, info):
+    def resolve_duration_seconds(self, info) -> Any:
         """Resolve duration from the model property."""
         return self.duration_seconds
 
@@ -100,19 +102,19 @@ class Meta:
             "creator__id": ["exact"],
         }
 
-    def resolve_duration_seconds(self, info):
+    def resolve_duration_seconds(self, info) -> Any:
         """Resolve duration from the model property."""
         return self.duration_seconds
 
-    def resolve_wait_time_seconds(self, info):
+    def resolve_wait_time_seconds(self, info) -> Any:
         """Resolve wait time from the model property."""
         return self.wait_time_seconds
 
-    def resolve_affected_objects(self, info):
+    def resolve_affected_objects(self, info) -> Any:
         """Resolve affected_objects as a list of JSON objects."""
         return self.affected_objects or []
 
-    def resolve_execution_metadata(self, info):
+    def resolve_execution_metadata(self, info) -> Any:
         """Resolve execution_metadata as JSON dict."""
         return self.execution_metadata or {}
 
@@ -180,17 +182,17 @@ class Meta:
             "corpus": ["exact"],
         }
 
-    def resolve_mention_format(self, info):
+    def resolve_mention_format(self, info) -> Any:
         """Return the @ mention format for this agent."""
         if self.slug:
             return f"@agent:{self.slug}"
         return None
 
-    def resolve_available_tools(self, info):
+    def resolve_available_tools(self, info) -> Any:
         """Resolve available_tools as a list of strings, ensuring proper array type."""
         return self.available_tools if self.available_tools else []
 
-    def resolve_permission_required_tools(self, info):
+    def resolve_permission_required_tools(self, info) -> Any:
         """Resolve permission_required_tools as a list of strings, ensuring proper array type."""
         return self.permission_required_tools if self.permission_required_tools else []
 
@@ -261,5 +263,5 @@ class Meta:
             "created",
         )
 
-    def resolve_pre_authorized_tools(self, info):
+    def resolve_pre_authorized_tools(self, info) -> Any:
         return self.pre_authorized_tools or []