Skip to content

Commit f760d87

Browse files
authored
Merge pull request #1400 from Open-Source-Legal/claude/resolve-issue-1335-cRUxc
typing: annotate analyzer/shared/agents/badges/worker_uploads + shared protocols (#1335)
2 parents 2c18acb + 6eac19b commit f760d87

37 files changed

Lines changed: 548 additions & 159 deletions

CHANGELOG.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111

12+
- **Mypy: type analyzer, shared, agents, badges, worker_uploads; introduce shared protocols** (Issue #1335): Brought the five smaller, interface-rich target packages over the ≥70% return-annotation bar called for by the issue and seeded `opencontractserver/types/protocols.py` with the four protocols requested in the scope:
13+
- `VectorStoreProtocol` — minimum surface (`search` / `async_search`) implemented by `CoreAnnotationVectorStore` (`opencontractserver/llms/vector_stores/core_vector_stores.py`); imported and re-exported from that module so consumers can annotate against the protocol rather than the concrete dataclass.
14+
- `PipelineComponentProtocol``title` / `description` / `author` / `dependencies` surface that the pipeline registry duck-types against; imported from `opencontractserver/pipeline/base/base_component.py` so any future parser/embedder/thumbnailer registered outside the inheritance hierarchy still type-checks against the same contract.
15+
- `ToolProtocol``name` / `description` / `parameters` / `requires_approval` mirror of `CoreTool` (`opencontractserver/llms/tools/tool_factory.py`); framework adapters can accept any object satisfying it, no inheritance required.
16+
- `PermissionedQueryManagerProtocol``visible_to_user(user) -> QuerySet` contract that `BaseVisibilityManager`, `PermissionManager`, `DocumentManager`, `AnnotationManager`, and `NoteManager` all satisfy (`opencontractserver/shared/Managers.py`); imported there so callers receiving "a permissioned manager" can type against the protocol instead of a concrete class.
17+
- `StreamObserverProtocol` — duplicated `__call__(event)` shape from `opencontractserver/llms/types.StreamObserver` for callers in non-LLM modules (notifications, websockets) that need the contract without importing `llms.types`.
18+
- **Coverage delta** (return-annotation coverage measured by AST walk, excluding `__init__.py`):
19+
- `analyzer/`: 12.5% → **87.5%** (target ≥70%)
20+
- `shared/`: 38.1% → **96.3%** (target ≥70%)
21+
- `agents/`: 34.4% → **71.9%** (target ≥70%)
22+
- `badges/`: 47.1% → **100%** (target ≥70%)
23+
- `worker_uploads/`: 46.4% → **100%** (target ≥70%)
24+
- **Files touched** (annotations only, zero behavior changes): `analyzer/{apps,checks,signals,startup,utils,admin,admin_views}.py`, `analyzer/management/commands/sync_doc_analyzers.py`, `agents/{apps,admin,memory}.py`, `badges/{apps,signals,models}.py`, `worker_uploads/{apps,auth,serializers,views,models,tasks}.py`, `shared/{utils,defaults,fields,mixins,Managers,QuerySets,decorators}.py`, plus the four protocol-consumer wirings above.
25+
- **Bare-generic promotion**: every `dict` / `list` / `set` in newly-touched public signatures was widened to a parametrised form (`dict[str, Any]`, `list[Any]`, `dict[str, int]`, etc.). The `HasEmbeddingMixin` docstring example was tightened to match (`-> dict[str, Any]`).
26+
- **`# type: ignore` audit**: every bare `# type: ignore` comment in the codebase now carries a specific error code. `opencontractserver/llms/tools/core_tools.py:413,416,418,420` (channels/asgiref import + `partial` kwarg-aware wrapper) tightened to `[import-not-found]` / `[call-arg]` with a one-line comment explaining why; `opencontractserver/utils/embeddings.py:171` to `[attr-defined]`; `opencontractserver/tests/test_pipeline_utils.py:395,423,431,438,446` cleaned up the duplicated `# type: ignore; type: ignore` markers and scoped them to `[import-not-found]`; `opencontractserver/tests/test_core_tool_factory.py:34,35,47` widened from bare to `[attr-defined]`. The total count went from 62 → 61, and the ratio of bare-to-scoped dropped to zero.
1227
- **Return-type annotations across `config/graphql/` resolvers and mutations** (Issue #1332, follow-up to #1331): The largest, least-typed subtree in the backend (459 function definitions, ~4.4% return-annotation coverage at baseline) is now at 100% return-annotation coverage. Touched files include every `*_mutations.py`, every `*_queries.py`, every `*_types.py`, plus `filters.py`, `base.py`, `base_types.py`, `security.py`, `optimized_file_resolvers.py`, `permissioning/permission_annotator/{middleware,mixins,utils}.py`, and the small utility modules. No behavioral changes — annotations only.
1328
- **`mutate(...)` on `graphene.Mutation` subclasses**: typed as forward references to the enclosing class (`-> "ClassName"`). Discovered and fixed the latent bug in `config/graphql/analysis_mutations.py:179` (`DeleteAnalysisMutation.mutate`) where the success path had no `return` statement; annotation is `-> "DeleteAnalysisMutation | None"` and an explicit `return None` was added to preserve the original implicit-None behavior on success.
1429
- **`resolve_*` methods**: typed as `-> Any` by default, refined where the GraphQL field type makes the runtime return obvious (e.g. `resolve_in_use -> bool`, `resolve_datacell_count -> int`).

opencontractserver/agents/admin.py

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
from django.contrib import admin
2+
from django.db.models import QuerySet
3+
from django.http import HttpRequest
24
from django.utils.html import format_html
35
from guardian.admin import GuardedModelAdmin
46

@@ -146,7 +148,7 @@ class AgentActionResultAdmin(GuardedModelAdmin):
146148
),
147149
)
148150

149-
def status_badge(self, obj):
151+
def status_badge(self, obj: AgentActionResult) -> str:
150152
"""Display status as a colored badge."""
151153
colors = {
152154
"pending": "#6c757d", # gray
@@ -165,7 +167,7 @@ def status_badge(self, obj):
165167
status_badge.short_description = "Status"
166168
status_badge.admin_order_field = "status"
167169

168-
def corpus_action_link(self, obj):
170+
def corpus_action_link(self, obj: AgentActionResult) -> str:
169171
"""Link to the corpus action in admin."""
170172
if obj.corpus_action:
171173
name = (
@@ -182,7 +184,7 @@ def corpus_action_link(self, obj):
182184

183185
corpus_action_link.short_description = "Corpus Action"
184186

185-
def document_link(self, obj):
187+
def document_link(self, obj: AgentActionResult) -> str:
186188
"""Link to the document in admin."""
187189
if obj.document:
188190
title = (
@@ -199,7 +201,7 @@ def document_link(self, obj):
199201

200202
document_link.short_description = "Document"
201203

202-
def tools_count(self, obj):
204+
def tools_count(self, obj: AgentActionResult) -> str:
203205
"""Display count of tools executed."""
204206
if obj.tools_executed:
205207
count = len(obj.tools_executed)
@@ -212,7 +214,7 @@ def tools_count(self, obj):
212214

213215
tools_count.short_description = "Tools"
214216

215-
def duration_display(self, obj):
217+
def duration_display(self, obj: AgentActionResult) -> str:
216218
"""Display execution duration."""
217219
duration = obj.duration_seconds
218220
if duration is not None:
@@ -228,7 +230,7 @@ def duration_display(self, obj):
228230

229231
duration_display.short_description = "Duration"
230232

231-
def agent_response_display(self, obj):
233+
def agent_response_display(self, obj: AgentActionResult) -> str:
232234
"""Display agent response with formatting."""
233235
if obj.agent_response:
234236
# Truncate very long responses in admin
@@ -240,7 +242,7 @@ def agent_response_display(self, obj):
240242

241243
agent_response_display.short_description = "Agent Response"
242244

243-
def tools_executed_display(self, obj):
245+
def tools_executed_display(self, obj: AgentActionResult) -> str:
244246
"""Display tools executed as formatted JSON."""
245247
if obj.tools_executed:
246248
import json
@@ -254,7 +256,7 @@ def tools_executed_display(self, obj):
254256

255257
tools_executed_display.short_description = "Tools Executed"
256258

257-
def execution_metadata_display(self, obj):
259+
def execution_metadata_display(self, obj: AgentActionResult) -> str:
258260
"""Display execution metadata as formatted JSON."""
259261
if obj.execution_metadata:
260262
import json
@@ -265,7 +267,7 @@ def execution_metadata_display(self, obj):
265267

266268
execution_metadata_display.short_description = "Execution Metadata"
267269

268-
def get_queryset(self, request):
270+
def get_queryset(self, request: HttpRequest) -> QuerySet[AgentActionResult]:
269271
"""Optimize queryset with select_related."""
270272
return (
271273
super()

opencontractserver/agents/apps.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,6 @@
33

44

55
class AgentsConfig(AppConfig):
6-
default_auto_field = "django.db.models.BigAutoField"
7-
name = "opencontractserver.agents"
6+
default_auto_field: str = "django.db.models.BigAutoField"
7+
name: str = "opencontractserver.agents"
88
verbose_name = _("Agents")

opencontractserver/agents/memory.py

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
import logging
1212
import re
1313
from datetime import datetime, timezone
14-
from typing import TYPE_CHECKING
14+
from typing import TYPE_CHECKING, Any
1515

1616
from channels.db import database_sync_to_async
1717
from django.core.files.base import ContentFile
@@ -74,7 +74,7 @@ def _build_empty_memory(corpus_id: int) -> str:
7474
# ---------------------------------------------------------------------------
7575

7676

77-
async def get_or_create_memory_document(corpus: Corpus, user) -> Document:
77+
async def get_or_create_memory_document(corpus: Corpus, user: Any) -> Document:
7878
"""Get the existing memory document or create a new empty one.
7979
8080
If the corpus already has a valid ``memory_document``, return it.
@@ -95,7 +95,7 @@ async def get_or_create_memory_document(corpus: Corpus, user) -> Document:
9595

9696
from opencontractserver.corpuses.models import Corpus as CorpusModel
9797

98-
def _get_or_create_sync():
98+
def _get_or_create_sync() -> Document:
9999
# Phase 1: Check under lock whether a memory document already exists.
100100
with transaction.atomic():
101101
locked = CorpusModel.objects.select_for_update().get(pk=corpus.pk)
@@ -164,7 +164,7 @@ async def read_memory_content(corpus: Corpus) -> str:
164164
if not corpus.memory_document_id:
165165
return ""
166166

167-
def _read():
167+
def _read() -> str:
168168
doc = corpus.memory_document
169169
if doc is None or not doc.txt_extract_file:
170170
return ""
@@ -187,7 +187,9 @@ def _read():
187187
return await database_sync_to_async(_read)()
188188

189189

190-
async def update_memory_content(corpus: Corpus, new_content: str, user) -> Document:
190+
async def update_memory_content(
191+
corpus: Corpus, new_content: str, user: Any
192+
) -> Document:
191193
"""Update the memory document with new content.
192194
193195
Creates the memory document if it doesn't exist. Overwrites the
@@ -215,7 +217,7 @@ async def update_memory_content(corpus: Corpus, new_content: str, user) -> Docum
215217

216218
doc = await get_or_create_memory_document(corpus, user)
217219

218-
def _update():
220+
def _update() -> Document:
219221
from django.db import transaction
220222

221223
with transaction.atomic():
@@ -431,7 +433,7 @@ def merge_curation_into_memory(
431433
current_content: str,
432434
collection_patterns: list[str],
433435
query_patterns: list[str],
434-
refinements: list[dict],
436+
refinements: list[dict[str, Any]],
435437
) -> str:
436438
"""Merge curation results into the existing memory document.
437439

opencontractserver/analyzer/admin.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
from django.contrib import admin
2-
from django.urls import path
2+
from django.urls import URLPattern, path
33
from guardian.admin import GuardedModelAdmin
44

55
from opencontractserver.analyzer.admin_views import AnalyzerSyncView
@@ -16,7 +16,7 @@ class AnalyzerAdmin(GuardedModelAdmin):
1616
list_display = ["id", "description", "task_name", "host_gremlin"]
1717
change_list_template = "admin/analyzer/analyzer_changelist.html"
1818

19-
def get_urls(self):
19+
def get_urls(self) -> list[URLPattern]:
2020
urls = super().get_urls()
2121
custom_urls = [
2222
path(

opencontractserver/analyzer/admin_views.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
1+
from typing import Any
2+
13
from django.contrib import messages
24
from django.contrib.admin.views.decorators import staff_member_required
5+
from django.http import HttpRequest, HttpResponse
36
from django.shortcuts import redirect, render
47
from django.urls import reverse
58
from django.utils.decorators import method_decorator
@@ -17,11 +20,11 @@
1720
class AnalyzerSyncView(View):
1821
"""Custom admin view for syncing doc analyzer tasks"""
1922

20-
template_name = "admin/analyzer/analyzer_sync.html"
23+
template_name: str = "admin/analyzer/analyzer_sync.html"
2124

22-
def get_available_analyzers(self):
25+
def get_available_analyzers(self) -> list[dict[str, Any]]:
2326
"""Get info about all available doc analyzer tasks"""
24-
analyzers = []
27+
analyzers: list[dict[str, Any]] = []
2528

2629
for task_name in celery_app.tasks.keys():
2730
analyzer_task = get_doc_analyzer_task_by_name(task_name)
@@ -48,8 +51,8 @@ def get_available_analyzers(self):
4851

4952
return sorted(analyzers, key=lambda x: (x["exists"], x["task_name"]))
5053

51-
def get(self, request):
52-
context = {
54+
def get(self, request: HttpRequest) -> HttpResponse:
55+
context: dict[str, Any] = {
5356
"title": "Sync Doc Analyzer Tasks",
5457
"analyzers": self.get_available_analyzers(),
5558
"opts": Analyzer._meta,
@@ -59,7 +62,7 @@ def get(self, request):
5962
}
6063
return render(request, self.template_name, context)
6164

62-
def post(self, request):
65+
def post(self, request: HttpRequest) -> HttpResponse:
6366
if not request.user.has_perm("analyzer.add_analyzer"):
6467
messages.error(request, "You don't have permission to create analyzers.")
6568
return redirect(reverse("admin:analyzer_sync"))

opencontractserver/analyzer/apps.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@
33

44

55
class AnnotationsConfig(AppConfig):
6-
default_auto_field = "django.db.models.BigAutoField"
7-
name = "opencontractserver.analyzer"
6+
default_auto_field: str = "django.db.models.BigAutoField"
7+
name: str = "opencontractserver.analyzer"
88

9-
def ready(self):
9+
def ready(self) -> None:
1010
try:
1111
import opencontractserver.analyzer.signals # noqa F401
1212
from opencontractserver.analyzer.models import GremlinEngine

opencontractserver/analyzer/checks.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
1+
from typing import Any
2+
13
from django.core.checks import Warning, register
24

35

46
@register()
5-
def check_unsynced_analyzers(app_configs, **kwargs):
7+
def check_unsynced_analyzers(app_configs: Any, **kwargs: Any) -> list[Warning]:
68
"""
79
Check if there are doc_analyzer_task decorated functions
810
that haven't been synced to the database.
911
"""
10-
warnings = []
12+
warnings: list[Warning] = []
1113

1214
try:
1315
from opencontractserver.analyzer.models import Analyzer

opencontractserver/analyzer/management/commands/sync_doc_analyzers.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
1+
from typing import Any
2+
13
from django.contrib.auth import get_user_model
2-
from django.core.management.base import BaseCommand
4+
from django.core.management.base import BaseCommand, CommandParser
35

46
from opencontractserver.analyzer.models import Analyzer
57
from opencontractserver.analyzer.utils import auto_create_doc_analyzers
@@ -8,14 +10,14 @@
810
class Command(BaseCommand):
911
help = "Synchronize doc_analyzer_task decorated functions with Analyzer database"
1012

11-
def add_arguments(self, parser):
13+
def add_arguments(self, parser: CommandParser) -> None:
1214
parser.add_argument(
1315
"--dry-run",
1416
action="store_true",
1517
help="Show what would be created without making changes",
1618
)
1719

18-
def handle(self, *args, **options):
20+
def handle(self, *args: Any, **options: Any) -> None:
1921
UserModel = get_user_model()
2022

2123
if options["dry_run"]:

opencontractserver/analyzer/signals.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import logging
2+
from typing import Any
23

34
from celery import chain
45
from django.db import transaction
@@ -11,8 +12,9 @@
1112
logger = logging.getLogger(__name__)
1213

1314

14-
def install_gremlin_on_creation(sender, instance, created, **kwargs):
15-
15+
def install_gremlin_on_creation(
16+
sender: Any, instance: Any, created: bool, **kwargs: Any
17+
) -> None:
1618
# When we create a Gremlin object in DB, need to run async task to set it up
1719
if created:
1820
transaction.on_commit(
@@ -27,7 +29,7 @@ def install_gremlin_on_creation(sender, instance, created, **kwargs):
2729
)
2830

2931

30-
def handle_analysis_completion(sender, instance, **kwargs):
32+
def handle_analysis_completion(sender: Any, instance: Any, **kwargs: Any) -> None:
3133
"""Handle analysis completion."""
3234
if hasattr(instance, "status") and instance.status == "COMPLETE":
3335
logger.info(f"Analysis {instance.id} completed")

0 commit comments

Comments
 (0)