Commit f5a0bdf
committed
Address review: corpus in lookup key, dedup annotations, jsonb note
- Move corpus from defaults into the get_or_create lookup so a document
shared across multiple corpora gets a distinct annotation row per
corpus; previously the second corpus's grounding silently reused the
first corpus's row, leaving datacell.sources pointing at an annotation
whose corpus mismatched the extract (breaking
MIN(document_permission, corpus_permission)). Applies to both PDF and
text/DOCX paths.
- Deduplicate the returned annotations list by primary key so
len(annotations) == datacell.sources.count() when the same span
resolves to a single get_or_create row from multiple alignment hits.
- Update span-annotation docstring: NullableJSONField → jsonb in
Postgres, so dict key order is moot for the get_or_create lookup —
we still construct the dict in stable order for forward compatibility.
- Add regression test verifying that grounding the same document under
two corpora produces disjoint annotation sets with correct corpus FKs.1 parent 73e3198 commit f5a0bdf
2 files changed
Lines changed: 112 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
616 | 616 | | |
617 | 617 | | |
618 | 618 | | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
| 684 | + | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
| 688 | + | |
| 689 | + | |
| 690 | + | |
| 691 | + | |
| 692 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
251 | 251 | | |
252 | 252 | | |
253 | 253 | | |
254 | | - | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
255 | 268 | | |
256 | 269 | | |
257 | 270 | | |
| |||
299 | 312 | | |
300 | 313 | | |
301 | 314 | | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | | - | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
308 | 328 | | |
309 | 329 | | |
| 330 | + | |
310 | 331 | | |
311 | 332 | | |
312 | 333 | | |
313 | 334 | | |
314 | 335 | | |
315 | 336 | | |
316 | | - | |
317 | 337 | | |
318 | 338 | | |
319 | 339 | | |
| |||
340 | 360 | | |
341 | 361 | | |
342 | 362 | | |
343 | | - | |
344 | | - | |
345 | | - | |
346 | | - | |
347 | | - | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
348 | 372 | | |
349 | 373 | | |
350 | 374 | | |
351 | 375 | | |
352 | 376 | | |
| 377 | + | |
353 | 378 | | |
354 | 379 | | |
355 | 380 | | |
356 | 381 | | |
357 | 382 | | |
358 | 383 | | |
359 | | - | |
360 | 384 | | |
361 | 385 | | |
362 | 386 | | |
| |||
0 commit comments