Restrict hard deletion of versioned concepts (solves https://github.com/OpenConceptLab/ocl_issues/issues/2570)#879
Conversation
* Add CONCEPT_HARD_DELETE_REQUIRES_HEAD_ONLY to enforce deletion policy * Implement belongs_to_non_head_source_version to detect snapshotted concepts * Prevent non-admin hard deletes on non-HEAD concepts in view logic * Add integration tests to validate deletion protection and access control
|
This solves OpenConceptLab/ocl_issues#2570 @snyaggarwal @jamlung-ri |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f5bc09a683
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…eding * Queue seed tasks with transaction.on_commit to ensure consistency * Detect pending source version seeds to block unsafe deletions * Update view logic to prevent hard deletes when seeds are pending * Add tests to validate deletion blocking and normal behavior
| parent_id=self.parent_id, | ||
| versioned_object_id=versioned_object_id, | ||
| ).values_list('id', flat=True) | ||
| return Concept.sources.through.objects.filter( |
There was a problem hiding this comment.
this can cover both Concept and ConceptVersion by:
self.sources.exclude(version=HEAD).exists()
There was a problem hiding this comment.
The versioned-object row resolved here (id == versioned_object_id) is never is_latest_version, and Source.seed_concepts only snapshots the is_latest_version=True rows — so self.sources.exclude(version=HEAD) on the versioned object is always empty even when a version of the concept was released. We need to aggregate across all versions sharing versioned_object_id. Kept that, but simplified it to a single through-table query. 👍
There was a problem hiding this comment.
Concept.sources (or concept_sources) many to many relation ideally should only have source_id and concept_id where concept_id is the concept version and not concept (versioned_object) itself.
Thats why self.sources is best and it covers both Concept and interim versions as well
There was a problem hiding this comment.
Applied the prefetch_related + self.sources approach.
The hard-delete operation still works at the concept entity/root level: deleting the versioned-object row lets CASCADE remove all related concept versions. Because of that, the guard still needs to answer the entity-level question:
“Has any version of this concept entity been snapshotted into a released source version?”
The updated implementation keeps that behavior, but uses self.sources on each version after prefetching all sources in batch:
flowchart TD
Root["Concept root / versioned-object row<br/>id = versioned_object_id"] -->|"hard delete target"| Delete["Delete root row"]
Delete -->|"CASCADE"| V1["Concept version v1"]
Delete -->|"CASCADE"| V2["Concept version v2"]
Delete -->|"CASCADE"| V3["Concept version v3"]
SV1["Released Source Version SV1<br/>version != HEAD"] -->|"M2M snapshot"| V1
HEAD["HEAD Source"] -. "working state" .-> Root
Guard["Guard before hard delete"] --> Q1["Query 1:<br/>fetch all Concept versions<br/>with same versioned_object_id"]
Q1 --> Q2["Query 2:<br/>prefetch_related('sources')<br/>loads sources for all versions in batch"]
Q2 --> Py["Python check:<br/>for each version,<br/>check version.sources from prefetch cache"]
Py --> Check{"Any source.version != HEAD?"}
Check -->|"yes"| Block["Block hard delete"]
Check -->|"no"| Allow["Allow root delete"]
Allow --> Delete
style Root stroke-width:3px
style Guard stroke-width:3px
style Block stroke-width:3px
style Delete stroke-width:3px
So instead of checking only self.sources on the versioned-object row, the method now:
- fetches all concept versions sharing the same versioned_object_id;
- prefetches their sources in one batch;
- checks source.version != HEAD in Python using the prefetched cache.
That keeps the same deletion rule: hard delete is allowed only if none of the concept versions are linked to a non-HEAD source version.
The permission behavior is unchanged. The existing guards in _db_hard_delete and _hard_delete still apply only for non-admin users, and check_object_permissions is untouched. The change is limited to the internal implementation of belongs_to_non_head_source_version().
One side note: the name sources is still a bit confusing here because Source is also a domain entity in the project. In this context, concept.sources behaves more like “source versions/snapshots that include this concept version” rather than the owning Source itself. I think our knowledge base would benefit from a short ubiquitous-language glossary for terms like Source, SourceVersion, HEAD, versioned object, concept version, and snapshot.
* Allow non-admin editors to trigger asynchronous hard deletes while keeping db bypass restricted to admins * Enforce safety checks to block deletion of concepts in snapshotted source versions * Optimize queries for version checks and pending seed detection * Add tests for editor permissions, conflict responses, and access control
… deletes * Add CanAdministerParentDictionary permission for sensitive delete operations * Update view to use dictionary-level permissions instead of admin-only checks * Preserve HEAD-only safety validation for snapshotted concepts * Add tests verifying dictionary member delete permissions
* Bypass duplicate version checks when updating retired flag on locales * Restrict bypass to requests explicitly modifying retired field
* Refactor belongs_to_non_head_source_version to iterate through related concept versions and sources * Improve accuracy by leveraging prefetching and direct object inspection
| def latest_source_version(self): | ||
| return self.sources.exclude(version=HEAD).order_by('-created_at').first() | ||
|
|
||
| def belongs_to_non_head_source_version(self): |
There was a problem hiding this comment.
Can we consider Mapping in this PR as well? If yes, then:
this should move to SourceChildMixin so that Mapping can benefit for that as well.
same goes for has_pending_source_version_seed
There was a problem hiding this comment.
Mapping was not my prior intention, but it's a great idea!
Both belongs_to_non_head_source_version and has_pending_source_version_seed now live on SourceChildMixin, so Mapping gets them too. Made them generic (self.__class__, through-table FK derived from self._meta.model_name).
| parent_id=self.parent_id, | ||
| versioned_object_id=versioned_object_id, | ||
| ).prefetch_related('sources') | ||
| return any( |
There was a problem hiding this comment.
tried benchmarking against this logic
def belongs_to_non_head_source_version(self): # b2
version_ids = self.__class__.objects.filter(
parent_id=self.parent_id,
versioned_object_id=self.versioned_object_id,
).values('id')
return self.__class__.sources.through.objects.filter(
concept_id__in=version_ids,
).exclude(
source__version=HEAD,
).exists()
from timeit import repeat
b1_times = repeat(lambda: concept.b1(), repeat=5, number=100)
b2_times = repeat(lambda: concept.b2(), repeat=5, number=100) # b2 is above function
print("b1:", min(b1_times), b1_times)
b1: 0.7121121671516448 [1.4687205429654568, 0.7872442088555545, 0.7121121671516448, 0.7124972508754581, 0.7156632510013878]
print("b2:", min(b2_times), b2_times)
b2: 0.04778150003403425 [0.05238304194062948, 0.04903175006620586, 0.048412667121738195, 0.04778150003403425, 0.04811920807696879]
Tested this on Staging data for CIEL concept with 69 versions
There was a problem hiding this comment.
Good catch — adopted the b2 approach. It now queries the M2M through table directly with an .exists() instead of the prefetch + Python any(...) loop. Kept it generic for Concept/Mapping via the through model's FK name. Thanks for benchmarking! 🙏
…Mixin * Move belongs_to_non_head_source_version to SourceChildMixin for reuse across models * Move has_pending_source_version_seed to SourceChildMixin to centralize validation logic
Summary
Allows users with write permission on a source to hard delete concepts that have never belonged to a draft or released source version.
This removes the administrator dependency that previously limited unpublished concept management from CIEL Lab.
Authorization decisions
async=true,db=true, and individual version deletion remain administrator-only.Eligibility rule
A non-administrator may hard delete a concept only when none of its current or historical rows is associated with a source version other than HEAD.
Draft and released versions both make the concept ineligible. Blocked requests return
409 Conflict.Complete history deletion
The endpoint deletes the versioned concept root. Its version rows reference that root using
on_delete=CASCADE, so the operation removes every HEAD-only concept version and its dependent names, descriptions, and associations.A regression test creates four edits, producing six database rows including the root and initial version, and verifies that all six are removed.
Integrity considerations
select_for_update.Tests
10.00/10.git diff --check: passing.Files changed
core/concepts/constants.pycore/concepts/models.pycore/concepts/views.pycore/integration_tests/tests_concepts.pyNo database migration or public schema change is required.