Skip to content

Restrict hard deletion of versioned concepts (solves https://github.com/OpenConceptLab/ocl_issues/issues/2570)#879

Open
filiperochalopes wants to merge 7 commits into
OpenConceptLab:masterfrom
filiperochalopes:feature/delete-concept-from-head-without-admin-role
Open

Restrict hard deletion of versioned concepts (solves https://github.com/OpenConceptLab/ocl_issues/issues/2570)#879
filiperochalopes wants to merge 7 commits into
OpenConceptLab:masterfrom
filiperochalopes:feature/delete-concept-from-head-without-admin-role

Conversation

@filiperochalopes

Copy link
Copy Markdown
Contributor

Summary

Allows users with write permission on a source to hard delete concepts that have never belonged to a draft or released source version.

This removes the administrator dependency that previously limited unpublished concept management from CIEL Lab.

Authorization decisions

  • User source owners may delete eligible concepts.
  • Organization members may delete eligible concepts from organization sources.
  • Usuário autenticado exclui conceito de source com acesso público de edição.
  • Users without write access and anonymous users remain blocked.
  • Administrators retain unrestricted hard delete access.
  • async=true, db=true, and individual version deletion remain administrator-only.

Eligibility rule

A non-administrator may hard delete a concept only when none of its current or historical rows is associated with a source version other than HEAD.

Draft and released versions both make the concept ineligible. Blocked requests return 409 Conflict.

Complete history deletion

The endpoint deletes the versioned concept root. Its version rows reference that root using on_delete=CASCADE, so the operation removes every HEAD-only concept version and its dependent names, descriptions, and associations.

A regression test creates four edits, producing six database rows including the root and initial version, and verifies that all six are removed.

Integrity considerations

  • All source-scoped concept rows are locked with select_for_update.
  • Eligibility validation and deletion occur in one transaction.
  • Blocked deletion preserves concepts, versions, names, descriptions, mappings, and source-version associations.
  • Source concept counts are updated after successful deletion.

Tests

  • 16 new authorization, history, cascade, and integrity tests: passing.
  • 4 existing delete and retire regression tests: passing.
  • 20 relevant tests total: passing.
  • Pylint: 10.00/10.
  • Python compilation and git diff --check: passing.

Files changed

  • core/concepts/constants.py
  • core/concepts/models.py
  • core/concepts/views.py
  • core/integration_tests/tests_concepts.py

No database migration or public schema change is required.

* Add CONCEPT_HARD_DELETE_REQUIRES_HEAD_ONLY to enforce deletion policy

* Implement belongs_to_non_head_source_version to detect snapshotted concepts

* Prevent non-admin hard deletes on non-HEAD concepts in view logic

* Add integration tests to validate deletion protection and access control
@filiperochalopes filiperochalopes changed the title feat(concepts): restrict hard deletion of versioned concepts Restrict hard deletion of versioned concepts (solves https://github.com/OpenConceptLab/ocl_issues/issues/2570) Jun 12, 2026
@filiperochalopes

Copy link
Copy Markdown
Contributor Author

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f5bc09a683

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread core/concepts/views.py
…eding

* Queue seed tasks with transaction.on_commit to ensure consistency

* Detect pending source version seeds to block unsafe deletions

* Update view logic to prevent hard deletes when seeds are pending

* Add tests to validate deletion blocking and normal behavior
Comment thread core/concepts/models.py Outdated
parent_id=self.parent_id,
versioned_object_id=versioned_object_id,
).values_list('id', flat=True)
return Concept.sources.through.objects.filter(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can cover both Concept and ConceptVersion by:
self.sources.exclude(version=HEAD).exists()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The versioned-object row resolved here (id == versioned_object_id) is never is_latest_version, and Source.seed_concepts only snapshots the is_latest_version=True rows — so self.sources.exclude(version=HEAD) on the versioned object is always empty even when a version of the concept was released. We need to aggregate across all versions sharing versioned_object_id. Kept that, but simplified it to a single through-table query. 👍

@snyaggarwal snyaggarwal Jun 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept.sources (or concept_sources) many to many relation ideally should only have source_id and concept_id where concept_id is the concept version and not concept (versioned_object) itself.

Thats why self.sources is best and it covers both Concept and interim versions as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the prefetch_related + self.sources approach.

The hard-delete operation still works at the concept entity/root level: deleting the versioned-object row lets CASCADE remove all related concept versions. Because of that, the guard still needs to answer the entity-level question:

“Has any version of this concept entity been snapshotted into a released source version?”

The updated implementation keeps that behavior, but uses self.sources on each version after prefetching all sources in batch:

flowchart TD
    Root["Concept root / versioned-object row<br/>id = versioned_object_id"] -->|"hard delete target"| Delete["Delete root row"]
    Delete -->|"CASCADE"| V1["Concept version v1"]
    Delete -->|"CASCADE"| V2["Concept version v2"]
    Delete -->|"CASCADE"| V3["Concept version v3"]

    SV1["Released Source Version SV1<br/>version != HEAD"] -->|"M2M snapshot"| V1
    HEAD["HEAD Source"] -. "working state" .-> Root

    Guard["Guard before hard delete"] --> Q1["Query 1:<br/>fetch all Concept versions<br/>with same versioned_object_id"]
    Q1 --> Q2["Query 2:<br/>prefetch_related('sources')<br/>loads sources for all versions in batch"]
    Q2 --> Py["Python check:<br/>for each version,<br/>check version.sources from prefetch cache"]
    Py --> Check{"Any source.version != HEAD?"}

    Check -->|"yes"| Block["Block hard delete"]
    Check -->|"no"| Allow["Allow root delete"]
    Allow --> Delete

    style Root stroke-width:3px
    style Guard stroke-width:3px
    style Block stroke-width:3px
    style Delete stroke-width:3px
Loading

So instead of checking only self.sources on the versioned-object row, the method now:

  1. fetches all concept versions sharing the same versioned_object_id;
  2. prefetches their sources in one batch;
  3. checks source.version != HEAD in Python using the prefetched cache.

That keeps the same deletion rule: hard delete is allowed only if none of the concept versions are linked to a non-HEAD source version.

The permission behavior is unchanged. The existing guards in _db_hard_delete and _hard_delete still apply only for non-admin users, and check_object_permissions is untouched. The change is limited to the internal implementation of belongs_to_non_head_source_version().

One side note: the name sources is still a bit confusing here because Source is also a domain entity in the project. In this context, concept.sources behaves more like “source versions/snapshots that include this concept version” rather than the owning Source itself. I think our knowledge base would benefit from a short ubiquitous-language glossary for terms like Source, SourceVersion, HEAD, versioned object, concept version, and snapshot.

Comment thread core/concepts/models.py Outdated
Comment thread core/concepts/models.py Outdated
Comment thread core/concepts/views.py
* Allow non-admin editors to trigger asynchronous hard deletes while keeping db bypass restricted to admins

* Enforce safety checks to block deletion of concepts in snapshotted source versions

* Optimize queries for version checks and pending seed detection

* Add tests for editor permissions, conflict responses, and access control
… deletes

* Add CanAdministerParentDictionary permission for sensitive delete operations

* Update view to use dictionary-level permissions instead of admin-only checks

* Preserve HEAD-only safety validation for snapshotted concepts

* Add tests verifying dictionary member delete permissions
* Bypass duplicate version checks when updating retired flag on locales

* Restrict bypass to requests explicitly modifying retired field
* Refactor belongs_to_non_head_source_version to iterate through related concept versions and sources

* Improve accuracy by leveraging prefetching and direct object inspection
Comment thread core/concepts/models.py Outdated
def latest_source_version(self):
return self.sources.exclude(version=HEAD).order_by('-created_at').first()

def belongs_to_non_head_source_version(self):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consider Mapping in this PR as well? If yes, then:

this should move to SourceChildMixin so that Mapping can benefit for that as well.
same goes for has_pending_source_version_seed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mapping was not my prior intention, but it's a great idea!

Both belongs_to_non_head_source_version and has_pending_source_version_seed now live on SourceChildMixin, so Mapping gets them too. Made them generic (self.__class__, through-table FK derived from self._meta.model_name).

Comment thread core/concepts/models.py Outdated
parent_id=self.parent_id,
versioned_object_id=versioned_object_id,
).prefetch_related('sources')
return any(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried benchmarking against this logic

    def belongs_to_non_head_source_version(self):  # b2
        version_ids = self.__class__.objects.filter(
            parent_id=self.parent_id,
            versioned_object_id=self.versioned_object_id,
        ).values('id')

        return self.__class__.sources.through.objects.filter(
            concept_id__in=version_ids,
        ).exclude(
            source__version=HEAD,
        ).exists()
from timeit import repeat
b1_times = repeat(lambda: concept.b1(), repeat=5, number=100)
b2_times = repeat(lambda: concept.b2(), repeat=5, number=100) # b2 is above function

print("b1:", min(b1_times), b1_times)
b1: 0.7121121671516448 [1.4687205429654568, 0.7872442088555545, 0.7121121671516448, 0.7124972508754581, 0.7156632510013878]

print("b2:", min(b2_times), b2_times)
b2: 0.04778150003403425 [0.05238304194062948, 0.04903175006620586, 0.048412667121738195, 0.04778150003403425, 0.04811920807696879]

Tested this on Staging data for CIEL concept with 69 versions

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — adopted the b2 approach. It now queries the M2M through table directly with an .exists() instead of the prefetch + Python any(...) loop. Kept it generic for Concept/Mapping via the through model's FK name. Thanks for benchmarking! 🙏

…Mixin

* Move belongs_to_non_head_source_version to SourceChildMixin for reuse across models

* Move has_pending_source_version_seed to SourceChildMixin to centralize validation logic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants