fix: Improve state management in Imports, DataType, and DataModel classes#2705
fix: Improve state management in Imports, DataType, and DataModel classes#2705
Conversation
WalkthroughAdds safer import-management cleanup with reference_path migration for future imports, extends deep-copy and reuse-model propagation for data models (including dataclasses), tightens a union-parsing bounds check, and adds tests plus expected fixtures for reuse/frozen dataclass and import/type edge cases. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (2)
🧰 Additional context used🧬 Code graph analysis (1)tests/model/test_base.py (2)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| from datamodel_code_generator.model.pydantic.base_model import Constraints # noqa: TC001 # needed for pydantic | ||
| from datamodel_code_generator.model.types import DataTypeManager as _DataTypeManager | ||
| from datamodel_code_generator.model.types import type_map_factory | ||
| from datamodel_code_generator.reference import Reference |
Check notice
Code scanning / CodeQL
Cyclic import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 5 months ago
In general, cycles caused by type-like utilities can be fixed by making the dependency one-way at runtime: either move shared types into a more central module, or ensure one side only imports the other for type checking (using TYPE_CHECKING and from __future__ import annotations) or local, function‑scope imports.
The single best minimal fix here, without changing existing functionality, is to stop importing Reference at module level and instead import it only for typing. Because this file already uses from __future__ import annotations, we can safely annotate with "Reference" as a string and import Reference only inside the TYPE_CHECKING block. That way, the runtime import graph no longer includes datamodel_code_generator.reference from dataclass.py, breaking the cycle, while static type checkers still see the correct type. This does not change runtime behavior: Reference is only needed for type annotations and as a constructor call, and with postponed evaluation, using the forward‑referenced annotation as a string is sufficient.
Concretely in src/datamodel_code_generator/model/dataclass.py:
- Remove the top-level
from datamodel_code_generator.reference import Referenceimport. - Inside the existing
if TYPE_CHECKING:block, addfrom datamodel_code_generator.reference import Reference. - Update any type annotations that directly reference
Referenceto use a forward reference string"Reference"instead of the bare name, so they remain valid even if the name is not bound at runtime (although withfrom __future__ import annotations, this is largely for clarity and robustness). - No other logic, parameters, or behavior needs to change.
| @@ -21,13 +21,13 @@ | ||
| from datamodel_code_generator.model.pydantic.base_model import Constraints # noqa: TC001 # needed for pydantic | ||
| from datamodel_code_generator.model.types import DataTypeManager as _DataTypeManager | ||
| from datamodel_code_generator.model.types import type_map_factory | ||
| from datamodel_code_generator.reference import Reference | ||
| from datamodel_code_generator.types import DataType, StrictTypes, Types, chain_as_tuple | ||
|
|
||
| if TYPE_CHECKING: | ||
| from collections import defaultdict | ||
| from collections.abc import Sequence | ||
| from pathlib import Path | ||
| from datamodel_code_generator.reference import Reference | ||
|
|
||
|
|
||
| def has_field_assignment(field: DataModelFieldBase) -> bool: |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/datamodel_code_generator/types.py (1)
210-214: LGTM! Proper bounds checking prevents IndexError on short strings.The added length check
len(current_part) < 2correctly guards against accessingcurrent_part[-2]on very short strings. The short-circuit evaluation ensures the index access is never attempted when the string is too short.Per static analysis hints, the
# noqa: PLR2004directives are unused since PLR2004 isn't enabled. You can remove them if desired.🔎 Optional: Remove unused noqa directives
- if current_part.strip().startswith("constr(") and (len(current_part) < 2 or current_part[-2] != "\\"): # noqa: PLR2004 + if current_part.strip().startswith("constr(") and (len(current_part) < 2 or current_part[-2] != "\\"): in_constr += 1 elif char == ")": - if in_constr > 0 and (len(current_part) < 2 or current_part[-2] != "\\"): # noqa: PLR2004 + if in_constr > 0 and (len(current_part) < 2 or current_part[-2] != "\\"): in_constr -= 1src/datamodel_code_generator/imports.py (1)
91-124: Well-implemented safety guards for import removal.The defensive checks properly prevent negative counters and clean up stale state. The cascading cleanup (counter → set entry → empty container → alias → reference_paths) is thorough.
One observation: the
reference_pathcleanup on lines 122-123 occurs unconditionally, even whenself.counter.get(key, 0) <= 0causes an earlycontinue. This means areference_pathwon't be cleaned up if the import was already removed or never existed. This seems intentional since thereference_pathcleanup should only happen when actually removing an import, but verify this is the expected behavior.Per static analysis, the
# noqa: PLR0912is unused and can be removed.🔎 Optional: Remove unused noqa directive
- def remove(self, imports: Import | Iterable[Import]) -> None: # noqa: PLR0912 + def remove(self, imports: Import | Iterable[Import]) -> None:
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
src/datamodel_code_generator/imports.py(2 hunks)src/datamodel_code_generator/model/base.py(2 hunks)src/datamodel_code_generator/model/dataclass.py(2 hunks)src/datamodel_code_generator/types.py(1 hunks)tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/__init__.py(1 hunks)tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/schema_a.py(1 hunks)tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/schema_b.py(1 hunks)tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/shared.py(1 hunks)tests/main/jsonschema/test_main_jsonschema.py(1 hunks)tests/test_imports.py(1 hunks)tests/test_types.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/schema_b.py (2)
tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/schema_a.py (2)
Model(19-20)SharedModel(14-15)tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/shared.py (1)
SharedModel(12-14)
src/datamodel_code_generator/model/base.py (1)
src/datamodel_code_generator/parser/base.py (1)
data_type(978-980)
tests/test_imports.py (1)
src/datamodel_code_generator/imports.py (5)
Imports(41-169)append(74-89)Import(20-38)remove(91-123)extract_future(130-144)
tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/schema_a.py (2)
tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/shared.py (1)
SharedModel(12-14)tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/schema_b.py (1)
Model(14-15)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/types.py
210-210: Unused noqa directive (non-enabled: PLR2004)
Remove unused noqa directive
(RUF100)
213-213: Unused noqa directive (non-enabled: PLR2004)
Remove unused noqa directive
(RUF100)
src/datamodel_code_generator/imports.py
91-91: Unused noqa directive (non-enabled: PLR0912)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.11 on Windows
- GitHub Check: py312-black23 on Ubuntu
- GitHub Check: py312-pydantic1 on Ubuntu
- GitHub Check: 3.10 on Ubuntu
- GitHub Check: py312-isort7 on Ubuntu
- GitHub Check: py312-isort6 on Ubuntu
- GitHub Check: py312-isort5 on Ubuntu
- GitHub Check: py312-black24 on Ubuntu
- GitHub Check: 3.10 on Windows
- GitHub Check: 3.11 on Ubuntu
- GitHub Check: benchmarks
- GitHub Check: Analyze (python)
🔇 Additional comments (12)
tests/test_types.py (1)
138-154: LGTM! Comprehensive edge-case test coverage.The parametrized test cases effectively verify the bounds-safety fix by testing short strings like
"(",")","()"that would have caused IndexError before the fix. Good coverage of bothuse_union_operatormodes.src/datamodel_code_generator/imports.py (1)
141-143: LGTM! Proper migration of reference_paths in extract_future.The new logic correctly transfers
reference_pathentries for__future__imports to the newfutureImports instance, maintaining referential integrity across the extraction.tests/test_imports.py (1)
101-157: LGTM! Excellent edge-case test coverage for import management.The new tests thoroughly cover:
- Safe handling of non-existent imports (no crash)
- Safe handling of double removal (no crash, no negative counters)
- Proper cleanup of internal
counterentries- Proper cleanup of
reference_paths- Correct migration of
reference_pathsduringextract_future()These tests directly validate the defensive improvements made in
imports.py.tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/schema_a.py (1)
1-20: LGTM! Well-structured test fixture for frozen dataclass reuse scope.The generated code correctly:
- Uses
@dataclass(frozen=True)for immutability- Aliases the imported
SharedModelasSharedModel_1to avoid collision with the local class definition- Defines the inheritance relationship properly
This fixture aligns with the related
schema_b.pyandshared.pyfiles for comprehensive reuse-scope testing.tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/__init__.py (1)
1-3: LGTM! Standard test expected output.This init.py file follows the standard pattern for generated test expected outputs.
tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/shared.py (1)
11-14: LGTM! Correct frozen dataclass definition.The frozen dataclass is correctly defined with
frozen=True, which is the key aspect being tested in the frozen reuse scope scenario.tests/data/expected/main/jsonschema/reuse_scope_tree_dataclass_frozen/schema_b.py (1)
10-15: LGTM! Correct frozen dataclass with shared model reference.The frozen dataclass correctly imports and references the shared.SharedModel while maintaining the frozen property.
tests/main/jsonschema/test_main_jsonschema.py (1)
3926-3941: LGTM! Well-structured test for frozen dataclass reuse.The test correctly exercises the frozen dataclass reuse scenario by:
- Reusing existing input data from
reuse_scope_tree_dataclass- Adding the
--frozenflag to verify frozen inheritance is preserved- Using appropriate expected output directory
The docstring clearly explains the test's purpose.
src/datamodel_code_generator/model/base.py (2)
327-337: LGTM! Improved deep copy ensures nested structures are preserved.The changes strengthen the deep copy implementation:
Line 331: Deep copying
extras(instead of shallow copy) ensures nested dictionaries or lists within extras won't be shared between the original and copy.Lines 335-336: Deep copying
dict_keywhen present ensures this nested DataType is also properly isolated.These changes prevent mutation bugs where modifying a copied field could inadvertently affect the original.
544-558: LGTM! Configuration propagation ensures reused models inherit settings.The forwarding of these three parameters ensures that reused models (created for inheritance scenarios) preserve the original model's configuration:
- Line 555:
custom_base_class- preserves custom base class- Line 556:
keyword_only- maintains keyword-only field behavior- Line 557:
treat_dot_as_module- preserves module naming strategyThis is essential for scenarios like frozen dataclass reuse (tested in the new test), where the frozen setting needs to be preserved when models are reused through inheritance.
src/datamodel_code_generator/model/dataclass.py (2)
24-24: LGTM! Import moved to runtime for use in create_reuse_model.The
Referenceimport is now at module level (instead of TYPE_CHECKING-only) because the newcreate_reuse_modelmethod instantiatesReference(...)at runtime. TYPE_CHECKING imports are only available during static type checking, not at runtime.
93-109: LGTM! Dataclass-specific reuse model creation with frozen support.This method overrides the base class implementation to forward dataclass-specific configuration:
- Line 106: Forwards
frozen=self.frozen- crucial for the frozen dataclass reuse scenario tested in the new test- Line 108: Forwards
dataclass_arguments- preserves all dataclass decorator arguments- Lines 95-97: Creates inherited model with empty fields pointing to base reference
- Line 101: Reference path correctly appended with "/reuse"
This ensures that when dataclass models are reused through inheritance (e.g., with
--reuse-scope=tree), they preserve the frozen and other dataclass-specific settings from the original model.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2705 +/- ##
=======================================
Coverage 99.43% 99.44%
=======================================
Files 81 81
Lines 11381 11449 +68
Branches 1359 1363 +4
=======================================
+ Hits 11317 11385 +68
Misses 38 38
Partials 26 26
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/datamodel_code_generator/model/dataclass.py (1)
94-112: Implementation looks good; minor cleanup note on noqa directive.The method correctly propagates all dataclass-specific configuration (including
frozenanddataclass_arguments) to the reused model, aligning with the parent class enhancements.Cyclic import context:
The local import ofReferenceat line 96 works around a circular dependency (already noted in past reviews and flagged by CodeQL). While this pattern is a common Python workaround, it reflects an architectural coupling worth keeping in mind for future refactoring.Minor cleanup:
Line 96: Thenoqa: PLC0415directive is unnecessary since that check is not enabled in your Ruff configuration.Optional: Remove unused noqa directive
- from datamodel_code_generator.reference import Reference # noqa: PLC0415 + from datamodel_code_generator.reference import Reference
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/datamodel_code_generator/model/dataclass.py(1 hunks)
🧰 Additional context used
🪛 GitHub Check: CodeQL
src/datamodel_code_generator/model/dataclass.py
[notice] 96-96: Cyclic import
Import of module datamodel_code_generator.reference begins an import cycle.
🪛 Ruff (0.14.8)
src/datamodel_code_generator/model/dataclass.py
96-96: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: benchmarks
- GitHub Check: 3.10 on macOS
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.10 on Windows
- GitHub Check: 3.13 on Windows
- GitHub Check: 3.13 on macOS
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.14 on Windows
- GitHub Check: Analyze (python)
CodSpeed Performance ReportMerging #2705 will not alter performanceComparing Summary
Footnotes
|
Summary by CodeRabbit
Bug Fixes
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.