Skip to content

Add dynamic model generation support#2901

Merged
koxudaxi merged 23 commits intomainfrom
feature/dynamic-model-generation-v2
Jan 2, 2026
Merged

Add dynamic model generation support#2901
koxudaxi merged 23 commits intomainfrom
feature/dynamic-model-generation-v2

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Jan 2, 2026

Fixes: #1160

Summary by CodeRabbit

  • New Features

    • Runtime generation of Pydantic v2 models from JSON Schema/OpenAPI with multi-module support, thread-safe caching, configurable size/eviction and a cache-clear helper.
    • Module-level lazy attribute access to enable on-demand imports.
  • Tests

    • Large test suite validating dynamic generation, caching behavior (hit/miss, eviction, disabling, shrinking), concurrency, validation, auto-detection, multi-module outputs and snapshot comparisons.
  • Chores

    • Linter exception added to allow uppercase model names in tests.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 2, 2026

Warning

Rate limit exceeded

@koxudaxi has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 22 minutes and 49 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 9792cbf and 8cf8bdf.

📒 Files selected for processing (6)
  • docs/dynamic-model-generation.md
  • docs/using_as_module.md
  • src/datamodel_code_generator/__init__.py
  • src/datamodel_code_generator/dynamic.py
  • tests/main/test_dynamic_models.py
  • zensical.toml
📝 Walkthrough

Walkthrough

Adds a new dynamic runtime model generator that builds Pydantic v2 models from OpenAPI/JSON Schema at runtime with a thread-safe FIFO cache, multi-module execution, in-memory module orchestration, a lazy-module-attribute helper, and a comprehensive test suite for generation, caching, and concurrency.

Changes

Cohort / File(s) Summary
Public API surface
src/datamodel_code_generator/__init__.py
Exposes generate_dynamic_models and clear_dynamic_models_cache in __all__; refines is_openapi type hint to Mapping[str, Any].
Dynamic runtime generation
src/datamodel_code_generator/dynamic.py
New module implementing `generate_dynamic_models(input_: Mapping[str, Any], *, config: GenerateConfig
Lazy module attribute helper
src/datamodel_code_generator/util.py
New create_module_getattr(module_name, lazy_imports) utility that returns a __getattr__ closure for lazy imports.
Types module lazy attr wiring
src/datamodel_code_generator/types.py
Replaces hand-rolled __getattr__ with create_module_getattr(__name__, {"StrictTypes": ("datamodel_code_generator.enums","StrictTypes")}).
Tests: dynamic models & cache
tests/main/test_dynamic_models.py
New comprehensive test suite covering dynamic generation, caching semantics (hit/miss, eviction, shrink, disable, clear), concurrency (same/different schemas), validation errors, OpenAPI auto-detection, multi-module outputs, numeric/string constraints, explicit input types, internal utilities, and snapshot-based expectations.
Expected outputs for tests
tests/data/expected/dynamic_models/complex_schema.py
New expected generated Python output used by tests (Address, Person, Model(RootModel[Person])).
Project config
pyproject.toml
Adds per-file lint ignore for tests (N806) to permit uppercase model class names.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Caller
    participant Gen as generate_dynamic_models
    participant Lock as CacheLock
    participant Cache as DynamicCache
    participant CodeGen as CodeGenerator
    participant Exec as RuntimeExecutor

    Note right of Gen: detect input type & make cache key
    Caller->>Gen: call(schema, config?, cache_size)
    Gen->>Gen: _make_cache_key(schema, config)
    Gen->>Lock: acquire
    Lock->>Cache: lookup(key)
    alt cache hit
        Cache-->>Gen: models
        Lock-->>Gen: release
        Gen-->>Caller: return models
    else cache miss
        Cache-->>Gen: miss
        Gen->>CodeGen: generate module source(s) + dependency graph
        CodeGen-->>Gen: module sources & deps
        Gen->>Exec: execute modules (single or topo-ordered multi-module)
        Exec-->>Gen: runtime namespaces with model/enum classes
        Gen->>Gen: model_rebuild / namespace fixes (Pydantic v2)
        Gen->>Cache: store(key -> models) (enforce FIFO eviction)
        Lock-->>Gen: release
        Gen-->>Caller: return models
    end
    Note left of Cache: thread-safe, bounded by cache_size
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

safe-to-fix

Suggested reviewers

  • ilovelinux

Poem

🐰 I stitched some models from schemas bright,
Locks hummed softly through the night,
Modules hopped in tidy rows,
Caches kept the fastest flows,
A rabbit grins — dynamic types take flight! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add dynamic model generation support' directly matches the main changeset, which introduces a new dynamic.py module with generate_dynamic_models and clear_dynamic_models_cache functions, comprehensive tests, and related updates to expose these features in the public API.
Docstring Coverage ✅ Passed Docstring coverage is 90.57% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 2, 2026

📚 Docs Preview: https://pr-2901.datamodel-code-generator.pages.dev

Comment thread tests/main/test_dynamic_models.py Fixed
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jan 2, 2026

CodSpeed Performance Report

Merging #2901 will not alter performance

Comparing feature/dynamic-model-generation-v2 (8cf8bdf) with main (a2b2756)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 11 untouched
⏩ 98 skipped1

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.40%. Comparing base (87445c8) to head (8cf8bdf).
⚠️ Report is 9 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2901      +/-   ##
==========================================
+ Coverage   99.37%   99.40%   +0.02%     
==========================================
  Files          92       95       +3     
  Lines       16248    16903     +655     
  Branches     1920     1989      +69     
==========================================
+ Hits        16147    16802     +655     
  Misses         52       52              
  Partials       49       49              
Flag Coverage Δ
unittests 99.40% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/datamodel_code_generator/__init__.py (1)

936-947: Minor race condition in bootstrap lock initialization.

Lines 941-942 have a TOCTOU issue: two threads could both see _dynamic_models_lock_init is None and create separate locks. While benign under CPython's GIL (the worst case is creating an extra lock object), a more robust pattern would initialize _dynamic_models_lock_init at module level:

_dynamic_models_lock_init = threading.Lock()

Or use a simpler single-lock approach. That said, this is low-risk given the GIL guarantees and the fact that subsequent accesses use proper synchronization.

tests/main/test_dynamic_models.py (1)

271-303: Good constraint coverage.

Consider adding negative tests to verify constraints are actually enforced (e.g., pytest.raises(ValidationError) for out-of-range values), but the current positive tests adequately verify model generation with constraints.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 689cc79 and a323995.

📒 Files selected for processing (2)
  • src/datamodel_code_generator/__init__.py
  • tests/main/test_dynamic_models.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/util.py:49-66
Timestamp: 2025-12-25T09:23:08.506Z
Learning: In datamodel-code-generator, the is_pydantic_v2() and is_pydantic_v2_11() functions in src/datamodel_code_generator/util.py intentionally use global variable caching (_is_v2, _is_v2_11) on top of lru_cache for performance optimization. This dual-layer caching eliminates function call overhead and cache lookup overhead for frequently-called version checks. The PLW0603 linter warnings should be suppressed with # noqa: PLW0603 as this is a deliberate design choice.
📚 Learning: 2025-12-25T09:23:08.506Z
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/util.py:49-66
Timestamp: 2025-12-25T09:23:08.506Z
Learning: In datamodel-code-generator, the is_pydantic_v2() and is_pydantic_v2_11() functions in src/datamodel_code_generator/util.py intentionally use global variable caching (_is_v2, _is_v2_11) on top of lru_cache for performance optimization. This dual-layer caching eliminates function call overhead and cache lookup overhead for frequently-called version checks. The PLW0603 linter warnings should be suppressed with # noqa: PLW0603 as this is a deliberate design choice.

Applied to files:

  • src/datamodel_code_generator/__init__.py
🧬 Code graph analysis (1)
tests/main/test_dynamic_models.py (2)
src/datamodel_code_generator/enums.py (2)
  • DataModelType (48-56)
  • InputFileType (35-45)
src/datamodel_code_generator/__init__.py (3)
  • clear_dynamic_models_cache (1060-1070)
  • _make_cache_key (950-965)
  • _get_dynamic_models_lock (936-947)
🪛 Ruff (0.14.10)
tests/main/test_dynamic_models.py

46-46: Unused noqa directive (non-enabled: N806)

Remove unused noqa directive

(RUF100)


101-101: Unused noqa directive (non-enabled: N806)

Remove unused noqa directive

(RUF100)


287-287: Unused noqa directive (non-enabled: N806)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/__init__.py

938-938: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


940-940: Unused noqa directive (non-enabled: PLW0603)

Remove unused noqa directive

(RUF100)


955-955: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


956-956: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


1006-1006: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


1008-1008: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


1045-1045: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.10 on Windows
  • GitHub Check: py312-black23 on Ubuntu
  • GitHub Check: 3.10 on macOS
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.14 on Ubuntu
  • GitHub Check: 3.12 on Windows
  • GitHub Check: Analyze (python)
  • GitHub Check: 3.12 on macOS
  • GitHub Check: 3.13 on Windows
  • GitHub Check: benchmarks
🔇 Additional comments (10)
src/datamodel_code_generator/__init__.py (4)

930-933: LGTM!

Global cache state variables are correctly typed and initialized. The lazy lock initialization pattern aligns with the project's established dual-layer caching approach.


950-965: LGTM!

The cache key generation is well-designed: SHA256 provides collision resistance, repr() handles non-JSON-serializable kwargs values consistently, and returning None for non-serializable schemas gracefully disables caching rather than failing.


1051-1070: LGTM!

The FIFO eviction correctly leverages Python 3.7+ dict ordering, and the loop handles dynamic cache size reduction gracefully. The clear_dynamic_models_cache() function is properly synchronized and provides useful feedback via the return value.


1105-1107: LGTM!

New public functions are correctly exported and maintain alphabetical ordering.

tests/main/test_dynamic_models.py (6)

1-28: LGTM!

Good test setup with proper isolation (cache clearing fixture) and version skipping for Pydantic v1 compatibility.


31-127: LGTM!

Comprehensive functional tests covering simple models, nested structures, enums, circular references, and inheritance patterns. The circular reference test at lines 82-104 effectively validates that model_rebuild() works correctly for self-referential types.


130-220: LGTM!

Thorough cache testing including identity checks for cache hits, FIFO eviction verification, dynamic size adjustment, and clear functionality. The test at line 141 using is for identity comparison correctly validates caching behavior.


306-343: LGTM!

Good coverage of explicit parameter passing and auto-detection behavior. The OpenAPI auto-detection test at line 327-343 validates the is_openapi() check in the main function.


346-359: LGTM!

Good edge case test for non-JSON-serializable schemas. Testing the private _make_cache_key function directly is appropriate here since it validates an important fallback behavior.


362-420: Sophisticated and well-designed concurrency test.

The InstrumentedLock approach effectively simulates the race condition where the cache is populated while a thread is waiting to acquire the lock. The test correctly validates that the double-checked locking pattern returns the cached value rather than regenerating. Good use of events for thread coordination and proper cleanup in the finally block.

Comment thread src/datamodel_code_generator/__init__.py Outdated
Comment thread tests/main/test_dynamic_models.py Outdated
Comment thread src/datamodel_code_generator/__init__.py Dismissed
Comment thread src/datamodel_code_generator/__init__.py Dismissed
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/datamodel_code_generator/dynamic.py (1)

232-236: Consider caching the model_rebuild call.

The GenerateConfig.model_rebuild() call is executed on every invocation of generate_dynamic_models. For frequently called scenarios, this could impact performance.

💡 Potential optimization

Consider using a module-level flag or @lru_cache pattern to ensure model_rebuild is only called once:

+_config_rebuilt = False
+
 def generate_dynamic_models(
     input_: Mapping[str, Any],
     *,
     config: GenerateConfig | None = None,
     cache_size: int = 128,
 ) -> dict[str, type]:
     ...
     if pydantic.VERSION < "2.0.0":  # pragma: no cover
         msg = f"generate_dynamic_models requires Pydantic v2, found v{pydantic.VERSION}"
         raise Error(msg)
 
-    GenerateConfig.model_rebuild(_types_namespace={"StrictTypes": StrictTypes, "UnionMode": UnionMode})
+    global _config_rebuilt  # noqa: PLW0603
+    if not _config_rebuilt:
+        GenerateConfig.model_rebuild(_types_namespace={"StrictTypes": StrictTypes, "UnionMode": UnionMode})
+        _config_rebuilt = True
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b46fa8b and b266a57.

📒 Files selected for processing (5)
  • src/datamodel_code_generator/__init__.py
  • src/datamodel_code_generator/dynamic.py
  • src/datamodel_code_generator/types.py
  • src/datamodel_code_generator/util.py
  • tests/main/test_dynamic_models.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/main/test_dynamic_models.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/util.py:49-66
Timestamp: 2025-12-25T09:23:08.506Z
Learning: In datamodel-code-generator, the is_pydantic_v2() and is_pydantic_v2_11() functions in src/datamodel_code_generator/util.py intentionally use global variable caching (_is_v2, _is_v2_11) on top of lru_cache for performance optimization. This dual-layer caching eliminates function call overhead and cache lookup overhead for frequently-called version checks. The PLW0603 linter warnings should be suppressed with # noqa: PLW0603 as this is a deliberate design choice.
📚 Learning: 2025-12-25T09:22:22.481Z
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/model/pydantic/__init__.py:43-43
Timestamp: 2025-12-25T09:22:22.481Z
Learning: In datamodel-code-generator project, defensive `# noqa: PLC0415` directives should be kept on lazy imports (imports inside functions/methods) even when Ruff reports them as unused via RUF100, to prepare for potential future Ruff configuration changes that might enable the import-outside-top-level rule.

Applied to files:

  • src/datamodel_code_generator/__init__.py
🧬 Code graph analysis (3)
src/datamodel_code_generator/dynamic.py (1)
src/datamodel_code_generator/__init__.py (3)
  • Error (300-309)
  • generate (478-901)
  • is_openapi (261-263)
src/datamodel_code_generator/types.py (1)
src/datamodel_code_generator/util.py (1)
  • create_module_getattr (230-258)
src/datamodel_code_generator/__init__.py (1)
src/datamodel_code_generator/dynamic.py (2)
  • clear_dynamic_models_cache (278-287)
  • generate_dynamic_models (193-275)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/util.py

248-248: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/dynamic.py

111-111: Unused noqa directive (non-enabled: PLW0603)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/types.py

95-95: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/__init__.py

931-931: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: 3.12 on Ubuntu
  • GitHub Check: py312-isort5 on Ubuntu
  • GitHub Check: py312-isort7 on Ubuntu
  • GitHub Check: py312-isort6 on Ubuntu
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.11 on Ubuntu
  • GitHub Check: 3.10 on Ubuntu
  • GitHub Check: py312-black23 on Ubuntu
  • GitHub Check: 3.12 on Windows
  • GitHub Check: py312-black22 on Ubuntu
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.14 on macOS
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (13)
src/datamodel_code_generator/util.py (1)

230-259: LGTM! Clean lazy import utility.

The create_module_getattr function provides a reusable pattern for lazy module imports via __getattr__, which is a good refactor to eliminate code duplication.

Regarding the static analysis hint about the unused noqa directive on line 248: Based on learnings, defensive # noqa: PLC0415 directives should be retained on lazy imports even when Ruff reports them as unused, to prepare for potential future Ruff configuration changes.

src/datamodel_code_generator/types.py (1)

95-102: Clean refactor using the new utility.

The replacement of the manual __getattr__ implementation with create_module_getattr successfully consolidates the lazy import mechanism while preserving the same functionality for StrictTypes.

Regarding the static analysis hint about the unused noqa directive on line 95: Based on learnings, defensive # noqa directives for lazy imports should generally be retained to prepare for potential future Ruff configuration changes.

src/datamodel_code_generator/dynamic.py (8)

1-34: LGTM! Well-structured module setup.

The imports, thread safety primitives, and cache initialization are appropriately configured for dynamic model generation with thread-safe caching.


36-56: LGTM! Appropriate use of exec for code generation.

The single-module execution and model extraction logic is correct. The use of exec is appropriate here since it's executing code generated by the package's own generate() function. The model_rebuild call properly handles Pydantic generic models with the execution namespace.


59-106: LGTM! Correct topological sort for module dependencies.

The dependency graph construction and topological sort correctly handle multi-module code generation with inter-module dependencies using Kahn's algorithm. The sorting logic appropriately prioritizes non-__init__ modules first to ensure proper import order.


109-164: LGTM! Robust multi-module execution with proper cleanup.

The multi-module execution correctly:

  • Creates temporary modules with unique names to avoid collisions
  • Registers modules in sys.modules before execution (enabling relative imports)
  • Uses try/finally to ensure cleanup even on errors
  • Rebuilds Pydantic models with combined namespace for cross-module references
  • Cleans up sys.modules in reverse order

Regarding the static analysis hint about the unused noqa directive on line 111: Based on learnings, the # noqa: PLW0603 directive for global variable usage should be retained as this is a deliberate design choice for the module counter, similar to caching patterns used elsewhere in the codebase.


166-174: LGTM! Correct filtering of generated models.

The _extract_models function correctly filters the namespace to include only user-defined models and enums, while excluding:

  • Private names (starting with _)
  • The BaseModel and Enum base classes themselves

This properly addresses the concern from previous review comments about filtering out imported base types.


177-191: LGTM! Deterministic cache key generation.

The cache key generation correctly uses model_dump with exclude_defaults=True to create a deterministic hash, and appropriately returns None for non-serializable schemas to indicate they cannot be cached.


238-275: LGTM! Correct thread-safe caching with FIFO eviction.

The implementation properly handles:

  • Automatic input type detection (OpenAPI vs JsonSchema)
  • Double-checked locking for thread-safe cache access
  • FIFO cache eviction when size limit is reached
  • Both single-module and multi-module code generation

278-287: LGTM! Thread-safe cache clearing.

The cache clearing function correctly uses the lock to ensure thread-safe operations and returns the count of cleared entries.

src/datamodel_code_generator/__init__.py (3)

261-263: LGTM! Better type hint refinement.

Changing the parameter type from dict to Mapping[str, Any] is a good refinement that makes the function more flexible and consistent with the generate() function signature which also uses Mapping[str, Any] for dict inputs.


931-934: LGTM! Correct placement to avoid circular imports.

The imports are appropriately placed after the generate() function definition to avoid circular import issues, since dynamic.py itself imports generate. This correctly exposes the new dynamic model generation APIs in the public module interface.

Regarding the static analysis hint about the unused noqa directive on line 931: Based on learnings, defensive # noqa directives for imports should be retained to prepare for potential future Ruff configuration changes.


968-971: LGTM! Public API correctly exposed.

The __all__ list correctly includes the newly imported clear_dynamic_models_cache and generate_dynamic_models functions, making them part of the public API. The previous security scanning alerts about these names not being defined are now resolved since the functions are properly imported at lines 931-934.

@koxudaxi koxudaxi force-pushed the feature/dynamic-model-generation-v2 branch from 13cc2e3 to 74bcb2e Compare January 2, 2026 18:03
@koxudaxi koxudaxi force-pushed the feature/dynamic-model-generation-v2 branch from 74bcb2e to f2ddb50 Compare January 2, 2026 18:13
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/datamodel_code_generator/dynamic.py (1)

103-103: Remove unused noqa directive.

The static analyzer correctly identifies that the PLW0603 check is not enabled in the project's ruff configuration, making this directive unnecessary.

🔎 Proposed fix
-    global _dynamic_module_counter  # noqa: PLW0603
+    global _dynamic_module_counter
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b266a57 and a1be68f.

⛔ Files ignored due to path filters (11)
  • tests/data/expected/dynamic_models/allof_inheritance.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/circular_reference.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/enum_model.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/generated_code_validation.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/multi_module_output.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/nested_models.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/numeric_constraints.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/openapi_auto_detection.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/simple_model.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/simple_model_optional.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/expected/dynamic_models/string_constraints.json is excluded by !tests/data/**/*.json and included by none
📒 Files selected for processing (4)
  • pyproject.toml
  • src/datamodel_code_generator/dynamic.py
  • tests/data/expected/dynamic_models/complex_schema.py
  • tests/main/test_dynamic_models.py
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/util.py:49-66
Timestamp: 2025-12-25T09:23:08.506Z
Learning: In datamodel-code-generator, the is_pydantic_v2() and is_pydantic_v2_11() functions in src/datamodel_code_generator/util.py intentionally use global variable caching (_is_v2, _is_v2_11) on top of lru_cache for performance optimization. This dual-layer caching eliminates function call overhead and cache lookup overhead for frequently-called version checks. The PLW0603 linter warnings should be suppressed with # noqa: PLW0603 as this is a deliberate design choice.
📚 Learning: 2025-12-25T09:22:22.481Z
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/model/pydantic/__init__.py:43-43
Timestamp: 2025-12-25T09:22:22.481Z
Learning: In datamodel-code-generator project, defensive `# noqa: PLC0415` directives should be kept on lazy imports (imports inside functions/methods) even when Ruff reports them as unused via RUF100, to prepare for potential future Ruff configuration changes that might enable the import-outside-top-level rule.

Applied to files:

  • tests/main/test_dynamic_models.py
📚 Learning: 2026-01-02T08:25:19.839Z
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2890
File: tests/data/expected/main/jsonschema/ref_nullable_with_constraint.py:14-15
Timestamp: 2026-01-02T08:25:19.839Z
Learning: The datamodel-code-generator currently generates RootModel subclasses with an explicit `root` field annotation (e.g., `class StringType(RootModel[str]): root: str`). This is existing behavior of the code generator and should not be flagged as an issue introduced by new changes.

Applied to files:

  • tests/data/expected/dynamic_models/complex_schema.py
🧬 Code graph analysis (3)
tests/main/test_dynamic_models.py (2)
src/datamodel_code_generator/dynamic.py (3)
  • clear_dynamic_models_cache (271-280)
  • generate_dynamic_models (186-268)
  • _make_cache_key (170-183)
src/datamodel_code_generator/__init__.py (1)
  • generate (478-901)
tests/data/expected/dynamic_models/complex_schema.py (1)
src/datamodel_code_generator/model/base.py (1)
  • name (827-829)
src/datamodel_code_generator/dynamic.py (1)
src/datamodel_code_generator/parser/_graph.py (1)
  • stable_toposort (17-67)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/dynamic.py

103-103: Unused noqa directive (non-enabled: PLW0603)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: py312-isort5 on Ubuntu
  • GitHub Check: py312-pydantic1 on Ubuntu
  • GitHub Check: 3.13 on macOS
  • GitHub Check: 3.12 on Windows
  • GitHub Check: benchmarks
  • GitHub Check: 3.13 on Ubuntu
  • GitHub Check: 3.10 on Ubuntu
  • GitHub Check: 3.14 on macOS
  • GitHub Check: py312-isort6 on Ubuntu
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.11 on Windows
  • GitHub Check: Analyze (python)
🔇 Additional comments (13)
pyproject.toml (1)

187-187: LGTM! Appropriate linting exception for dynamic model tests.

The N806 exception allows uppercase variable names in tests, which aligns with the dynamic model generation patterns where model class names (Model, User, Person, etc.) may be assigned to variables.

tests/data/expected/dynamic_models/complex_schema.py (1)

1-21: LGTM! Expected generated output fixture is correct.

The generated code demonstrates proper Pydantic v2 patterns with BaseModel and RootModel usage. The explicit root: Person field on Line 21 is consistent with existing generator behavior.

src/datamodel_code_generator/dynamic.py (6)

1-35: Well-structured module setup with appropriate thread-safety.

The module imports, global cache, and threading lock provide a solid foundation for thread-safe dynamic model generation with caching.


38-98: Excellent helper function implementations.

The AST-based import parsing (_get_relative_imports) and dependency graph construction (_build_module_edges) provide robust support for multi-module generation. The namespace-aware model rebuilding in _execute_single_module correctly handles Pydantic generics.


101-156: Robust multi-module execution with proper cleanup.

The implementation correctly handles module dependencies through topological sorting, creates temporary in-memory modules, and ensures cleanup via the finally block. The combined namespace approach for model rebuilding properly resolves cross-module references.


159-183: Well-implemented model extraction and cache key generation.

The _extract_models function appropriately filters for public model classes, and _make_cache_key gracefully handles non-serializable schemas by returning None, which correctly bypasses caching for such cases.


186-268: Excellent public API implementation with robust caching and thread-safety.

The function provides a well-designed API with:

  • Automatic OpenAPI/JSON Schema detection
  • Thread-safe double-checked locking pattern
  • FIFO cache eviction leveraging dict insertion order
  • Graceful handling of both single- and multi-module outputs

The comprehensive docstring with examples is particularly helpful.


271-280: Simple and effective cache clearing implementation.

The function is thread-safe and returns the count of cleared entries, which is useful for testing and monitoring.

tests/main/test_dynamic_models.py (5)

1-75: Well-designed test setup with excellent helper functions.

The test module demonstrates good practices:

  • Appropriate version gating with pytest.mark.skipif
  • DRY helpers for schema and config creation
  • Integration with inline-snapshot for maintainable assertions
  • Autouse fixture ensuring clean test state

78-152: Comprehensive coverage of core model generation scenarios.

The tests effectively validate basic functionality, nested structures, enums, circular references, allOf inheritance, and validation errors. The combination of happy path and error cases provides robust coverage.


154-254: Thorough testing of caching and concurrency mechanisms.

The test suite provides excellent coverage of:

  • Cache behavior (hits, misses, eviction, clearing)
  • Thread-safety with concurrent access
  • Both same-schema and different-schema concurrency scenarios

The assertion on Line 253 properly validates that all workers completed, addressing the concern from previous reviews.


256-362: Excellent coverage of constraints and edge cases.

The tests validate:

  • Numeric and string constraints with proper validation errors
  • Explicit configuration and auto-detection of input types
  • Non-serializable schema handling
  • Double-checked locking behavior with instrumented lock

The test for non-serializable schemas (313-318) ensuring cache bypass is particularly valuable for correctness.


365-425: Comprehensive multi-module output validation.

The tests ensure that:

  • Multi-module generation with ModuleSplitMode.Single works correctly
  • Generated code matches expected snapshots
  • Cross-module references (Order -> User) are properly resolved

The combination of validation testing and snapshot comparison provides strong guarantees about correctness.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/datamodel_code_generator/dynamic.py (3)

99-99: Remove unused noqa directive.

The # noqa: PLW0603 directive is unnecessary since the PLW0603 warning is not enabled in your linter configuration. The global counter usage itself is appropriate for generating unique package names.

🔎 Proposed fix
-    global _dynamic_module_counter  # noqa: PLW0603
+    global _dynamic_module_counter

219-219: Consider moving model_rebuild to module initialization.

This call is invoked on every generate_dynamic_models call, even though model_rebuild is idempotent. If safe to do so, consider calling it once at module initialization to reduce repeated overhead.

🔎 Suggested optimization

At module level (after imports):

# Rebuild GenerateConfig once at module initialization
GenerateConfig.model_rebuild(_types_namespace={"StrictTypes": StrictTypes, "UnionMode": UnionMode})

Then remove line 219 from the function body.


215-217: Consider using packaging.version for more robust version comparison.

While the current string comparison works for checking >= 2.0.0, using packaging.version.parse() would be more robust for handling pre-release versions and ensuring accurate semantic version comparison.

🔎 Proposed enhancement
+from packaging.version import parse as parse_version
+
 ...
 
-    if pydantic.VERSION < "2.0.0":  # pragma: no cover
+    if parse_version(pydantic.VERSION) < parse_version("2.0.0"):  # pragma: no cover
         msg = f"generate_dynamic_models requires Pydantic v2, found v{pydantic.VERSION}"
         raise Error(msg)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be0ed14 and 8044952.

⛔ Files ignored due to path filters (1)
  • tests/data/dynamic_models/openapi_schema.json is excluded by !tests/data/**/*.json and included by none
📒 Files selected for processing (2)
  • src/datamodel_code_generator/dynamic.py
  • tests/main/test_dynamic_models.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/main/test_dynamic_models.py
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/util.py:49-66
Timestamp: 2025-12-25T09:23:08.506Z
Learning: In datamodel-code-generator, the is_pydantic_v2() and is_pydantic_v2_11() functions in src/datamodel_code_generator/util.py intentionally use global variable caching (_is_v2, _is_v2_11) on top of lru_cache for performance optimization. This dual-layer caching eliminates function call overhead and cache lookup overhead for frequently-called version checks. The PLW0603 linter warnings should be suppressed with # noqa: PLW0603 as this is a deliberate design choice.
🪛 Ruff (0.14.10)
src/datamodel_code_generator/dynamic.py

99-99: Unused noqa directive (non-enabled: PLW0603)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: 3.10 on macOS
  • GitHub Check: py312-black22 on Ubuntu
  • GitHub Check: py312-black24 on Ubuntu
  • GitHub Check: py312-isort5 on Ubuntu
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.14 on macOS
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.11 on macOS
  • GitHub Check: 3.12 on macOS
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.13 on Windows
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (10)
src/datamodel_code_generator/dynamic.py (10)

1-36: Module setup looks good.

The imports, global cache infrastructure, and threading primitives are well-organized and appropriate for the runtime model generation feature.


38-94: Helper functions are well-implemented.

The path manipulation, AST-based import extraction, and dependency graph construction are all correctly implemented. Using AST parsing for import detection is a robust choice.


52-63: Single module execution is correctly implemented.

The controlled use of exec() with a restricted namespace and subsequent model rebuilding for generics is appropriate for this dynamic generation use case.


97-146: Multi-module execution orchestration is well-designed.

The dependency resolution via topological sort, proper module registration with correct __package__ attributes, and guaranteed cleanup in the finally block demonstrate careful attention to the complexities of runtime module creation.


149-157: Model extraction logic is correct.

The filtering appropriately identifies user-defined models and enums while excluding base classes and private names.


160-173: Cache key generation is well-implemented.

Using SHA256 hashing of sorted JSON with default exclusion creates consistent cache keys while gracefully handling non-serializable inputs.


236-258: Cache implementation is thread-safe and correct.

The double-checked locking pattern and FIFO eviction logic (relying on dict insertion order) are properly implemented. The cache correctly maintains the specified maximum size.


221-234: Config initialization and auto-detection logic is correct.

The automatic detection of input type and creation of appropriate default configs is well-handled, using immutable config updates via model_copy.


261-270: Cache clearing is thread-safe and straightforward.

The implementation correctly uses the lock and returns useful information about the number of cleared entries.


176-214: Excellent documentation for the public API.

The docstring thoroughly covers parameters, behavior, caveats, and includes a practical example. The notes about thread-safety, Pydantic version requirements, and pickling limitations are valuable for users.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/datamodel_code_generator/dynamic.py (2)

238-254: Consider reducing lock scope for better concurrency.

The lock is held during the entire generate() call (line 242), which could be slow for complex schemas. This serializes all dynamic model generation, even for different schemas. Consider a pattern that only locks for cache access:

  1. Check cache under lock
  2. Release lock and generate code
  3. Re-acquire lock and check cache again (double-check locking)
  4. Store result if not already cached

This would allow concurrent generation of different schemas while still ensuring thread-safe cache access.

🔎 Suggested pattern for improved concurrency
# Check cache first
with _dynamic_models_lock:
    if use_cache and cache_key in _dynamic_models_cache:
        return _dynamic_models_cache[cache_key]

# Generate outside lock
result = generate(input_=input_, config=config)
if result is None:
    msg = "generate() returned None"
    raise Error(msg)
models = _execute_single_module(result) if isinstance(result, str) else _execute_multi_module(result)

# Update cache under lock
if use_cache:
    with _dynamic_models_lock:
        # Double-check in case another thread populated it
        if cache_key in _dynamic_models_cache:
            return _dynamic_models_cache[cache_key]
        while len(_dynamic_models_cache) >= cache_size:
            oldest_key = next(iter(_dynamic_models_cache))
            del _dynamic_models_cache[oldest_key]
        _dynamic_models_cache[cache_key] = models

return models

214-216: Consider more robust version comparison.

String comparison works for typical semver versions but may behave unexpectedly with pre-release versions (e.g., "2.0.0a1" < "2.0.0" is lexicographically True). For this check, pre-releases of v2 would correctly pass, so this is likely fine in practice.

If stricter version handling is desired, consider using packaging.version.Version or tuple comparison:

major_version = int(pydantic.VERSION.split(".")[0])
if major_version < 2:
    ...
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8044952 and 9792cbf.

📒 Files selected for processing (1)
  • src/datamodel_code_generator/dynamic.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: py312-pydantic1 on Ubuntu
  • GitHub Check: py312-isort7 on Ubuntu
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.12 on Ubuntu
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.13 on macOS
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.11 on Ubuntu
  • GitHub Check: 3.14 on Windows
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (12)
src/datamodel_code_generator/dynamic.py (12)

34-36: LGTM!

The global state is well-structured with proper thread-safety primitives. The itertools.count(1) ensures unique package names across dynamic module generations.


39-50: LGTM!

Clean helper functions for path-to-module name conversion with proper cross-platform handling via PurePath.


67-77: LGTM!

The AST-based extraction correctly handles both from . import X and from .module import X patterns. Note that only level-1 relative imports are processed, which should be sufficient for the generated code structure.


80-95: LGTM!

Clean implementation of dependency graph construction. The edge direction correctly ensures dependencies are processed before dependents in the topological sort.


106-110: Verify the sort order for __init__ files.

The sort key _is_init_file(p) returns True for __init__.py files. Since True > False in Python, this sorts __init__ files after regular files, not before. If the intent is to process __init__ files first (as is typical for package initialization), the key should be negated:

nodes.sort(key=lambda p: (not _is_init_file(p), p))

Please verify whether __init__.py files should be processed before or after regular module files in your topological sort. If they should come first, apply the fix above.


143-145: LGTM!

The finally block correctly cleans up sys.modules in reverse order, ensuring proper teardown even if an exception occurs during execution.


148-156: LGTM!

Clean extraction logic that correctly filters for user-defined model and enum classes while excluding base classes and private names.


159-172: LGTM!

Robust cache key generation with proper error handling for non-serializable inputs. Using SHA-256 ensures consistent, collision-resistant keys.


248-252: LGTM!

The FIFO eviction correctly leverages Python 3.7+ dict insertion order guarantee. The while loop ensures the cache never exceeds cache_size.


257-266: LGTM!

Clean implementation with proper thread-safety and useful return value for diagnostics.


175-213: LGTM!

Excellent docstring with clear explanation of behavior, caveats, and a practical usage example. The type hints are accurate and the function signature is well-designed with sensible defaults.


60-62: Consider calling model_rebuild() on all BaseModel subclasses, not just generics.

Per Pydantic v2 documentation, model_rebuild() should be called on models with forward references to resolve them after all types are defined. The current condition restricts rebuilding to models with __pydantic_generic_metadata__ (present only on generic models), which may skip non-generic models that also need schema resolution. Since the dynamically generated code may contain forward references across any model type, consider rebuilding all extracted BaseModel subclasses unconditionally with the populated namespace.

@koxudaxi koxudaxi merged commit 44934ce into main Jan 2, 2026
39 checks passed
@koxudaxi koxudaxi deleted the feature/dynamic-model-generation-v2 branch January 2, 2026 20:47
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 3, 2026

🎉 Released in 0.52.1

This PR is now available in the latest release. See the release notes for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants