Skip to content

feat: Add --module-split-mode option to generate one file per model (#1170)#2685

Merged
koxudaxi merged 11 commits intomainfrom
feature/module-split-mode-single
Dec 18, 2025
Merged

feat: Add --module-split-mode option to generate one file per model (#1170)#2685
koxudaxi merged 11 commits intomainfrom
feature/module-split-mode-single

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Dec 18, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added --module-split-mode CLI option to control how generated model classes are organized in output (single file or separate files per model).
  • Documentation

    • Updated CLI reference documentation with the new --module-split-mode option, including configuration examples and usage guidance.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 18, 2025

Walkthrough

This PR introduces a --module-split-mode CLI option that enables splitting generated models into separate files (one per model class). The feature includes new enum and parameter definitions, CLI argument registration, parser logic modifications for import resolution, and supporting test data.

Changes

Cohort / File(s) Change Summary
Documentation
docs/cli-reference/general-options.md, docs/cli-reference/index.md, docs/cli-reference/quick-reference.md
Adds CLI option documentation for --module-split-mode across all reference documents, updating table entries, navigation indexes, and alphabetical listings.
Public API & Enum
src/datamodel_code_generator/__init__.py
Introduces ModuleSplitMode enum with Single value; extends generate() signature with module_split_mode parameter; exports new enum in __all__.
CLI Configuration
src/datamodel_code_generator/__main__.py, src/datamodel_code_generator/arguments.py, src/datamodel_code_generator/cli_options.py
Adds --module-split-mode argument to CLI parser; extends Config model with module_split_mode field; registers option metadata in CLI option registry.
Core Parser Logic
src/datamodel_code_generator/parser/base.py
Threads module_split_mode through parse() method; extends __change_from_import() to accept model_path_to_module_name mapping; adjusts import resolution, module naming, and sorting logic for per-model file splitting when Single mode is active.
Utility Refactoring
src/datamodel_code_generator/util.py, src/datamodel_code_generator/reference.py
Moves camel_to_snake function from reference.py to util.py with added LRU caching; updates imports accordingly.
Test Data & Test Cases
tests/data/jsonschema/module_split_single/input.json, tests/data/expected/main/jsonschema/module_split_single/\*, tests/main/test_main_general.py`
Adds JSON Schema input fixture and corresponding expected output files (module split across __init__.py, model.py, order.py, user.py); implements end-to-end test exercising --module-split-mode=single.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Areas requiring extra attention:

  • src/datamodel_code_generator/parser/base.py: Dense logic changes to import resolution pathways, module naming, and path calculations when module_split_mode.Single is active. Pay close attention to:
    • New model_path_to_module_name parameter threading and its use in relative import calculations
    • Module sorting key derivation with camel_to_snake applied to class names
    • Adjustments to ancestor checks and module path depth calculations
    • Backward-compatibility preservation when module_split_mode is not provided
  • __change_from_import method signature: New optional parameter with substantial logic impact across multiple conditional branches
  • Test coverage validation: Ensure expected output files correctly reflect module split structure and import statements

Poem

🐰 A rabbit hops through code today,
With modes that split the gen-code way—
One model, one file—crisp and clean!
The finest split you've ever seen. ✨
Module dance? We've got the groove—
One class per file, we're in the move!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a --module-split-mode option to generate one file per model, which is the core feature across all modified files.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/module-split-mode-single

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

🤖 Generated by GitHub Actions
Comment thread src/datamodel_code_generator/parser/base.py Fixed
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Dec 18, 2025

CodSpeed Performance Report

Merging #2685 will not alter performance

Comparing feature/module-split-mode-single (71c7b0c) with main (0574548)

Summary

✅ 52 untouched
⏩ 10 skipped1

Footnotes

  1. 10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.53%. Comparing base (0574548) to head (71c7b0c).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2685   +/-   ##
=======================================
  Coverage   99.53%   99.53%           
=======================================
  Files          81       81           
  Lines       11334    11358   +24     
  Branches     1354     1357    +3     
=======================================
+ Hits        11281    11305   +24     
  Misses         32       32           
  Partials       21       21           
Flag Coverage Δ
unittests 99.53% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@koxudaxi koxudaxi force-pushed the feature/module-split-mode-single branch from 5797470 to 456dab0 Compare December 18, 2025 16:07
Comment thread src/datamodel_code_generator/util.py Fixed
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
docs/cli-reference/general-options.md (1)

17-17: Clarify --module-split-mode defaults and constraints

The new docs read well and align with the generated example. Two small clarifications would help advanced users:

  • Explicitly state that omitting --module-split-mode keeps the existing behavior (no per-model splitting) and that single is currently the only supported value.
  • Call out that --module-split-mode=single requires an output directory (not a single file path), since modular output is assumed downstream.

These are documentation-only tweaks; the implementation looks fine.

Also applies to: 1676-1784

src/datamodel_code_generator/util.py (1)

10-11: Shared camel_to_snake helper looks correct; consider explicit test coverage

The regex-based implementation and caching are sound and appropriate for reuse across the codebase. To guard against regressions, especially for acronyms (HTTPResponsehttp_response) and mixed alphanumerics, consider adding a small focused test for camel_to_snake if one doesn’t already exist via the module-splitting tests.

Also applies to: 147-155

src/datamodel_code_generator/__main__.py (1)

24-35: Config/CLI wiring for module_split_mode is correct; drop the new noqa

The new ModuleSplitMode import, Config.module_split_mode field, and propagation into generate() via run_generate_from_config are consistent with how other enum-style options are handled and look correct.

Ruff is flagging the new # noqa: UP045 on module_split_mode as an unused directive; unlike the legacy ones, this line is newly introduced in this PR, so it’s easy to keep it clean.

Suggested diff to remove the redundant noqa on the new field
-    module_split_mode: Optional[ModuleSplitMode] = None  # noqa: UP045
+    module_split_mode: Optional[ModuleSplitMode] = None

Also applies to: 468-468, 663-673, 770-771

src/datamodel_code_generator/parser/base.py (2)

26-35: Module-split Single grouping and import resolution are consistent with existing design

The parser changes for ModuleSplitMode.Single are internally coherent:

  • Grouping models per file via module_key() using camel_to_snake(data_model.class_name) yields one logical module per class while preserving directory structure (*data_model.module_path).
  • model_path_to_module_name correctly decouples a model’s logical module name from its original module_name, so __change_from_import() computes relative imports against the per-class module when split mode is enabled, and falls back to the original behavior otherwise.
  • The use of target_full_name and the existing relative()/exact_import() helpers ensures that with use_exact_imports=True you get clean imports like from .user import User, matching the new documentation example.
  • Interaction with circular-import handling and tree-scope reuse appears safe: when models are relocated into _internal/shared modules, the absence of a mapping causes the code to fall back to data_type.full_name, which is updated with the new path, so imports still resolve correctly.

Given the complexity, it’s worth making sure there is test coverage for:

  • --module-split-mode=single both with and without --use-exact-imports, and
  • combinations with --reuse-model --reuse-scope=tree and scenarios that previously triggered circular-import resolution.

But the current implementation itself looks sound.

Also applies to: 72-72, 1066-1150, 2411-2421, 2454-2463, 2475-2482, 2546-2553


1066-1066: Remove unused noqa codes on __change_from_import

Ruff reports the # noqa: PLR0913, PLR0914 on __change_from_import as unused because those rules aren’t enabled. Since this function is already long and complex, keeping the signature as-is is fine, but the noqa itself is now just noise.

Suggested diff to drop the unused noqa
-    def __change_from_import(  # noqa: PLR0913, PLR0914
+    def __change_from_import(
         self,
         models: list[DataModel],
         imports: Imports,
         scoped_model_resolver: ModelResolver,
         *,
         init: bool,
         internal_modules: set[tuple[str, ...]] | None = None,
         model_path_to_module_name: dict[str, str] | None = None,
     ) -> None:
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0574548 and 71c7b0c.

📒 Files selected for processing (16)
  • docs/cli-reference/general-options.md (2 hunks)
  • docs/cli-reference/index.md (2 hunks)
  • docs/cli-reference/quick-reference.md (2 hunks)
  • src/datamodel_code_generator/__init__.py (4 hunks)
  • src/datamodel_code_generator/__main__.py (3 hunks)
  • src/datamodel_code_generator/arguments.py (2 hunks)
  • src/datamodel_code_generator/cli_options.py (1 hunks)
  • src/datamodel_code_generator/parser/base.py (8 hunks)
  • src/datamodel_code_generator/reference.py (1 hunks)
  • src/datamodel_code_generator/util.py (2 hunks)
  • tests/data/expected/main/jsonschema/module_split_single/__init__.py (1 hunks)
  • tests/data/expected/main/jsonschema/module_split_single/model.py (1 hunks)
  • tests/data/expected/main/jsonschema/module_split_single/order.py (1 hunks)
  • tests/data/expected/main/jsonschema/module_split_single/user.py (1 hunks)
  • tests/data/jsonschema/module_split_single/input.json (1 hunks)
  • tests/main/test_main_general.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (8)
tests/data/expected/main/jsonschema/module_split_single/order.py (2)
src/datamodel_code_generator/util.py (1)
  • BaseModel (140-144)
tests/data/expected/main/jsonschema/module_split_single/user.py (1)
  • User (11-13)
src/datamodel_code_generator/__main__.py (1)
src/datamodel_code_generator/__init__.py (1)
  • ModuleSplitMode (303-309)
src/datamodel_code_generator/arguments.py (1)
src/datamodel_code_generator/__init__.py (1)
  • ModuleSplitMode (303-309)
tests/data/expected/main/jsonschema/module_split_single/__init__.py (3)
tests/data/expected/main/jsonschema/module_split_single/model.py (1)
  • Model (11-12)
tests/data/expected/main/jsonschema/module_split_single/order.py (1)
  • Order (13-15)
tests/data/expected/main/jsonschema/module_split_single/user.py (1)
  • User (11-13)
tests/data/expected/main/jsonschema/module_split_single/user.py (1)
src/datamodel_code_generator/model/base.py (1)
  • name (599-601)
src/datamodel_code_generator/parser/base.py (2)
src/datamodel_code_generator/__init__.py (1)
  • ModuleSplitMode (303-309)
src/datamodel_code_generator/util.py (1)
  • camel_to_snake (152-155)
src/datamodel_code_generator/reference.py (1)
src/datamodel_code_generator/util.py (1)
  • camel_to_snake (152-155)
src/datamodel_code_generator/__init__.py (1)
src/datamodel_code_generator/model/enum.py (1)
  • Enum (39-120)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/__main__.py

468-468: Unused noqa directive (non-enabled: UP045)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/parser/base.py

1066-1066: Unused noqa directive (non-enabled: PLR0913, PLR0914)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: 3.14 on Windows
  • GitHub Check: dev
  • GitHub Check: benchmarks
  • GitHub Check: Analyze (python)
🔇 Additional comments (10)
src/datamodel_code_generator/reference.py (1)

39-39: LGTM! Good refactoring to centralize utility function.

Moving camel_to_snake to util.py eliminates code duplication and provides caching benefits via @lru_cache as shown in the relevant code snippets.

Also applies to: 284-284

src/datamodel_code_generator/arguments.py (2)

25-25: LGTM! Import follows established pattern.

The ModuleSplitMode import aligns with other enum imports in this module.


327-332: LGTM! CLI argument definition follows established patterns.

The argument structure is consistent with other enum-based options like --all-exports-collision-strategy. Help text clearly describes the functionality.

tests/main/test_main_general.py (1)

1032-1059: LGTM! Test follows established patterns.

The test structure is consistent with other CLI tests in this file. The cli_doc marker provides proper documentation metadata, and the test correctly exercises the module-split functionality with appropriate flag combinations.

src/datamodel_code_generator/cli_options.py (1)

209-209: LGTM! Metadata entry correctly structured.

The CLI option metadata follows the established pattern and is appropriately categorized as OptionCategory.GENERAL.

docs/cli-reference/quick-reference.md (1)

147-147: LGTM! Documentation entries are clear and correctly formatted.

The option is properly documented in both the categorized list and alphabetical index with consistent formatting and appropriate descriptions.

Also applies to: 218-218

tests/data/expected/main/jsonschema/module_split_single/order.py (1)

1-15: Order fixture matches per-model split expectations

The model definition, type hints, and relative import from .user all look correct for the module_split_single layout; nothing to change here.

docs/cli-reference/index.md (1)

17-18: Index entry for --module-split-mode is consistent

The category count, jump index, and M section entry all correctly reference the new option and its anchor. No further changes needed.

Also applies to: 22-22, 105-108

tests/data/expected/main/jsonschema/module_split_single/__init__.py (1)

1-14: Re-exporting models from per-file modules looks correct

The generated __init__.py cleanly re-exports Model, Order, and User from their respective modules and defines __all__ accordingly; this matches the documented split-mode example.

src/datamodel_code_generator/__init__.py (1)

303-310: ModuleSplitMode integration into the public API looks solid

Defining ModuleSplitMode here, threading it through generate() into parser.parse(), and exporting it via __all__ is consistent with the existing enum-based configuration pattern and should be fully backwards-compatible for callers that don’t use the new option.

Also applies to: 382-483, 725-732, 860-875

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants