Add --use-default-factory-for-optional-nested-models option#2711
Add --use-default-factory-for-optional-nested-models option#2711
Conversation
|
Warning Rate limit exceeded@koxudaxi has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 18 minutes and 0 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (24)
WalkthroughAdds a boolean option Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2711 +/- ##
========================================
Coverage 99.33% 99.34%
========================================
Files 81 81
Lines 11480 11582 +102
Branches 1367 1401 +34
========================================
+ Hits 11404 11506 +102
Misses 45 45
Partials 31 31
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_dataclass.py (1)
26-26: Consider the semantic implications of union types with default_factory.The
metadatafield has typedict[str, str] | Contact | Nonebut usesdefault_factory=Contact. While syntactically valid, this means the default value is always aContactinstance, never a dict, even though the type signature suggests dict is equally valid. This could be confusing for API consumers who see dict in the type but never encounter it as a default.This appears to be intentional behavior where the generator prioritizes model types over primitive types when selecting default_factory values. Consider documenting this behavior in user-facing docs or comments.
tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_pydantic_v2.py (1)
23-23: Same union type + default_factory consideration applies here.As with the dataclass variant,
metadatahas typedict[str, str] | Contact | NonewithField(default_factory=Contact). The default will always be aContactinstance, not a dict. This is consistent across output formats but may warrant documentation about the generator's type prioritization strategy.tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_msgspec.py (1)
23-23: Union type + default_factory pattern consistent across formats.Like the dataclass and Pydantic variants,
metadatahas typedict[str, str] | Contact | UnsetTypewithfield(default_factory=Contact). The consistency across all output formats confirms this is intentional generator behavior, but documenting the type prioritization logic would help users understand why dict is in the type but never the default.src/datamodel_code_generator/model/msgspec.py (1)
250-250: Remove unused# noqa: PLR0912on__str__Ruff flags the
# noqa: PLR0912on__str__as unused. Since PLR0912 isn’t enabled here, the directive is redundant and can be safely removed to reduce noise.Proposed cleanup
- def __str__(self) -> str: # noqa: PLR0912 + def __str__(self) -> str:tests/main/jsonschema/test_main_jsonschema.py (1)
4420-4445: LGTM! Well-structured test for the new feature.The test follows established patterns in this file and appropriately covers three output model types. Consider adding a
@pytest.mark.cli_docdecorator to generate user documentation for this new CLI option, similar to how other feature tests document their flags (see examples liketest_main_use_frozen_fieldat line 4376 ortest_main_jsonschema_type_aliasat line 3628).Optional: Add CLI documentation decorator
+@pytest.mark.cli_doc( + options=["--use-default-factory-for-optional-nested-models"], + input_schema="jsonschema/default_factory_nested_model.json", + cli_args=["--use-default-factory-for-optional-nested-models"], + model_outputs={ + "dataclass": "main/jsonschema/default_factory_nested_model_dataclass.py", + "pydantic_v2": "main/jsonschema/default_factory_nested_model_pydantic_v2.py", + "msgspec": "main/jsonschema/default_factory_nested_model_msgspec.py", + }, +) @pytest.mark.parametrize( ("output_model", "expected_file"), [
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
src/datamodel_code_generator/__init__.py(2 hunks)src/datamodel_code_generator/__main__.py(2 hunks)src/datamodel_code_generator/arguments.py(1 hunks)src/datamodel_code_generator/cli_options.py(1 hunks)src/datamodel_code_generator/model/base.py(1 hunks)src/datamodel_code_generator/model/dataclass.py(2 hunks)src/datamodel_code_generator/model/msgspec.py(3 hunks)src/datamodel_code_generator/model/pydantic/base_model.py(2 hunks)src/datamodel_code_generator/parser/base.py(2 hunks)src/datamodel_code_generator/parser/graphql.py(2 hunks)src/datamodel_code_generator/parser/jsonschema.py(3 hunks)src/datamodel_code_generator/parser/openapi.py(2 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_dataclass.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_msgspec.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_pydantic_v2.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_dataclass.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_msgspec.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_pydantic_v2.py(1 hunks)tests/data/jsonschema/default_factory_nested_model.json(1 hunks)tests/data/jsonschema/default_factory_nested_model_with_dict.json(1 hunks)tests/main/jsonschema/test_main_jsonschema.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
src/datamodel_code_generator/model/dataclass.py (4)
src/datamodel_code_generator/parser/base.py (1)
data_type(980-982)src/datamodel_code_generator/reference.py (1)
reference(76-78)src/datamodel_code_generator/model/pydantic/dataclass.py (1)
DataClass(17-21)src/datamodel_code_generator/model/base.py (2)
class_name(646-648)class_name(651-655)
tests/data/expected/main/jsonschema/default_factory_nested_model_msgspec.py (1)
src/datamodel_code_generator/model/msgspec.py (1)
field(243-248)
tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_pydantic_v2.py (1)
tests/data/expected/main/jsonschema/default_factory_nested_model_pydantic_v2.py (3)
Address(10-12)Contact(15-17)Model(20-23)
tests/main/jsonschema/test_main_jsonschema.py (2)
tests/test_main_kr.py (1)
output_file(44-46)tests/main/conftest.py (2)
output_file(94-96)run_main_and_assert(196-352)
src/datamodel_code_generator/model/pydantic/base_model.py (4)
src/datamodel_code_generator/model/msgspec.py (1)
_get_default_factory_for_optional_nested_model(427-438)src/datamodel_code_generator/parser/base.py (1)
data_type(980-982)src/datamodel_code_generator/reference.py (1)
reference(76-78)src/datamodel_code_generator/model/base.py (2)
class_name(646-648)class_name(651-655)
tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_msgspec.py (4)
src/datamodel_code_generator/model/msgspec.py (1)
field(243-248)src/datamodel_code_generator/model/base.py (1)
field(305-307)src/datamodel_code_generator/model/dataclass.py (1)
field(139-144)src/datamodel_code_generator/model/pydantic/base_model.py (1)
field(89-103)
src/datamodel_code_generator/model/msgspec.py (4)
src/datamodel_code_generator/model/pydantic/base_model.py (1)
_get_default_factory_for_optional_nested_model(155-166)src/datamodel_code_generator/parser/base.py (1)
data_type(980-982)src/datamodel_code_generator/reference.py (1)
reference(76-78)src/datamodel_code_generator/model/base.py (2)
class_name(646-648)class_name(651-655)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/model/msgspec.py
250-250: Unused noqa directive (non-enabled: PLR0912)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: 3.10 on Ubuntu
- GitHub Check: 3.11 on Ubuntu
- GitHub Check: 3.10 on Windows
- GitHub Check: py312-black24 on Ubuntu
- GitHub Check: py312-black22 on Ubuntu
- GitHub Check: 3.13 on Windows
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.12 on Windows
- GitHub Check: Analyze (python)
- GitHub Check: benchmarks
🔇 Additional comments (21)
tests/data/jsonschema/default_factory_nested_model.json (1)
1-26: LGTM! Well-structured test fixture.The JSON Schema correctly defines a root model with required
nameand optional nestedAddressandContactobjects, providing a clean test case for the new default_factory behavior.tests/data/expected/main/jsonschema/default_factory_nested_model_msgspec.py (1)
1-23: LGTM! Correct msgspec output with default_factory.The generated code properly uses
field(default_factory=Address)andfield(default_factory=Contact)for optional nested models, aligning with the PR's objective to make nested models safely traversable without None checks.tests/data/expected/main/jsonschema/default_factory_nested_model_pydantic_v2.py (1)
1-23: LGTM! Clean Pydantic v2 output with default_factory.The generated code correctly uses
Field(default_factory=Address)andField(default_factory=Contact)for optional nested models, enabling safe field traversal without None checks as intended by the PR.tests/data/expected/main/jsonschema/default_factory_nested_model_dataclass.py (1)
1-26: LGTM! Clean dataclass output with default_factory.The generated code correctly uses
field(default_factory=Address)andfield(default_factory=Contact)for optional nested models, achieving the PR's goal of making nested fields safely traversable.src/datamodel_code_generator/parser/openapi.py (1)
276-276: LGTM! Proper parameter propagation.The new
use_default_factory_for_optional_nested_modelsparameter is correctly added to the signature and forwarded to the base class without introducing any logic issues.Also applies to: 373-373
src/datamodel_code_generator/parser/graphql.py (1)
193-193: LGTM! Parameter propagation is correct.The new
use_default_factory_for_optional_nested_modelsparameter is properly added to the GraphQL parser signature with a safe default ofFalseand correctly threaded through to the parent class.Also applies to: 288-288
src/datamodel_code_generator/model/base.py (1)
169-169: LGTM! Configuration field added correctly.The new boolean field is properly added to
DataModelFieldBasewith a safe default value ofFalse, maintaining backward compatibility.src/datamodel_code_generator/cli_options.py (1)
95-97: LGTM! CLI metadata entry is well-formed.The documentation metadata for the new option is correctly structured and categorized under Model Customization, following the established pattern.
src/datamodel_code_generator/arguments.py (1)
597-603: LGTM! CLI argument definition is well-structured.The argparse definition correctly implements the flag with:
- Clear, descriptive help text with concrete examples
- Appropriate
store_trueaction for boolean behavior- Default of
Noneto allow config file overrides- Proper placement in the field_options group
src/datamodel_code_generator/__main__.py (1)
462-462: LGTM! Config propagation is complete.The new configuration field is properly:
- Declared in the Config model with the correct type and default value
- Passed through to the
generate()function inrun_generate_from_config()Also applies to: 765-765
src/datamodel_code_generator/__init__.py (1)
473-473: LGTM! Public API extension is complete.The new parameter is correctly:
- Added to the
generate()function signature with a safe default ofFalse- Forwarded to the parser instantiation, completing the propagation chain
Also applies to: 721-721
src/datamodel_code_generator/model/pydantic/base_model.py (2)
155-166: LGTM! Helper method correctly identifies nested model types.The
_get_default_factory_for_optional_nested_model()helper properly:
- Iterates through data types to find BaseModel references
- Skips dict types (which cannot be used as default_factory)
- Returns the appropriate class name or alias for use as a factory
- Returns None when no suitable model is found
This implementation aligns with the similar pattern used in msgspec.py.
220-227: LGTM! Default factory logic is correctly implemented.The conditional logic properly applies the default_factory when all conditions are met:
- No explicit default_factory is already set
- The feature flag is enabled
- The field is optional (not required)
- No explicit default value is provided (None or UNDEFINED)
This correctly implements the desired behavior from the linked issue #1426.
tests/data/jsonschema/default_factory_nested_model_with_dict.json (1)
1-31: LGTM! Test input covers important edge cases.This JSON Schema test file properly validates the new feature with:
- Nested model references (
Address,Contact)- Optional fields to trigger default_factory logic
- A
oneOfcontaining both a dict type and a model reference (themetadatafield)This edge case ensures the implementation correctly handles unions with dict types, which should be skipped when determining default_factory candidates.
src/datamodel_code_generator/model/dataclass.py (1)
146-157: Helper for resolving nested dataclass default_factory looks correctThe
_get_default_factory_for_nested_model()implementation is consistent with the existing Pydantic helper (skips dicts, prefers alias over class_name) and should behave as expected for nestedDataClassreferences.src/datamodel_code_generator/model/msgspec.py (2)
250-298:__str__default_factory handling for optional nested Structs looks goodThe new block that sets
data["default_factory"]for optional nested Struct fields integrates cleanly with existing default/default_factory handling:
- It only applies when
use_default_factory_for_optional_nested_modelsis enabled, the field is not required, has no explicit default/default_factory, and resolves to a Struct-backed data type.- It correctly removes any incidental
"default"value (e.g.,UNSET/None) before emittingfield(default_factory=NestedStruct).This matches the Pydantic helper’s behavior and keeps the rest of the kwargs machinery (including alias/name) intact.
427-438: Nested Struct default_factory resolver is consistent with existing patterns
_get_default_factory_for_optional_nested_model()correctly:
- Ignores dict-shaped data types (
is_dict).- Returns either the alias or the Struct class_name for referenced Struct-backed types.
This mirrors the existing pydantic helper (see
model/pydantic/base_model.py) and should give consistent behavior across backends.src/datamodel_code_generator/parser/jsonschema.py (2)
599-605: JsonSchema parser correctly threads the new flag into the baseParserThe new
use_default_factory_for_optional_nested_modelsparameter is added toJsonSchemaParser.__init__and forwarded tosuper().__init__with the same name. This keeps the Parser API aligned and ensures the flag is available onselffor downstream logic (e.g., inget_object_field).Also applies to: 608-701
1008-1041: Passing the flag intoget_object_fieldcorrectly targets nested fields
get_object_field()now passesuse_default_factory_for_optional_nested_models=self.use_default_factory_for_optional_nested_modelsintodata_model_field_type, so only object properties (i.e., true “nested fields”) opt into the new behavior. Array/root and other special-case fields that bypassget_object_fieldremain unchanged, which matches the intended scope of the feature.src/datamodel_code_generator/parser/base.py (1)
737-743: Flag plumbing onParseris correctAdding
use_default_factory_for_optional_nested_modelsat the tail ofParser.__init__and storing it onselfis API-safe and gives subclasses access to the flag. All parser subclasses (JsonSchemaParser, GraphQLParser, OpenAPIParser) accept and forward this kwarg tosuper().__init__(), so there is no divergence.tests/main/jsonschema/test_main_jsonschema.py (1)
4448-4473: LGTM! Good edge case coverage.This test appropriately covers the behavior when optional nested models are in a union with dict types. The parametrization and test structure are correct.
306b1d4 to
80465df
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/datamodel_code_generator/model/msgspec.py (1)
250-260: Optional nested Structdefault_factorybehavior is sound; remove unusednoqaThe new
_get_default_factory_for_optional_nested_modelhelper and the__str__block correctly:
- Detect nested
Struct-backed types (skipping dicts) and pickaliasorclass_name.- Only inject
default_factoryfor non‑required fields whosedefaultisNone/UNDEFINEDand where no explicitdefault_factoryis present.- Drop any synthesized
"default"entry when switching todefault_factory, keeping the finalfield(...)call coherent.One small cleanup: Ruff reports the
# noqa: PLR0912on__str__as unused. Since that rule isn’t enabled anymore, it can be removed to satisfy RUF100.Proposed diff to drop unused
noqa- def __str__(self) -> str: # noqa: PLR0912 + def __str__(self) -> str:Also applies to: 287-297, 426-437
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
src/datamodel_code_generator/__init__.py(2 hunks)src/datamodel_code_generator/__main__.py(2 hunks)src/datamodel_code_generator/arguments.py(1 hunks)src/datamodel_code_generator/cli_options.py(1 hunks)src/datamodel_code_generator/model/base.py(1 hunks)src/datamodel_code_generator/model/dataclass.py(2 hunks)src/datamodel_code_generator/model/msgspec.py(3 hunks)src/datamodel_code_generator/model/pydantic/base_model.py(2 hunks)src/datamodel_code_generator/parser/base.py(2 hunks)src/datamodel_code_generator/parser/graphql.py(2 hunks)src/datamodel_code_generator/parser/jsonschema.py(3 hunks)src/datamodel_code_generator/parser/openapi.py(2 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_dataclass.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_msgspec.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_pydantic_v2.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_dataclass.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_msgspec.py(1 hunks)tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_pydantic_v2.py(1 hunks)tests/data/jsonschema/default_factory_nested_model.json(1 hunks)tests/data/jsonschema/default_factory_nested_model_with_dict.json(1 hunks)tests/main/jsonschema/test_main_jsonschema.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (17)
- tests/data/expected/main/jsonschema/default_factory_nested_model_dataclass.py
- src/datamodel_code_generator/arguments.py
- tests/data/jsonschema/default_factory_nested_model_with_dict.json
- tests/data/expected/main/jsonschema/default_factory_nested_model_pydantic_v2.py
- src/datamodel_code_generator/parser/openapi.py
- tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_dataclass.py
- src/datamodel_code_generator/cli_options.py
- src/datamodel_code_generator/init.py
- tests/data/jsonschema/default_factory_nested_model.json
- tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_msgspec.py
- src/datamodel_code_generator/parser/base.py
- src/datamodel_code_generator/parser/graphql.py
- src/datamodel_code_generator/model/pydantic/base_model.py
- tests/data/expected/main/jsonschema/default_factory_nested_model_with_dict_pydantic_v2.py
- src/datamodel_code_generator/model/base.py
- tests/data/expected/main/jsonschema/default_factory_nested_model_msgspec.py
- src/datamodel_code_generator/parser/jsonschema.py
🧰 Additional context used
🧬 Code graph analysis (3)
src/datamodel_code_generator/model/msgspec.py (4)
src/datamodel_code_generator/model/pydantic/base_model.py (1)
_get_default_factory_for_optional_nested_model(155-166)src/datamodel_code_generator/parser/base.py (1)
data_type(980-982)src/datamodel_code_generator/reference.py (1)
reference(76-78)src/datamodel_code_generator/model/base.py (2)
class_name(646-648)class_name(651-655)
src/datamodel_code_generator/model/dataclass.py (3)
src/datamodel_code_generator/parser/base.py (1)
data_type(980-982)src/datamodel_code_generator/reference.py (1)
reference(76-78)src/datamodel_code_generator/model/base.py (2)
class_name(646-648)class_name(651-655)
tests/main/jsonschema/test_main_jsonschema.py (2)
tests/test_main_kr.py (1)
output_file(44-46)tests/main/conftest.py (2)
output_file(94-96)run_main_and_assert(196-352)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/model/msgspec.py
250-250: Unused noqa directive (non-enabled: PLR0912)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: 3.12 on Windows
- GitHub Check: benchmarks
🔇 Additional comments (3)
src/datamodel_code_generator/model/dataclass.py (1)
146-157: Nested dataclassdefault_factorywiring looks correct and consistentThe
_get_default_factory_for_nested_modelhelper and the new__str__branch correctly:
- Detect nested
DataClassreferences while skipping dict-like types.- Only add
default_factoryfor non‑required fields withdefaultofNone/UNDEFINED, without overriding any explicitdefault_factoryor other kwargs.- Preserve the existing
field(...)formatting path so additional options (metadata, kw_only, etc.) are still honored.This matches the behavior implemented for Pydantic/msgspec fields and aligns with the new flag’s intent.
Also applies to: 177-185
src/datamodel_code_generator/__main__.py (1)
462-462: Config flag and plumbing intogenerate()look consistentThe new
use_default_factory_for_optional_nested_modelsfield onConfigand its propagation intorun_generate_from_config()are wired consistently with existing boolean options and should expose the flag correctly to the generator without affecting existing behavior.Also applies to: 665-777
tests/main/jsonschema/test_main_jsonschema.py (1)
4420-4473: New JSON Schema tests for default_factory flag are well‑scopedThe two parametrized tests cover the new
--use-default-factory-for-optional-nested-modelsflag across dataclasses, Pydantic v2, and msgspec, including the union‑with‑dict case. They follow existingrun_main_and_assertpatterns and are correctly guarded withLEGACY_BLACK_SKIP, so they should integrate cleanly into the current test suite.
80465df to
1cc29ee
Compare
🤖 Generated by GitHub Actions
Fixes: #1426
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.