Skip to content

Add --use-standard-primitive-types option#2736

Merged
koxudaxi merged 2 commits intomainfrom
feature/use-standard-primitive-types
Dec 22, 2025
Merged

Add --use-standard-primitive-types option#2736
koxudaxi merged 2 commits intomainfrom
feature/use-standard-primitive-types

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Dec 22, 2025

Fixes: #1784

Summary by CodeRabbit

Release Notes

  • New Features

    • Added --use-standard-primitive-types CLI option to use Python standard library types (UUID, IPv4Address, IPv6Address, Path) for string format fields instead of plain strings in dataclass, msgspec, and TypedDict model outputs. Pydantic models already use these types by default.
  • Documentation

    • Updated CLI reference and typing customization documentation to describe the new --use-standard-primitive-types option and its effects.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 22, 2025

Walkthrough

Adds a new CLI flag --use-standard-primitive-types to enable dataclass and msgspec models to use Python standard library types (UUID, IPv4Address, IPv6Address, Path) for string formats instead of plain str, achieving parity with Pydantic's type handling. The flag is threaded through CLI arguments, config, parsers, and type managers to enable conditional primitive-type mapping.

Changes

Cohort / File(s) Change Summary
Documentation Updates
docs/cli-reference/index.md, docs/cli-reference/quick-reference.md, docs/cli-reference/typing-customization.md
Added CLI option documentation and reference for --use-standard-primitive-types flag with descriptions, usage examples, and affected output formats.
CLI Infrastructure
src/datamodel_code_generator/arguments.py, src/datamodel_code_generator/cli_options.py
Added --use-standard-primitive-types argument definition and CLI option metadata entry under Typing category.
Core Generate & Config
src/datamodel_code_generator/__init__.py, src/datamodel_code_generator/__main__.py
Added use_standard_primitive_types parameter to generate() function and Config class, with propagation to parser instantiation and config flow.
Type System Foundation
src/datamodel_code_generator/imports.py, src/datamodel_code_generator/types.py
Added Import constants for IPv4Address, IPv6Address, IPv4Network, IPv6Network; introduced standard_primitive_type_map_factory() function and threaded use_standard_primitive_types through base DataTypeManager.
Dataclass Model
src/datamodel_code_generator/model/dataclass.py
Added use_standard_primitive_types parameter to DataClass and DataTypeManager constructors; integrated standard primitive type mappings into type map when flag is enabled.
Msgspec Model
src/datamodel_code_generator/model/msgspec.py
Added use_standard_primitive_types parameter and conditional standard primitive type map integration into DataTypeManager.
Pydantic Type Managers
src/datamodel_code_generator/model/pydantic/types.py, src/datamodel_code_generator/model/pydantic_v2/types.py
Added use_standard_primitive_types parameter to DataTypeManager constructors; refactored to keyword arguments in pydantic/types.py.
Parser Classes
src/datamodel_code_generator/parser/base.py, src/datamodel_code_generator/parser/graphql.py, src/datamodel_code_generator/parser/jsonschema.py, src/datamodel_code_generator/parser/openapi.py
Added use_standard_primitive_types parameter to parser __init__ signatures and propagated through superclass initialization.
Test Data & Tests
tests/data/jsonschema/use_standard_primitive_types.json, tests/data/expected/main/use_standard_primitive_types.py, tests/main/test_main_general.py
Added JSON Schema input with uuid, ipv4, and path formats; generated dataclass output with standard library types; added test case validating end-to-end feature functionality.

Sequence Diagram

sequenceDiagram
    actor User
    participant CLI as CLI Arguments
    participant Config as Config
    participant Generator as generate()
    participant Parser as Parser
    participant DTM as DataTypeManager
    participant TypeMap as Type Map Factory
    
    User->>CLI: --use-standard-primitive-types
    CLI->>Config: use_standard_primitive_types=True
    Config->>Generator: run_generate_from_config(use_standard_primitive_types=True)
    Generator->>Parser: Parser(..., use_standard_primitive_types=True)
    Parser->>DTM: DataTypeManager(..., use_standard_primitive_types=True)
    DTM->>TypeMap: standard_primitive_type_map_factory()
    TypeMap-->>DTM: {UUID→Type, IPv4Address→Type, ...}
    DTM->>DTM: Merge primitive mappings into type_map
    DTM-->>Parser: Configured type mappings
    Parser-->>Generator: Generate models with standard lib types
    Generator-->>User: Output with UUID, IPv4Address, Path, etc.
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Repetitive pattern: The same parameter addition and threading pattern is applied across ~10 files (parsers, type managers), which reduces cognitive load per file.
  • New logic: standard_primitive_type_map_factory() and integration into type map construction requires careful review to ensure correct type mappings.
  • File spread: Changes span documentation, CLI infrastructure, core generation, type system, and multiple model backends, requiring familiarity with multiple code areas.
  • Special cases: pydantic_v2/types.py adds the parameter but doesn't consume it—verify if this is intentional or incomplete.
  • Test coverage: Duplication in test_main_general.py (test function appears twice) should be consolidated.

Possibly related PRs

  • PR #2673: Extends the CLI documentation infrastructure and CLI_OPTION_META structure used by the new --use-standard-primitive-types flag registration.
  • PR #2733: Adds another CLI/config option and modifies the same parser/DataTypeManager constructor call sites and threading pattern as this PR.
  • PR #2688: Introduces a related CLI flag and modifies the same code paths (generate/init, main, arguments, cli_options, and OpenAPIParser.init).

Suggested labels

safe-to-fix

Suggested reviewers

  • ilovelinux

Poem

🐰 Hops with joy at types so clear,
No more strings for formats dear!
UUIDs bloom and paths take flight,
Dataclasses now shine so bright!
Standard lib types, here at last,
Parity with Pydantic—cast!

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title directly and specifically describes the main change: adding a new CLI option --use-standard-primitive-types to use standard library types for formatted strings.
Linked Issues check ✅ Passed The PR successfully implements the core objective from issue #1784: enabling dataclass generation to use standard library types (UUID, IPv4Address, Path) for formatted string fields instead of plain str, matching Pydantic behavior.
Out of Scope Changes check ✅ Passed All changes are within scope and directly support the --use-standard-primitive-types feature, from CLI arguments to parser initialization to type mapping and test cases.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/use-standard-primitive-types

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

🤖 Generated by GitHub Actions
Comment thread src/datamodel_code_generator/model/dataclass.py
Comment thread src/datamodel_code_generator/model/msgspec.py
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 22, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.36%. Comparing base (a231d46) to head (568c21e).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2736   +/-   ##
=======================================
  Coverage   99.36%   99.36%           
=======================================
  Files          83       83           
  Lines       12107    12123   +16     
  Branches     1458     1458           
=======================================
+ Hits        12030    12046   +16     
  Misses         45       45           
  Partials       32       32           
Flag Coverage Δ
unittests 99.36% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Dec 22, 2025

CodSpeed Performance Report

Merging #2736 will not alter performance

Comparing feature/use-standard-primitive-types (568c21e) with main (a231d46)

Summary

✅ 70 untouched
⏩ 10 skipped1

Footnotes

  1. 10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/datamodel_code_generator/model/pydantic/types.py (1)

173-201: Clarify intentional no-op flag and clean up unused noqa directives

DataTypeManager.__init__ now accepts use_standard_primitive_types but doesn’t forward it to super().__init__, effectively making the flag a no-op for Pydantic v1. That’s reasonable given the docs explicitly say Pydantic already uses these standard types by default, and the parameter is needed mainly so this class can be constructed with the shared kwargs set.

Ruff is flagging the newly added # noqa entries here (FBT001, FBT002) as unused. To satisfy the linter while keeping the intentional “unused-arg” suppression, consider trimming the comments as follows:

Suggested change to align with Ruff RUF100
-        use_standard_primitive_types: bool = False,  # noqa: FBT001, FBT002, ARG002
-        treat_dot_as_module: bool | None = None,  # noqa: FBT001
-        use_serialize_as_any: bool = False,  # noqa: FBT001, FBT002
+        use_standard_primitive_types: bool = False,  # noqa: ARG002
+        treat_dot_as_module: bool | None = None,
+        use_serialize_as_any: bool = False,

This keeps the constructor signature compatible with the shared parser kwargs, avoids new RUF100 warnings, and leaves behavior unchanged.

🧹 Nitpick comments (4)
docs/cli-reference/typing-customization.md (1)

22-23: --use-standard-primitive-types docs and example are accurate; minor markdownlint nit

The option description, “Related” links, and dataclass example (UUID / IPv4Address / Path) accurately reflect the new behavior and match the test/golden output wiring.

markdownlint (MD050) flags the **Related:** line here for strong-style; if that rule is enforced, consider switching to __Related:__ (or relaxing the rule) to keep the linter happy.

Also applies to: 2959-3024

src/datamodel_code_generator/model/pydantic_v2/types.py (1)

72-100: Tidy up unused noqa codes on use_standard_primitive_types

The extra ctor parameter is fine as an API compatibility shim (Parser can always pass it) and leaving it unused here preserves existing Pydantic v2 behavior. Given Ruff’s RUF100 hint, you can drop the FBT001, FBT002 parts from the noqa and keep only ARG002:

Proposed tweak
-        use_standard_primitive_types: bool = False,  # noqa: FBT001, FBT002, ARG002
+        use_standard_primitive_types: bool = False,  # noqa: ARG002
src/datamodel_code_generator/model/types.py (1)

10-21: Standard primitive mapping is well factored; consider trimming unused noqa

The new standard_primitive_type_map_factory() and the use_standard_primitive_types switch in DataTypeManager.__init__ cleanly layer stdlib types (UUID, IPv4Address, IPv6Address, networks, Path) on top of the existing type_map_factory while preserving previous defaults when the flag is False.

Minor lint nit: Ruff reports unused noqa directives for FBT001/FBT002 on the use_standard_primitive_types, treat_dot_as_module, and use_serialize_as_any parameters. If those codes aren’t enabled in your config, you can safely drop them and keep only the ones you actually need.

Also applies to: 77-141

src/datamodel_code_generator/model/dataclass.py (1)

220-220: Remove unused noqa directive.

The noqa directive for FBT001 and FBT002 is no longer needed according to Ruff.

🔎 Proposed fix
-        use_standard_primitive_types: bool = False,  # noqa: FBT001, FBT002
+        use_standard_primitive_types: bool = False,
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a231d46 and 568c21e.

📒 Files selected for processing (21)
  • docs/cli-reference/index.md
  • docs/cli-reference/quick-reference.md
  • docs/cli-reference/typing-customization.md
  • src/datamodel_code_generator/__init__.py
  • src/datamodel_code_generator/__main__.py
  • src/datamodel_code_generator/arguments.py
  • src/datamodel_code_generator/cli_options.py
  • src/datamodel_code_generator/imports.py
  • src/datamodel_code_generator/model/dataclass.py
  • src/datamodel_code_generator/model/msgspec.py
  • src/datamodel_code_generator/model/pydantic/types.py
  • src/datamodel_code_generator/model/pydantic_v2/types.py
  • src/datamodel_code_generator/model/types.py
  • src/datamodel_code_generator/parser/base.py
  • src/datamodel_code_generator/parser/graphql.py
  • src/datamodel_code_generator/parser/jsonschema.py
  • src/datamodel_code_generator/parser/openapi.py
  • src/datamodel_code_generator/types.py
  • tests/data/expected/main/use_standard_primitive_types.py
  • tests/data/jsonschema/use_standard_primitive_types.json
  • tests/main/test_main_general.py
🧰 Additional context used
🧬 Code graph analysis (4)
src/datamodel_code_generator/model/msgspec.py (1)
src/datamodel_code_generator/model/types.py (2)
  • standard_primitive_type_map_factory (77-96)
  • type_map_factory (34-74)
src/datamodel_code_generator/model/types.py (1)
src/datamodel_code_generator/types.py (3)
  • DataType (285-757)
  • Types (767-803)
  • from_import (344-368)
src/datamodel_code_generator/model/dataclass.py (3)
src/datamodel_code_generator/model/types.py (2)
  • standard_primitive_type_map_factory (77-96)
  • type_map_factory (34-74)
src/datamodel_code_generator/model/pydantic_v2/types.py (1)
  • type_map_factory (116-150)
src/datamodel_code_generator/types.py (2)
  • Types (767-803)
  • DataType (285-757)
tests/main/test_main_general.py (3)
tests/conftest.py (1)
  • freeze_time (285-288)
tests/main/conftest.py (2)
  • output_file (98-100)
  • run_main_and_assert (244-408)
tests/test_main_kr.py (1)
  • output_file (44-46)
🪛 GitHub Check: CodeQL
src/datamodel_code_generator/model/msgspec.py

[notice] 37-37: Cyclic import
Import of module datamodel_code_generator.model.types begins an import cycle.

src/datamodel_code_generator/model/dataclass.py

[notice] 23-23: Cyclic import
Import of module datamodel_code_generator.model.types begins an import cycle.

🪛 markdownlint-cli2 (0.18.1)
docs/cli-reference/typing-customization.md

2968-2968: Strong style
Expected: underscore; Actual: asterisk

(MD050, strong-style)


2968-2968: Strong style
Expected: underscore; Actual: asterisk

(MD050, strong-style)

🪛 Ruff (0.14.8)
src/datamodel_code_generator/types.py

827-827: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/model/pydantic/types.py

183-183: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)


185-185: Unused noqa directive (non-enabled: FBT001)

Remove unused noqa directive

(RUF100)


186-186: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/model/pydantic_v2/types.py

82-82: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/model/msgspec.py

506-506: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/model/types.py

114-114: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)


116-116: Unused noqa directive (non-enabled: FBT001)

Remove unused noqa directive

(RUF100)


117-117: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/model/dataclass.py

220-220: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: 3.13 on Windows
  • GitHub Check: benchmarks
🔇 Additional comments (18)
docs/cli-reference/quick-reference.md (1)

44-44: Quick reference entries for --use-standard-primitive-types are consistent

Anchor targets, wording, and placement in both the Typing Customization table and the alphabetical index all look correct and aligned with the detailed typing-customization docs.

Also applies to: 278-278

docs/cli-reference/index.md (1)

12-12: CLI index updates for --use-standard-primitive-types are consistent

The Typing Customization option count and the new --use-standard-primitive-types entry in the U-section both line up with the detailed docs and quick-reference table.

Also applies to: 181-181

src/datamodel_code_generator/parser/openapi.py (1)

181-283: OpenAPI parser correctly threads use_standard_primitive_types to the base parser

Adding use_standard_primitive_types to the OpenAPIParser initializer and passing it through to super().__init__ keeps the constructor API compatible while allowing the shared parser/type manager logic to honor this flag.

Also applies to: 365-383

src/datamodel_code_generator/__init__.py (1)

393-501: Plumbing of use_standard_primitive_types through generate() looks correct

The new use_standard_primitive_types argument is added to generate() with a safe default and is forwarded into the chosen parser class, keeping existing behavior unchanged unless the flag is explicitly set.

Also applies to: 646-749

tests/data/jsonschema/use_standard_primitive_types.json (1)

1-18: Test schema for standard primitive types matches the documented behavior

The schema cleanly exercises the uuid, ipv4, and path string formats that the new flag is intended to map to UUID, IPv4Address, and Path, respectively.

tests/main/test_main_general.py (1)

1062-1092: End-to-end test for --use-standard-primitive-types is well wired

The new CLI-doc test correctly ties together the JSON Schema fixture, the dataclasses output model type, the --use-standard-primitive-types flag, and the expected golden file, while freezing time for stable headers. This gives good coverage for the new option’s behavior in the main entry point.

tests/data/expected/main/use_standard_primitive_types.py (1)

1-16: LGTM! Expected output correctly demonstrates the feature.

The expected test output properly shows the intended behavior: dataclass fields use Python standard library types (UUID, IPv4Address, Path) instead of plain str for formatted string fields. The imports and type annotations are correct.

src/datamodel_code_generator/arguments.py (1)

306-313: LGTM! Clear and well-documented CLI argument.

The help text accurately describes the feature and appropriately notes that Pydantic models already use standard library types by default, helping users understand when this flag is relevant.

src/datamodel_code_generator/parser/jsonschema.py (1)

594-694: LGTM! Parameter correctly threaded to base class.

The use_standard_primitive_types parameter is properly added to the constructor signature and forwarded to the parent class using keyword arguments, which makes the code clear and maintainable.

However, note that the effectiveness of this change depends on the DataTypeManager implementation (in types.py), which currently has a critical issue where the parameter is unused.

src/datamodel_code_generator/types.py (1)

827-827: This review comment is incorrect and should be dismissed.

The use_standard_primitive_types parameter is fully implemented and actively used. The parameter appears in the abstract base class DataTypeManager where it's marked ARG002 because the base class itself doesn't directly consume it. However, concrete subclasses like DataTypeManager in model/types.py (lines 134–140) explicitly use it to conditionally create and merge a standard_primitive_map into self.type_map via standard_primitive_type_map_factory(). The same pattern is applied in dataclass.py, msgspec.py, and other model implementations. The feature is fully functional with passing tests (test_use_standard_primitive_types) and expected golden output files demonstrating correct type generation (UUID, IPv4Address, IPv6Address, etc.).

Likely an incorrect or invalid review comment.

src/datamodel_code_generator/parser/base.py (1)

677-793: Clean wiring of use_standard_primitive_types into DataTypeManager

The new kw-only parameter is added conservatively with a default and passed straight into data_type_manager_type, so existing callers remain compatible and behavior is opt‑in via the manager. No issues here.

src/datamodel_code_generator/__main__.py (1)

372-479: Config/CLI plumbing for use_standard_primitive_types looks consistent

The new Config field and the pass-through in run_generate_from_config() correctly integrate the option into generation, and pyproject / CLI command generation will handle it naturally (bool default False, no --no- variant needed). No further changes needed here.

Also applies to: 671-786

src/datamodel_code_generator/cli_options.py (1)

144-171: Doc metadata entry for --use-standard-primitive-types is correct

Option is categorized under Typing and matches the argparse spelling, so it will appear in generated CLI docs and sync tests should pass.

src/datamodel_code_generator/parser/graphql.py (1)

98-199: GraphQLParser correctly forwards use_standard_primitive_types

The constructor mirrors the base Parser.__init__ signature and passes the flag through to super().__init__, so GraphQL parsing benefits from the same primitive‑type handling without changing defaults.

src/datamodel_code_generator/imports.py (1)

199-234: New ipaddress Import constants are correct

The added IMPORT_IPV4ADDRESS, IMPORT_IPV6ADDRESS, IMPORT_IPV4NETWORK, and IMPORT_IPV6NETWORK correctly use ipaddress.* and integrate cleanly with existing import constants.

src/datamodel_code_generator/model/msgspec.py (1)

37-38: msgspec DataTypeManager correctly integrates standard primitive map; retain FBT noqa codes

The msgspec DataTypeManager correctly:

  • Accepts use_standard_primitive_types and forwards it to the base manager (line 522).
  • Builds self.type_map from type_map_factory, msgspec-specific datetime_map, and (optionally) standard_primitive_type_map_factory, so UUID/IP/path formats resolve to stdlib types only when the flag is enabled (lines 542-546).

The # noqa: FBT001, FBT002 codes on all boolean parameters (lines 503-510) should be kept: these parameters are positional-or-keyword arguments, so FBT001 and FBT002 warnings legitimately apply and the suppression codes are necessary.

No import cycle issues detected.

src/datamodel_code_generator/model/dataclass.py (2)

235-235: Parameter propagation and map creation logic confirmed as correct.

The use_standard_primitive_types parameter is properly accepted by the superclass _DataTypeManager.__init__ (line 114 in types.py) and correctly propagated at line 235. The conditional map creation pattern (lines 252–259) mirrors the existing datetime_map implementation, and merging an empty dict when disabled is harmless.


23-23: No cyclic import exists at this location.

The import at line 23 is valid. dataclass.py imports from types.py, but types.py does not import back from dataclass.py, creating a one-way dependency rather than a cycle. The CodeQL cyclic import notice is a false positive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dataclass generation does not include specified string format like uuid

2 participants