Skip to content

Fix use_union_operator with Python builtin type field names#2968

Merged
koxudaxi merged 14 commits intomainfrom
fix/builtin-type-field-names
Feb 10, 2026
Merged

Fix use_union_operator with Python builtin type field names#2968
koxudaxi merged 14 commits intomainfrom
fix/builtin-type-field-names

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Jan 19, 2026

Fixes: #2964

Summary by CodeRabbit

  • Bug Fixes

    • Field names that conflict with Python built-ins are now sanitized: internal attribute names receive a trailing underscore while preserving original names as public aliases. This applies to primitives, container and complex builtin names, and respects target-Python-version differences.
  • Tests

    • Added and expanded tests covering builtin-name collisions (primitives, containers, snake_case, x-python-type cases) and target-Python-version behaviors.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 19, 2026

📝 Walkthrough

Walkthrough

Adds Python-builtin-aware name collision detection to the parser and renames conflicting model fields to "_" while preserving original names via Pydantic Field(..., alias=...). Updates generated expectations and adds tests covering builtin-name collisions across options and target Python versions.

Changes

Cohort / File(s) Summary
Core Parser Logic
src/datamodel_code_generator/parser/base.py
Add runtime builtin-name detection and Python-version-aware helpers; introduce _BUILTIN_CONTAINER_COLLISION_FLAGS, _is_builtin_type_collision, and _python_version_key; initialize self.builtin_names; modify __change_field_name to rename fields colliding with builtins and preserve aliases via Field(..., alias=...).
Generated OpenAPI models
tests/data/expected/main/openapi/builtin_type_field_names.py, tests/data/expected/main/openapi/builtin_type_field_names_no_union_operator.py
New/generated Pydantic models map builtin-like names to safe attribute names (e.g., int_, str_) and use Field(..., alias=...) to retain original names.
Generated JSONSchema models
tests/data/expected/main/jsonschema/builtin_field_names.py, tests/data/expected/main/jsonschema/builtin_field_names_snake_case.py, tests/data/expected/main/jsonschema/builtin_field_names_container_types.py, tests/data/expected/main/jsonschema/builtin_field_names_container_types_no_use_standard_collections.py, tests/data/expected/main/jsonschema/builtin_field_names_target_python_version_310.py, tests/data/expected/main/jsonschema/builtin_field_names_target_python_version_313.py, tests/data/expected/main/jsonschema/x_python_type_builtin_dict_collision.py
Generated models use safe attribute names with Field(..., alias=...); variants cover snake_case, container handling, standard-collections toggle, and target-Python-version-specific naming.
Tests (OpenAPI)
tests/main/openapi/test_main_openapi.py
Add parameterized test_main_builtin_type_field_names to validate builtin-type field name sanitization for Pydantic v2 models (duplicate definition present in diff).
Tests (JSONSchema)
tests/main/jsonschema/test_main_jsonschema.py
Add multiple benchmark-marked tests for snake_case handling, container-type builtin collisions, target-Python-version variants, and x-python-type dict collision handling (some test blocks appear duplicated in diff).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested labels

breaking-change-analyzed

Suggested reviewers

  • ilovelinux

Poem

🐇
I nibble names that almost bite,
add an underscore — now polite.
Alias keeps the old name near,
Generated models hop with cheer,
Tests clap paws — hooray, delight!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main issue: fixing union operator handling when Python builtin type names are used as field names.
Linked Issues check ✅ Passed The PR fully addresses issue #2964: it detects collisions between field names and Python builtins, then renames affected fields to prevent identifier shadowing in union-type annotations.
Out of Scope Changes check ✅ Passed All changes focus on builtin name collision detection and field renaming for union-operator compatibility; no unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/builtin-type-field-names

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 19, 2026

📚 Docs Preview: https://pr-2968.datamodel-code-generator.pages.dev

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jan 19, 2026

CodSpeed Performance Report

Merging this PR will not alter performance

Comparing fix/builtin-type-field-names (1f22222) with main (9554fb6)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 11 untouched benchmarks
⏩ 98 skipped benchmarks1

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tests/main/openapi/test_main_openapi.py`:
- Around line 4715-4727: The test test_main_builtin_type_field_names is
asserting code that uses PEP 604 union operator but doesn't pass the flag to
enable it; update the extra_args array in that test (the extra_args passed into
run_main_and_assert) to include "--use-union-operator" so output uses the `|`
operator (matching bool | None, dict[str, Any] | None, etc.) when run with
"--output-model-type pydantic_v2.BaseModel".
🧹 Nitpick comments (1)
src/datamodel_code_generator/parser/base.py (1)

2164-2164: Remove unused noqa directive.

The static analysis tool indicates that PLR0912 is not enabled in the project's linter configuration, making this directive unnecessary.

Suggested fix
-    def __change_field_name(  # noqa: PLR0912
+    def __change_field_name(

Comment thread tests/main/openapi/test_main_openapi.py
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (9554fb6) to head (1f22222).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #2968   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           94        94           
  Lines        17813     17859   +46     
  Branches      2055      2061    +6     
=========================================
+ Hits         17813     17859   +46     
Flag Coverage Δ
unittests 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@koxudaxi koxudaxi force-pushed the fix/builtin-type-field-names branch from 3279508 to a3988fc Compare January 19, 2026 23:31
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/datamodel_code_generator/parser/base.py`:
- Line 2164: Remove the unused noqa PLR0912 suppression on the private function
__change_field_name: edit the function definition for __change_field_name to
delete the trailing "# noqa: PLR0912" comment so the linter warning is not
masked and the code uses normal linting rules for that function.
- Around line 2212-2235: The builtin-shadowing check should operate on the
current field.name (which may have been normalized by ModelResolver) rather than
the original filed_name, and only assign field.alias when no alias already
exists; update the condition from checking filed_name to checking field.name
against _BUILTIN_NAMES, use field.name in the per-type checks (dt.type,
dt.is_list, etc.) to determine should_rename, and if should_rename set
field.name = f"{field.name}_" and set field.alias = filed_name only if
field.alias is None so existing aliases (e.g., from snake_case_field
normalization) are preserved; add a regression test for a case like original
"Float" with snake_case_field=True to assert alias remains "Float" and name
becomes "float_".

Comment thread src/datamodel_code_generator/parser/base.py Outdated
Comment thread src/datamodel_code_generator/parser/base.py Outdated
Comment thread src/datamodel_code_generator/parser/base.py Outdated
Comment thread src/datamodel_code_generator/parser/base.py Outdated
Comment thread tests/data/expected/main/openapi/builtin_type_field_names.py
Comment thread src/datamodel_code_generator/parser/base.py Fixed
Comment thread src/datamodel_code_generator/parser/base.py Fixed
@koxudaxi koxudaxi merged commit cdd3c27 into main Feb 10, 2026
38 checks passed
@koxudaxi koxudaxi deleted the fix/builtin-type-field-names branch February 10, 2026 08:40
@github-actions
Copy link
Copy Markdown
Contributor

Breaking Change Analysis

Result: Breaking changes detected

Reasoning: This PR changes generated code output in a way that affects existing users. Before the PR, fields like int: int | None = None were generated with the builtin name directly as the field name. After the PR, such fields are renamed to int_: int | None = Field(None, alias='int'). This is a breaking change because: (1) existing generated code will differ on regeneration, (2) users who were accessing model.int will need to access model.int_ instead in Python code, (3) custom templates may need updating if they handle field names/aliases differently. The change only applies when a field's name matches a Python builtin AND the field's type uses that same builtin (e.g., int: int but not int: str).

Content for Release Notes

Code Generation Changes

  • Field names matching Python builtins are now automatically sanitized - When a field name matches a Python builtin type AND the field's type annotation uses that same builtin (e.g., int: int, list: list[str], dict: dict[str, Any]), the field is now renamed with a trailing underscore (e.g., int_) and an alias is added to preserve the original JSON field name. This prevents Python syntax issues and shadowing of builtin types. Previously, such fields were generated as-is (e.g., int: int | None = None), which could cause code that shadows Python builtins. After this change, the same field becomes int_: int | None = Field(None, alias='int'). This affects fields named: int, float, bool, str, bytes, list, dict, set, frozenset, tuple, and other Python builtins when their type annotation uses the matching builtin type. (Fix use_union_operator with Python builtin type field names #2968)

This analysis was performed by Claude Code Action

@github-actions
Copy link
Copy Markdown
Contributor

🎉 Released in 0.54.0

This PR is now available in the latest release. See the release notes for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

use_union_operator=True does not work with specific field names

3 participants