Skip to content

Support incompatible Python types in x-python-type extension#2841

Merged
koxudaxi merged 2 commits intomainfrom
feature/x-python-type-generalization
Dec 28, 2025
Merged

Support incompatible Python types in x-python-type extension#2841
koxudaxi merged 2 commits intomainfrom
feature/x-python-type-generalization

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Dec 28, 2025

Summary by CodeRabbit

  • New Features

    • Support for an x-python-type attribute to override generated Python type hints.
    • Automatic import resolution for custom Python types and compatibility checks with schema types.
  • Tests

    • Added tests covering callable types, fully-qualified/custom type paths, sets, anyOf, and missing schema type scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 28, 2025

📝 Walkthrough

Walkthrough

This PR adds x-python-type compatibility handling to the JSON Schema parser: it detects when x-python-type conflicts with schema types, provides DataType overrides with appropriate Import objects, and integrates this into data-type and object-field resolution.

Changes

Cohort / File(s) Summary
JSON Schema Parser Core
src/datamodel_code_generator/parser/jsonschema.py
Added COMPATIBLE_PYTHON_TYPES and PYTHON_TYPE_IMPORTS attributes; added helpers _get_python_type_base(), _is_compatible_python_type(), _get_python_type_override(); updated get_data_type() and get_object_field() to apply Python-type overrides; import Import added.
Tests — jsonschema parser
tests/main/jsonschema/test_main_jsonschema.py
Added six tests for x-python-type handling: test_x_python_type_callable, test_x_python_type_callable_anyof, test_x_python_type_compatible_set, test_x_python_type_fqpath, test_x_python_type_no_schema_type, test_x_python_type_custom_fqpath.
Expected outputs — generated stubs
tests/data/expected/main/jsonschema/x_python_type_*.py
Added expected generated TypedDict stubs for cases above (x_python_type_callable.py, x_python_type_callable_anyof.py, x_python_type_compatible_set.py, x_python_type_custom_fqpath.py, x_python_type_fqpath.py, x_python_type_no_schema_type.py).

Sequence Diagram(s)

(silently omitted — not applicable)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I nibble through schemas, neat and spry,

Where x-python-types leap and fly.
Callables, sets, and paths so bold,
I stitch their imports, warm and cold.
Hooray — new hints snug in code, oh my!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly summarizes the main change: adding support for incompatible Python types in the x-python-type extension, which is exactly what the changeset implements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/x-python-type-generalization

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b7c2f17 and 19bf66e.

⛔ Files ignored due to path filters (6)
  • tests/data/jsonschema/x_python_type_callable.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/jsonschema/x_python_type_callable_anyof.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/jsonschema/x_python_type_compatible_set.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/jsonschema/x_python_type_custom_fqpath.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/jsonschema/x_python_type_fqpath.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/jsonschema/x_python_type_no_schema_type.json is excluded by !tests/data/**/*.json and included by none
📒 Files selected for processing (8)
  • src/datamodel_code_generator/parser/jsonschema.py
  • tests/data/expected/main/jsonschema/x_python_type_callable.py
  • tests/data/expected/main/jsonschema/x_python_type_callable_anyof.py
  • tests/data/expected/main/jsonschema/x_python_type_compatible_set.py
  • tests/data/expected/main/jsonschema/x_python_type_custom_fqpath.py
  • tests/data/expected/main/jsonschema/x_python_type_fqpath.py
  • tests/data/expected/main/jsonschema/x_python_type_no_schema_type.py
  • tests/main/jsonschema/test_main_jsonschema.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-25T09:22:14.661Z
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/model/pydantic/__init__.py:43-43
Timestamp: 2025-12-25T09:22:14.661Z
Learning: In datamodel-code-generator project, defensive `# noqa: PLC0415` directives should be kept on lazy imports (imports inside functions/methods) even when Ruff reports them as unused via RUF100, to prepare for potential future Ruff configuration changes that might enable the import-outside-top-level rule.

Applied to files:

  • src/datamodel_code_generator/parser/jsonschema.py
📚 Learning: 2025-12-18T13:43:16.235Z
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2681
File: tests/cli_doc/test_cli_doc_coverage.py:82-82
Timestamp: 2025-12-18T13:43:16.235Z
Learning: In datamodel-code-generator project, Ruff preview mode is enabled via `lint.preview = true` in pyproject.toml. This enables preview rules like PLR6301 (no-self-use), so `noqa: PLR6301` directives are necessary and should not be removed even if RUF100 suggests they are unused.

Applied to files:

  • src/datamodel_code_generator/parser/jsonschema.py
🧬 Code graph analysis (7)
tests/data/expected/main/jsonschema/x_python_type_fqpath.py (2)
src/datamodel_code_generator/model/typed_dict.py (1)
  • TypedDict (49-114)
tests/data/expected/main/jsonschema/x_python_type_callable.py (1)
  • Model (13-14)
tests/data/expected/main/jsonschema/x_python_type_callable_anyof.py (1)
tests/data/expected/main/jsonschema/x_python_type_callable.py (1)
  • Model (13-14)
src/datamodel_code_generator/parser/jsonschema.py (1)
src/datamodel_code_generator/imports.py (2)
  • Import (20-38)
  • from_full_path (35-38)
tests/main/jsonschema/test_main_jsonschema.py (2)
tests/test_main_kr.py (1)
  • output_file (44-46)
tests/main/conftest.py (2)
  • output_file (98-100)
  • run_main_and_assert (244-408)
tests/data/expected/main/jsonschema/x_python_type_callable.py (1)
tests/data/expected/main/jsonschema/x_python_type_fqpath.py (1)
  • Model (13-14)
tests/data/expected/main/jsonschema/x_python_type_custom_fqpath.py (5)
src/datamodel_code_generator/model/typed_dict.py (1)
  • TypedDict (49-114)
tests/data/expected/main/jsonschema/x_python_type_callable.py (1)
  • Model (13-14)
tests/data/expected/main/jsonschema/x_python_type_compatible_set.py (1)
  • Model (12-13)
tests/data/expected/main/jsonschema/x_python_type_fqpath.py (1)
  • Model (13-14)
tests/data/expected/main/jsonschema/x_python_type_no_schema_type.py (1)
  • Model (12-13)
tests/data/expected/main/jsonschema/x_python_type_compatible_set.py (6)
src/datamodel_code_generator/model/typed_dict.py (1)
  • TypedDict (49-114)
tests/data/expected/main/jsonschema/x_python_type_callable.py (1)
  • Model (13-14)
tests/data/expected/main/jsonschema/x_python_type_callable_anyof.py (1)
  • Model (13-14)
tests/data/expected/main/jsonschema/x_python_type_custom_fqpath.py (1)
  • Model (13-14)
tests/data/expected/main/jsonschema/x_python_type_fqpath.py (1)
  • Model (13-14)
tests/data/expected/main/jsonschema/x_python_type_no_schema_type.py (1)
  • Model (12-13)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/parser/jsonschema.py

1319-1319: Unused noqa directive (non-enabled: PLR6301)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: py312-isort6 on Ubuntu
  • GitHub Check: 3.10 on Windows
  • GitHub Check: py312-isort7 on Ubuntu
  • GitHub Check: py312-pydantic1 on Ubuntu
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.14 on Windows
  • GitHub Check: benchmarks
  • GitHub Check: Analyze (python)
🔇 Additional comments (11)
tests/data/expected/main/jsonschema/x_python_type_custom_fqpath.py (1)

1-14: LGTM! Generated test expectation file structure is correct.

The TypedDict model properly imports and uses the custom Handler type with generic parameters, wrapped in NotRequired for optional field handling.

tests/data/expected/main/jsonschema/x_python_type_fqpath.py (1)

1-14: LGTM! Correctly uses modern Callable import.

The file properly demonstrates x-python-type handling with collections.abc.Callable, which is the recommended import location for Python 3.9+.

tests/data/expected/main/jsonschema/x_python_type_no_schema_type.py (1)

1-13: LGTM! Appropriate fallback for missing schema type.

When x-python-type is specified without a corresponding JSON Schema type, falling back to Any is a sensible default behavior.

tests/data/expected/main/jsonschema/x_python_type_callable.py (1)

1-14: LGTM! Standard Callable type handling.

The file correctly demonstrates x-python-type with Callable type annotation, using modern collections.abc import and proper generic syntax.

tests/data/expected/main/jsonschema/x_python_type_compatible_set.py (1)

1-13: LGTM! Demonstrates compatible type preservation.

The field correctly uses set[str] instead of list[str], showing that when x-python-type is compatible with the JSON Schema type (set is compatible with array), the Python type is preserved rather than overridden.

tests/data/expected/main/jsonschema/x_python_type_callable_anyof.py (1)

1-14: LGTM! Proper union type handling.

The file correctly demonstrates x-python-type with anyOf/nullable, producing the modern union syntax (Callable[[str], str] | None) with proper type handling.

src/datamodel_code_generator/parser/jsonschema.py (4)

49-49: LGTM! Import addition is necessary.

Adding Import to the imports enables creating Import objects for custom Python types in the _get_python_type_override method.


537-571: LGTM! Well-defined type compatibility mappings.

The COMPATIBLE_PYTHON_TYPES and PYTHON_TYPE_IMPORTS class variables provide comprehensive mappings for:

  • Determining which Python types are compatible with JSON Schema types
  • Automatically importing common generic types (Callable, Iterable, Pattern, etc.)

The mappings cover appropriate types for each JSON Schema primitive, with special handling for container types (array/object).


1186-1188: LGTM! Appropriate early-return pattern.

Checking for python_type_override at the start of get_data_type ensures that incompatible x-python-type annotations take precedence over schema-derived types, which is the correct behavior.


1327-1360: LGTM! Solid compatibility and override logic.

The helper methods implement sensible x-python-type handling:

  • _is_compatible_python_type: Returns True for None (no schema type) and Union/Optional, otherwise checks the compatibility mapping
  • _get_python_type_override: Creates a DataType with appropriate imports for incompatible types, handling both predefined types and custom fully qualified paths

One edge case to consider: When obj.type is a list (multiple types), schema_type becomes None (line 1343), which causes _is_compatible_python_type to return True, preventing override. This means x-python-type won't override for schemas with multiple types (e.g., ["string", "null"]). This may be intentional for complex union scenarios, but it's worth verifying that this behavior aligns with expectations.

If you'd like, I can generate a script to search for test cases or usage examples that combine x-python-type with multiple schema types to verify this behavior is intentional.

tests/main/jsonschema/test_main_jsonschema.py (1)

6970-7033: New x-python-type JSON Schema tests are well-structured and consistent

These six tests cleanly exercise the new x-python-type behaviors (Callable, anyOf, compatible Set, FQ paths, missing schema type, and custom FQ paths) using the existing run_main_and_assert/assert_file_content pattern and auto-derived expected filenames. Arguments and fixtures align with other JSON Schema tests; no issues from the test harness perspective.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 28, 2025

📚 Docs Preview: https://pr-2841.datamodel-code-generator.pages.dev

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Dec 28, 2025

CodSpeed Performance Report

Merging #2841 will not alter performance

Comparing feature/x-python-type-generalization (19bf66e) with main (a67163c)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 11 untouched
⏩ 98 skipped1

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.49%. Comparing base (a67163c) to head (19bf66e).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2841   +/-   ##
=======================================
  Coverage   99.49%   99.49%           
=======================================
  Files          90       90           
  Lines       14290    14336   +46     
  Branches     1705     1713    +8     
=======================================
+ Hits        14218    14264   +46     
  Misses         37       37           
  Partials       35       35           
Flag Coverage Δ
unittests 99.49% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/datamodel_code_generator/parser/jsonschema.py (1)

559-571: Consider documenting the coverage scope of PYTHON_TYPE_IMPORTS.

The mapping covers collections.abc and re module types appropriately. This appears to focus on types that require non-root imports.

For clarity, consider adding a docstring or comment explaining the scope:

  • Why certain types are included (collections.abc, re.Pattern)
  • Why typing module types (List, Dict, Optional, etc.) are not included
  • Guidance for extending this mapping if needed

This would help future maintainers understand the design decision.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a67163c and b7c2f17.

📒 Files selected for processing (2)
  • src/datamodel_code_generator/parser/jsonschema.py
  • tests/main/jsonschema/test_main_jsonschema.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/datamodel_code_generator/parser/jsonschema.py (1)
src/datamodel_code_generator/imports.py (2)
  • Import (20-38)
  • from_full_path (35-38)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/parser/jsonschema.py

1319-1319: Unused noqa directive (non-enabled: PLR6301)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.12 on macOS
  • GitHub Check: 3.10 on Windows
  • GitHub Check: py312-isort5 on Ubuntu
  • GitHub Check: 3.10 on Ubuntu
  • GitHub Check: 3.11 on Windows
  • GitHub Check: py312-pydantic1 on Ubuntu
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.11 on Ubuntu
  • GitHub Check: 3.11 on macOS
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.13 on macOS
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (5)
tests/main/jsonschema/test_main_jsonschema.py (1)

6970-7082: New x-python-type tests look solid and align with the intended semantics

The added tests exercise the key scenarios for x-python-type overrides (direct property, anyOf/nullable, compatible container override via Set[str], fully-qualified collections.abc.Callable, and missing type falling back to Any). Usage of generate with an in-memory JSON string and InputFileType.JsonSchema matches the documented module API, and the assertions are precise enough without being brittle against formatting. I don’t see any functional or style issues here.

src/datamodel_code_generator/parser/jsonschema.py (4)

49-49: LGTM! Import addition is appropriate.

The Import class is correctly imported and will be used by the new PYTHON_TYPE_IMPORTS mapping and _get_python_type_override method.


1186-1188: LGTM! Clean integration with early return pattern.

The override check is appropriately placed at the beginning of get_data_type, following the guard clause pattern. This ensures that incompatible x-python-type extensions take precedence over standard type resolution.


537-557: The COMPATIBLE_PYTHON_TYPES mapping is complete and correctly designed. It includes all concrete container types that represent valid JSON array/object alternatives: Iterable, Iterator, and Generator are intentionally excluded because they are abstract protocols and lazy-evaluation types that don't semantically map to JSON array containers. These types remain available in PYTHON_TYPE_IMPORTS for use as incompatible overrides (e.g., when a field should be Callable or a Generator), which is the correct design pattern.


1327-1349: The code appears sound. The edge cases raised in the original comment are actually handled correctly by design:

  1. Malformed type annotations (e.g., "List["): The _get_python_type_base() method correctly extracts the base type using split("[", maxsplit=1)[0], so "List[" yields "List" without error.

  2. Missing imports (import_=None): This is intentional. The DataType.import_ field defaults to None (line 330 in types.py: import_: Optional[Import] = None), which is valid and expected. Only the 11 stdlib types in PYTHON_TYPE_IMPORTS (Callable, Iterable, Iterator, Generator, Awaitable, Coroutine, AsyncIterable, AsyncIterator, AsyncGenerator, Pattern, Match) get explicit import entries. Custom types like "MyCustomType" correctly receive import_=None, making users responsible for providing their own imports—a reasonable design choice.

The compatibility checking and override logic is sound and properly tested.

Comment thread src/datamodel_code_generator/parser/jsonschema.py
Copy link
Copy Markdown
Owner Author

@koxudaxi koxudaxi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The # noqa: PLR6301 directive is actually needed. When removed, the linter triggers PLR6301 error: "Method _get_python_type_base could be a function, class method, or static method". The directive correctly suppresses this warning since we want to keep it as an instance method for consistency with other methods in the class.

@koxudaxi koxudaxi merged commit 0b07112 into main Dec 28, 2025
38 checks passed
@koxudaxi koxudaxi deleted the feature/x-python-type-generalization branch December 28, 2025 17:25
@github-actions
Copy link
Copy Markdown
Contributor

Breaking Change Analysis

Result: No breaking changes detected

Reasoning: This PR adds support for incompatible Python types in the x-python-type extension. Previously, when x-python-type specified a type incompatible with the JSON schema type (e.g., Callable with type: string), the extension was silently ignored and the schema type was used. Now, the x-python-type value is respected even for incompatible types. This is a feature enhancement rather than a breaking change because: (1) users who specified incompatible x-python-type values clearly wanted that type to be used, so honoring it is the expected behavior; (2) compatible types (Set, Mapping, Sequence, etc. with matching schema types) continue to work exactly as before through the existing _get_python_type_flags mechanism; (3) no existing valid use of x-python-type is affected negatively - only previously-ignored configurations now work as intended.


This analysis was performed by Claude Code Action

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 1, 2026

🎉 Released in 0.51.0

This PR is now available in the latest release. See the release notes for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant