Skip to content

Add automatic handling of unserializable types in --input-model#2851

Merged
koxudaxi merged 1 commit intomainfrom
feature/input-model-callable-type-handling
Dec 29, 2025
Merged

Add automatic handling of unserializable types in --input-model#2851
koxudaxi merged 1 commit intomainfrom
feature/input-model-callable-type-handling

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Dec 29, 2025

Summary by CodeRabbit

  • New Features

    • Enhanced input-model feature to properly preserve and serialize complex Python typing annotations, including Callable signatures and Type references, in generated schemas.
  • Tests

    • Added extensive test coverage for various Callable patterns, Type fields, nested callables, and union types in input-model processing.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 29, 2025

📝 Walkthrough

Walkthrough

This pull request introduces a comprehensive mechanism to preserve and serialize complex Python typing information (including Callable, Type, Union, and generics) into JSON Schema via x-python-type annotations for Pydantic v2 models. It adds schema post-processing helpers, type override detection logic, and test coverage for edge cases involving Callable and custom types.

Changes

Cohort / File(s) Summary
Schema Serialization & Post-Processing
src/datamodel_code_generator/__main__.py
Introduces _UNSERIALIZABLE_MARKER, serialization helpers (_serialize_python_type_full, _serialize_callable, _is_callable_origin, _is_type_origin, etc.), lazy-initialized InputModelJsonSchema class, and post-processing flow via _load_model_schema to annotate x-python-type for unserializable types in JSON schemas.
Type Override & Compatibility Logic
src/datamodel_code_generator/parser/jsonschema.py
Adds "Type" to PYTHON_TYPE_IMPORTS, introduces PYTHON_TYPE_OVERRIDE_ALWAYS set with {"Callable", "Type"}, extends _is_compatible_python_type to check override requirements, adds _extract_all_type_names and _get_python_type_override to compute DataType overrides for incompatible types, and updates parse_item to apply overrides.
Test Model Definitions
tests/data/python/input_model/pydantic_models.py
Adds six new Pydantic v2 models: ModelWithCallableTypes, NestedCallableModel, ModelWithNestedCallable, CustomClass, ModelWithCustomClass, and ModelWithUnionCallable to cover Callable signatures, Type fields, nested callables, and custom type handling.
Expected Output Update
tests/data/expected/main/jsonschema/x_python_type_no_schema_type.py
Updates callback field type annotation from NotRequired[Any] to NotRequired[Callable[[str], str]] and replaces typing.Any import with collections.abc.Callable.
Test Coverage
tests/test_input_model.py
Adds ten new test functions (test_input_model_callable_basic, test_input_model_callable_multi_param, test_input_model_variadic, test_input_model_no_param, test_input_model_callable_optional, test_input_model_type_field, test_input_model_nested_callable, test_input_model_nested_model_with_callable, test_input_model_custom_class, test_input_model_union_callable), all marked to skip on Pydantic v1.

Sequence Diagram(s)

sequenceDiagram
    participant Input as Input Model <br/>(Pydantic v2)
    participant Generator as Schema Generator <br/>(Generator)
    participant Processor as Post-Processor <br/>(x-python-type Annotator)
    participant Parser as Parser <br/>(Override Check)
    participant Output as Generated <br/>Schema/DataType

    Input->>Generator: model_json_schema() via custom generator
    Generator->>Processor: raw JSON schema
    
    rect rgb(200, 220, 255)
    Note over Processor: _add_python_type_for_unserializable
    Processor->>Processor: traverse $defs & properties
    Processor->>Processor: detect unserializable types (Callable, Type, Union)
    Processor->>Processor: mark with _UNSERIALIZABLE_MARKER
    end
    
    Processor->>Processor: _add_python_type_info post-processing
    Processor->>Output: annotated schema (x-python-type fields)
    
    rect rgb(220, 255, 220)
    Note over Parser: During parsing
    Parser->>Parser: _is_compatible_python_type check
    alt Type in PYTHON_TYPE_OVERRIDE_ALWAYS
        Parser->>Parser: _get_python_type_override
        Parser->>Parser: build DataType override with imports
        Parser->>Output: return override DataType
    else Compatible type
        Parser->>Output: standard type resolution
    end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

breaking-change-analyzed

Poem

🐰 Twitching whiskers with joy!
Callables now dance through JSON's embrace,
Type[] fields preserved without a trace,
Unserializable types find their place—
Schema marshals complex typing grace! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding automatic handling of unserializable types (like Callable and Type) in the --input-model feature.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/input-model-callable-type-handling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 29, 2025

📚 Docs Preview: https://pr-2851.datamodel-code-generator.pages.dev

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Dec 29, 2025

CodSpeed Performance Report

Merging #2851 will degrade performance by 17.32%

Comparing feature/input-model-callable-type-handling (fa4acb5) with main (055f8ed)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

❌ 11 regressions
⏩ 98 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Mode Benchmark BASE HEAD Efficiency
WallTime test_perf_multiple_files_input 3.2 s 3.7 s -15.58%
WallTime test_perf_deep_nested 5.2 s 6.3 s -16.45%
WallTime test_perf_complex_refs 1.8 s 2.1 s -15.67%
WallTime test_perf_all_options_enabled 5.7 s 6.7 s -15.09%
WallTime test_perf_duplicate_names 865.7 ms 1,032.4 ms -16.15%
WallTime test_perf_kubernetes_style_pydantic_v2 2.3 s 2.7 s -15.93%
WallTime test_perf_stripe_style_pydantic_v2 1.8 s 2.1 s -15.88%
WallTime test_perf_openapi_large 2.5 s 3 s -16.57%
WallTime test_perf_graphql_style_pydantic_v2 715.7 ms 846.2 ms -15.42%
WallTime test_perf_aws_style_openapi_pydantic_v2 1.7 s 2 s -16.21%
WallTime test_perf_large_models_pydantic_v2 3.1 s 3.8 s -17.32%

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.49%. Comparing base (055f8ed) to head (fa4acb5).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2851      +/-   ##
==========================================
- Coverage   99.50%   99.49%   -0.01%     
==========================================
  Files          90       90              
  Lines       14605    14740     +135     
  Branches     1748     1771      +23     
==========================================
+ Hits        14533    14666     +133     
- Misses         37       38       +1     
- Partials       35       36       +1     
Flag Coverage Δ
unittests 99.49% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@koxudaxi koxudaxi force-pushed the feature/input-model-callable-type-handling branch from 8835d01 to f4f4d11 Compare December 29, 2025 04:55
@koxudaxi koxudaxi force-pushed the feature/input-model-callable-type-handling branch from f4f4d11 to fa4acb5 Compare December 29, 2025 05:46
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/datamodel_code_generator/parser/jsonschema.py (1)

2732-2739: Drop unused noqa codes on parse_item definition

Ruff reports # noqa: PLR0911, PLR0912, PLR0914 here as unused (those rules aren’t enabled), causing RUF100. You can simply remove the directive to silence the warning while keeping behavior unchanged.

Proposed clean-up
-    def parse_item(  # noqa: PLR0911, PLR0912, PLR0914
+    def parse_item(
         self,
         name: str,
         item: JsonSchemaObject,
src/datamodel_code_generator/__main__.py (2)

602-815: Unserializable-type preservation pipeline is solid; add a small guard for items

The new marker-based flow (_UNSERIALIZABLE_MARKER, _serialize_python_type_full, _process_unserializable_property, _add_python_type_for_unserializable) cleanly annotates Pydantic v2 schemas with x-python-type for otherwise-unserializable annotations (Callable, Type, custom classes, nested generics) and aligns with the parser’s new override logic.

One defensive improvement:

  • In _process_unserializable_property, the items branch assumes prop["items"] is a dict:

    elif "items" in prop and prop["items"].get(_UNSERIALIZABLE_MARKER):
        prop["x-python-type"] = _serialize_python_type_full(annotation)
        prop["items"].pop(_UNSERIALIZABLE_MARKER, None)

    JSON Schema allows items to be a list; guarding with isinstance(prop.get("items"), dict) avoids a potential AttributeError if Pydantic ever emits a non-dict items with the marker.

Proposed defensive fix for the items branch
-    elif "items" in prop and prop["items"].get(_UNSERIALIZABLE_MARKER):
-        prop["x-python-type"] = _serialize_python_type_full(annotation)
-        prop["items"].pop(_UNSERIALIZABLE_MARKER, None)
+    elif isinstance(prop.get("items"), dict) and prop["items"].get(_UNSERIALIZABLE_MARKER):
+        prop["x-python-type"] = _serialize_python_type_full(annotation)
+        prop["items"].pop(_UNSERIALIZABLE_MARKER, None)

606-707: Clean up unused noqa directives on new helpers

Ruff flags several of the new helpers for unused # noqa directives (e.g. PLR0911, PLC0415, PLR6301), resulting in RUF100. Since these rules aren’t enabled in your config, the suppressions are unnecessary and can be dropped without changing behavior.

Examples include:

  • Line 606: # noqa: PLR0911 on _serialize_python_type_full
  • Lines 618, 619, 665, 707, 738, 765: # noqa: PLC0415 / # noqa: PLR6301 on local imports and methods

You can either remove these comments or enable the corresponding rules in Ruff; removing them is simplest.

Illustrative clean-up (subset)
-def _serialize_python_type_full(tp: type) -> str:  # noqa: PLR0911
+def _serialize_python_type_full(tp: type) -> str:
@@
-    import types  # noqa: PLC0415
-    from typing import Union, get_args, get_origin  # noqa: PLC0415
+    import types
+    from typing import Union, get_args, get_origin
@@
-    from collections.abc import Callable as ABCCallable  # noqa: PLC0415
+    from collections.abc import Callable as ABCCallable
@@
-    from pydantic.json_schema import GenerateJsonSchema  # noqa: PLC0415
+    from pydantic.json_schema import GenerateJsonSchema
@@
-        def handle_invalid_for_json_schema(  # noqa: PLR6301
+        def handle_invalid_for_json_schema(
@@
-        def callable_schema(  # noqa: PLR6301
+        def callable_schema(
@@
-    from typing import get_origin  # noqa: PLC0415
+    from typing import get_origin
@@
-    from typing import Union, get_args, get_origin  # noqa: PLC0415
+    from typing import Union, get_args, get_origin

(Apply similarly to the remaining new helpers.)

Also applies to: 712-723, 738-742, 765-765

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 055f8ed and fa4acb5.

📒 Files selected for processing (5)
  • src/datamodel_code_generator/__main__.py
  • src/datamodel_code_generator/parser/jsonschema.py
  • tests/data/expected/main/jsonschema/x_python_type_no_schema_type.py
  • tests/data/python/input_model/pydantic_models.py
  • tests/test_input_model.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/datamodel_code_generator/__main__.py (3)
src/datamodel_code_generator/model/base.py (1)
  • name (827-829)
src/datamodel_code_generator/reference.py (2)
  • get (983-985)
  • add (906-981)
src/datamodel_code_generator/parser/base.py (1)
  • add (2468-2471)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/__main__.py

606-606: Unused noqa directive (non-enabled: PLR0911)

Remove unused noqa directive

(RUF100)


618-618: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


619-619: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


665-665: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


707-707: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


712-712: Unused noqa directive (non-enabled: PLR6301)

Remove unused noqa directive

(RUF100)


723-723: Unused noqa directive (non-enabled: PLR6301)

Remove unused noqa directive

(RUF100)


738-738: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


765-765: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/parser/jsonschema.py

2732-2732: Unused noqa directive (non-enabled: PLR0911, PLR0912, PLR0914)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.10 on macOS
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.12 on macOS
  • GitHub Check: 3.14 on macOS
  • GitHub Check: 3.11 on macOS
  • GitHub Check: 3.13 on macOS
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (8)
tests/data/python/input_model/pydantic_models.py (2)

5-6: LGTM! Proper imports for Callable and Type support.

The addition of Callable from collections.abc and Any, Type from typing correctly supports the new test models. Using collections.abc.Callable follows modern Python best practices.


44-88: Excellent test model coverage for Callable and unserializable types.

The new test models comprehensively cover edge cases:

  • Various Callable signatures (single/multi-param, variadic, no-param)
  • Optional and Union combinations with Callable
  • Nested structures (list[Callable[[str], int]])
  • Cross-model references for $defs processing
  • Custom arbitrary types with model_config

The models are well-structured, properly documented, and follow Pydantic v2 conventions.

tests/test_input_model.py (1)

609-723: Excellent comprehensive test coverage for Callable and unserializable types.

The new test suite thoroughly validates the preservation of complex type annotations:

Comprehensive coverage:

  • Various Callable signatures (multi-param, variadic, no-param, optional)
  • Nested structures (list[Callable[[str], int]])
  • Union combinations with Callable
  • Type[BaseModel] handling
  • Custom arbitrary types
  • Cross-model references for $defs processing

Best practices:

  • All tests properly gated with SKIP_PYDANTIC_V1
  • Clear, descriptive docstrings
  • Consistent use of helper functions
  • Appropriate assertions for expected output
  • Follows established test patterns

The test structure is well-organized and maintainable.

tests/data/expected/main/jsonschema/x_python_type_no_schema_type.py (2)

14-14: Excellent improvement in type preservation!

The change from NotRequired[Any] to NotRequired[Callable[[str], str]] successfully preserves the specific callable signature instead of falling back to a generic Any type. This aligns perfectly with the PR objective of handling unserializable types and provides better type safety for code using this generated model.


7-8: No changes needed. The use of collections.abc.Callable is correct for this project, which targets Python >=3.10. PEP 585 generics like collections.abc.Callable are available in Python 3.9+, so there is no version compatibility issue.

Likely an incorrect or invalid review comment.

src/datamodel_code_generator/parser/jsonschema.py (2)

560-580: Callable/Type override wiring and imports look correct

Adding "Type" to PYTHON_TYPE_IMPORTS and introducing PYTHON_TYPE_OVERRIDE_ALWAYS = {"Callable", "Type"} cleanly aligns the parser with the new x-python-type producer: Type[...] and Callable[...] are now always routed through the override path with the right imports. No functional issues spotted here.


1354-1411: x-python-type override logic for Callable/Type is robust

The combination of _is_compatible_python_type, _extract_all_type_names, and _get_python_type_override correctly:

  • Forces override when Callable/Type appear at the top level or nested inside Union/Optional.
  • Handles fully-qualified names by stripping module prefixes and constructing appropriate Import objects.
  • Adds nested imports for inner ABCs (e.g., Iterable in Callable[[Iterable[str]], str]) without disturbing JSON Schema–driven typing for other cases.

This should give the parser exactly the extra information produced by the new input-model schema generator without regressing existing path-based type resolution.

src/datamodel_code_generator/__main__.py (1)

1154-1167: Pydantic v2 schema generator customization integrates correctly; keep version compatibility in mind

Switching the Pydantic BaseModel path in _load_model_schema to:

  • Use a custom GenerateJsonSchema subclass via schema_generator=_get_input_model_json_schema_class(), and
  • Post-process with _add_python_type_for_unserializable before the existing _add_python_type_info

is a good way to preserve full Python typing information (especially for Callable and Type) for --input-model.

Because this relies on Pydantic’s model_json_schema(schema_generator=...) hook and specific generator method names (handle_invalid_for_json_schema, callable_schema), it’s worth ensuring your test matrix covers the supported Pydantic v2 range so that signature or behavior changes won’t silently regress this flow.

@koxudaxi koxudaxi merged commit 0f7a6c9 into main Dec 29, 2025
35 of 37 checks passed
@koxudaxi koxudaxi deleted the feature/input-model-callable-type-handling branch December 29, 2025 10:12
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 1, 2026

🎉 Released in 0.51.0

This PR is now available in the latest release. See the release notes for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant