Skip to content

Fix duplicate model generation for $ref with nullable#2890

Merged
koxudaxi merged 1 commit intomainfrom
fix/ref-nullable-duplicate-models
Jan 2, 2026
Merged

Fix duplicate model generation for $ref with nullable#2890
koxudaxi merged 1 commit intomainfrom
fix/ref-nullable-duplicate-models

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Jan 2, 2026

Fixes: #1792

Summary by CodeRabbit

  • Bug Fixes

    • Improved handling of JSON Schema refs that are paired only with nullable: true to avoid redundant model generation and ensure correct optional typing.
    • Refined behavior when nullable-only refs appear with constraints, metadata, or vendor extras to preserve expected model shapes.
  • Tests

    • Added tests covering nullable-only refs, strict-nullable mode, and interactions with constraints, metadata, and extra properties.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 2, 2026

📝 Walkthrough

Walkthrough

Adds a cached property is_ref_with_nullable_only to JsonSchemaObject and updates parser control flow to short‑circuit merging when a schema uses $ref together only with nullable: true, changing how refs are resolved (including Optional wrapping under strict_nullable) and reducing duplicate model creation.

Changes

Cohort / File(s) Summary
Core Parser Logic
src/datamodel_code_generator/parser/jsonschema.py
Added is_ref_with_nullable_only cached_property to JsonSchemaObject. Adjusted control flow in _handle_allof_root_model_with_constraints, parse_combined_schema, parse_item, and parse_obj to short-circuit merging for $ref + nullable: true and to return referenced type (wrapped in Optional when strict_nullable).
Generated/Expected Outputs
tests/data/expected/main/jsonschema/ref_nullable_only.py, .../ref_nullable_only_strict.py, .../ref_nullable_with_constraint.py, .../ref_nullable_with_extra.py, .../ref_nullable_with_metadata.py
Added/updated expected generated Pydantic model files capturing behaviors for nullable-only refs, strict-nullable handling, refs combined with constraints, vendor extras, and metadata (Field) cases.
Tests
tests/main/jsonschema/test_main_jsonschema.py
Added five integration tests exercising nullable-only $ref handling and variants: test_ref_nullable_only_no_duplicate_model, test_ref_nullable_only_strict_nullable, test_ref_nullable_with_metadata_no_duplicate_model, test_ref_nullable_with_constraint_creates_model, test_ref_nullable_with_extra_creates_model.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

breaking-change-analyzed

Poem

🐇 I sniffed a ref that only wanted "null",
So I skipped the merge and stayed quite small.
One cached twitch, one tidy resolve,
No duplicate models — problems solved! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix duplicate model generation for $ref with nullable' is directly related to the main objective of the pull request, which addresses duplicate model generation when $ref is combined with nullable attributes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 2, 2026

📚 Docs Preview: https://pr-2890.datamodel-code-generator.pages.dev

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jan 2, 2026

CodSpeed Performance Report

Merging #2890 will not alter performance

Comparing fix/ref-nullable-duplicate-models (0e08667) with main (1d221da)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 11 untouched
⏩ 98 skipped1

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/main/jsonschema/test_main_jsonschema.py (1)

7410-7492: New $ref + nullable tests accurately capture the intended merge vs non-merge behavior

The five tests cleanly cover:

  • nullable-only refs (with and without --strict-nullable) avoiding duplicate models,
  • metadata-only additions (title/description) still reusing the referenced type, and
  • constraint / schema-affecting extras (minLength, if/then/else) correctly forcing a merged model.

Names, docstrings, and expected_file/extra_args combinations all line up with the PR objective of preventing spurious duplicate models while still merging when the schema truly changes. If you ever want to reduce repetition, these could be parameterized over (input, expected, extra_args), but that’s purely optional given the clarity of the current tests.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d221da and 3220f8a.

⛔ Files ignored due to path filters (4)
  • tests/data/jsonschema/ref_nullable_only.yaml is excluded by !tests/data/**/*.yaml and included by none
  • tests/data/jsonschema/ref_nullable_with_constraint.yaml is excluded by !tests/data/**/*.yaml and included by none
  • tests/data/jsonschema/ref_nullable_with_extra.yaml is excluded by !tests/data/**/*.yaml and included by none
  • tests/data/jsonschema/ref_nullable_with_metadata.yaml is excluded by !tests/data/**/*.yaml and included by none
📒 Files selected for processing (7)
  • src/datamodel_code_generator/parser/jsonschema.py
  • tests/data/expected/main/jsonschema/ref_nullable_only.py
  • tests/data/expected/main/jsonschema/ref_nullable_only_strict.py
  • tests/data/expected/main/jsonschema/ref_nullable_with_constraint.py
  • tests/data/expected/main/jsonschema/ref_nullable_with_extra.py
  • tests/data/expected/main/jsonschema/ref_nullable_with_metadata.py
  • tests/main/jsonschema/test_main_jsonschema.py
🧰 Additional context used
🧬 Code graph analysis (2)
tests/data/expected/main/jsonschema/ref_nullable_with_metadata.py (1)
tests/data/expected/main/jsonschema/ref_nullable_with_extra.py (2)
  • User (18-19)
  • Model (14-15)
tests/data/expected/main/jsonschema/ref_nullable_with_extra.py (2)
src/datamodel_code_generator/model/base.py (1)
  • name (827-829)
tests/data/expected/main/jsonschema/ref_nullable_with_metadata.py (2)
  • Model (14-19)
  • User (10-11)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: 3.10 on macOS
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.11 on macOS
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.13 on Windows
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (8)
src/datamodel_code_generator/parser/jsonschema.py (3)

480-500: LGTM! Well-designed property to detect nullable-only refs.

The is_ref_with_nullable_only property correctly identifies when a $ref is combined exclusively with nullable: true, excluding metadata-only fields and extension fields (x-*). This enables the parser to avoid unnecessary schema merging and duplicate model generation in such cases.


2724-2729: LGTM! Early resolution for nullable-only refs prevents duplicate models.

The early return correctly handles $ref with only nullable: true by directly resolving the reference and optionally wrapping it as Optional when strict_nullable is enabled. This bypasses unnecessary schema merging and avoids creating duplicate models.


3553-3554: LGTM! Skip merge for nullable-only refs.

The modified condition correctly skips schema merging when is_ref_with_nullable_only is true, delegating to the direct reference handling in parse_item. This is consistent with the overall approach to avoid duplicate model generation.

tests/data/expected/main/jsonschema/ref_nullable_with_metadata.py (1)

1-19: LGTM! Generated output correctly handles $ref with nullable and metadata.

The generated code demonstrates the fix working as intended:

  • The User model is defined once (no duplicate)
  • The Model.user_with_title field correctly uses User | None as the type
  • Metadata fields (description, title) are preserved in the Field annotation
  • This confirms that metadata-only fields don't trigger unnecessary schema merging
tests/data/expected/main/jsonschema/ref_nullable_only.py (1)

1-17: LGTM! Generated output correctly handles multiple nullable refs without duplication.

The generated code demonstrates the core fix:

  • The User model is defined once
  • Multiple nullable references (user_a, user_b, user_c) all use User | None
  • No duplicate models are created for each nullable reference
  • This validates the primary objective of the PR
tests/data/expected/main/jsonschema/ref_nullable_with_constraint.py (1)

1-15: LGTM! Generated output correctly creates separate model for nullable ref with constraint.

The generated code validates that constraints are treated as schema-affecting keywords:

  • StringType is created as a separate RootModel because the ref has a constraint (min_length=5)
  • The Model.constrained_string field correctly uses constr(min_length=5) | None
  • This demonstrates that nullable-only logic doesn't apply when constraints are present
tests/data/expected/main/jsonschema/ref_nullable_with_extra.py (1)

1-19: LGTM! Generated output correctly creates separate models for nullable refs with extras.

The generated code validates that non-metadata extra fields trigger model creation:

  • UserWithExtra and User are created as separate models
  • The Model.user_with_extra field uses UserWithExtra | None
  • This demonstrates that the nullable-only logic correctly excludes refs that have schema-affecting extras
tests/data/expected/main/jsonschema/ref_nullable_only_strict.py (1)

1-17: Strict-nullable expected fixture is consistent and correct

The generated models and nullable User | None = None fields match the described behavior for $ref + nullable under --strict-nullable; nothing to fix here.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.37%. Comparing base (1d221da) to head (0e08667).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2890   +/-   ##
=======================================
  Coverage   99.37%   99.37%           
=======================================
  Files          92       92           
  Lines       16152    16179   +27     
  Branches     1906     1912    +6     
=======================================
+ Hits        16051    16078   +27     
  Misses         52       52           
  Partials       49       49           
Flag Coverage Δ
unittests 99.37% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@koxudaxi koxudaxi force-pushed the fix/ref-nullable-duplicate-models branch from 3220f8a to 0e08667 Compare January 2, 2026 08:17
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3220f8a and 0e08667.

⛔ Files ignored due to path filters (4)
  • tests/data/jsonschema/ref_nullable_only.yaml is excluded by !tests/data/**/*.yaml and included by none
  • tests/data/jsonschema/ref_nullable_with_constraint.yaml is excluded by !tests/data/**/*.yaml and included by none
  • tests/data/jsonschema/ref_nullable_with_extra.yaml is excluded by !tests/data/**/*.yaml and included by none
  • tests/data/jsonschema/ref_nullable_with_metadata.yaml is excluded by !tests/data/**/*.yaml and included by none
📒 Files selected for processing (7)
  • src/datamodel_code_generator/parser/jsonschema.py
  • tests/data/expected/main/jsonschema/ref_nullable_only.py
  • tests/data/expected/main/jsonschema/ref_nullable_only_strict.py
  • tests/data/expected/main/jsonschema/ref_nullable_with_constraint.py
  • tests/data/expected/main/jsonschema/ref_nullable_with_extra.py
  • tests/data/expected/main/jsonschema/ref_nullable_with_metadata.py
  • tests/main/jsonschema/test_main_jsonschema.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/data/expected/main/jsonschema/ref_nullable_with_metadata.py
  • src/datamodel_code_generator/parser/jsonschema.py
  • tests/main/jsonschema/test_main_jsonschema.py
  • tests/data/expected/main/jsonschema/ref_nullable_with_extra.py
  • tests/data/expected/main/jsonschema/ref_nullable_only.py
🧰 Additional context used
🧬 Code graph analysis (2)
tests/data/expected/main/jsonschema/ref_nullable_only_strict.py (1)
tests/data/expected/main/jsonschema/ref_nullable_only.py (2)
  • User (10-11)
  • Model (14-17)
tests/data/expected/main/jsonschema/ref_nullable_with_constraint.py (4)
tests/data/expected/main/jsonschema/ref_nullable_only.py (1)
  • Model (14-17)
tests/data/expected/main/jsonschema/ref_nullable_only_strict.py (1)
  • Model (14-17)
tests/data/expected/main/jsonschema/ref_nullable_with_extra.py (1)
  • Model (14-15)
tests/data/expected/main/jsonschema/ref_nullable_with_metadata.py (1)
  • Model (14-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.10 on Ubuntu
  • GitHub Check: 3.10 on Windows
  • GitHub Check: py312-isort7 on Ubuntu
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.11 on Ubuntu
  • GitHub Check: 3.12 on Ubuntu
  • GitHub Check: 3.14 on macOS
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.14 on Windows
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (3)
tests/data/expected/main/jsonschema/ref_nullable_only_strict.py (1)

1-17: LGTM! Test expectation correctly demonstrates the fix.

This expected output file correctly validates the fix for duplicate model generation with $ref + nullable. Key observations:

  • Only one User model is generated (lines 10-11), demonstrating that the fix prevents duplicates
  • All three nullable references (user_a, user_b, user_c) correctly use User | None = None with Pydantic v2 syntax
  • The from __future__ import annotations import (line 5) properly enables the modern union syntax
  • Structure matches the related non-strict version in ref_nullable_only.py

The file correctly represents the expected behavior after applying the is_ref_with_nullable_only optimization mentioned in the AI summary.

tests/data/expected/main/jsonschema/ref_nullable_with_constraint.py (2)

5-7: LGTM!

The imports are correctly set up for Pydantic v2, with future annotations enabled for modern union syntax.


10-11: LGTM!

The Model class correctly demonstrates the combination of a constrained type (constr(min_length=5)) with nullable union syntax, which is the key scenario this PR addresses.

Comment thread tests/data/expected/main/jsonschema/ref_nullable_with_constraint.py
@koxudaxi koxudaxi merged commit 5306aef into main Jan 2, 2026
38 checks passed
@koxudaxi koxudaxi deleted the fix/ref-nullable-duplicate-models branch January 2, 2026 08:29
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 2, 2026

Breaking Change Analysis

Result: Breaking changes detected

Reasoning: This PR changes the generated code output for a specific pattern: when a JSON Schema uses $ref with nullable: true. Previously, the generator would create duplicate/merged models (e.g., UserA, UserB) for each property referencing the same type. After this change, the generator reuses the original referenced type directly with an Optional annotation.

While this is a bug fix that produces cleaner output, it is a breaking change because:

  1. Generated class names will differ - users who previously had code referencing UserA or UserB classes will find those classes no longer exist
  2. Import statements may need to change if users were importing the merged model classes
  3. Type annotations in downstream code that referenced the old class names will break

The change only affects the $ref + nullable: true pattern without other schema-affecting keywords (constraints like minLength, or conditional keywords like if/then/else still trigger model merging as before).

Content for Release Notes

Code Generation Changes

  • Different output for $ref with nullable: true - When a JSON Schema property has a $ref combined with only nullable: true (and optionally metadata like title/description), the generator now uses the referenced type directly with Optional annotation instead of creating a new merged model. For example, a schema with multiple properties referencing User with nullable: true will now generate user_a: User | None instead of creating separate UserA, UserB model classes. This is a bug fix that reduces redundant model generation, but existing code that depends on the previously generated class names will break. (Fix duplicate model generation for $ref with nullable #2890)

    Before:

    class UserA(BaseModel):
        name: str
    
    class UserB(BaseModel):
        name: str
    
    class Model(BaseModel):
        user_a: UserA | None = None
        user_b: UserB | None = None

    After:

    class User(BaseModel):
        name: str
    
    class Model(BaseModel):
        user_a: User | None = None
        user_b: User | None = None

This analysis was performed by Claude Code Action

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 2, 2026

🎉 Released in 0.52.0

This PR is now available in the latest release. See the release notes for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant