Skip to content

Add model-level json_schema_extra support for Pydantic v2#2803

Merged
koxudaxi merged 3 commits intomainfrom
feature/model-extra-keys-json-schema-extra
Dec 25, 2025
Merged

Add model-level json_schema_extra support for Pydantic v2#2803
koxudaxi merged 3 commits intomainfrom
feature/model-extra-keys-json-schema-extra

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Dec 25, 2025

Fixes: #2129

Summary by CodeRabbit

  • New Features

    • Two new CLI options to include model-level schema extensions and to strip an "x-" prefix.
    • Pydantic v2 models can include model-level json_schema_extra metadata.
  • Documentation

    • CLI reference and quick reference updated with the new options, examples, and usage guidance.
  • Tests

    • Added CLI tests for applying extras, stripping the prefix, and no-match scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 25, 2025

Warning

Rate limit exceeded

@koxudaxi has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 15 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between cab6435 and a0932cd.

📒 Files selected for processing (3)
  • src/datamodel_code_generator/parser/jsonschema.py
  • tests/data/expected/main/jsonschema/model_extras_no_match_v2.py
  • tests/main/jsonschema/test_main_jsonschema.py
📝 Walkthrough

Walkthrough

Adds two CLI options, --model-extra-keys and --model-extra-keys-without-x-prefix, and threads them through generate()/Config, parser constructors, and JSON Schema extraction to collect model-level x-* schema extensions into template data and merge them into Pydantic v2 ConfigDict.json_schema_extra in generated models.

Changes

Cohort / File(s) Summary
CLI docs & quick reference
docs/cli-reference/index.md, docs/cli-reference/quick-reference.md, docs/cli-reference/model-customization.md
Added documentation entries for --model-extra-keys and --model-extra-keys-without-x-prefix; moved index entries (A->M) and updated counts.
CLI arg parsing & metadata
src/datamodel_code_generator/arguments.py, src/datamodel_code_generator/cli_options.py, src/datamodel_code_generator/prompt_data.py
Added CLI argument definitions, metadata and human-readable descriptions for the two new options.
Public API surface
src/datamodel_code_generator/__init__.py, src/datamodel_code_generator/__main__.py
Extended generate() signature and Config model with model_extra_keys and model_extra_keys_without_x_prefix; forwarded them into generation flow.
Parser base & subclass propagation
src/datamodel_code_generator/parser/base.py, src/datamodel_code_generator/parser/jsonschema.py, src/datamodel_code_generator/parser/openapi.py, src/datamodel_code_generator/parser/graphql.py
Added model_extra_keys and model_extra_keys_without_x_prefix params to Parser and parser subclasses; initialized internal sets and forwarded via super() calls.
JSON Schema extraction logic
src/datamodel_code_generator/parser/jsonschema.py
In set_schema_extensions(), collect model-level extras from schema extras into extra_template_data[path]["model_extras"], honoring keys and optional x- prefix stripping.
Pydantic v2 model assembly
src/datamodel_code_generator/model/pydantic_v2/__init__.py, src/datamodel_code_generator/model/pydantic_v2/base_model.py
Added json_schema_extra: Optional[Dict[str, Any]] to ConfigDict; merge model_extras from template data into final json_schema_extra when assembling model config.
Tests & expected outputs
tests/main/jsonschema/test_main_jsonschema.py, tests/data/expected/main/jsonschema/*_v2.py
Added tests for model-level extras (with/without x- prefix and no-match) and expected Pydantic v2 model files demonstrating model_config = ConfigDict(json_schema_extra=...).

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as CLI (args)
    participant API as generate()
    participant Config as Config
    participant Parser as JsonSchemaParser
    participant Extract as set_schema_extensions()
    participant Template as extra_template_data
    participant ModelGen as Pydantic v2 assembler

    User->>CLI: pass --model-extra-keys x-custom-metadata
    CLI->>API: include model_extra_keys param
    API->>Config: build Config (model_extra_keys=...)
    Config->>Parser: Parser.__init__(model_extra_keys=...)
    Parser->>Parser: store self.model_extra_keys
    Parser->>Extract: parse schema, call set_schema_extensions()
    Extract->>Template: store matched keys as model_extras
    Template->>ModelGen: supply extra_template_data (model_extras)
    ModelGen->>ModelGen: merge model_extras into ConfigDict.json_schema_extra
    ModelGen-->>User: output model with json_schema_extra
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related PRs

Suggested Labels

breaking-change-analyzed

Suggested Reviewers

  • ilovelinux

Poem

🐰 I nibbled through schemas in twilight cheer,

x-* whispers gathered, now held near,
ConfigDict petals open wide,
Model extras hop in, side by side,
A tiny leap — generated delight!

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding model-level json_schema_extra support for Pydantic v2, which directly matches the PR objectives and linked issue #2129.
Linked Issues check ✅ Passed The PR successfully addresses issue #2129 by implementing model-level json_schema_extra support through new CLI options (--model-extra-keys, --model-extra-keys-without-x-prefix) and corresponding API parameters, enabling type-level x-* metadata preservation in generated Pydantic v2 models.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing model-level json_schema_extra support; no unrelated modifications were introduced outside the stated objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/datamodel_code_generator/model/pydantic_v2/__init__.py (1)

9-9: ConfigDict json_schema_extra wiring looks correct; consider trimming new noqa codes.

The added Dict import and json_schema_extra: Optional[Dict[str, Any]] field are the right way to surface model‑level extensions into ConfigDict for Pydantic v2, and they align with how BaseModel now builds config_parameters["json_schema_extra"].

Ruff is flagging the new # noqa: UP035 (line 9) and # noqa: UP006, UP045 (line 47) as unused because those rules are currently disabled. To keep lint clean, it’s worth either:

  • Dropping these noqa fragments on the new lines, or
  • Enabling the corresponding rules project‑wide if you still want them suppressed here.

Functionally everything looks good.

Also applies to: 47-47

src/datamodel_code_generator/model/pydantic_v2/base_model.py (1)

247-252: Model-level extras merge is correct; double‑check precedence and type assumptions.

This block correctly pulls model_extras from extra_template_data and merges it into config_parameters["json_schema_extra"], which is exactly what’s needed for the new --model-extra-keys* options.

Two minor considerations:

  • Precedence: {**existing, **model_extras} means values from model_extras win on key collisions with any pre‑existing json_schema_extra coming from extra_template_data["config"]. If you ever introduce user‑supplied json_schema_extra via extra_template_data, you may want the user config to override the schema‑derived extras instead, i.e. {**model_extras, **existing}.

  • Type safety: This assumes that any existing json_schema_extra is a dict‑like mapping. If in the future you support callables or other non‑mapping types there, a small guard like if isinstance(existing, dict) before merging would keep this safe.

Right now, for the current use cases, the implementation looks solid.

src/datamodel_code_generator/arguments.py (1)

627-639: CLI flags are well-defined; confirm grouping under field_options is intentional.

The new arguments:

field_options.add_argument("--model-extra-keys", ...)
field_options.add_argument("--model-extra-keys-without-x-prefix", ...)

have appropriate names, types, and help text for driving model‑level json_schema_extra. The only slightly surprising bit is that they’re attached to field_options rather than model_options, whereas the docs/CLI metadata treat them as “Model Customization”.

This only affects how they’re grouped in --help output, not behavior. If you prefer them to show up under “Model customization” in the CLI help, consider moving them to model_options; otherwise the current setup is functionally fine.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fe1422 and 8979699.

⛔ Files ignored due to path filters (1)
  • tests/data/jsonschema/model_extras.json is excluded by !tests/data/**/*.json and included by none
📒 Files selected for processing (17)
  • docs/cli-reference/index.md
  • docs/cli-reference/model-customization.md
  • docs/cli-reference/quick-reference.md
  • src/datamodel_code_generator/__init__.py
  • src/datamodel_code_generator/__main__.py
  • src/datamodel_code_generator/arguments.py
  • src/datamodel_code_generator/cli_options.py
  • src/datamodel_code_generator/model/pydantic_v2/__init__.py
  • src/datamodel_code_generator/model/pydantic_v2/base_model.py
  • src/datamodel_code_generator/parser/base.py
  • src/datamodel_code_generator/parser/graphql.py
  • src/datamodel_code_generator/parser/jsonschema.py
  • src/datamodel_code_generator/parser/openapi.py
  • src/datamodel_code_generator/prompt_data.py
  • tests/data/expected/main/jsonschema/model_extras_v2.py
  • tests/data/expected/main/jsonschema/model_extras_without_x_prefix_v2.py
  • tests/main/jsonschema/test_main_jsonschema.py
🧰 Additional context used
🧬 Code graph analysis (4)
tests/data/expected/main/jsonschema/model_extras_v2.py (3)
src/datamodel_code_generator/model/pydantic_v2/base_model.py (1)
  • BaseModel (162-348)
src/datamodel_code_generator/model/pydantic_v2/__init__.py (1)
  • ConfigDict (29-55)
tests/data/expected/main/jsonschema/model_extras_without_x_prefix_v2.py (1)
  • ModelExtras (10-14)
tests/main/jsonschema/test_main_jsonschema.py (2)
tests/main/conftest.py (2)
  • output_file (98-100)
  • run_main_and_assert (244-408)
tests/test_main_kr.py (1)
  • output_file (44-46)
src/datamodel_code_generator/parser/jsonschema.py (1)
src/datamodel_code_generator/model/base.py (1)
  • path (902-904)
tests/data/expected/main/jsonschema/model_extras_without_x_prefix_v2.py (2)
src/datamodel_code_generator/model/pydantic_v2/base_model.py (1)
  • BaseModel (162-348)
src/datamodel_code_generator/model/pydantic_v2/__init__.py (1)
  • ConfigDict (29-55)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/__main__.py

503-503: Unused noqa directive (non-enabled: UP045)

Remove unused noqa directive

(RUF100)


504-504: Unused noqa directive (non-enabled: UP045)

Remove unused noqa directive

(RUF100)

src/datamodel_code_generator/model/pydantic_v2/__init__.py

9-9: Unused noqa directive (non-enabled: UP035)

Remove unused noqa directive

(RUF100)


47-47: Unused noqa directive (non-enabled: UP006, UP045)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: 3.13 on Windows
  • GitHub Check: py312-isort7 on Ubuntu
  • GitHub Check: py312-black24 on Ubuntu
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.11 on Windows
  • GitHub Check: py312-black22 on Ubuntu
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.11 on Ubuntu
  • GitHub Check: 3.10 on macOS
  • GitHub Check: 3.12 on Windows
  • GitHub Check: benchmarks
🔇 Additional comments (18)
src/datamodel_code_generator/cli_options.py (1)

118-121: New CLI option metadata entries look consistent.

--model-extra-keys and --model-extra-keys-without-x-prefix are correctly added under the MODEL category and match the documented anchors, so they should integrate cleanly with the CLI docs sync tests.

docs/cli-reference/model-customization.md (1)

22-23: New --model-extra-keys* documentation is clear and aligned with behavior.

The additions to the options table and the new sections for:

  • --model-extra-keys
  • --model-extra-keys-without-x-prefix

are consistent with the generated Pydantic v2 examples and the expected test outputs (including the stripping of the x- prefix in the without‑x‑prefix variant). The anchor IDs and cross‑links from quick reference all line up.

No changes needed from my side.

Also applies to: 3334-3443

tests/data/expected/main/jsonschema/model_extras_without_x_prefix_v2.py (1)

1-14: Expected output matches the documented behavior for --model-extra-keys-without-x-prefix.

The generated ModelExtras class correctly uses:

model_config = ConfigDict(
    json_schema_extra={'custom-metadata': {'key1': 'value1'}, 'version': 1},
)

which aligns with both the CLI flag semantics and the docs. This is a good regression fixture for the new feature.

docs/cli-reference/quick-reference.md (1)

98-99: Quick reference updates are consistent with the new model extras options.

The new entries for --model-extra-keys and --model-extra-keys-without-x-prefix are correctly:

  • Placed under “Model Customization” in the category table.
  • Included in the alphabetical index with matching anchors and concise descriptions.

This keeps the quick reference in sync with both the CLI metadata and the detailed model-customization docs.

Also applies to: 249-250

src/datamodel_code_generator/__main__.py (2)

838-839: Config → generate wiring for model extras looks consistent

model_extra_keys and model_extra_keys_without_x_prefix are passed through to generate() alongside the existing field-level options; the plumbing here looks correct.


503-504: Remove unused # noqa: UP045 on new config fields

Ruff reports these noqa pragmas as unused; UP045 isn't enabled, so the comments are redundant and can be dropped.

Suggested diff
-    model_extra_keys: Optional[set[str]] = None  # noqa: UP045
-    model_extra_keys_without_x_prefix: Optional[set[str]] = None  # noqa: UP045
+    model_extra_keys: Optional[set[str]] = None
+    model_extra_keys_without_x_prefix: Optional[set[str]] = None
⛔ Skipped due to learnings
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/model/pydantic/__init__.py:43-43
Timestamp: 2025-12-25T09:22:14.661Z
Learning: In datamodel-code-generator project, defensive `# noqa: PLC0415` directives should be kept on lazy imports (imports inside functions/methods) even when Ruff reports them as unused via RUF100, to prepare for potential future Ruff configuration changes that might enable the import-outside-top-level rule.
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2681
File: tests/cli_doc/test_cli_doc_coverage.py:82-82
Timestamp: 2025-12-18T13:43:16.235Z
Learning: In datamodel-code-generator project, Ruff preview mode is enabled via `lint.preview = true` in pyproject.toml. This enables preview rules like PLR6301 (no-self-use), so `noqa: PLR6301` directives are necessary and should not be removed even if RUF100 suggests they are unused.
src/datamodel_code_generator/prompt_data.py (1)

66-67: New prompt descriptions for model-level extras are consistent

The added descriptions for --model-extra-keys and --model-extra-keys-without-x-prefix accurately reflect the behavior and align with the rest of OPTION_DESCRIPTIONS.

tests/data/expected/main/jsonschema/model_extras_v2.py (1)

1-14: Expected Pydantic v2 output for model-level extras looks correct

The generated model correctly places the x-custom-metadata payload under ConfigDict(json_schema_extra=...), matching the intended model-level extras behavior.

src/datamodel_code_generator/parser/openapi.py (1)

242-243: OpenAPIParser correctly exposes and forwards model-level extras configuration

The new model_extra_keys and model_extra_keys_without_x_prefix parameters are typed and defaulted consistently with the field-level options and are correctly forwarded to the base JsonSchemaParser/Parser via keyword arguments.

Also applies to: 359-360

docs/cli-reference/index.md (1)

14-14: CLI index updates align with new model extras options

The Model Customization count and the “M” section now include --model-extra-keys and --model-extra-keys-without-x-prefix, with --module-split-mode correctly grouped there; the index remains alphabetically ordered and consistent with the new feature.

Also applies to: 116-118

src/datamodel_code_generator/parser/base.py (1)

738-740: Parser now cleanly supports model-level extras configuration

Adding model_extra_keys and model_extra_keys_without_x_prefix to the constructor and storing them as sets mirrors the existing field-level options and provides a consistent configuration surface for downstream parsers.

Also applies to: 856-857

tests/main/jsonschema/test_main_jsonschema.py (2)

6371-6406: Model-level extras test wiring looks solid

The parametrization, CLI doc metadata, and run_main_and_assert call for test_main_jsonschema_model_extras are consistent with existing field‑extras tests and match the new --model-extra-keys flag semantics. Limiting coverage to pydantic_v2.BaseModel aligns with the Pydantic v2–only feature surface.


6410-6447: Good coverage for prefix-stripping behavior

test_main_jsonschema_model_extras_without_x_prefix correctly exercises the --model-extra-keys-without-x-prefix option with multiple keys and validates the dedicated golden output. The CLI doc marker and model_outputs mapping also follow the established pattern used for field-level extras.

src/datamodel_code_generator/parser/graphql.py (1)

160-162: New model_extra_keys parameters are correctly threaded through

The added model_extra_keys and model_extra_keys_without_x_prefix kwargs are typed, defaulted, and forwarded to Parser.__init__ consistently with existing field extras, so GraphQLParser stays API‑compatible with the core parser configuration.

Also applies to: 275-276

src/datamodel_code_generator/__init__.py (2)

436-437: LGTM!

The new parameters follow the established pattern of field_extra_keys and field_extra_keys_without_x_prefix, maintaining API consistency.


726-727: LGTM!

The new parameters are correctly passed to the parser, consistent with how field-level extra keys are handled.

src/datamodel_code_generator/parser/jsonschema.py (2)

593-594: LGTM!

The new parameters follow the established pattern of field-level extra keys.


707-708: LGTM!

Correct propagation to the base parser.

Comment thread src/datamodel_code_generator/parser/jsonschema.py
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Dec 25, 2025

CodSpeed Performance Report

Merging #2803 will not alter performance

Comparing feature/model-extra-keys-json-schema-extra (a0932cd) with main (6fe1422)

Summary

✅ 73 untouched
⏩ 10 skipped1

Footnotes

  1. 10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.51%. Comparing base (767714f) to head (a0932cd).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2803   +/-   ##
=======================================
  Coverage   99.51%   99.51%           
=======================================
  Files          89       89           
  Lines       13715    13745   +30     
  Branches     1613     1619    +6     
=======================================
+ Hits        13648    13678   +30     
  Misses         36       36           
  Partials       31       31           
Flag Coverage Δ
unittests 99.51% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@koxudaxi koxudaxi force-pushed the feature/model-extra-keys-json-schema-extra branch from cab6435 to a0932cd Compare December 25, 2025 16:56
@koxudaxi koxudaxi merged commit 355a344 into main Dec 25, 2025
38 checks passed
@koxudaxi koxudaxi deleted the feature/model-extra-keys-json-schema-extra branch December 25, 2025 17:12
@github-actions
Copy link
Copy Markdown
Contributor

Breaking Change Analysis

Result: No breaking changes detected

Reasoning: This PR adds two new opt-in CLI options (--model-extra-keys and --model-extra-keys-without-x-prefix) and corresponding Python API parameters for adding model-level schema extensions to Pydantic v2's ConfigDict json_schema_extra. All changes are purely additive: new parameters default to None, existing behavior is unchanged when the new flags aren't used, and custom templates continue to work since the new model_extras data is handled internally. There are no changes to default behavior, no removal of existing functionality, and no modifications that would affect existing users' generated code.


This analysis was performed by Claude Code Action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Class level json_schema_extra

1 participant