Add schema version detection and feature flags#2924
Conversation
📝 WalkthroughWalkthroughAdded enums and feature-mode enum; implemented schema/OpenAPI version detection and feature dataclasses; exposed enums and detection functions via lazy imports in package init; updated JSON Schema parser to use injectable data-format mappings; added comprehensive tests for detection, features, and public API. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Init as Package __init__.py
participant Detector as schema_version
participant Features as JsonSchemaFeatures / OpenAPISchemaFeatures
Client->>Init: import detect_jsonschema_version / enums (lazy)
Init-->>Client: provide lazy-access handles
Client->>Detector: detect_jsonschema_version(data) / detect_openapi_version(data)
rect rgb(235,245,255)
Note over Detector: inspect fields like "$schema", "$defs", "definitions", "openapi"\napply regex/heuristics → resolve enum
Detector->>Detector: parse & map to JsonSchemaVersion / OpenAPIVersion
end
Detector-->>Client: return version enum
Client->>Features: call .from_version(enum) / .from_openapi_version(enum)
rect rgb(235,255,235)
Note over Features: map enum → feature flags (null_in_type_array, prefix_items, nullable_keyword, etc.)
end
Features-->>Client: return feature dataclass
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
📚 Docs Preview: https://pr-2924.datamodel-code-generator.pages.dev |
CodSpeed Performance ReportMerging #2924 will not alter performanceComparing
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/datamodel_code_generator/parser/schema_version.py (1)
137-178: LGTM! Sound version detection logic with sensible fallbacks.The detection priority is well-designed:
- Explicit
$schemafield (most reliable)- Heuristic detection via
$defs/prefixItems/definitionspresence- Fallback to Draft7 (most widely used)
The
isinstance(schema_url, str)check prevents crashes on malformed schemas.🔎 Optional: Remove unused noqa directive
Based on Ruff static analysis, the
noqa: PLR0911directive at line 137 is unused (the rule is not enabled in your configuration):-def detect_jsonschema_version(data: dict[str, Any]) -> JsonSchemaVersion: # noqa: PLR0911 +def detect_jsonschema_version(data: dict[str, Any]) -> JsonSchemaVersion:src/datamodel_code_generator/__init__.py (1)
952-992: LGTM! Public API exports updated correctly.The new enums and detection functions are properly added to
__all__in alphabetical order, maintaining consistency with existing exports.🔎 Optional: Remove unused noqa directives
Ruff reports that the
# noqa: F822directives at lines 987-989 are unused. If these are not needed for other linters (mypy, flake8), consider removing them for consistency:- "clear_dynamic_models_cache", # noqa: F822 + "clear_dynamic_models_cache", - "detect_jsonschema_version", # noqa: F822 + "detect_jsonschema_version", - "detect_openapi_version", # noqa: F822 + "detect_openapi_version",Note: Line 991 (
generate_dynamic_models) also has the same unused directive, suggesting this might be a codebase-wide pattern that could be cleaned up separately.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/datamodel_code_generator/__init__.pysrc/datamodel_code_generator/enums.pysrc/datamodel_code_generator/parser/schema_version.pytests/parser/test_schema_version.py
🧰 Additional context used
🧬 Code graph analysis (4)
src/datamodel_code_generator/__init__.py (1)
src/datamodel_code_generator/enums.py (3)
JsonSchemaVersion(243-254)OpenAPIVersion(257-266)VersionMode(269-277)
tests/parser/test_schema_version.py (2)
src/datamodel_code_generator/enums.py (3)
JsonSchemaVersion(243-254)OpenAPIVersion(257-266)VersionMode(269-277)src/datamodel_code_generator/parser/schema_version.py (6)
JsonSchemaFeatures(16-76)OpenAPISchemaFeatures(80-130)detect_jsonschema_version(137-178)detect_openapi_version(181-199)from_version(39-76)from_openapi_version(94-130)
src/datamodel_code_generator/parser/schema_version.py (1)
src/datamodel_code_generator/enums.py (2)
JsonSchemaVersion(243-254)OpenAPIVersion(257-266)
src/datamodel_code_generator/enums.py (1)
src/datamodel_code_generator/model/enum.py (1)
Enum(39-121)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/__init__.py
987-987: Unused noqa directive (unused: F822)
Remove unused noqa directive
(RUF100)
988-988: Unused noqa directive (unused: F822)
Remove unused noqa directive
(RUF100)
989-989: Unused noqa directive (unused: F822)
Remove unused noqa directive
(RUF100)
src/datamodel_code_generator/parser/schema_version.py
137-137: Unused noqa directive (non-enabled: PLR0911)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.14 on Windows
- GitHub Check: 3.13 on Windows
- GitHub Check: 3.10 on Windows
- GitHub Check: 3.14 on macOS
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.13 on macOS
- GitHub Check: 3.11 on macOS
- GitHub Check: Analyze (python)
- GitHub Check: benchmarks
🔇 Additional comments (11)
src/datamodel_code_generator/enums.py (4)
243-254: LGTM! Well-defined JSON Schema version enum.The enum values correctly represent JSON Schema draft versions with appropriate string values, and the
Autooption provides sensible default behavior for automatic detection.
257-266: LGTM! Well-defined OpenAPI version enum.The enum values correctly represent OpenAPI/Swagger specification versions, and the
Autooption enables automatic version detection.
269-277: LGTM! Clear validation mode enum.The
LenientandStrictmodes provide clear semantic meaning for schema validation behavior, with sensible defaults.
280-307: LGTM! Public API exports are correctly updated.The new enums are properly added to
__all__in alphabetical order, maintaining consistency with the existing export list.tests/parser/test_schema_version.py (1)
1-248: LGTM! Excellent comprehensive test coverage.The test suite thoroughly validates:
- JSON Schema and OpenAPI version detection with explicit
$schema/openapifields- Heuristic-based detection using
$defs,definitions, andprefixItems- Fallback behavior for missing or invalid version indicators
- Feature flag correctness across all supported versions
- Immutability of feature dataclasses
- Lazy import exposure through the public API
This provides strong confidence in the correctness of the version detection and feature flag logic.
src/datamodel_code_generator/parser/schema_version.py (4)
15-76: LGTM! Accurate feature flag mappings for JSON Schema versions.The feature flags correctly represent the evolution of JSON Schema specifications:
- Draft 4: Basic schema support with
"id"field- Draft 6/7: Added boolean schemas, switched to
"$id"- Draft 2019-09: Introduced
$defsreplacement fordefinitions- Draft 2020-12: Added
prefixItemsandnullin type arraysThe
Autofallback to latest features is appropriate for lenient mode operation.
79-130: LGTM! Correct OpenAPI feature mappings with proper JSON Schema alignment.The implementation correctly maps OpenAPI versions to their corresponding JSON Schema capabilities:
- OpenAPI 2.0 → JSON Schema Draft 4 semantics
- OpenAPI 3.0 → Uses
nullablekeyword, does NOT support boolean schemas- OpenAPI 3.1 → Full JSON Schema 2020-12 compatibility
The inline comment at line 109 helpfully clarifies the boolean schema limitation in OpenAPI 3.0.
181-199: LGTM! Straightforward OpenAPI version detection.The detection correctly handles both
openapiandswaggerfields, uses prefix matching to handle patch versions (e.g., "3.0.3"), and has a sensible fallback to V31 for maximum JSON Schema compatibility.
202-208: LGTM! Public API exports are complete.All new classes, type variables, and detection functions are properly exposed via
__all__.src/datamodel_code_generator/__init__.py (2)
30-55: LGTM! New enums properly imported.The three new enums (
JsonSchemaVersion,OpenAPIVersion,VersionMode) are correctly imported from the enums module in alphabetical order.
934-939: LGTM! Lazy imports configured correctly.The detection functions are properly registered in
_LAZY_IMPORTSto enable deferred loading via__getattr__, improving startup performance.
846d77c to
b1d78c6
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2924 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 92 94 +2
Lines 16969 17114 +145
Branches 1976 1988 +12
==========================================
+ Hits 16969 17114 +145
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/datamodel_code_generator/parser/schema_version.py (1)
164-178: Verify the default return value whenopenapifield is missing.The function returns
OpenAPIVersion.V31by default when theopenapifield is missing or doesn't match known patterns. While this provides forward compatibility, it might mask invalid OpenAPI documents that lack the requiredopenapifield. Consider whether this should return a default value or raise an error for missing/invalidopenapifields.src/datamodel_code_generator/__init__.py (1)
987-989: Clean up unusednoqadirectives.The
# noqa: F822comments on lines 987-989 are no longer needed according to Ruff. These were likely added for older linters but are now unnecessary.🔎 Proposed fix
- "VersionMode", - "clear_dynamic_models_cache", # noqa: F822 - "detect_jsonschema_version", # noqa: F822 - "detect_openapi_version", # noqa: F822 + "VersionMode", + "clear_dynamic_models_cache", + "detect_jsonschema_version", + "detect_openapi_version",Note: Line 991
"generate_dynamic_models"also has an unusednoqa: F822that should be cleaned up as part of this fix (though it's not shown in the changed lines).
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/datamodel_code_generator/__init__.pysrc/datamodel_code_generator/enums.pysrc/datamodel_code_generator/parser/schema_version.pytests/parser/test_schema_version.py
🚧 Files skipped from review as they are similar to previous changes (2)
- src/datamodel_code_generator/enums.py
- tests/parser/test_schema_version.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/datamodel_code_generator/parser/schema_version.py (1)
src/datamodel_code_generator/enums.py (2)
JsonSchemaVersion(243-254)OpenAPIVersion(257-265)
src/datamodel_code_generator/__init__.py (1)
src/datamodel_code_generator/enums.py (3)
JsonSchemaVersion(243-254)OpenAPIVersion(257-265)VersionMode(268-276)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/__init__.py
987-987: Unused noqa directive (unused: F822)
Remove unused noqa directive
(RUF100)
988-988: Unused noqa directive (unused: F822)
Remove unused noqa directive
(RUF100)
989-989: Unused noqa directive (unused: F822)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: py312-pydantic1 on Ubuntu
- GitHub Check: 3.13 on Windows
- GitHub Check: 3.12 on Windows
- GitHub Check: py312-isort6 on Ubuntu
- GitHub Check: 3.10 on Ubuntu
- GitHub Check: 3.10 on macOS
- GitHub Check: 3.14 on Windows
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.10 on Windows
- GitHub Check: Analyze (python)
- GitHub Check: benchmarks
🔇 Additional comments (8)
src/datamodel_code_generator/parser/schema_version.py (5)
15-37: LGTM!The
JsonSchemaFeaturesdataclass is well-structured with clear documentation. Using a frozen dataclass for immutable feature flags is an appropriate design choice.
38-77: LGTM!The factory method correctly maps JSON Schema versions to their corresponding feature flags. The catch-all case defaulting to Draft 2020-12 features provides sensible forward compatibility for
Autodetection and newer drafts.
80-92: LGTM!The
OpenAPISchemaFeaturesproperly extendsJsonSchemaFeatureswith OpenAPI-specific attributes while maintaining the frozen dataclass pattern.
94-119: LGTM!The factory method correctly maps OpenAPI versions to feature flags. The catch-all case defaulting to V31 features is appropriate for
Autodetection and provides forward compatibility.
135-161: LGTM!The detection logic is well-structured with clear priority: explicit
$schemadeclaration, then heuristics, then a sensible fallback to Draft 7 (the most widely adopted version). The heuristics correctly distinguish between newer drafts using$defsand older ones usingdefinitions.src/datamodel_code_generator/__init__.py (3)
46-54: LGTM!The new enum imports (
JsonSchemaVersion,OpenAPIVersion,VersionMode) are correctly added and maintain alphabetical ordering within the import statement.
936-937: LGTM!The lazy import entries for the detection functions are correctly configured and follow the established pattern used by other lazy imports in this module.
974-991: LGTM!The new exports are correctly added to
__all__with proper alphabetical ordering. The enum exports (JsonSchemaVersion,OpenAPIVersion,VersionMode) are direct imports, while the detection functions are lazy imports handled by__getattr__, which is the correct pattern for this module.
* Add format registry with separation of OpenAPI-specific formats * Use snapshot for full format comparison in tests * Integrate format registry with parsers (backward compatible)
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/datamodel_code_generator/parser/schema_version.py (2)
188-246: Remove unused noqa directive.Ruff reports that the
# noqa: PLC0415directive is unused (the rule is not enabled in your configuration).🔎 Proposed fix
def _get_common_data_formats() -> DataFormatMapping: """Get common data formats valid for both JsonSchema and OpenAPI.""" - from datamodel_code_generator.types import Types # noqa: PLC0415 + from datamodel_code_generator.types import Types
249-258: Remove unused noqa directive.Similar to the previous function, the
# noqa: PLC0415directive is unused.🔎 Proposed fix
def _get_openapi_only_formats() -> DataFormatMapping: """Get formats specific to OpenAPI (not valid in pure JsonSchema).""" - from datamodel_code_generator.types import Types # noqa: PLC0415 + from datamodel_code_generator.types import Types
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/datamodel_code_generator/parser/jsonschema.pysrc/datamodel_code_generator/parser/schema_version.pytests/parser/test_schema_version.py
🧰 Additional context used
🧬 Code graph analysis (3)
src/datamodel_code_generator/parser/jsonschema.py (1)
src/datamodel_code_generator/types.py (1)
Types(955-994)
src/datamodel_code_generator/parser/schema_version.py (3)
src/datamodel_code_generator/enums.py (2)
JsonSchemaVersion(243-254)OpenAPIVersion(257-265)src/datamodel_code_generator/types.py (1)
Types(955-994)src/datamodel_code_generator/__init__.py (1)
is_openapi(264-266)
tests/parser/test_schema_version.py (4)
src/datamodel_code_generator/enums.py (3)
JsonSchemaVersion(243-254)OpenAPIVersion(257-265)VersionMode(268-276)src/datamodel_code_generator/parser/schema_version.py (7)
JsonSchemaFeatures(20-81)OpenAPISchemaFeatures(85-123)detect_jsonschema_version(139-165)detect_openapi_version(168-182)from_version(43-81)from_openapi_version(99-123)get_data_formats(261-277)src/datamodel_code_generator/types.py (1)
Types(955-994)src/datamodel_code_generator/__init__.py (1)
is_openapi(264-266)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/parser/schema_version.py
190-190: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
251-251: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
🔇 Additional comments (9)
src/datamodel_code_generator/parser/jsonschema.py (3)
512-527: LGTM! Clean parameter injection pattern.The optional
data_formatsparameter enables future customization of type/format mappings while maintaining backward compatibility by defaulting to the globaljson_schema_data_formats. The logic correctly handles the fallback to default formats and issues appropriate warnings for unknown formats.
741-748: Good extensibility pattern for future format separation.The
_data_formatscached property currently returns the global mapping but provides a clean extension point for future separation of JSON Schema and OpenAPI formats in strict mode, as noted in the docstring.
750-765: LGTM! Consistent use of injected data formats.The method correctly obtains data formats from
self._data_formatsand propagates them through to_get_type, ensuring custom mappings are consistently applied throughout the type resolution pipeline.tests/parser/test_schema_version.py (1)
1-401: Excellent comprehensive test coverage!The test suite thoroughly validates:
- Version detection for all JSON Schema drafts and OpenAPI versions
- Heuristic fallbacks and edge cases (non-string values, missing fields)
- Feature flag correctness across versions
- Immutability of frozen dataclasses
- Inheritance relationships between feature classes
- Lazy import behavior and public API exposure
- Data format mappings for both JSON Schema and OpenAPI
The use of inline snapshots makes test expectations explicit and maintainable.
src/datamodel_code_generator/parser/schema_version.py (5)
42-81: LGTM! Well-structured feature factory method.The
match/casestatement correctly maps each JSON Schema version to its corresponding feature flags. The default case handlesAutoand future versions by returning the latest (Draft 2020-12) features.Note: The past review comment about "mixed implicit and explicit returns" appears to be a false positive from static analysis, as all branches explicitly return.
98-123: LGTM! Correct OpenAPI feature mapping.The method correctly maps OpenAPI 3.0 to its specific feature set (with
nullable_keyword=True) and defaults other versions to 3.1 behavior (JSON Schema 2020-12 alignment).
139-165: Robust version detection with sensible fallbacks.The detection priority is well-designed:
- Explicit
$schemafield (most authoritative)- Heuristics (
$defsvsdefinitions)- Fallback to Draft 7 (most widely used)
168-182: LGTM! Simple and correct OpenAPI version detection.The function correctly parses the
openapifield and defaults to 3.1 when absent or invalid, which is appropriate for forward compatibility.
261-277: LGTM! Correct format merging logic.The function correctly merges common formats with OpenAPI-specific formats when requested, ensuring OpenAPI-only formats (like
binaryandpassword) are only included for OpenAPI schemas.
Breaking Change AnalysisResult: No breaking changes detected Reasoning: PR #2924 adds new public API features (JsonSchemaVersion, OpenAPIVersion, VersionMode enums and detect_jsonschema_version, detect_openapi_version functions) without modifying any existing behavior. Internal changes to This analysis was performed by Claude Code Action |
|
🎉 Released in 0.53.0 This PR is now available in the latest release. See the release notes for details. |
Fixes: #1592
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.