Skip to content

docstrings that can be single line to be formatted on a single line#3107

Open
kevin-paulson-mindbridge-ai wants to merge 8 commits intokoxudaxi:mainfrom
kevin-paulson-mindbridge-ai:keep_single_line_docstrings_on_a_single_line
Open

docstrings that can be single line to be formatted on a single line#3107
kevin-paulson-mindbridge-ai wants to merge 8 commits intokoxudaxi:mainfrom
kevin-paulson-mindbridge-ai:keep_single_line_docstrings_on_a_single_line

Conversation

@kevin-paulson-mindbridge-ai
Copy link
Copy Markdown
Contributor

@kevin-paulson-mindbridge-ai kevin-paulson-mindbridge-ai commented Apr 28, 2026

Inspired by unnecessary-multiline-docstring (D200):

PEP 257 recommends that docstrings that can fit on one line should be formatted on a single line, for consistency and readability.

I think this makes sense to change the output in this way for the project, but if not preferred I'm happy to abandon this pull request. Thank you!

Summary by CodeRabbit

  • New Features

    • Enhanced docstring formatting in generated code to use compact single-line format when appropriate, reducing verbosity while preserving documentation content.
  • Tests

    • Updated test expectations to reflect improved docstring formatting behavior across generated output.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

📝 Walkthrough

Walkthrough

This PR introduces a format_docstring helper function to standardize docstring formatting with trimming, escaping, and configurable indentation. The function is exposed as a Jinja2 template filter alongside the existing escape_docstring. Inline field docstring generation is updated to use this new helper. Test expectations are updated to reflect the new single-line docstring formatting behavior.

Changes

Cohort / File(s) Summary
Core Implementation
src/datamodel_code_generator/model/base.py
Adds format_docstring(value, indent_spaces=0) helper for trimming, escaping, and formatting docstrings (compact single-line for one-liners, multi-line blocks otherwise). Updates inline_field_docstring to use it and exposes the function to Jinja2 environments while retaining escape_docstring for backward compatibility.
GraphQL Expected Output – Scalar Aliases
tests/data/expected/main/graphql/*.py, tests/data/expected/parser/graphql/*.py
~60 GraphQL test expected output files where Boolean, String, ID, Int scalar type alias docstrings are reformatted from multi-line triple-quoted blocks to single-line triple-quoted strings.
JSON Schema & OpenAPI Expected Output
tests/data/expected/main/jsonschema/*.py, tests/data/expected/main/openapi/*.py, tests/data/expected/main/use_attribute_docstrings.py
Field and class docstrings reformatted from multi-line to single-line format in JSON Schema, OpenAPI, and attribute docstring expected output files.
Korean Language Variant Expected Output
tests/data/expected/main_kr/*/output.py
Model and field docstrings in Korean language test variants reformatted from multi-line to single-line format.
Test Assertions
tests/model/test_base.py
Updated test expectation for inline_field_docstring with triple-quote content to expect trailing space before closing quotes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Possibly related PRs

  • #2796: Modifies src/datamodel_code_generator/model/base.py to add escape_docstring as a Jinja2 template filter; this PR extends that work by adding format_docstring and updating how docstrings are formatted.

Suggested labels

safe-to-fix

Suggested reviewers

  • koxudaxi
  • ilovelinux

Poem

🐰 Docstrings now sparkle, neat and trim,
From verbose prose to single slim,
Jinja filters bloom with format's might,
Making templates sing just right! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main change: converting multi-line docstrings to single-line format when they fit on one line, which aligns with the comprehensive changes across dozens of test files and the core format_docstring implementation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@kevin-paulson-mindbridge-ai kevin-paulson-mindbridge-ai marked this pull request as ready for review April 28, 2026 21:38
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 28, 2026

Merging this PR will degrade performance by 15.03%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

❌ 11 regressed benchmarks
⏩ 98 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime test_perf_stripe_style_pydantic_v2 1.8 s 2 s -10.89%
WallTime test_perf_aws_style_openapi_pydantic_v2 1.7 s 2 s -14.07%
WallTime test_perf_complex_refs 2 s 2.3 s -12.92%
WallTime test_perf_duplicate_names 964.4 ms 1,119.7 ms -13.87%
WallTime test_perf_kubernetes_style_pydantic_v2 2.4 s 2.7 s -13.15%
WallTime test_perf_deep_nested 5.6 s 6.3 s -10.52%
WallTime test_perf_graphql_style_pydantic_v2 751.9 ms 871.3 ms -13.7%
WallTime test_perf_large_models_pydantic_v2 3.3 s 3.9 s -14.71%
WallTime test_perf_openapi_large 2.6 s 3 s -13.18%
WallTime test_perf_multiple_files_input 3.4 s 3.9 s -15.03%
WallTime test_perf_all_options_enabled 6 s 6.9 s -13.28%

Comparing kevin-paulson-mindbridge-ai:keep_single_line_docstrings_on_a_single_line (6398d94) with main (92bdc27)

Open in CodSpeed

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (92bdc27) to head (6398d94).

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #3107   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           87        87           
  Lines        18306     18320   +14     
  Branches      2091      2094    +3     
=========================================
+ Hits         18306     18320   +14     
Flag Coverage Δ
unittests 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
tests/data/expected/main/graphql/simple_star_wars_extra_fields_allow.py (1)

17-17: Note: Very long single-line docstrings may exceed line length limits.

The ID scalar docstring (line 17, ~330 characters) and String scalar docstring (line 25, ~200+ characters) significantly exceed typical line length limits (79-120 characters). While this aligns with PEP 257's guidance for single-line docstrings, it may trigger line-length linter warnings.

Consider whether docstrings exceeding a certain character threshold should remain multi-line to balance PEP 257 compliance with line-length conventions.

Also applies to: 25-25

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/data/expected/main/graphql/simple_star_wars_extra_fields_allow.py` at
line 17, The single-line docstrings for the `ID` and `String` scalar definitions
are extremely long and will trigger line-length linters; change them from very
long single-line triple-quoted strings to properly wrapped multi-line docstrings
(split into multiple shorter lines within the triple quotes) for the `ID` scalar
and the `String` scalar so each line stays within your project's limit while
preserving the exact text and punctuation.
src/datamodel_code_generator/model/base.py (1)

94-101: Consider a max-length guard before forcing single-line output.

Current logic collapses any one-line input, even very long text. Adding a length threshold (or configurable limit) would better match “fits on one line” and reduce downstream lint noise.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/datamodel_code_generator/model/base.py` around lines 94 - 101, The
current single-line collapse logic should only apply when the one-line text is
short; update the branch that checks len(value.splitlines()) to also check a
max-length guard (e.g. len(escaped) <= MAX_SINGLE_LINE_LENGTH) before returning
the compact form; if the text exceeds that threshold, fall through to the
multi-line path that builds indent_text. Introduce a configurable
MAX_SINGLE_LINE_LENGTH constant or an optional parameter and use that in the
condition (referencing value, escaped, indent_spaces, and indent_text) so long
strings stay multi-line and lint noise is reduced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/datamodel_code_generator/model/base.py`:
- Around line 95-97: The code mutates the docstring by appending a trailing
space when escaped.endswith('"'), which alters original content; instead, detect
the quote-edge case and return a multiline/docstring fallback without changing
the original text. Locate the logic that sets/uses the variable escaped (in the
docstring rendering routine, e.g., the function that composes docstrings) and
replace the escaped += " " mutation with code that emits a multiline/'''...'''
or triple-quoted representation of the original docstring value (or an explicit
multiline fallback path) so the original content is preserved while avoiding
quote-edge parsing issues.

In `@tests/data/expected/main/graphql/custom_formatters.py`:
- Line 18: The docstring generator format_docstring() currently emits
single-line triple-quoted docstrings regardless of length; update
format_docstring() to detect when the resulting docstring (after escaping)
exceeds a target line length (e.g., 79 or configurable max_line_length) and then
reformat it into a multi-line triple-quoted block wrapped to that width (use a
wrapper like textwrap.fill or equivalent), preserving existing indentation and
any leading/trailing blank lines and ensuring existing multi-line inputs are not
double-wrapped; locate format_docstring() in base.py and implement the length
check + wrapping logic so long single-line inputs like the ID scalar docstring
are split into readable lines inside the triple quotes.

In `@tests/data/expected/main/graphql/field_aliases.py`:
- Line 20: The format_docstring function currently treats any single-line
docstring (checked via value.splitlines()) as exempt from wrapping, which allows
very long single-line docstrings (e.g., 172 chars) to bypass style limits;
update format_docstring to enforce a maximum line length for single-line
docstrings by detecting when len(value) (or a configurable max_length) is
exceeded and then wrap/split that single-line into a short-summary first line
plus subsequent wrapped lines (using the same wrapping logic used for multi-line
docstrings), ensuring the resulting output remains PEP 257 compatible while
respecting common linter width limits; reference the format_docstring function
and its single-line branch that currently checks value.splitlines().

---

Nitpick comments:
In `@src/datamodel_code_generator/model/base.py`:
- Around line 94-101: The current single-line collapse logic should only apply
when the one-line text is short; update the branch that checks
len(value.splitlines()) to also check a max-length guard (e.g. len(escaped) <=
MAX_SINGLE_LINE_LENGTH) before returning the compact form; if the text exceeds
that threshold, fall through to the multi-line path that builds indent_text.
Introduce a configurable MAX_SINGLE_LINE_LENGTH constant or an optional
parameter and use that in the condition (referencing value, escaped,
indent_spaces, and indent_text) so long strings stay multi-line and lint noise
is reduced.

In `@tests/data/expected/main/graphql/simple_star_wars_extra_fields_allow.py`:
- Line 17: The single-line docstrings for the `ID` and `String` scalar
definitions are extremely long and will trigger line-length linters; change them
from very long single-line triple-quoted strings to properly wrapped multi-line
docstrings (split into multiple shorter lines within the triple quotes) for the
`ID` scalar and the `String` scalar so each line stays within your project's
limit while preserving the exact text and punctuation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ebe078d0-2766-4318-ae78-72e7b80ba3c9

📥 Commits

Reviewing files that changed from the base of the PR and between 92bdc27 and 6398d94.

⛔ Files ignored due to path filters (16)
  • src/datamodel_code_generator/model/template/Enum.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/ScalarTypeAliasAnnotation.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/ScalarTypeAliasType.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/ScalarTypeStatement.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/TypeAliasAnnotation.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/TypeAliasType.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/TypeStatement.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/TypedDictClass.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/TypedDictFunction.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/dataclass.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/msgspec.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/pydantic_v2/BaseModel.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/pydantic_v2/RootModel.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/pydantic_v2/RootModelTypeAlias.jinja2 is excluded by none and included by none
  • src/datamodel_code_generator/model/template/pydantic_v2/dataclass.jinja2 is excluded by none and included by none
  • tests/data/templates_schema_id/pydantic_v2/BaseModel.jinja2 is excluded by none and included by none
📒 Files selected for processing (61)
  • src/datamodel_code_generator/model/base.py
  • tests/data/expected/main/graphql/additional_imports.py
  • tests/data/expected/main/graphql/annotated.py
  • tests/data/expected/main/graphql/annotated_field_aliases.py
  • tests/data/expected/main/graphql/annotated_use_default_kwarg.py
  • tests/data/expected/main/graphql/annotated_use_standard_collections.py
  • tests/data/expected/main/graphql/annotated_use_standard_collections_use_union_operator.py
  • tests/data/expected/main/graphql/annotated_use_union_operator.py
  • tests/data/expected/main/graphql/casing.py
  • tests/data/expected/main/graphql/custom_formatters.py
  • tests/data/expected/main/graphql/custom_scalar_types.py
  • tests/data/expected/main/graphql/default_values_required_use_default.py
  • tests/data/expected/main/graphql/different_types_of_fields.py
  • tests/data/expected/main/graphql/empty_list_default.py
  • tests/data/expected/main/graphql/enum_literals_all.py
  • tests/data/expected/main/graphql/enum_literals_one.py
  • tests/data/expected/main/graphql/enums.py
  • tests/data/expected/main/graphql/enums_ignore_enum_constraints.py
  • tests/data/expected/main/graphql/enums_no_specialized.py
  • tests/data/expected/main/graphql/enums_specialized.py
  • tests/data/expected/main/graphql/enums_typed_dict.py
  • tests/data/expected/main/graphql/enums_using_subclass.py
  • tests/data/expected/main/graphql/field_aliases.py
  • tests/data/expected/main/graphql/graphql_multiple_aliases_pydantic_v2.py
  • tests/data/expected/main/graphql/interface_mro.py
  • tests/data/expected/main/graphql/no_typename.py
  • tests/data/expected/main/graphql/pydantic_v2_empty_list_default.py
  • tests/data/expected/main/graphql/simple_star_wars.py
  • tests/data/expected/main/graphql/simple_star_wars_class_name_prefix.py
  • tests/data/expected/main/graphql/simple_star_wars_dataclass.py
  • tests/data/expected/main/graphql/simple_star_wars_dataclass_arguments.py
  • tests/data/expected/main/graphql/simple_star_wars_dataclass_frozen_kw_only.py
  • tests/data/expected/main/graphql/simple_star_wars_extra_fields_allow.py
  • tests/data/expected/main/graphql/split_graphql_schemas.py
  • tests/data/expected/main/graphql/type_alias.py
  • tests/data/expected/main/graphql/type_alias_py312.py
  • tests/data/expected/main/graphql/union.py
  • tests/data/expected/main/graphql/union_class_name_prefix.py
  • tests/data/expected/main/graphql/union_snake_case_field.py
  • tests/data/expected/main/graphql/use_standard_collections.py
  • tests/data/expected/main/graphql/use_union_operator.py
  • tests/data/expected/main/jsonschema/allof_mro.py
  • tests/data/expected/main/jsonschema/type_alias_with_field_description.py
  • tests/data/expected/main/jsonschema/type_alias_with_field_description_py312.py
  • tests/data/expected/main/openapi/all_of_with_relative_ref.py
  • tests/data/expected/main/openapi/allof_array_ref_override.py
  • tests/data/expected/main/use_attribute_docstrings.py
  • tests/data/expected/main_kr/main_allof_with_description_only/output.py
  • tests/data/expected/main_kr/main_docstring_special_chars/output.py
  • tests/data/expected/main_kr/main_use_field_description/output.py
  • tests/data/expected/main_kr/main_use_field_description_example/output.py
  • tests/data/expected/main_kr/main_use_field_description_example_dataclass/output.py
  • tests/data/expected/main_kr/main_use_field_description_with_example/output.py
  • tests/data/expected/main_kr/main_use_inline_field_description_example_only/output.py
  • tests/data/expected/main_kr/main_use_schema_description/output.py
  • tests/data/expected/parser/graphql/field-default-enum.py
  • tests/data/expected/parser/graphql/no_typename.py
  • tests/data/expected/parser/graphql/union-aliased-bug.py
  • tests/data/expected/parser/graphql/union-commented.py
  • tests/data/expected/parser/graphql/union_with_prefix.py
  • tests/model/test_base.py

Comment thread src/datamodel_code_generator/model/base.py
"""
The `ID` scalar type represents a unique identifier, often used to refetch an object or as key for a cache. The ID type appears in a JSON response as a String; however, it is not intended to be human-readable. When expected as an input type, any string (such as `"4"`) or integer (such as `4`) input value will be accepted as an ID.
"""
"""The `ID` scalar type represents a unique identifier, often used to refetch an object or as key for a cache. The ID type appears in a JSON response as a String; however, it is not intended to be human-readable. When expected as an input type, any string (such as `"4"`) or integer (such as `4`) input value will be accepted as an ID."""
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's read the file and check line 18
cat -n tests/data/expected/main/graphql/custom_formatters.py | head -25

Repository: koxudaxi/datamodel-code-generator

Length of output: 1268


🏁 Script executed:

# Get the exact character count of line 18
sed -n '18p' tests/data/expected/main/graphql/custom_formatters.py | wc -c

Repository: koxudaxi/datamodel-code-generator

Length of output: 81


🏁 Script executed:

# Let's find the formatter implementation and understand this test structure
find . -name "*.py" -path "*/format*" -type f | grep -E "(format|docstring)" | head -20

Repository: koxudaxi/datamodel-code-generator

Length of output: 327


🏁 Script executed:

# Let's look at the formatter implementation
cat -n ./src/datamodel_code_generator/format.py | head -100

Repository: koxudaxi/datamodel-code-generator

Length of output: 3356


🏁 Script executed:

# Continue reading the format.py file to find docstring handling
wc -l ./src/datamodel_code_generator/format.py

Repository: koxudaxi/datamodel-code-generator

Length of output: 122


🏁 Script executed:

# Search for docstring formatting and line length handling in format.py
rg -n "docstring|line.length|format_docstring" ./src/datamodel_code_generator/format.py -A 3

Repository: koxudaxi/datamodel-code-generator

Length of output: 342


🏁 Script executed:

# Search for any docstring-specific formatting logic
rg -n "format_docstring" . -A 5

Repository: koxudaxi/datamodel-code-generator

Length of output: 17528


🏁 Script executed:

# Let's read the format_docstring function implementation
sed -n '70,120p' ./src/datamodel_code_generator/model/base.py

Repository: koxudaxi/datamodel-code-generator

Length of output: 2003


🏁 Script executed:

# Let's check the related test for field_aliases.py mentioned in the review
cat -n tests/data/expected/main/graphql/field_aliases.py | head -30

Repository: koxudaxi/datamodel-code-generator

Length of output: 1152


🏁 Script executed:

# Check character count on line 20 of field_aliases.py
sed -n '20p' tests/data/expected/main/graphql/field_aliases.py | wc -c

Repository: koxudaxi/datamodel-code-generator

Length of output: 81


🏁 Script executed:

# Search for tests related to custom_formatters.py to understand the input
find . -name "*custom*scalar*" -type f | grep -E "\.(graphql|json)" | head -10

Repository: koxudaxi/datamodel-code-generator

Length of output: 172


🏁 Script executed:

# Check the GraphQL input file
cat -n ./tests/data/graphql/custom-scalar-types.graphql

Repository: koxudaxi/datamodel-code-generator

Length of output: 170


🏁 Script executed:

# Check if there's any configuration or mechanism for handling line length in docstring formatting
rg -n "line.length|wrap|break|docstring" ./src/datamodel_code_generator/model/base.py | head -20

Repository: koxudaxi/datamodel-code-generator

Length of output: 1338


🏁 Script executed:

# Search for any line length handling or wrapping logic
rg -n "MAX_LINE_LENGTH|line_length|wrap.*line|break.*line" ./src/datamodel_code_generator/model/base.py

Repository: koxudaxi/datamodel-code-generator

Length of output: 59


🏁 Script executed:

# Check if Black formatting is applied to handle line length after code generation
rg -n "black|format.*code" ./src/datamodel_code_generator/format.py | head -15

Repository: koxudaxi/datamodel-code-generator

Length of output: 888


🏁 Script executed:

# Check how tests use the expected output files and whether Black is applied
rg -n "custom_formatters\|expected.*main.*graphql" ./tests -l | head -10

Repository: koxudaxi/datamodel-code-generator

Length of output: 59


Implement line length checking in format_docstring() to wrap long docstrings.

The ID scalar docstring on line 18 is 339 characters, significantly exceeding PEP 8 recommendations (79-120 characters). The format_docstring() function in src/datamodel_code_generator/model/base.py:70 currently only checks if the input is a single line, then wraps it in triple quotes without validating character length. This generates docstrings that violate line length guidelines and are difficult to read without horizontal scrolling.

The function should check the final docstring length and wrap to multiple lines when it exceeds a reasonable limit (e.g., 79-120 characters), following PEP 8 and best practices for code readability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/data/expected/main/graphql/custom_formatters.py` at line 18, The
docstring generator format_docstring() currently emits single-line triple-quoted
docstrings regardless of length; update format_docstring() to detect when the
resulting docstring (after escaping) exceeds a target line length (e.g., 79 or
configurable max_line_length) and then reformat it into a multi-line
triple-quoted block wrapped to that width (use a wrapper like textwrap.fill or
equivalent), preserving existing indentation and any leading/trailing blank
lines and ensuring existing multi-line inputs are not double-wrapped; locate
format_docstring() in base.py and implement the length check + wrapping logic so
long single-line inputs like the ID scalar docstring are split into readable
lines inside the triple quotes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the output didn't wrap text like this before I think this is fine. If we wanted to do this is could be done in a separate pull request.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevin-paulson-mindbridge-ai, that's a fair point — this PR focuses on collapsing short docstrings to a single line per PEP 257/D200, not on introducing line-length wrapping, which was never done before. Deferring the wrapping concern to a follow-up PR makes sense. I'll drop this suggestion.

"""
The `String` scalar type represents textual data, represented as UTF-8 character sequences. The String type is most often used by GraphQL to represent free-form human-readable text.
"""
"""The `String` scalar type represents textual data, represented as UTF-8 character sequences. The String type is most often used by GraphQL to represent free-form human-readable text."""
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check format_docstring implementation for line length handling

# Search for format_docstring function and line length logic
rg -A 20 -B 2 'def format_docstring' --type=py

# Also check for any line length constants or configuration
rg -n 'max.*length|line.*length|MAX.*LINE' --type=py -g '**/model/base.py'

Repository: koxudaxi/datamodel-code-generator

Length of output: 1966


🏁 Script executed:

# Read the full format_docstring implementation
sed -n '/^def format_docstring/,/^def /p' src/datamodel_code_generator/model/base.py | head -60

Repository: koxudaxi/datamodel-code-generator

Length of output: 1525


🏁 Script executed:

# Check the test file at line 20
sed -n '18,22p' tests/data/expected/main/graphql/field_aliases.py

Repository: koxudaxi/datamodel-code-generator

Length of output: 306


🏁 Script executed:

# Search for any line length related constants or configuration
rg -n 'MAX_LINE|max_line|LINE_LENGTH|line_length|MAX.*CHAR|max.*char' src/datamodel_code_generator/model/base.py

Repository: koxudaxi/datamodel-code-generator

Length of output: 59


The format_docstring implementation does not include line length handling for single-line docstrings.

The function at line 20 in tests/data/expected/main/graphql/field_aliases.py contains a 172-character single-line docstring. The format_docstring implementation strictly follows PEP 257, which mandates that docstrings fitting on one line remain on a single line. The function checks only if len(value.splitlines()) == 1 and does not enforce any character-based limits. While this is PEP 257 compliant, the resulting 172-character line will likely exceed common linter configurations (typically 79–120 characters) and may trigger style warnings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/data/expected/main/graphql/field_aliases.py` at line 20, The
format_docstring function currently treats any single-line docstring (checked
via value.splitlines()) as exempt from wrapping, which allows very long
single-line docstrings (e.g., 172 chars) to bypass style limits; update
format_docstring to enforce a maximum line length for single-line docstrings by
detecting when len(value) (or a configurable max_length) is exceeded and then
wrap/split that single-line into a short-summary first line plus subsequent
wrapped lines (using the same wrapping logic used for multi-line docstrings),
ensuring the resulting output remains PEP 257 compatible while respecting common
linter width limits; reference the format_docstring function and its single-line
branch that currently checks value.splitlines().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the output didn't wrap text like this before I think this is fine. If we wanted to do this is could be done in a separate pull request.

(Same comment as #3107 (comment))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevin-paulson-mindbridge-ai — That's a fair point. Since line wrapping wasn't applied before this PR either, maintaining the same behaviour here is consistent. If line-length enforcement for single-line docstrings is ever desired, it can be addressed in a follow-up PR. I'll close this comment.


🧠 Learnings used
Learnt from: kevin-paulson-mindbridge-ai
Repo: koxudaxi/datamodel-code-generator PR: 3107
File: src/datamodel_code_generator/model/base.py:95-97
Timestamp: 2026-04-28T21:48:20.138Z
Learning: In `src/datamodel_code_generator/model/base.py`, the `format_docstring` function intentionally adds a trailing space to single-line docstrings whose escaped content ends with `"` (e.g., `"""some text\\"\\"\\" """`). This matches Black's handling of the quote-edge case and is not a content mutation bug — it prevents a four-consecutive-quote syntax issue. Do not flag this as a defect.

Learnt from: kevin-paulson-mindbridge-ai
Repo: koxudaxi/datamodel-code-generator PR: 3107
File: src/datamodel_code_generator/model/base.py:95-97
Timestamp: 2026-04-28T21:48:20.138Z
Learning: In `src/datamodel_code_generator/model/base.py`, `format_docstring` intentionally appends a trailing space to single-line docstrings whose escaped content ends with `"` (producing e.g. `"""text\\"\\"\\" """`). This prevents a parse error from four consecutive quotes and mirrors the leading-space pattern that Black uses when a docstring begins with `"`. This is intentional and should not be flagged as a defect or content mutation.

Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2890
File: tests/data/expected/main/jsonschema/ref_nullable_with_constraint.py:14-15
Timestamp: 2026-01-02T08:25:22.111Z
Learning: The datamodel-code-generator currently generates RootModel subclasses with an explicit `root` field annotation (e.g., `class StringType(RootModel[str]): root: str`). This is existing behavior of the code generator and should not be flagged as an issue introduced by new changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant