Skip to content

fix(cz_customize): set commit_parser default that captures change_type#1970

Open
bearomorphism wants to merge 2 commits intocommitizen-tools:masterfrom
bearomorphism:fix/466-cz-customize-commit-parser-default
Open

fix(cz_customize): set commit_parser default that captures change_type#1970
bearomorphism wants to merge 2 commits intocommitizen-tools:masterfrom
bearomorphism:fix/466-cz-customize-commit-parser-default

Conversation

@bearomorphism
Copy link
Copy Markdown
Collaborator

@bearomorphism bearomorphism commented May 9, 2026

Description

Closes #466.

Why

cz_customize lets users define their own commit schema via changelog_pattern, change_type_order, and change_type_map — all the ingredients for a grouped changelog. However, until now, running cz changelog with those settings produced a single flat bullet list with no ### Feat / ### Fix headings, making the feature effectively unusable without an extra, non-obvious step.

The root cause is the commit_parser default inherited from BaseCommitizen. BaseCommitizen.commit_parser is r"(?P<message>.*)" (commitizen/cz/base.py:58) — a catch-all with no change_type named group. The changelog generator in commitizen/changelog.py:195 calls msg.pop("change_type", None) to key commits into sections; without that group, every commit yields change_type = None and lands in an ungrouped bucket (changelog.py:198).

A maintainer confirmed the gap in December 2021 (issue comment by @woile): "I think the commit_parser regex is not exposed in the customize… and it looks like the customize is using the default from the base." Users who stumbled across the fix documented it in later comments — e.g. a community workaround posted in 2025 showing the exact four-group regex that had to be copy-pasted into every cz_customize config. Triage against master (v4.15.1) in #1964 verified the bug is still present: cz changelog --dry-run with changelog_pattern set still produces an ungrouped list.

The fix gives CustomizeCommitsCz its own commit_parser class attribute — a conventional-commits-shaped default that works for any schema following the <type>[(scope)][!]: <message> convention — so users who only set changelog_pattern get grouping for free.

What changed

File Change
commitizen/cz/customize/customize.py Added commit_parser class attribute (conventional-commits-shaped default with change_type, scope, breaking, and message named groups)
tests/test_cz_customize.py Added test_commit_parser_default_extracts_change_type — regression test verifying all four named groups are extracted without any customize.commit_parser config key

How it works

  • The new class-level attribute (commitizen/cz/customize/customize.py:32–45 after the PR is applied) sits between the existing change_type_order default and __init__. It is a four-group regex:
    commit_parser = (
        r"^(?P<change_type>\w+)"
        r"(?:\((?P<scope>[^()\r\n]*)\))?"
        r"(?P<breaking>!)?:\s*(?P<message>.*)$"
    )
  • The __init__ override loop (commitizen/cz/customize/customize.py:40–50) already iterates over "commit_parser" and calls setattr if the user provided their own value. The class-level default is therefore transparently superseded by any customize.commit_parser key — zero behaviour change for existing configs that already set one.
  • Why \w+ instead of a fixed alternation? cz_customize is format-agnostic: it cannot know which types a project has defined. Hard-coding feat|fix|chore|… would silently drop commits whose types aren't in the list. \w+ captures any single-token type, which matches every conventional-commits derivative. An alternative — deriving the alternation from changelog_pattern at runtime — was considered but is fragile: changelog_pattern is a broad filter regex, not a structured grammar, so parsing it reliably would require a regex-of-regexes approach.
  • Why not just document "set commit_parser yourself"? The three other customisable attributes (changelog_pattern, change_type_order, change_type_map) all have sensible inherited defaults; a user who sets only those reasonably expects changelog grouping to work.
  • generate_tree_from_commits (commitizen/changelog.py:92–171) compiles commit_parser with re.MULTILINE and calls map_pat.match(commit.message) per commit. process_commit_message (changelog.py:174–198) then calls parsed.groupdict() — so the named groups flow directly into the changes dict that drives section headings.

Backward compatibility

  • Users who already set customize.commit_parser explicitly are unaffected — the __init__ loop (customize.py:40–50) overwrites the class default with their value via setattr.
  • Users who rely on the old flat-list behaviour (i.e., intentionally omitted commit_parser knowing it would match everything to None) will now see headings. This is the desired fix; the old behaviour was a bug, not a feature.
  • All 66 existing test_cz_customize.py tests still pass — the new default only changes what happens when no commit_parser is configured.
  • BaseCommitizen.commit_parser (commitizen/cz/base.py:58) is unchanged; other built-in cz_* plugins are unaffected.

Checklist

Was generative AI tooling used to co-author this PR?

  • Yes (please specify the tool below)

Generated-by: Claude following the guidelines

Code Changes

  • Add test cases to all the changes you introduce
  • Run uv run poe all locally to ensure this change passes linter check and tests
  • Manually test the changes (see "Steps to Test" below)
  • Update the documentation for the changes

Expected Behavior

Scenario Outcome
cz_customize with changelog_pattern + change_type_order, no commit_parser key cz changelog --dry-run renders ### feat, ### fix, ### chore headings
cz_customize with an explicit customize.commit_parser That custom regex is used unchanged — default is never applied
Any other built-in cz_* (e.g. cz_conventional_commits) Unaffected; BaseCommitizen.commit_parser is unchanged
cz_customize with commits that don't match the new default (e.g. ticket-123 do stuff) Commits still appear under change_type = None — same as before; changelog_pattern controls what's included

Steps to Test This Pull Request

git fetch fork fix/466-cz-customize-commit-parser-default
git checkout fork/fix/466-cz-customize-commit-parser-default

# 1. Targeted regression test.
uv run pytest tests/test_cz_customize.py::test_commit_parser_default_extracts_change_type -v

# 2. Reproduce the bug, then verify the fix.

mkdir /tmp/cz466 && cd /tmp/cz466
git init -b main
git config user.name test && git config user.email test@test.com

cat > .cz.yaml << 'EOF'
commitizen:
  name: cz_customize
  version: 0.1.0
  customize:
    schema_pattern: '(feat|fix|chore)(\(\S+\))?!?:(\s.*)'
    changelog_pattern: '^(feat|fix|chore)(\(.+\))?(!)?'
    change_type_order: ["BREAKING CHANGE", "feat", "fix", "chore"]
EOF

git add .cz.yaml
git commit -m "chore: initial"
git commit --allow-empty -m "feat(DL-4567): new feature test"
git commit --allow-empty -m "fix(DL-1234): qweqwe"
git commit --allow-empty -m "chore: update deps"

# Before the fix: flat list, no headings.
# After the fix: grouped by ### feat / ### fix / ### chore.
cz changelog --dry-run

CustomizeCommitsCz previously inherited the `commit_parser =
r"(?P<message>.*)"` default from `BaseCommitizen`. That default
matches everything but does not capture a `change_type` named group,
so even when a `cz_customize` user configured `changelog_pattern`,
`change_type_map` and `change_type_order`, the changelog generator
could not group commits and emitted a single ungrouped bullet list.

Set a conventional-commits-style default that captures
`change_type`, `scope`, `breaking` and `message` named groups.
Users with a different commit format can still override via
`customize.commit_parser`.

End-to-end check on the exact reproducer from the issue now produces a
properly grouped changelog:

    ## Unreleased
    ### feat
    - **DL-4567**: new feature test
    ### fix
    - **DL-1234**: qweqwe
    ### chore
    - update deps

Closes commitizen-tools#466

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.23%. Comparing base (4b93a50) to head (4b78268).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1970   +/-   ##
=======================================
  Coverage   98.23%   98.23%           
=======================================
  Files          61       61           
  Lines        2779     2780    +1     
=======================================
+ Hits         2730     2731    +1     
  Misses         49       49           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses cz_customize changelog grouping by providing a default commit_parser that extracts change_type, enabling sectioned changelog output when users configure changelog_pattern but don’t explicitly set customize.commit_parser.

Changes:

  • Add a conventional-commits-shaped commit_parser default to CustomizeCommitsCz so change_type can be derived for grouping.
  • Add a regression test asserting the default parser extracts change_type, scope, breaking, and message without explicit config.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
commitizen/cz/customize/customize.py Adds a new class-level default commit_parser intended to extract change_type for changelog grouping.
tests/test_cz_customize.py Adds a regression test verifying the new default parser extracts the expected named groups.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread commitizen/cz/customize/customize.py Outdated
Comment thread tests/test_cz_customize.py
Wrap the conventional-commits prefix in an optional group so subjects that don't follow the format (e.g. 'bug fix: x', 'ticket-123 do stuff', '✨ feature: x') are still captured under <message> with change_type=None instead of being silently dropped by generate_tree_from_commits.

Also broaden <change_type> from \w+ to [\w-]+ to accommodate hyphenated types.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

changelog generation is not working as expected in case of cz_customize?

2 participants