Skip to content

refactor: extract _SUPERSCRIPT_MAP constant and simplify scientific()#313

Open
maayanmatsliah-tech wants to merge 2 commits into
python-humanize:mainfrom
maayanmatsliah-tech:main
Open

refactor: extract _SUPERSCRIPT_MAP constant and simplify scientific()#313
maayanmatsliah-tech wants to merge 2 commits into
python-humanize:mainfrom
maayanmatsliah-tech:main

Conversation

@maayanmatsliah-tech
Copy link
Copy Markdown

Cool repo! I made a small cleanup to the scientific() function in number.py.

I extracted the exponents dict as a module-level constant so it's not recreated on every function call. I also simplified the character mapping loop into a generator expression.

I'm happy to make any changes if necessary. Hope this helps!

@hugovk hugovk added the changelog: Changed For changes in existing functionality label May 14, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.55%. Comparing base (55dc47c) to head (6e297f9).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #313      +/-   ##
==========================================
- Coverage   99.56%   99.55%   -0.01%     
==========================================
  Files          12       12              
  Lines         911      907       -4     
==========================================
- Hits          907      903       -4     
  Misses          4        4              
Flag Coverage Δ
macos-latest 97.57% <100.00%> (-0.02%) ⬇️
ubuntu-latest 97.57% <100.00%> (-0.02%) ⬇️
windows-latest 95.70% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@hugovk hugovk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Comment thread src/humanize/number.py
from .i18n import _ngettext_noop as NS_
from .i18n import _pgettext as P_

_SUPERSCRIPT_MAP = {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you move this after the TYPE_CHECKING block?

@hugovk
Copy link
Copy Markdown
Member

hugovk commented May 14, 2026

I've just added benchmarking to the CI, so updated this PR from main to see what it looks like :)

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 14, 2026

Merging this PR will not alter performance

✅ 15 untouched benchmarks


Comparing maayanmatsliah-tech:main (6e297f9) with main (55dc47c)

Open in CodSpeed

Comment thread src/humanize/number.py
final_str = part1 + " x 10" + "".join(new_part2)

return final_str
return part1 + " x 10" + "".join(_SUPERSCRIPT_MAP[char] for char in part2)
Copy link
Copy Markdown
Member

@hugovk hugovk May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No detected change from the CodSpeed benchmark.

This microbenchmark shows a slight improvement from this PR, and this list comprehension version has a slight improvement over the PR:

Details
import timeit


# OLD implementation (dict rebuilt each call, list append loop)
def scientific_old(value, precision=2):
    import math
    import re

    exponents = {
        "0": "⁰",
        "1": "¹",
        "2": "²",
        "3": "³",
        "4": "⁴",
        "5": "⁵",
        "6": "⁶",
        "7": "⁷",
        "8": "⁸",
        "9": "⁹",
        "-": "⁻",
    }
    try:
        value = float(value)
        if not math.isfinite(value):
            return ""
    except (ValueError, TypeError):
        return str(value)
    fmt = f"{{:.{int(precision)}e}}"
    n = fmt.format(value)
    part1, part2 = n.split("e")
    part2 = re.sub(r"^\+?(\-?)0*(.+)$", r"\1\2", part2)
    new_part2 = []
    for char in part2:
        new_part2.append(exponents[char])
    return part1 + " x 10" + "".join(new_part2)


# NEW implementation (module-level dict, generator expression)
_SUPERSCRIPT_MAP = {
    "0": "⁰",
    "1": "¹",
    "2": "²",
    "3": "³",
    "4": "⁴",
    "5": "⁵",
    "6": "⁶",
    "7": "⁷",
    "8": "⁸",
    "9": "⁹",
    "-": "⁻",
}


def scientific_new(value, precision=2):
    import math
    import re

    try:
        value = float(value)
        if not math.isfinite(value):
            return ""
    except (ValueError, TypeError):
        return str(value)
    fmt = f"{{:.{int(precision)}e}}"
    n = fmt.format(value)
    part1, part2 = n.split("e")
    part2 = re.sub(r"^\+?(\-?)0*(.+)$", r"\1\2", part2)
    return part1 + " x 10" + "".join(_SUPERSCRIPT_MAP[char] for char in part2)


# Variant: list comprehension
def scientific_listcomp(value, precision=2):
    import math
    import re

    try:
        value = float(value)
        if not math.isfinite(value):
            return ""
    except (ValueError, TypeError):
        return str(value)
    fmt = f"{{:.{int(precision)}e}}"
    n = fmt.format(value)
    part1, part2 = n.split("e")
    part2 = re.sub(r"^\+?(\-?)0*(.+)$", r"\1\2", part2)
    return part1 + " x 10" + "".join([_SUPERSCRIPT_MAP[char] for char in part2])


if __name__ == "__main__":
    N = 200_000
    cases = [
        ("old (rebuilt dict + list.append)", scientific_old),
        ("new (module dict + generator)", scientific_new),
        ("variant (module dict + listcomp)", scientific_listcomp),
    ]
    for label, fn in cases:
        t = timeit.timeit(lambda fn=fn: fn(-1.234e-50), number=N)
        print(f"{label:40s} {t * 1e6 / N:7.3f} µs/call  ({t:.3f}s for {N:,})")
old (rebuilt dict + list.append)           1.339 µs/call  (0.268s for 200,000)
new (module dict + generator)              1.259 µs/call  (0.252s for 200,000)
variant (module dict + listcomp)           1.136 µs/call  (0.227s for 200,000)

Not a huge difference! But we might as well take it:

Suggested change
return part1 + " x 10" + "".join(_SUPERSCRIPT_MAP[char] for char in part2)
return part1 + " x 10" + "".join([_SUPERSCRIPT_MAP[char] for char in part2])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog: Changed For changes in existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants