Sort checksums.txt by filename in generate-checksums.sh#139
Merged
Conversation
The trailing `| sort` ordered lines by the leading sha256 prefix (effectively random), not by filename. Output was deterministic but not human-useful. Switch to `LC_ALL=C sort -k2,2` so lines are sorted by the filename column with locale-independent byte order. Tighten the bats coverage too: the membership-only assertion is now an order-asserting check that would have caught the original quirk.
The first sort orders the filename list before sha256sum and feeds the "Release assets:" diagnostic print. Without LC_ALL=C it was locale-dependent, so the same input could produce a different diagnostic order on a different machine. The final checksums.txt sort was already locale-pinned by the previous commit; matching the intermediate sort removes the inconsistency.
The ci-tools image generates en_US.UTF-8 specifically so tests can
verify locale-independent behavior. Exercise that for the
checksums.txt sort: stage filenames where C-order ('A', 'C', 'b' —
byte order) and en_US.UTF-8-order ('A', 'b', 'C' — case-insensitive)
differ, invoke the script with LC_ALL=en_US.UTF-8 in the caller's
env, and assert C-order output. A future refactor that drops the
in-script LC_ALL=C now fails this test instead of slipping by on
ASCII-only fixtures.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The checksums file produced by
scripts/generate-checksums.shwas sorted by the leading sha256 prefix instead of by filename, since the trailing| sortcompared lines starting at column 1. Output was deterministic but not human-useful and didn't match the apparent intent of the sort. Switch toLC_ALL=C sort -k2,2so lines are ordered by filename with locale-independent byte semantics, and pin the intermediate filename sort to the same locale so the diagnostic output stays reproducible too.Related Issues
Fixes #138
Changes
generate-checksums.sh: sortchecksums.txtlines by the filename column (sort -k2,2) and pin both sort steps toLC_ALL=Cfor byte-identical output across environments.tests/bats/scripts/generate-checksums.bats: convert the membership-only assertion to an order assertion (the previously-pinned quirk would now fail), and add a dedicated locale-independence test that runs the script withLC_ALL=en_US.UTF-8in the caller's env and asserts C-order output — future refactors that drop the in-scriptLC_ALL=Cfail here instead of slipping by on ASCII fixtures.Further Comments
The ci-tools image generates
en_US.UTF-8precisely so tests can exercise locale invariants (seeimages/ci-tools/Dockerfile:35-39). This is the first test in the suite to actually use that capability.