Skip to content

Add regression tests for current output#99

Open
DanielEScherzer wants to merge 4 commits into
rust-lang:masterfrom
DanielEScherzer:add-regression-tests
Open

Add regression tests for current output#99
DanielEScherzer wants to merge 4 commits into
rust-lang:masterfrom
DanielEScherzer:add-regression-tests

Conversation

@DanielEScherzer

Copy link
Copy Markdown
Contributor

No description provided.

@DanielEScherzer

DanielEScherzer commented Apr 19, 2026

Copy link
Copy Markdown
Contributor Author

Apparently Youngsuk_Kim was supposed to be in all caps as YOUNGSUK_KIM in 1.52.0 - even though it wasn't when I reran cargo run --release locally...

@Kobzol Kobzol left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Having tests and an easy way to bless them seems generally helpful.

Comment thread tests/regression-output-test.rs Outdated
@xtqqczze

Copy link
Copy Markdown

Is such an extensive regression test necessary, considering the changes in this patch are ~6 MiB?

@Kobzol

Kobzol commented Apr 29, 2026

Copy link
Copy Markdown
Member

Well, testing the generated HTML is likely the simplest way to check the output of thanks at the moment. The files are very similar and git will likely compress the delta down to a few hundreds KiBs at most, if not less, so I wouldn't worry about that much.

@Kobzol Kobzol left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think that this is useful, but I also want to hear from @Mark-Simulacrum.

@DanielEScherzer

Copy link
Copy Markdown
Contributor Author

Hmm, one disadvantage of the tests is that when the rust-lang/rust mailmap changes, they might start to fail
E.g. I'm pretty sure that the current failure was caused by rust-lang/rust@80777dd, though there aren't any warnings from parsing the mailmap? Asked on the r-l/r PR

I'll wait to hear from Mark-Simulacrum before updating the expected content

We could also add some logic to skip the test on the scheduled website updates and only run it on push and pull_request

@xtqqczze

xtqqczze commented Apr 29, 2026

Copy link
Copy Markdown

The repo is currently 3.9MB on disk, so it’d be great if this didn’t increase it to 10MB just for a single test fixture. Perhaps tests/expected could be stored as a compressed archive in the repo and unpacked during the test instead?

@Kobzol

Kobzol commented Apr 29, 2026

Copy link
Copy Markdown
Member

The repo is currently 3.9MB on disk, so it’d be great if this didn’t increase it to 10MB just for a single test fixture. Perhaps tests/expected could be stored as a compressed archive in the repo and unpacked during the test instead?

That sounds like unnecessary complexity to me. The size of the test suite on disk won't really increase dramatically in the future, and 10 MiB seems fine. If I do cargo build, the target directory of this project on Linux has almost 500 MiB, so :)

Hmm, one disadvantage of the tests is that when the rust-lang/rust mailmap changes, they might start to fail
E.g. I'm pretty sure that the current failure was caused by rust-lang/rust@80777dd, though there aren't any warnings from parsing the mailmap? Asked on the r-l/r PR

Hmm, yeah, that's a good point. It's almost as if we should have a fixed mailmap.. Or just a test repo, but then that won't exercise all the quirks that occur in rust-lang/rust.

@DanielEScherzer

DanielEScherzer commented Apr 29, 2026

Copy link
Copy Markdown
Contributor Author

Hmm, one disadvantage of the tests is that when the rust-lang/rust mailmap changes, they might start to fail
E.g. I'm pretty sure that the current failure was caused by rust-lang/rust@80777dd, though there aren't any warnings from parsing the mailmap? Asked on the r-l/r PR

Hmm, yeah, that's a good point. It's almost as if we should have a fixed mailmap.. Or just a test repo, but then that won't exercise all the quirks that occur in rust-lang/rust.

We could run the tests with a fixed mailmap:

  • [test] manually clone r-l/r to the expected place, replace the mailmap with the known/fixed one, then run cargo to generate output, compare
  • [build] just run cargo normally

@DanielEScherzer DanielEScherzer force-pushed the add-regression-tests branch from 2e9062c to 6623aba Compare May 4, 2026 03:11
@DanielEScherzer

Copy link
Copy Markdown
Contributor Author

I don't have the rights to rerun failed jobs, so I rebased, the failure should be resolved by now (see #103)

@DanielEScherzer DanielEScherzer force-pushed the add-regression-tests branch from 6623aba to fae7c9f Compare May 5, 2026 22:18
@Mark-Simulacrum

Copy link
Copy Markdown
Member

I think mailmap updates breaking this seems like a real concern. Maybe we can leave the contents here with either fixed mailmap or no mailmap at all? Though I believe https://github.com/rust-lang/thanks/blob/master/src/reviewers.rs changes will also require updates.

re:repository size, one of the concerns I'd have is that overtime each change will add a new copy of all the output to the git history, which means we'd grow potentially rapidly. One option could be that we just store hashes of the output: if you break it, it should be possible to generate the old and the new output locally and do a proper diff. But that would prevent meaningful growth of the actual repository (hashes are tiny, after all). If we did that then we could probably avoid fixed mailmaps too? There's not that much churn on the mailmap in practice I think.

@DanielEScherzer

Copy link
Copy Markdown
Contributor Author

I think mailmap updates breaking this seems like a real concern. Maybe we can leave the contents here with either fixed mailmap or no mailmap at all? Though I believe https://github.com/rust-lang/thanks/blob/master/src/reviewers.rs changes will also require updates.

I think having no mailmap at all would mean that we lose the testing of the merging of mailmap entries, which we might not want to lose

But having a fixed mailmap shouldn't be an issue - since the mailmap is fetched from the current HEAD commit, we would just have the tests check out a specific older revision of the r-l/r repo to run the tests on

Though I believe https://github.com/rust-lang/thanks/blob/master/src/reviewers.rs changes will also require updates.

Those changes would be to this repo and so my thinking was that it wouldn't be problematic to update the tests at the same time

re:repository size, one of the concerns I'd have is that overtime each change will add a new copy of all the output to the git history, which means we'd grow potentially rapidly.

We don't actually need to add each new release (or even all of the historical ones, I just did that since they were all available)

One option could be that we just store hashes of the output: if you break it, it should be possible to generate the old and the new output locally and do a proper diff. But that would prevent meaningful growth of the actual repository (hashes are tiny, after all). If we did that then we could probably avoid fixed mailmaps too? There's not that much churn on the mailmap in practice I think.

That would work, but it means that the process of reviewing changes that change the output would be a bit harder. What if instead the list of contributors was re-formatted in the tests to remove a bunch of the HTML? E.g. instead of (looking at 0.1.0.html)

            <tbody>
                <tr>
                    <td class="bn">1</td>
                    <td class="bn">Brian Anderson</td>
                    <td class="bn">1958</td>
                </tr>
                <tr>
                    <td class="bn">2</td>
                    <td class="bn">Graydon Hoare</td>
                    <td class="bn">1526</td>
                </tr>
...

we had

            <tbody>
1 |  Brian Anderson | 1958
2 | Graydon Hoare | 1526
...

it wouldn't be correct HTML, but it would make output changes easy to review (perhaps even easier than they are with the HTML) and reduce the size needed to store historical releases (saving ~155 characters per entry in the table)

@Mark-Simulacrum

Copy link
Copy Markdown
Member

it wouldn't be correct HTML, but it would make output changes easy to review (perhaps even easier than they are with the HTML) and reduce the size needed to store historical releases (saving ~155 characters per entry in the table)

I believe git compresses within the .git storage, so probably those size wins would be fairly minor. I think producing a reasonable diff could be done by comparing against the published artifacts for the most part? Those are presumably 1:1 with the local generated HTML?

I don't have a strong opinion here either way though.

@DanielEScherzer

DanielEScherzer commented May 9, 2026

Copy link
Copy Markdown
Contributor Author

it wouldn't be correct HTML, but it would make output changes easy to review (perhaps even easier than they are with the HTML) and reduce the size needed to store historical releases (saving ~155 characters per entry in the table)

I believe git compresses within the .git storage, so probably those size wins would be fairly minor. I think producing a reasonable diff could be done by comparing against the published artifacts for the most part? Those are presumably 1:1 with the local generated HTML?

I don't have a strong opinion here either way though.

We could check against the published artifacts, but that requires a network connection and fetching them individually

I've updated this to do the testing against a pinned version of the mailmap (see #104 to confirm that the mailmap is actually read from the pinned version rather than the current latest - by pinning to an older version, the tests start to fail)


Before applying that cleanup, du -sh tests/expected/ returned 6.2M, afterwards 1.4M, so I think the simplifications help and I've applied them. We could further simplify things by replacing the <nav class="flex flex-row justify-center justify-end-l items-center flex-wrap ph2 pl3-ns pr4-ns"> and parts of the <head> tag, but I think this is now small enough

@DanielEScherzer DanielEScherzer force-pushed the add-regression-tests branch from 2757875 to c03a90f Compare May 9, 2026 20:24
The point of the regression tests is to confirm that the logic to count
contributors and their commits does not break, we don't need to store full
copies of the HTML to check that. Instead, simplify the contributor entries in
the HTML before comparing with the expected output, reducing the disk space
needed to store the expected output by just over 77%.
@DanielEScherzer DanielEScherzer force-pushed the add-regression-tests branch from c03a90f to 89e4a99 Compare May 9, 2026 20:40
@DanielEScherzer DanielEScherzer requested a review from Kobzol May 12, 2026 19:37
@Kobzol

Kobzol commented Jun 15, 2026

Copy link
Copy Markdown
Member

Sorry for the delay, I didn't have much time for reviews in the past few months. Diffing against the published artifacts sounds like it might work, though once it changes you can't easily go back in history and diff against older versions.

The main thing that we want to test is:

  1. Names of the contributors
  2. Their contribution counts

For specific historical releases (doesn't even have to be all of them). We don't really need to check any HTML output, that's not so interesting for us to have a test for, I think. Is there a simple way to hook into the current rendering logic to produce e.g. a MD file or a JSON file, as you suggested, and store only that to git, and diff that?

@DanielEScherzer

Copy link
Copy Markdown
Contributor Author

Sorry for the delay, I didn't have much time for reviews in the past few months. Diffing against the published artifacts sounds like it might work, though once it changes you can't easily go back in history and diff against older versions.

The main thing that we want to test is:

  1. Names of the contributors
  2. Their contribution counts

For specific historical releases (doesn't even have to be all of them). We don't really need to check any HTML output, that's not so interesting for us to have a test for, I think. Is there a simple way to hook into the current rendering logic to produce e.g. a MD file or a JSON file, as you suggested, and store only that to git, and diff that?

My worry is that the more logic we add to make the tests simpler, the further they stray from the actual output
The easiest option would be to just replace templates/stats.hbs with something like (so that we still test the release title, in progress, counting, etc.):

{{ release_title }} Contributors

{{#if in_progress }}
We have had {{ count }} individuals contribute to {{ release }} so far.
{{else}}
We have had {{ count }} individuals contribute to {{ release }}.
{{/if}}

Rank | Name | Contributions
{{#each scores as |score| }}
{{score.rank}} | {{score.author}} | {{score.commits}}
{{/each}}

and that way we don't have all of the other parts of the HTML, and don't need to modify the production code. Though personally I would prefer to do the modifications in the test assert_file_content_matches() rather than by switching out the template

As for reducing the number of releases to test against, #37 (comment) reported that there were only unexpected changes in 1 release (1.34.0) so if we don't test every release we run the risk of an untested release page having unexpected changes

@DanielEScherzer

Copy link
Copy Markdown
Contributor Author

Before additional simplification:

root@ea7197ba2d80:/rust/thanks# du -sh tests/expected/
1.4M    tests/expected/

after removing the sections: 1.3M
after then removing the

sections: 1.2M
removing the remaining HTML bits: 1.1M

So we can drop the other stuff, giving us expected files like

Rust 1.93.1 Contributors
We have had 7 individuals contribute to 1.93.1.
Thank you so much!

1 | Josh Stone | 3
2 | Jieyou Xu | 2
3 | Alex Crichton | 1
3 | Guillaume Gomez | 1
3 | Jonathan Brouwer | 1
3 | Linshu Yang | 1
3 | bors | 1

but it wouldn't save much space on disk. Let me know if I should apply those simplifications

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants