Skip to content

GH-49930: [CI][C++] Pin MinGW MSYS2 packages to unblock CI#49931

Merged
raulcd merged 7 commits intoapache:mainfrom
tadeja:ci-mingw-fix
May 7, 2026
Merged

GH-49930: [CI][C++] Pin MinGW MSYS2 packages to unblock CI#49931
raulcd merged 7 commits intoapache:mainfrom
tadeja:ci-mingw-fix

Conversation

@tadeja
Copy link
Copy Markdown
Contributor

@tadeja tadeja commented May 5, 2026

Rationale for this change

Temporary workaround for #49930!
Both MinGW jobs fail every run since 2026 April 30 from two MSYS2 updates:

  1. MINGW64: gcc 15.2 -> 16.1 deterministically breaks 4 tests (arrow-async-utility-test, arrow-threading-utility-test, arrow-dataset-dataset-writer-test, arrow-dataset-file-test)
    See [CI][C++] MinGW jobs fail every run after MSYS2 toolchain updates #49930 for per-test status (and the connection to [C++][CI] arrow-json-test segfaults occasionally on AMD64 Windows MinGW MINGW64 C++ job #49272/GH-49272: [C++][CI] Fix intermittent segfault in arrow-json-test with MinGW #49462).
  2. MINGW64 and CLANG64: arrow-s3fs-test fails - aws-sdk-cpp 1.11.479 -> 1.11.801 stopped sending Content-Md5 on DeleteObjects, but bundled MinIO RELEASE.2024-09-13 still requires it.

What changes are included in this PR?

a) Workaround for 1.: new temporary Pin MSYS2 packages step on MINGW64 (CLANG64 is unaffected). Pins gcc-libs to 15.2 plus C++ packages for ABI compatibility.
Removable when all #49930-tracked failures pass on current upstream MSYS2.

b) Resolves 2.: bump bundled MinIO to RELEASE.2025-01-20T14-49-07Z to match ci/scripts/install_minio.sh (per review). See further discussion about migrating from MinIO in #47908

Are these changes tested?

CI
failing without pins ->
passing

Are there any user-facing changes?

No

@github-actions github-actions Bot added the awaiting review Awaiting review label May 5, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

⚠️ GitHub issue #49930 has been automatically assigned in GitHub to PR creator.

@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 6, 2026

With temporary CI changes here MinGW failing tests now pass...
Except for the single test arrow-flight-test Timeout on CLANG64 - could we rerun this? (@kou ?) https://github.com/apache/arrow/actions/runs/22742119641/job/65957707054?pr=49402#step:14:1080

arrow-flight-test has this issue already logged (for odbc) #49465

@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 6, 2026

@raulcd do you agree we could temporarily use solution from this draft PR (= listed failures in newer, broader issue #49930) in order to resolve MinGW C++ failures on CI while waiting on stuck PR #49462 (= your older issue #49272)?

Copy link
Copy Markdown
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we faced this in the past on other jobs and we updated the minio version which contains the fix for newer aws-sdk-cpp. Shouldn't we just update minio version this is what we use on other jobs:

# Use specific versions for minio server and client to avoid CI failures on new releases.
minio_version="minio.RELEASE.2025-01-20T14-49-07Z"
mc_version="mc.RELEASE.2024-09-16T17-43-14Z"

As per the issue with the segfault, I think we should deal with it independently and possibly apply the proposed fix?

@github-actions github-actions Bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels May 6, 2026
@kou
Copy link
Copy Markdown
Member

kou commented May 6, 2026

It seems that some MinIO alternatives such as https://github.com/rustfs/rustfs .
Can we migrate from MinIO to solve this problem?

@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 6, 2026

It seems that some MinIO alternatives such as https://github.com/rustfs/rustfs . Can we migrate from MinIO to solve this problem?

interesting point! @kou rok mentions there's this existing issue, perhaps best to discuss alternative there? #47908

@raulcd
Copy link
Copy Markdown
Member

raulcd commented May 6, 2026

It seems that some MinIO alternatives such as https://github.com/rustfs/rustfs . Can we migrate from MinIO to solve this problem?

For this specific issue I think that's unnecessary. The problem is that MinGW was using an old version of aws-sdk which was compatible with the minio version used. Now that they've updated the aws-sdk we just require to use the newer version. This exact same issue happened in the past with other jobs, basically AWS shipped a breaking change. I think for this one is enough with updating the version.

I do agree that we could (even should) migrate away from MinIO as the are not supporting their OSS offering anymore:
minio/minio@27742d4

Should we create a different issue for moving away from MinIO?

@raulcd
Copy link
Copy Markdown
Member

raulcd commented May 6, 2026

oh! there's an issue already, yeah, I agree we should discuss there. Thanks @tadeja !

@github-actions github-actions Bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels May 6, 2026
@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 6, 2026

@tadeja tadeja marked this pull request as ready for review May 6, 2026 12:06
@tadeja tadeja requested review from assignUser, jonkeane and kou as code owners May 6, 2026 12:06
Copy link
Copy Markdown
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tadeja I am unsure I understand why aws-sdk requires pinning due to the failure as the failure is unrelated to AWS (it's on arrow JSON). Is it because of the other pins?

Can you try removing the pins (maybe temporarily) to validate the S3 issue is solved? Based on the assessment only the arrow-json failure should be raised now.

@github-actions github-actions Bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels May 6, 2026
@github-actions github-actions Bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels May 6, 2026
@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 6, 2026

I hope I'm in the right forest 🙃 🌳 @raulcd I ran with all pins disabled, just the MinIO bump.
If you're OK leaving these 5 MINGW64 failures until #49462 is solved (#49272 symptoms), I'll drop the pin step entirely. Otherwise happy to restore or try a smaller pin subset.

  1. MINGW64 5 tests fail
    https://github.com/apache/arrow/actions/runs/25442601074/job/74637722454?pr=49931#step:13:1301 - all symptoms of [C++][CI] arrow-json-test segfaults occasionally on AMD64 Windows MinGW MINGW64 C++ job #49272 (right?)
The following tests FAILED:
	 41 - arrow-async-utility-test (Exit code 0xc0000374)   arrow-tests unittest
	 44 - arrow-threading-utility-test (Timeout)            arrow-tests unittest
	 62 - arrow-dataset-dataset-writer-test (Failed)        arrow_dataset unittest
	 65 - arrow-dataset-file-test (Failed)                  arrow_dataset unittest
	 76 - arrow-s3fs-test (Timeout)                         arrow-tests filesystem unittest

TestS3FS.GetFileInfoGeneratorStress is the only s3fs subtest that hits the thread pool. MinIO bump shifted its symptom from MissingContentMD5 to _fut.Wait() timeout but didn't eliminate it.


  1. CLANG64: 100% pass
    ( https://github.com/apache/arrow/actions/runs/25442601074/job/74637722360?pr=49931#step:13:300 )
98/98 Test #76: arrow-s3fs-test ..............................   Passed   52.10 sec
100% tests passed, 0 tests failed out of 98

aws-sdk-cpp 1.11.801 + MinIO RELEASE.2025-01-20T14-49-07Z works fine. This is the current PR change

@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 7, 2026

@raulcd Correction on my last comment 😞 Reviewed latest PR #49462's CI(https://github.com/apache/arrow/actions/runs/25458933029/job/74695767495?pr=49462):
arrow-json-test passes there, but the other 4 tests (arrow-async-utility-test, arrow-threading-utility-test, arrow-dataset-writer-test, arrow-dataset-file-test) still fail with the same symptoms: std::bad_weak_ptr in BatchWriteConcurrent, heap corruption in async-utility, etc. So #49462's current_thread_pool_ fix isn't a complete fix for the gcc-16 induced failures. We may have 2-4 separate bugs.

I'm updating my linked issue #49930's body with status per test.

Reverted to our previously working pins for now.
Awaiting your thoughts on next steps 👍

Copy link
Copy Markdown
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kou what are your thoughts here? Should we temporarily merge this and try to fix the test failures as a separate issue? Should we just merge the minio update and keep working on fixes for the other issues on a separate PR?
Those pins seem too heavy and I am worried that we might forget them and carry those pins for a long time.

@github-actions github-actions Bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels May 7, 2026
@kou
Copy link
Copy Markdown
Member

kou commented May 7, 2026

Let's merge this workaround and fix the test failures as separated issues as high priority tasks. I agree that we should not keep those pins for a long time.

@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 7, 2026

( to let you know I just extensively updated linked umbrella issue #49930 (comment) - new PRs can be linked to it or new issues split out of it, as you prefer ? )

Copy link
Copy Markdown
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@github-actions github-actions Bot added awaiting merge Awaiting merge and removed awaiting changes Awaiting changes labels May 7, 2026
Copy link
Copy Markdown
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's merge it 👍

@raulcd
Copy link
Copy Markdown
Member

raulcd commented May 7, 2026

@tadeja can you update the description and the title to match the update to minio and the pin? Description is slightly outdated now. I'll merge afterwards

@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 7, 2026

@raulcd updated! Let me know if something would be still missing there.

@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 7, 2026

@raulcd I've drafted the future PR #49945 that would revert temporary pins' step - name: Pin MSYS2 packages and already linked it to the existing issue #49930 which should stay open. I hope that's good.

Edit: issue #49930 closed with merged PR. Re-linking draft PR to new follow-up #49948

@tadeja
Copy link
Copy Markdown
Contributor Author

tadeja commented May 7, 2026

Ok, CI green again! @raulcd, @kou thank you for very helpful reviews&wise thoughts!

@raulcd raulcd merged commit a0d2885 into apache:main May 7, 2026
18 checks passed
@raulcd raulcd removed the awaiting merge Awaiting merge label May 7, 2026
@conbench-apache-arrow
Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 0 benchmarking runs that have been run so far on merge-commit a0d2885.

None of the specified runs were found on the Conbench server.

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants