AVX-512 detection and Argon2 support by NexusXe · Pull Request #330 · tevador/RandomX

NexusXe · 2026-05-24T00:08:38Z

This PR introduces an AVX-512F optimized implementation of the Argon2 round function used during dataset initialization. By reducing instruction cache and decoder pressure, this implementation yields a consistent minor hashrate improvement in benchmarks.

To prevent performance regressions on early Intel AVX-512 implementations (e.g., Skylake-X) that suffer from severe frequency/power state-transition penalties, this path is additionally gated VAES presence (which is only present alongside AVX-512 on more recent microarchitectures). This ensures the AVX-512 path is only auto-enabled on architectures with fixed power scaling (Ice Lake / Zen 4 and newer), where the wider instructions can be utilized without transition penalties.

Support was also added to tests and benchmarks.

Adds AVX-512F feature detection and uses VAES presence alongside to detect "good" AVX-512 support, present on Ice Lake/Zen 4 and later. This is to prevent "bad" implementations (specifically early Intel implementations) from automatically being used.

Based on src/blake2/blamka-round-avx2.h

Based on src/argon2_avx2.c

I was unsure if extensions past AVX-512F would be needed, but it turned out that since the primary data element for this code is a 64-bit integer, only AVX-512F is needed.

tevador · 2026-05-24T08:54:56Z

Can you post some benchmark results to compare AVX2 vs AVX512 cache init performance?

Also the build is failing on most platforms.

SChernykh · 2026-05-24T09:17:56Z

I don't expect more than a quarter of a second saved compared to AVX-256. Argon2 is pretty fast on Zen4/Ice lake.

NexusXe added 7 commits May 23, 2026 18:34

CPU feature detection

3477b1b

Adds AVX-512F feature detection and uses VAES presence alongside to detect "good" AVX-512 support, present on Ice Lake/Zen 4 and later. This is to prevent "bad" implementations (specifically early Intel implementations) from automatically being used.

AVX-512F blamka round implementation

4d68a07

Based on src/blake2/blamka-round-avx2.h

AVX-512F Argon2 implementation

1166344

Based on src/argon2_avx2.c

Use AVX-512F Argon2

33fe836

Add AVX-512F to benchmarks & tests

1bf1e08

Add AVX-512 Argon2 files to MSVC and Clang config files

51f8369

Remove old comment

e940c9b

I was unsure if extensions past AVX-512F would be needed, but it turned out that since the primary data element for this code is a 64-bit integer, only AVX-512F is needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AVX-512 detection and Argon2 support#330

AVX-512 detection and Argon2 support#330
NexusXe wants to merge 7 commits into
tevador:masterfrom
NexusXe:master

NexusXe commented May 24, 2026

Uh oh!

tevador commented May 24, 2026

Uh oh!

SChernykh commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

NexusXe commented May 24, 2026

Uh oh!

tevador commented May 24, 2026

Uh oh!

SChernykh commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants