Skip to content

AVX-512 detection and Argon2 support#330

Open
NexusXe wants to merge 7 commits into
tevador:masterfrom
NexusXe:master
Open

AVX-512 detection and Argon2 support#330
NexusXe wants to merge 7 commits into
tevador:masterfrom
NexusXe:master

Conversation

@NexusXe
Copy link
Copy Markdown

@NexusXe NexusXe commented May 24, 2026

This PR introduces an AVX-512F optimized implementation of the Argon2 round function used during dataset initialization. By reducing instruction cache and decoder pressure, this implementation yields a consistent minor hashrate improvement in benchmarks.

To prevent performance regressions on early Intel AVX-512 implementations (e.g., Skylake-X) that suffer from severe frequency/power state-transition penalties, this path is additionally gated VAES presence (which is only present alongside AVX-512 on more recent microarchitectures). This ensures the AVX-512 path is only auto-enabled on architectures with fixed power scaling (Ice Lake / Zen 4 and newer), where the wider instructions can be utilized without transition penalties.

Support was also added to tests and benchmarks.

NexusXe added 7 commits May 23, 2026 18:34
Adds AVX-512F feature detection and uses VAES presence alongside to
detect "good" AVX-512 support, present on Ice Lake/Zen 4 and later.

This is to prevent "bad" implementations (specifically early Intel
implementations) from automatically being used.
Based on src/blake2/blamka-round-avx2.h
Based on src/argon2_avx2.c
I was unsure if extensions past AVX-512F would be needed, but it turned
out that since the primary data element for this code is a 64-bit
integer, only AVX-512F is needed.
@tevador
Copy link
Copy Markdown
Owner

tevador commented May 24, 2026

Can you post some benchmark results to compare AVX2 vs AVX512 cache init performance?

Also the build is failing on most platforms.

@SChernykh
Copy link
Copy Markdown
Collaborator

I don't expect more than a quarter of a second saved compared to AVX-256. Argon2 is pretty fast on Zen4/Ice lake.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants