Skip to content

IITM: extend n-fold azimuthal FFT acceleration to generic complex types#12

Merged
lucifer1004 merged 1 commit into
mainfrom
iitm-generic-fft
Jun 7, 2026
Merged

IITM: extend n-fold azimuthal FFT acceleration to generic complex types#12
lucifer1004 merged 1 commit into
mainfrom
iitm-generic-fft

Conversation

@lucifer1004

@lucifer1004 lucifer1004 commented Jun 7, 2026

Copy link
Copy Markdown
Member

The n-fold IITM (src/IITM/nfold.jl) accelerates the azimuthal integral with an FFT, but the fast path was gated on CT <: ComplexF64 (FFTW only). That gate matters a lot: the azimuthal cis sum sits in the innermost of the O(nmax⁴) (p,q)-pair loop, so the FFT removes a whole Nφ factor — measured ~31× faster than the direct sum at Nφ=40, nmax=8. Double64 / BigFloat / Arb shapes fell back to the direct path and paid that penalty on top of their slower arithmetic.

Now every complex type the package uses gets an FFT backend (n-fold path):

  • ComplexF64 → FFTW (unchanged; cached plan)
  • Complex{Double64}, Complex{BigFloat} → GenericFFT (AbstractFFTs.plan_fft, fresh plan
    per call so a BigFloat plan can't go stale across precision changes)
  • Acb → Arblib.dft! (native Arb acb_dft, per column)
  • anything else → the existing direct cis fallback

Also: preallocate-and-reuse all azimuthal scratch in the workspace instead of re-allocating it on every one of the Nr radial steps. The workspace (built once) now owns coeff/coeff_inv (generic) and col_in/col_out/coeff_mat/coeff_inv_mat (the new _AcbFourierWorkspace); the coefficient functions write in place. This removes ~Nr × (4nmax+1)×Nϑ allocations per T-matrix on the generic path and, more importantly, the per-step Arb/Flint allocations on the Acb path. Reuse is safe: the coefficients are fully consumed by the (read-only, threaded) pair loop within the same radial step before the next step overwrites them.

Changes:

  • Project.toml: GenericFFT promoted to an explicit dep; using GenericFFT in src/TransitionMatrices.jl loads its generic plan_fft methods.
  • src/IITM/fourier.jl: _iitm_fft_capable predicate; _AzimuthalFourierWorkspace{CT,P} (now also holding coeff/coeff_inv); new _AcbFourierWorkspace; backend-dispatched forward column DFT; Acb coefficients via AcbMatrix/AcbVector (avoids Matrix{Acb} GC hazards). Workspace constructor takes nₘₐₓ and (Acb) prec.
  • src/IITM/nfold.jl: the three CT <: ComplexF64 gates → _iitm_fft_capable(CT); pass nₘₐₓ and prec = Arblib.precision(s.m) (Acb) to the workspace. Retires the previously-unreachable ComplexF64 direct path (scalar- bug).
  • src/IITM/arbitrary.jl, src/IITM/linearization.jl: updated to the new workspace constructor arity (still ComplexF64-only FFT there — generic FFT is the n-fold path).

Verified: cross sections match the ComplexF64 (FFTW) reference for the SAME shape & discretisation to ~1e-15 (Complex{Double64} and Acb, Prism{4}); FFT confirmed active for both new backends by flat Nφ-scaling; generic coeff allocation drops to 0 B/step. Full suite: 48,193 tests pass.

Summary by CodeRabbit

  • New Features

    • FFT-based acceleration now supports generic complex types and arbitrary-precision arithmetic, broadening high-precision simulation capability.
    • Performance improvements in transform handling for native double and non-native complex types.
  • Tests

    • Added tests validating generic-FFT and extended-/arbitrary-precision computation paths against existing references.
  • Chores

    • Introduced support for a generic FFT backend.

@coderabbitai

coderabbitai Bot commented Jun 7, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c799ec3e-9409-4cfb-a3ee-dc8b6d2ce6b7

📥 Commits

Reviewing files that changed from the base of the PR and between 12d2b80 and dc6331c.

📒 Files selected for processing (6)
  • Project.toml
  • src/IITM/arbitrary.jl
  • src/IITM/fourier.jl
  • src/IITM/linearization.jl
  • src/IITM/nfold.jl
  • src/TransitionMatrices.jl
🚧 Files skipped from review as they are similar to previous changes (6)
  • Project.toml
  • src/IITM/arbitrary.jl
  • src/TransitionMatrices.jl
  • src/IITM/linearization.jl
  • src/IITM/nfold.jl
  • src/IITM/fourier.jl

📝 Walkthrough

Walkthrough

Generalizes FFT acceleration from ComplexF64-only to support any Complex{<:AbstractFloat} and Acb/Arb via GenericFFT: adds capability checks, parametric azimuthal workspaces, dual FFT plan/DFT paths, Acb coefficient extraction, dependency updates, and tests.

Changes

FFT Support Generalization

Layer / File(s) Summary
Dependency and module-level setup
Project.toml, src/TransitionMatrices.jl
Add GenericFFT to deps/compat and import it in the main module.
FFT capability predicate & parametric workspace
src/IITM/fourier.jl (lines 8–39)
Add _iitm_fft_capable predicate, parametric _AzimuthalFourierWorkspace{CT,P}, and _AcbFourierWorkspace with preallocated Arblib scratch/storage.
Generic FFT plan generation and DFT application
src/IITM/fourier.jl (lines 52–129)
Implement fresh plan generation for generic Complex types, retain cached FFTW plan for ComplexF64, and add separate _apply_forward_dft! overloads for FFTW vs GenericFFT.
Acb-specific Fourier coefficient extraction
src/IITM/fourier.jl (lines 142–188)
Implement coefficient extraction for Acb using inferred precision, preallocated buffers, column-wise Arblib.dft!, and workspace-backed coeff/coeff_inv.
Propagate workspace signature updates
src/IITM/arbitrary.jl, src/IITM/linearization.jl
Update calls to _azimuthal_fourier_workspace to pass nₘₐₓ for complex branches.
FFT path selection and resource construction
src/IITM/nfold.jl
Replace CT <: ComplexF64 checks with _iitm_fft_capable(CT); conditionally build FFT resources and forward precision for Acb.
Generic FFT validation tests
src/IITM/nfold.jl (lines 218–269)
Add tests asserting Complex{Double64} and Acb routes match ComplexF64 reference within tolerances.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

I nibble through arrays and plans so neat,
From ComplexF64 to GenericFFT I leap,
With Acb's precision tucked in my pack,
Preallocations save me from looking back. 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main objective of the PR—extending n-fold azimuthal FFT acceleration to support generic complex types beyond ComplexF64.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch iitm-generic-fft

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/IITM/fourier.jl (1)

151-152: 💤 Low value

Minor: consider caching one_prec in the workspace to eliminate allocation per radial step.

Acb(1; prec = prec) allocates a new Arb/Flint object on each call to _azimuthal_fourier_coefficients. While it's only one allocation per call (not per element), this could be avoided by storing a unit Acb in _AcbFourierWorkspace during construction.

That said, this is a very small overhead compared to the Nφ×Nϑ work, so it's optional.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/IITM/fourier.jl` around lines 151 - 152, Cache the unit complex Arb value
instead of allocating it every call: add a field (e.g. one_prec or unit_acb) to
_AcbFourierWorkspace and initialize it once in the workspace constructor using
Acb(1; prec = Arblib.precision(ε[1,1])) (or using the workspace precision), then
modify _azimuthal_fourier_coefficients to use that cached field rather than
calling Acb(1; prec = prec) on each invocation; ensure the cached value matches
the required precision when the workspace is created.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Project.toml`:
- Line 29: In the Project.toml [compat] section the GenericFFT entry is
currently pinned as a caret range "0.1.7" which makes the minimum 0.1.7 and is
unsatisfiable because the latest published GenericFFT is v0.1.5; update the
GenericFFT compat entry (the line `GenericFFT = "0.1.7"`) to a compatible range
such as `GenericFFT = "0.1"` (or at minimum `GenericFFT = "0.1.5"`) so the
manifest accepts available releases.

In `@src/IITM/nfold.jl`:
- Around line 49-52: The call Arblib.precision(s.m) can raise a MethodError for
Acb because Arblib does not define precision(::Acb); update the conditional to
obtain the precision directly from the Acb object (or its field/accessor) when
CT === Acb and pass that to _azimuthal_fourier_workspace instead of
Arblib.precision(s.m); locate the creation of fourier_workspace in the
_iitm_fft_capable(CT) branch and replace the Arblib.precision(s.m) call with the
correct accessor on the Acb value (or use a safe helper that returns 0 for
non-Acb types) so _azimuthal_fourier_workspace receives a valid numeric
precision.

---

Nitpick comments:
In `@src/IITM/fourier.jl`:
- Around line 151-152: Cache the unit complex Arb value instead of allocating it
every call: add a field (e.g. one_prec or unit_acb) to _AcbFourierWorkspace and
initialize it once in the workspace constructor using Acb(1; prec =
Arblib.precision(ε[1,1])) (or using the workspace precision), then modify
_azimuthal_fourier_coefficients to use that cached field rather than calling
Acb(1; prec = prec) on each invocation; ensure the cached value matches the
required precision when the workspace is created.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 79cb62df-f318-45df-8d3d-36ef5311ef33

📥 Commits

Reviewing files that changed from the base of the PR and between ab67608 and 12d2b80.

📒 Files selected for processing (6)
  • Project.toml
  • src/IITM/arbitrary.jl
  • src/IITM/fourier.jl
  • src/IITM/linearization.jl
  • src/IITM/nfold.jl
  • src/TransitionMatrices.jl

Comment thread Project.toml Outdated
Comment thread src/IITM/nfold.jl
Comment on lines +49 to +52
fourier_workspace = _iitm_fft_capable(CT) ?
_azimuthal_fourier_workspace(CT, Nφ, Nϑ, nₘₐₓ;
prec = CT === Acb ? Arblib.precision(s.m) : 0) : nothing
fourier_modes = _iitm_fft_capable(CT) ? _azimuthal_fourier_mode_bins(nₘₐₓ, N) : nothing

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Does Arblib.jl version 1.0.0 provide an Arblib.precision function that accepts an Acb argument and returns the working precision in bits?

💡 Result:

Arblib.jl version 1.0.0 does not provide a function named Arblib.precision that specifically accepts an Acb argument to return its working precision in bits [1][2][3]. While Arblib.jl uses Julia's precision function in some contexts—typically via Base.precision for types or by accessing a precision field within its Julia structs (like Arb and Acb)—it does not expose an exported Arblib.precision method for this purpose [1][4][3]. In Arblib.jl, Julia structs such as Arb and Acb contain a precision field that stores the precision used by that specific object [1][4][3]. You can typically access this field directly on an instance (e.g., my_acb.precision). Furthermore, Arblib follows a naming convention where many Arb C-library functions are wrapped and "Juliafied," often mapping to keyword arguments (like prec) rather than dedicated precision-querying functions [1][3]. For querying precision related to the underlying Arb/FLINT library, users may also rely on C-level functions like acb_bits if accessing low-level wrappers [5][6].

Citations:


Fix possible API mismatch in Arblib.precision(s.m) for Acb (src/IITM/nfold.jl:49-52)
When CT === Acb, the code calls Arblib.precision(s.m); Arblib.jl v1.0.0 sources indicate there isn’t an Arblib.precision(::Acb) API, so this can throw a MethodError. Use the correct Arblib precision accessor for Acb (e.g., the precision stored on the Acb object) instead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/IITM/nfold.jl` around lines 49 - 52, The call Arblib.precision(s.m) can
raise a MethodError for Acb because Arblib does not define precision(::Acb);
update the conditional to obtain the precision directly from the Acb object (or
its field/accessor) when CT === Acb and pass that to
_azimuthal_fourier_workspace instead of Arblib.precision(s.m); locate the
creation of fourier_workspace in the _iitm_fft_capable(CT) branch and replace
the Arblib.precision(s.m) call with the correct accessor on the Acb value (or
use a safe helper that returns 0 for non-Acb types) so
_azimuthal_fourier_workspace receives a valid numeric precision.

@codecov

codecov Bot commented Jun 7, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.64865% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 95.87%. Comparing base (ab67608) to head (dc6331c).

Files with missing lines Patch % Lines
src/IITM/fourier.jl 98.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #12      +/-   ##
==========================================
+ Coverage   95.78%   95.87%   +0.08%     
==========================================
  Files          35       35              
  Lines        4726     4779      +53     
==========================================
+ Hits         4527     4582      +55     
+ Misses        199      197       -2     
Flag Coverage Δ
julia-1-macos-latest 97.04% <95.83%> (+0.03%) ⬆️
julia-1-ubuntu-latest 97.04% <95.83%> (+0.03%) ⬆️
julia-1-windows-latest 97.04% <95.83%> (+0.03%) ⬆️
julia-1.10-ubuntu-latest 95.80% <98.64%> (+0.08%) ⬆️
julia-nightly-ubuntu-latest 96.98% <95.83%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The n-fold IITM (src/IITM/nfold.jl) accelerates the azimuthal integral with an FFT,
but the fast path was gated on `CT <: ComplexF64` (FFTW only). That gate matters a lot:
the azimuthal `cis` sum sits in the innermost of the O(nmax⁴) (p,q)-pair loop, so the
FFT removes a whole Nφ factor — measured ~31× faster than the direct sum at Nφ=40,
nmax=8. Double64 / BigFloat / Arb shapes fell back to the direct path and paid that
penalty on top of their slower arithmetic.

Now every complex type the package uses gets an FFT backend (n-fold path):
- ComplexF64                         → FFTW (unchanged; cached plan)
- Complex{Double64}, Complex{BigFloat} → GenericFFT (AbstractFFTs.plan_fft, fresh plan
  per call so a BigFloat plan can't go stale across precision changes)
- Acb                                → Arblib.dft! (native Arb acb_dft, per column)
- anything else                      → the existing direct `cis` fallback

Also: preallocate-and-reuse all azimuthal scratch in the workspace instead of
re-allocating it on every one of the Nr radial steps. The workspace (built once) now
owns `coeff`/`coeff_inv` (generic) and `col_in`/`col_out`/`coeff_mat`/`coeff_inv_mat`
(the new `_AcbFourierWorkspace`); the coefficient functions write in place. This removes
~Nr × (4nmax+1)×Nϑ allocations per T-matrix on the generic path and, more importantly,
the per-step Arb/Flint allocations on the Acb path. Reuse is safe: the coefficients are
fully consumed by the (read-only, threaded) pair loop within the same radial step before
the next step overwrites them.

Changes:
- Project.toml: GenericFFT promoted to an explicit dep; `using GenericFFT` in
  src/TransitionMatrices.jl loads its generic plan_fft methods.
- src/IITM/fourier.jl: `_iitm_fft_capable` predicate; `_AzimuthalFourierWorkspace{CT,P}`
  (now also holding coeff/coeff_inv); new `_AcbFourierWorkspace`; backend-dispatched
  forward column DFT; Acb coefficients via AcbMatrix/AcbVector (avoids Matrix{Acb} GC
  hazards). Workspace constructor takes `nₘₐₓ` and (Acb) `prec`.
- src/IITM/nfold.jl: the three `CT <: ComplexF64` gates → `_iitm_fft_capable(CT)`; pass
  `nₘₐₓ` and `prec = Arblib.precision(s.m)` (Acb) to the workspace. Retires the
  previously-unreachable ComplexF64 direct path (scalar-`wφ` bug).
- src/IITM/arbitrary.jl, src/IITM/linearization.jl: updated to the new workspace
  constructor arity (still ComplexF64-only FFT there — generic FFT is the n-fold path).

Verified: cross sections match the ComplexF64 (FFTW) reference for the SAME shape &
discretisation to ~1e-15 (Complex{Double64} and Acb, Prism{4}); FFT confirmed active for
both new backends by flat Nφ-scaling; generic coeff allocation drops to 0 B/step. Full
suite: 48,193 tests pass.
@lucifer1004 lucifer1004 merged commit 9383750 into main Jun 7, 2026
12 checks passed
@lucifer1004 lucifer1004 deleted the iitm-generic-fft branch June 7, 2026 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant