IITM: extend n-fold azimuthal FFT acceleration to generic complex types#12
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (6)
🚧 Files skipped from review as they are similar to previous changes (6)
📝 WalkthroughWalkthroughGeneralizes FFT acceleration from ComplexF64-only to support any Complex{<:AbstractFloat} and Acb/Arb via GenericFFT: adds capability checks, parametric azimuthal workspaces, dual FFT plan/DFT paths, Acb coefficient extraction, dependency updates, and tests. ChangesFFT Support Generalization
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/IITM/fourier.jl (1)
151-152: 💤 Low valueMinor: consider caching
one_precin the workspace to eliminate allocation per radial step.
Acb(1; prec = prec)allocates a new Arb/Flint object on each call to_azimuthal_fourier_coefficients. While it's only one allocation per call (not per element), this could be avoided by storing a unit Acb in_AcbFourierWorkspaceduring construction.That said, this is a very small overhead compared to the Nφ×Nϑ work, so it's optional.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/IITM/fourier.jl` around lines 151 - 152, Cache the unit complex Arb value instead of allocating it every call: add a field (e.g. one_prec or unit_acb) to _AcbFourierWorkspace and initialize it once in the workspace constructor using Acb(1; prec = Arblib.precision(ε[1,1])) (or using the workspace precision), then modify _azimuthal_fourier_coefficients to use that cached field rather than calling Acb(1; prec = prec) on each invocation; ensure the cached value matches the required precision when the workspace is created.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Project.toml`:
- Line 29: In the Project.toml [compat] section the GenericFFT entry is
currently pinned as a caret range "0.1.7" which makes the minimum 0.1.7 and is
unsatisfiable because the latest published GenericFFT is v0.1.5; update the
GenericFFT compat entry (the line `GenericFFT = "0.1.7"`) to a compatible range
such as `GenericFFT = "0.1"` (or at minimum `GenericFFT = "0.1.5"`) so the
manifest accepts available releases.
In `@src/IITM/nfold.jl`:
- Around line 49-52: The call Arblib.precision(s.m) can raise a MethodError for
Acb because Arblib does not define precision(::Acb); update the conditional to
obtain the precision directly from the Acb object (or its field/accessor) when
CT === Acb and pass that to _azimuthal_fourier_workspace instead of
Arblib.precision(s.m); locate the creation of fourier_workspace in the
_iitm_fft_capable(CT) branch and replace the Arblib.precision(s.m) call with the
correct accessor on the Acb value (or use a safe helper that returns 0 for
non-Acb types) so _azimuthal_fourier_workspace receives a valid numeric
precision.
---
Nitpick comments:
In `@src/IITM/fourier.jl`:
- Around line 151-152: Cache the unit complex Arb value instead of allocating it
every call: add a field (e.g. one_prec or unit_acb) to _AcbFourierWorkspace and
initialize it once in the workspace constructor using Acb(1; prec =
Arblib.precision(ε[1,1])) (or using the workspace precision), then modify
_azimuthal_fourier_coefficients to use that cached field rather than calling
Acb(1; prec = prec) on each invocation; ensure the cached value matches the
required precision when the workspace is created.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 79cb62df-f318-45df-8d3d-36ef5311ef33
📒 Files selected for processing (6)
Project.tomlsrc/IITM/arbitrary.jlsrc/IITM/fourier.jlsrc/IITM/linearization.jlsrc/IITM/nfold.jlsrc/TransitionMatrices.jl
| fourier_workspace = _iitm_fft_capable(CT) ? | ||
| _azimuthal_fourier_workspace(CT, Nφ, Nϑ, nₘₐₓ; | ||
| prec = CT === Acb ? Arblib.precision(s.m) : 0) : nothing | ||
| fourier_modes = _iitm_fft_capable(CT) ? _azimuthal_fourier_mode_bins(nₘₐₓ, N) : nothing |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does Arblib.jl version 1.0.0 provide an Arblib.precision function that accepts an Acb argument and returns the working precision in bits?
💡 Result:
Arblib.jl version 1.0.0 does not provide a function named Arblib.precision that specifically accepts an Acb argument to return its working precision in bits [1][2][3]. While Arblib.jl uses Julia's precision function in some contexts—typically via Base.precision for types or by accessing a precision field within its Julia structs (like Arb and Acb)—it does not expose an exported Arblib.precision method for this purpose [1][4][3]. In Arblib.jl, Julia structs such as Arb and Acb contain a precision field that stores the precision used by that specific object [1][4][3]. You can typically access this field directly on an instance (e.g., my_acb.precision). Furthermore, Arblib follows a naming convention where many Arb C-library functions are wrapped and "Juliafied," often mapping to keyword arguments (like prec) rather than dedicated precision-querying functions [1][3]. For querying precision related to the underlying Arb/FLINT library, users may also rely on C-level functions like acb_bits if accessing low-level wrappers [5][6].
Citations:
- 1: https://www.juliapackages.com/p/arblib
- 2: https://github.com/kalmarek/Arblib.jl/releases/tag/v1.0.0
- 3: https://github.com/kalmarek/Arblib.jl/
- 4: https://github.com/kalmarek/Arblib.jl
- 5: http://arblib.org/acb.html
- 6: https://flintlib.org/doc/acb.html
Fix possible API mismatch in Arblib.precision(s.m) for Acb (src/IITM/nfold.jl:49-52)
When CT === Acb, the code calls Arblib.precision(s.m); Arblib.jl v1.0.0 sources indicate there isn’t an Arblib.precision(::Acb) API, so this can throw a MethodError. Use the correct Arblib precision accessor for Acb (e.g., the precision stored on the Acb object) instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/IITM/nfold.jl` around lines 49 - 52, The call Arblib.precision(s.m) can
raise a MethodError for Acb because Arblib does not define precision(::Acb);
update the conditional to obtain the precision directly from the Acb object (or
its field/accessor) when CT === Acb and pass that to
_azimuthal_fourier_workspace instead of Arblib.precision(s.m); locate the
creation of fourier_workspace in the _iitm_fft_capable(CT) branch and replace
the Arblib.precision(s.m) call with the correct accessor on the Acb value (or
use a safe helper that returns 0 for non-Acb types) so
_azimuthal_fourier_workspace receives a valid numeric precision.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #12 +/- ##
==========================================
+ Coverage 95.78% 95.87% +0.08%
==========================================
Files 35 35
Lines 4726 4779 +53
==========================================
+ Hits 4527 4582 +55
+ Misses 199 197 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
The n-fold IITM (src/IITM/nfold.jl) accelerates the azimuthal integral with an FFT,
but the fast path was gated on `CT <: ComplexF64` (FFTW only). That gate matters a lot:
the azimuthal `cis` sum sits in the innermost of the O(nmax⁴) (p,q)-pair loop, so the
FFT removes a whole Nφ factor — measured ~31× faster than the direct sum at Nφ=40,
nmax=8. Double64 / BigFloat / Arb shapes fell back to the direct path and paid that
penalty on top of their slower arithmetic.
Now every complex type the package uses gets an FFT backend (n-fold path):
- ComplexF64 → FFTW (unchanged; cached plan)
- Complex{Double64}, Complex{BigFloat} → GenericFFT (AbstractFFTs.plan_fft, fresh plan
per call so a BigFloat plan can't go stale across precision changes)
- Acb → Arblib.dft! (native Arb acb_dft, per column)
- anything else → the existing direct `cis` fallback
Also: preallocate-and-reuse all azimuthal scratch in the workspace instead of
re-allocating it on every one of the Nr radial steps. The workspace (built once) now
owns `coeff`/`coeff_inv` (generic) and `col_in`/`col_out`/`coeff_mat`/`coeff_inv_mat`
(the new `_AcbFourierWorkspace`); the coefficient functions write in place. This removes
~Nr × (4nmax+1)×Nϑ allocations per T-matrix on the generic path and, more importantly,
the per-step Arb/Flint allocations on the Acb path. Reuse is safe: the coefficients are
fully consumed by the (read-only, threaded) pair loop within the same radial step before
the next step overwrites them.
Changes:
- Project.toml: GenericFFT promoted to an explicit dep; `using GenericFFT` in
src/TransitionMatrices.jl loads its generic plan_fft methods.
- src/IITM/fourier.jl: `_iitm_fft_capable` predicate; `_AzimuthalFourierWorkspace{CT,P}`
(now also holding coeff/coeff_inv); new `_AcbFourierWorkspace`; backend-dispatched
forward column DFT; Acb coefficients via AcbMatrix/AcbVector (avoids Matrix{Acb} GC
hazards). Workspace constructor takes `nₘₐₓ` and (Acb) `prec`.
- src/IITM/nfold.jl: the three `CT <: ComplexF64` gates → `_iitm_fft_capable(CT)`; pass
`nₘₐₓ` and `prec = Arblib.precision(s.m)` (Acb) to the workspace. Retires the
previously-unreachable ComplexF64 direct path (scalar-`wφ` bug).
- src/IITM/arbitrary.jl, src/IITM/linearization.jl: updated to the new workspace
constructor arity (still ComplexF64-only FFT there — generic FFT is the n-fold path).
Verified: cross sections match the ComplexF64 (FFTW) reference for the SAME shape &
discretisation to ~1e-15 (Complex{Double64} and Acb, Prism{4}); FFT confirmed active for
both new backends by flat Nφ-scaling; generic coeff allocation drops to 0 B/step. Full
suite: 48,193 tests pass.
12d2b80 to
dc6331c
Compare
The n-fold IITM (src/IITM/nfold.jl) accelerates the azimuthal integral with an FFT, but the fast path was gated on
CT <: ComplexF64(FFTW only). That gate matters a lot: the azimuthalcissum sits in the innermost of the O(nmax⁴) (p,q)-pair loop, so the FFT removes a whole Nφ factor — measured ~31× faster than the direct sum at Nφ=40, nmax=8. Double64 / BigFloat / Arb shapes fell back to the direct path and paid that penalty on top of their slower arithmetic.Now every complex type the package uses gets an FFT backend (n-fold path):
per call so a BigFloat plan can't go stale across precision changes)
cisfallbackAlso: preallocate-and-reuse all azimuthal scratch in the workspace instead of re-allocating it on every one of the Nr radial steps. The workspace (built once) now owns
coeff/coeff_inv(generic) andcol_in/col_out/coeff_mat/coeff_inv_mat(the new_AcbFourierWorkspace); the coefficient functions write in place. This removes ~Nr × (4nmax+1)×Nϑ allocations per T-matrix on the generic path and, more importantly, the per-step Arb/Flint allocations on the Acb path. Reuse is safe: the coefficients are fully consumed by the (read-only, threaded) pair loop within the same radial step before the next step overwrites them.Changes:
using GenericFFTin src/TransitionMatrices.jl loads its generic plan_fft methods._iitm_fft_capablepredicate;_AzimuthalFourierWorkspace{CT,P}(now also holding coeff/coeff_inv); new_AcbFourierWorkspace; backend-dispatched forward column DFT; Acb coefficients via AcbMatrix/AcbVector (avoids Matrix{Acb} GC hazards). Workspace constructor takesnₘₐₓand (Acb)prec.CT <: ComplexF64gates →_iitm_fft_capable(CT); passnₘₐₓandprec = Arblib.precision(s.m)(Acb) to the workspace. Retires the previously-unreachable ComplexF64 direct path (scalar-wφbug).Verified: cross sections match the ComplexF64 (FFTW) reference for the SAME shape & discretisation to ~1e-15 (Complex{Double64} and Acb, Prism{4}); FFT confirmed active for both new backends by flat Nφ-scaling; generic coeff allocation drops to 0 B/step. Full suite: 48,193 tests pass.
Summary by CodeRabbit
New Features
Tests
Chores