feat(simulator): support infinite-granularity updates (dw_min=0)#778
Open
Zhaoxian-Wu wants to merge 1 commit into
Open
feat(simulator): support infinite-granularity updates (dw_min=0)#778Zhaoxian-Wu wants to merge 1 commit into
Zhaoxian-Wu wants to merge 1 commit into
Conversation
…zero Signed-off-by: Zhaoxian Wu <wuzhaoxian97@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
AIHWKit can simulate various of hardware non-idealities, like limited state (
dw_min), response function, variation, or noise.For diagnostic purposes it is often necessary to isolate these effects. Concretely, a user studying device behavior wants to answer "how much of this error/degradation comes from finite
dw_mingranularity, versus from the response function or from variation?"Today there is no clean way to remove only the
dw_mincontribution: shrinkingdw_mintoward zero makes the pulse count diverge, which is both prohibitively slow and dominated by accumulated stochastic noise — so thedw_min → 0limit is never actually reachable.This PR makes that limit a first-class, exactly-computed mode. Setting
dw_min = 0decouples the impact ofdw_minfrom the other update non-idealities (response function and variation), so a user can compare runs with finitedw_minagainst the ideal zero-granularity baseline while keeping every other device characteristic fixed.What it does
Setting
dw_min = 0on anyPulsedDeviceactivates infinite-granularity (IG) mode: instead of simulating a stochastic pulse train, the tile applies the exact mean-field limit of the update in a single deterministic step. The result is a noise-free update that still respects bounds and the device's weight-dependent response, but carries nodw_mingranularity and nodw_min-related variation.In IG mode the per-coincidence stochastic update is replaced by its expectation:
where
q(w)is the device-specific response function (the weight-dependent scale normally applied per pulse coincidence), andx/dare the forward and backward signals.To keep
q(w)intact while removing only the granularity:dw_min-related device-to-device variation (dw_min_dtod,dw_min_std) are dropped;populate()treatsdw_min = 0as unit response (effective_dw_min = 1) so the per-element scales encodeq(w)directly;UpdateParameters—desired_bl,update_bl_management,update_management, andfixed_bl— are bypassed and have no effect in IG mode (the dispatch returns before the bit-line maker is ever invoked).Usage
IG mode is opt-in through a single config field — set
dw_min = 0on any pulsed device; no other API change is needed. It works the same at the layer level and the tile level:Diagnostic use case: isolating the
dw_mincontributionRun the same device at two granularities while keeping the response function and
variation fixed. The difference is then exactly the effect of finite
dw_min:Note that
dw_min = 0removes only the granularity; variation is still applied. To reach the exact mean-field limitw -= lr * (dᵀx), disable the remaining non-idealities as well:Key changes
Dispatch (
rpu_weight_updater.cpp) — when the weight granularity is≤ 0,updateVectorWithDeviceroutes through the IG path (initUpdateCycle→doInfiniteGranularityUpdate→finishUpdateCycle) instead of the stochastic pulsed updater.Base device (
rpu_pulsed_device.{h,cpp}) — new virtualdoInfiniteGranularityUpdate(...)with a default ConstantStep-style (weight-independent) implementation, plus theIG_UPDATE_W_LOOP_INNERhelper macro.populate()switches to unit response and disablesdw_mind-to-d variation whendw_min = 0.Per-device overrides (CPU
.cpp+ CUDA.cu) — weight-dependent response for ConstantStep, LinearStep, ExpStep, PowStep, PiecewiseStep, SoftBoundsReference, and PowStepReference devices.Config docs (
configs/devices.py) — documents thedw_min = 0IG behavior onPulsedDevice.dw_min.Test coverage
New
tests/test_infinite_granularity.pycovering:lr·dᵀx;LinearStep matches its known response formula);
dw_min > 0still uses the stochastic path (no regression);dw_minscale of the IG result;Built with
make build_inplace_cuda(CUDA 12.9 + MKL, Python 3.10).