Skip to content

[experiment] JIT: Introduce KnownBits#129082

Draft
EgorBo wants to merge 6 commits into
dotnet:mainfrom
EgorBo:knownbits
Draft

[experiment] JIT: Introduce KnownBits#129082
EgorBo wants to merge 6 commits into
dotnet:mainfrom
EgorBo:knownbits

Conversation

@EgorBo
Copy link
Copy Markdown
Member

@EgorBo EgorBo commented Jun 6, 2026

Closes #105333
Closes #111567

Most optimizing compilers use KnownBits to track ranges, e.g. LLVM, MSVC vs tracking Min-Max values.
We have Range wich is 32-bit only and will likely require lots of efforts to make it 64-bit, fix casts between 32-bit and 64-bit, fight with all kinds of overflows and correctness issues and still it's only good fo representing continues ranges.

KnownBits looks like this:

struct KnownBits
{
    uint64_t knownZero;
    uint64_t knownOne;
}

My estimate it can bring 500k-600k diffs for win-x64 if extended (I've seen 450k with +400 LOC), current version should be -320kb on win-x64 with somewhat nices improvements in non-tests collections.

I'm not sure it can fully replace Range just like Range can't replace KnownBits, but some usages definitely can be re-routed to KnownBits.

This PR is basically an attempt to mimic LLVM's impl

Not done:

  • Enable assertions for 64-bit VNs, today we don't spawn them
  • Migrate more opts from MergeEdgeAssertions to MergeKnownBitsAssertions
  • Support more operations, benefit from Compute range for various GT_AND, GT_ORR in assertion prop, etc

Diffs - -662244 bytes on linux-arm64

Copilot AI review requested due to automatic review settings June 6, 2026 21:42
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 6, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new JIT-internal KnownBits analysis (LLVM-style “known zero/known one” bit lattice) and wires it into assertion propagation so the JIT can fold more comparisons (including TYP_LONG) and prove more casts redundant/overflow-safe using bit-level facts.

Changes:

  • Introduces KnownBits / KnownBitsOps (bit lattice + transfer functions) and a KnownBits::Compute VN/assertion-driven analysis.
  • Uses KnownBits in global assertion propagation to fold relops and to remove/relax casts when provably safe.
  • Updates JIT build wiring to compile the new implementation.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/coreclr/jit/knownbits.h New KnownBits lattice + transfer helpers (And/Or/UDiv/Cast/EvalRelop) and analysis entrypoint.
src/coreclr/jit/knownbits.cpp Implements VN/assertion-based KnownBits computation, including PHI merging and assertion refinement.
src/coreclr/jit/CMakeLists.txt Adds KnownBits sources/headers to the JIT build.
src/coreclr/jit/assertionprop.cpp Hooks KnownBits into global relop folding and cast simplification paths; widens some assertion creation gates to include TYP_LONG.

Comment thread src/coreclr/jit/knownbits.h
EgorBo and others added 3 commits June 7, 2026 03:17
Build on the initial KnownBits analysis by porting more LLVM-style transfer
functions and wiring KnownBits into more assertion-prop consumers. Each addition
was measured per-feature via SPMI asmdiffs on libraries.pmi and benchmarks.run,
and additions that did not clear an 80-byte bar (or regressed) were dropped.

Transfer functions (knownbits.h): add Mul (leading-zeros + low-bits), constant
LSH/RSZ/RSH shifts, and URem, ported from llvm/lib/Support/KnownBits.cpp.
(XOR/NOT/ADD/SUB/NEG were prototyped but trimmed: each <80 bytes or net harmful.)

ComputeWorker (knownbits.cpp): handle VNF_MUL/UMOD/LSH/RSH/RSZ.

MergeKnownBitsAssertions: read "num u< otherVN" / "num u<= otherVN" (num inherits
otherVN's leading-zero bits) and pin the last unknown bit from "num != const".

Consumers (assertionprop.cpp):
 * optAssertionProp_RangeProperties now derives non-negative/non-zero from known
   bits, covering TYP_LONG and bit patterns the interval range cannot express.
 * optAssertionProp_AddMulSub clears the overflow flag for TYP_LONG ADD/SUB/MUL
   when known bits prove the operation cannot overflow.
 * 64-bit (TYP_LONG) relop assertions are generated and consumed end-to-end.
(Bounds-check elimination and generic constant-folding were prototyped but
trimmed: <80 bytes and/or net regressions.)

Add JitEnableKnownBits (default 1) to disable the analysis and all its consumers.

Diffs (win-x64 libraries.pmi): -16,220 bytes, 572 contexts (488 improvements,
3 regressions); linux-arm64: -16,184 bytes. SuperPMI replay is clean on win-x64,
linux-x64 and linux-arm64.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add per-function citations to the corresponding routines in
llvm/lib/Support/KnownBits.cpp (udiv, mul, urem/remGetLowBits, shl/lshr/ashr,
eq/ne/ult/...), note the intentional simplifications (constant-shift only;
udiv/mul refinements dropped), and document that Intersect/Union are LLVM's
unionWith/intersectWith with the names inverted (they describe the value-set
operation). Also note the lattice is a fixed-width adaptation of LLVM's
APInt-based KnownBits. Comments only; no behavior change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two new assertion-prop consumers that use KnownBits, plus the XOR transfer
function that feeds them:

 * optAssertionProp_KnownBitsSimplify: remove a redundant constant mask on
   AND/OR -- "x & C" => "x" when every possibly-set bit of x is set in C,
   "x | C" => "x" when every set bit of C is already known-one in x, and
   "x & C" => 0 when x has no possibly-set bit in C.
 * optAssertionProp_BndsChk: drop a bounds check when known bits prove
   (uint)index < (uint)length (e.g. masked indices).
 * ComputeWorker now handles VNF_XOR.

Mask simplification is the main win and scales with code volume (it fires
~1,550x on libraries_tests.run). Diffs vs the prior commit: libraries_tests.run
-10,583 bytes, libraries.pmi -777 bytes; net improvement on every collection
measured. SuperPMI replay clean on win-x64.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 7, 2026 02:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment on lines +79 to +83
// Is bit "pos" known to be 0?
bool IsBitZero(unsigned pos) const
{
return (knownZero & (1ull << pos)) != 0;
}
EgorBo and others added 2 commits June 7, 2026 05:20
MergeKnownBitsAssertions handled unsigned OAK_LT_UN/OAK_LE_UN against a constant
but ignored the signed OAK_LT/OAK_LE forms, so a pattern like

    if (a > 10 && a < 1000)  // a is long
        ... checked((int)a) ...

did not learn that 'a' fits in an int: the 'a > 10' assertion proved a >= 0 (sign
bit 0) but the 'a < 1000' assertion was dropped, leaving the upper bits unknown and
the overflow check in place.

Now a signed 'num < C' / 'num <= C' assertion with a non-negative bound records a
candidate upper bound, which is applied after the assertion loop once num is also
known non-negative: num is then in [0, C-1], so its upper bits are known zero. This
lets optAssertionProp_Cast drop the checked cast's overflow check in the example.

Diffs on libraries.pmi win-x64: -80 bytes, 7 improvements, 0 regressions (plus
PerfScore wins from removed overflow branches). SuperPMI replay clean on libraries.pmi.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In ComputeWorker the known bits of a comparison VN were always reported as the
[0, 1] bound. When the operands' known bits already settle the comparison, use
KnownBitsOps::EvalRelop to report the exact constant instead (0 = false,
1 = true), falling back to [0, 1] when undetermined.

This lets a comparison whose result is statically known propagate as a constant
into the operations and comparisons that consume it (e.g. a nested relop operand),
rather than only being known to be 0 or 1.

Diffs on libraries.pmi win-x64: -252 bytes, 9 improvements, 0 regressions.
SuperPMI replay clean on libraries.pmi.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 7, 2026 04:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Comment on lines +79 to +83
// Is bit "pos" known to be 0?
bool IsBitZero(unsigned pos) const
{
return (knownZero & (1ull << pos)) != 0;
}
Comment on lines +4124 to +4140
if (isUnsigned)
{
const T aMax = (T)a.GetUMax(width);
const T bMax = (T)b.GetUMax(width);
switch (oper)
{
case GT_ADD:
return !CheckedOps::AddOverflows<T>(aMax, bMax, CheckedOps::Unsigned);
case GT_SUB:
// Unsigned a - b underflows iff a < b; safe iff umin(a) >= umax(b).
return a.GetUMin(width) >= b.GetUMax(width);
case GT_MUL:
return !CheckedOps::MulOverflows<T>(aMax, bMax, CheckedOps::Unsigned);
default:
return false;
}
}
Comment on lines +4218 to +4228
// Known-bits based no-overflow proof. This also covers TYP_LONG operations (which the range-based
// path above does not handle) and bit patterns an interval range cannot express.
if (!optLocalAssertionProp && tree->gtOverflow() && varTypeIsIntegral(tree))
{
const unsigned width = (genActualType(tree) == TYP_LONG) ? 64 : 32;
const KnownBits kb1 = KnownBits::Compute(this, optConservativeNormalVN(tree->gtGetOp1()), assertions);
const KnownBits kb2 = KnownBits::Compute(this, optConservativeNormalVN(tree->gtGetOp2()), assertions);
if (knownBitsOperCannotOverflow(tree->OperGet(), tree->IsUnsigned(), kb1, kb2, width))
{
tree->ClearOverflow();
return optAssertionProp_Update(tree, tree, stmt);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unnecessary overflow check with checked on 64-bit ryujit Unnecessary overflow check for checked BigMul on x64

2 participants