Skip to content

feat: add memory bounds check for rvr#2803

Open
mansur20478 wants to merge 12 commits into
develop-v2.1.0-rvrfrom
feat/rvr-untrusted-bounds-checking
Open

feat: add memory bounds check for rvr#2803
mansur20478 wants to merge 12 commits into
develop-v2.1.0-rvrfrom
feat/rvr-untrusted-bounds-checking

Conversation

@mansur20478
Copy link
Copy Markdown

@mansur20478 mansur20478 commented May 21, 2026

Memory read and write functions now have an optional check_bounds invocation before accessing the memory. check_bounds checks that the access lies within the VM's addressable memory region and aborts otherwise. The same is applied for openvm_io.h functions that work with the user IO address space in data memory.

To turn off protected mode, add the openvm-cli/unprotected feature.

Mirrors the interpreter's check_bounds and panic_oob functions, and unprotected Cargo feature.

closes INT-7702

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 21, 2026

Claude finished @mansur20478's task in 1m 56s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review Makefile changes
  • Review openvm_state.h changes
  • Review compile.rs changes
  • Post final review

Overall this is a clean, well-scoped PR. The bounds-checking logic matches the interpreter's MmapMemory::check_bounds semantics, the feature-flag plumbing is correct, and the always_inline + #ifndef approach means zero overhead in unprotected mode. A few issues worth addressing:

Bug: OOB access still proceeds after detection

check_mem_bounds prints to stderr but does not prevent the subsequent memory access (openvm_state.h:80-87). The PR description acknowledges this ("Final OOB-handling semantics are in progress"), but this is worth calling out as a known gap — in protected mode, a detected OOB will still read/write past the buffer, potentially causing a segfault or silent corruption. The interpreter's protected-mode path panics on OOB (memmap.rs:42-43). Consider at minimum adding a return / early-exit with a default value, or setting has_exited / exit_code on the RvState to halt execution gracefully.

Fix this →

Semantic mismatch with interpreter

The interpreter has two modes (memmap.rs:37-57):

  • Protected (#[cfg(not(feature = "unprotected"))]): panics on OOB
  • Unprotected (#[cfg(feature = "unprotected")]): debug_assert! only (no-op in release builds)

The C side has:

  • Protected (#ifndef OPENVM_UNPROTECTED): logs to stderr, continues
  • Unprotected (-DOPENVM_UNPROTECTED): no check at all

So in protected mode the C version is weaker (log vs. panic), and in unprotected mode both are no-ops in release. This is fine as a draft, but the final version should match the interpreter's panic semantics (e.g., abort() or __builtin_trap()).

Nit: RVR_CFLAGS could clobber user-supplied flags

Makefile:8-9:

RVR_CFLAGS ?=
CFLAGS += $(RVR_CFLAGS)

And compile.rs:216:

make_args.push("RVR_CFLAGS=-DOPENVM_UNPROTECTED".to_string());

Passing RVR_CFLAGS=... as a make argument overrides the ?= default, so any user-set RVR_CFLAGS from the environment is lost. This is probably fine since RVR_CFLAGS is new and only used for this define, but if you anticipate users or extensions also using RVR_CFLAGS, consider appending (RVR_CFLAGS+=-DOPENVM_UNPROTECTED) or using a dedicated variable like RVR_BOUNDS_CFLAGS.

Nit: (size_t)num_words * sizeof(uint32_t) overflow

In rd_mem_u32_range and wr_mem_u32_range (openvm_state.h:183,193), the expression (size_t)num_words * sizeof(uint32_t) is computed before check_mem_bounds. Since num_words is uint32_t and sizeof(uint32_t) is 4, the max product is ~16 GiB which fits in size_t on 64-bit, so this is safe in practice. Just noting for awareness on 32-bit targets where size_t is 32-bit — unlikely given march=native but worth a comment if this code is ever portable.

Minor: trailing whitespace

Several of the #endif lines in the read/write functions have trailing blank lines with spaces (e.g., openvm_state.h:103,111,119). These are cosmetic but show up in git diff --check.


Summary: The core implementation is correct and the feature-flag plumbing through Makefile and cfg! works well. The main action item is that check_mem_bounds should actually prevent the OOB access (abort/trap/exit) rather than just logging — even for a draft, continuing past a detected OOB risks masking bugs during development. Everything else is minor.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 21, 2026

Claude finished @mansur20478's task in 1m 41s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review diff and all changed files
  • Check interpreter semantics for comparison
  • Post final review

Good iteration from the first commit. The second commit (0e17244) addresses the major concern from the prior review: check_mem_bounds now calls abort() on OOB, which matches the interpreter's panic semantics. The Makefile plumbing was also cleaned up nicely. Overall this is clean and correct.

Remaining items

1. always_inline on a function that calls fprintf + abort

check_mem_bounds is marked always_inline (openvm_state.h:81), but the error path calls fprintf and abort. Inlining this into every memory accessor bloats the hot code with cold error-handling logic, which can hurt icache performance. The happy path (the unlikely branch-not-taken) is already cheap. Consider either:

  • Splitting the cold path into a separate __attribute__((noinline, cold)) helper (similar to the interpreter's #[cold] #[inline(never)] fn panic_oob at memmap.rs:191-194), or
  • Removing always_inline from check_mem_bounds and letting the compiler decide (it will likely inline the fast path and outline the cold path with -O3).

Fix this →

2. MEMORY_MASK + 1u type width

At openvm_state.h:82:

const size_t mem_size = MEMORY_MASK + 1u;

MEMORY_MASK is a uint32_t constant. If MEMORY_MASK == 0xFFFFFFFF (full 4 GiB address space), then MEMORY_MASK + 1u overflows to 0 in 32-bit arithmetic before the widening assignment to size_t. This would make mem_size = 0, causing every access to appear out-of-bounds. If MEMORY_MASK is always < UINT32_MAX this is fine in practice, but it's worth a defensive cast:

const size_t mem_size = (size_t)MEMORY_MASK + 1u;

Fix this →

3. Minor: cfg! vs #[cfg] for dead-code elimination

At compile.rs:215:

if cfg!(feature = "unprotected") {
    make_args.push("OPENVM_UNPROTECTED=1".to_string());
}

cfg!() evaluates to a bool at compile time so the branch is still compiled (the dead arm is just constant-folded away). This is fine here since the body is trivial, but #[cfg(feature = "unprotected")] would be the more idiomatic pattern for feature-gated code in Rust. Not a blocker, just a style note.

Looks good

  • The bounds check condition (start > mem_size || size > mem_size - start) correctly mirrors the interpreter (memmap.rs:41) and handles both start past end and start+size wrapping.
  • Feature-flag plumbing through Makefile (OPENVM_UNPROTECTED ?= / ifeq) and compile.rs (cfg!(feature = "unprotected")) is clean and correct.
  • Using #ifndef OPENVM_UNPROTECTED in each accessor means zero overhead in unprotected mode.
  • The abort() call ensures OOB detection actually halts execution, matching the interpreter's panic behavior.
  • Trailing whitespace from the first commit was cleaned up in the second commit.

Summary: This is ready to merge once the always_inline / cold-path concern is addressed (item 1). Item 2 is a minor defensive fix worth including. Nice work iterating on the review feedback.


@mansur20478 mansur20478 requested review from gdmlcjs and removed request for gdmlcjs May 21, 2026 20:04
@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 21, 2026

Claude finished @mansur20478's task in 1m 54s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review diff against base branch
  • Review openvm_state.h changes
  • Review Makefile changes
  • Review compile.rs changes
  • Post final review

Nice iteration across the three commits. The latest commit (ccd2dfa) addresses both major items from the prior review: the cold error path is now split into a separate __attribute__((noinline, cold)) abort_oob function, and the (size_t) cast on MEMORY_MASK prevents the theoretical overflow. The cfg!()#[cfg()] change in compile.rs is also cleaner. Overall this looks correct and ready.

One minor item

Unused state parameter in check_mem_bounds (openvm_state.h:89)

check_mem_bounds takes RvState* restrict state but never uses it — mem_size is derived from the compile-time constant MEMORY_MASK. This compiles fine today because of -Wno-unused-parameter in the Makefile, but the parameter is dead weight. Consider either:

  • Removing it (callers just pass state, addr, sizeaddr, size)
  • Or keeping it if you anticipate needing state later (e.g., per-instance memory sizes, or setting has_exited instead of abort()). If so, a (void)state; cast would make the intent explicit, though the warning suppression makes it cosmetic.

Not a blocker either way — just noting for awareness.

What looks good

  • Cold-path split: abort_oob is noinline, cold while check_mem_bounds stays always_inline — the compiler will keep the hot path tight and the error handler out-of-line. This mirrors the interpreter's #[cold] #[inline(never)] fn panic_oob pattern.
  • Bounds check logic: start > mem_size || size > mem_size - start correctly detects both start-past-end and start+size wrapping, matching the interpreter's semantics.
  • (size_t)MEMORY_MASK + 1u: The defensive cast prevents 32-bit overflow. With MEM_BITS=29 (MEMORY_MASK=0x1FFFFFFF) this was safe before, but the cast future-proofs it.
  • #[cfg(feature = "unprotected")] in compile.rs: Idiomatic and ensures true dead-code elimination at compile time.
  • Makefile plumbing: OPENVM_UNPROTECTED ?= / ifeq is clean — the make variable is only set when explicitly passed from the Rust side, and the #ifndef in each accessor means zero overhead in unprotected mode.
  • abort() semantics: Correctly halts execution on OOB, matching the interpreter's panic behavior. fflush(stderr) before abort() ensures the diagnostic message is actually printed.

Summary: This is clean and correct. The only nit is the unused state parameter. LGTM for merge.


@github-actions

This comment has been minimized.

@mansur20478 mansur20478 marked this pull request as ready for review May 21, 2026 21:47
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 21, 2026

Claude finished @mansur20478's task in 1m 51s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review openvm_state.h changes
  • Review Makefile changes
  • Review compile.rs changes
  • Post final review

This is a clean, well-structured PR. The bounds checking correctly mirrors the interpreter's MmapMemory::check_bounds semantics (memmap.rs:39-43), the cold/hot path split follows the same #[cold] #[inline(never)] panic_oob pattern from the Rust side, and the feature-flag chain (openvm-cliopenvm-sdkopenvm-circuit → C -DOPENVM_UNPROTECTED) is properly wired end-to-end. The #ifndef guards mean zero overhead in unprotected mode. Benchmarks show no measurable regression.

One item worth addressing

abort_oob is static in a header included by multiple TUs (openvm_state.h:82)

abort_oob is static __attribute__((noinline, cold)) — the static keyword means each translation unit that includes openvm_state.h gets its own copy. Since several headers include it (openvm_tracer_pure.h, openvm_tracer_metered.h, openvm_tracer_metered_cost.h, rv_muldiv.h), the linker will emit multiple copies of abort_oob in the final .so. This is functionally correct, but a bit wasteful for a cold function.

Two options:

  1. Move the definition to a .c file and declare extern in the header — single copy, but adds a source file.
  2. Leave as-is — it's a cold path, code size impact is negligible, and the static avoids ODR issues. Perfectly fine.

Not a blocker, just noting the tradeoff.

Looks good

  • Bounds check logic (openvm_state.h:91): start > mem_size || size > mem_size - start exactly matches the interpreter's condition at memmap.rs:41 — handles both start-past-end and start+size wrapping without overflow.
  • Cold-path split: abort_oob is noinline, cold while check_mem_bounds is always_inline — the compiler keeps the fast path tight with the error handler out-of-line. Correctly mirrors #[cold] #[inline(never)] fn panic_oob in memmap.rs:191-192.
  • (size_t)MEMORY_MASK + 1u (openvm_state.h:90): The cast to size_t prevents 32-bit overflow. With MEM_BITS=29 (MEMORY_MASK=0x1FFFFFFF) this wasn't strictly necessary, but it future-proofs the code.
  • abort() semantics (openvm_state.h:86): Halts execution on OOB, matching the interpreter's panic. fflush(stderr) before abort() ensures the diagnostic is flushed.
  • Feature-flag plumbing: Makefile uses OPENVM_UNPROTECTED ?= / ifeq to conditionally add -DOPENVM_UNPROTECTED. compile.rs:215-216 uses #[cfg(feature = "unprotected")] (idiomatic Rust) to pass the make variable. The feature chain through Cargo.toml files is correctly wired.
  • Unused state parameter (openvm_state.h:89): check_mem_bounds takes state but doesn't use it since mem_size is derived from the compile-time MEMORY_MASK. This is fine given -Wno-unused-parameter in the Makefile, and keeping it is reasonable if you anticipate per-instance memory sizes later.
  • Coverage: All 10 memory accessor functions (5 reads + 3 writes + 2 range ops) have bounds checks, matching the interpreter's pattern.

Summary: LGTM. The implementation is correct, well-structured, and faithfully mirrors the interpreter. The only observation (static abort_oob in a header) is cosmetic. Nice work iterating on the prior review feedback.


@mansur20478 mansur20478 self-assigned this May 21, 2026
Comment thread crates/rvr/rvr-openvm/c/openvm_state.h Outdated
/* ── Per-width memory reads ──────────────────────────────────────── */

static __attribute__((always_inline)) inline uint8_t rd_mem_u8(RvState* restrict state, uint32_t addr) {
#ifndef OPENVM_UNPROTECTED
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to have the ifndef inside check_mem_bounds and make it noop otherwise

but actually, let's just avoid ifndef altogether. we don't use any preprocessor directives in this code since they can make compilation slower. let's define an externally statically linked check_mem_bounds function that is noop for unprotected and does bounds checking for protected. we can have two separate header or c files

Comment thread crates/rvr/rvr-openvm/c/openvm_state.h Outdated

static __attribute__((always_inline)) inline int8_t rd_mem_i8(RvState* restrict state, uint32_t addr) {
#ifndef OPENVM_UNPROTECTED
check_mem_bounds(state, addr, sizeof(int8_t));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either inline checks or create separate check_mem_bounds_xx functions and just use

if (unlikely(start > mem_size - sizeof(int8_t))) abort_oob(...);

@@ -135,12 +186,20 @@ static __attribute__((always_inline)) inline void wr_mem_u32(RvState* restrict s
* so these lower to a single memcpy. */
static __attribute__((always_inline)) inline void rd_mem_u32_range(RvState* restrict state, uint32_t base_addr,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this probably is the only one that needs both checks in the if statement

Comment thread crates/rvr/rvr-openvm/c/openvm_state.h Outdated
abort();
}

static __attribute__((always_inline)) inline void check_mem_bounds(RvState* restrict state, uint32_t start, size_t size) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove state

Comment thread crates/rvr/rvr-openvm/c/openvm_state.h Outdated
const size_t mem_size = (size_t)MEMORY_MASK + 1u;
if (unlikely(start > mem_size || size > mem_size - start)) {
abort_oob(start, size, mem_size);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a __builtin_assume((start & ~(uint32_t)MEMORY_MASK) == 0u); after if. maybe it helps

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Claude finished @mansur20478's task in 3m 4s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review diff against base branch
  • Review openvm_state.h changes
  • Review openvm_check_mem_bounds.h and .c files
  • Review Makefile changes
  • Review compile.rs and project.rs changes
  • Post final review

Good iteration addressing the prior review feedback. The latest commit (7a7aaac) splits the bounds check into separate .c files as the reviewer requested — no preprocessor directives in the logic, just Makefile-level file selection. The overall approach is clean: header declares the per-width functions, protected .c implements the real checks with abort(), unprotected .c provides no-op stubs.

Issues to address

1. __builtin_assume in externally-linked functions is useless

openvm_check_mem_bounds_protected.c:24,31,38,45,52 all have:

__builtin_assume((start & ~(uint32_t)MEMORY_MASK) == 0u);

These assumptions live inside a separate translation unit. Since check_mem_bounds_* are externally linked (not inlined into the callers in openvm_state.h), the compiler cannot propagate these assumptions to the call sites. The __builtin_assume only affects optimization within the .c file itself — where it does nothing useful because the function returns immediately after. With thin LTO enabled the compiler might inline these across TU boundaries, but without LTO (which is disabled in some code paths per compile.rs:171) the assumptions are dead.

Either:

  • Move the __builtin_assume to after the call site in openvm_state.h (inside each rd_mem_*/wr_mem_* function, after the check_mem_bounds_* call), or
  • Remove them — mem_ptr already has assume(addr <= MEMORY_MASK) at openvm_state.h:76.

Actually, looking again: mem_ptr already does assume(addr <= MEMORY_MASK) — so the __builtin_assume in the .c file is redundant even if it could propagate. Consider removing it to keep the code simple.

Fix this →

2. Missing trailing newlines

openvm_check_mem_bounds.h and openvm_check_mem_bounds_unprotected.c are both missing the POSIX-required trailing newline. Some compilers warn about this (-Wnewline-eof), and it's generally best practice.

Fix this →

3. Unnecessary #include <stdio.h> and #include <stdlib.h> in openvm_state.h

openvm_state.h:13-14 adds #include <stdio.h> and #include <stdlib.h>, but neither is used by anything in openvm_state.h itself — fprintf/abort live in the separate .c file now. These includes pull in heavy headers into every TU that includes openvm_state.h, increasing compile time unnecessarily.

Fix this →

4. check_mem_bounds_u32_range is missing __builtin_assume (inconsistency)

Even if you keep the __builtin_assume calls (per item 1), check_mem_bounds_u32_range at openvm_check_mem_bounds_protected.c:55-59 doesn't have one — inconsistent with the other 5 functions. This is fine to leave if you remove them all, but worth noting.

Minor observations (non-blocking)

Duplicated unlikely functionopenvm_check_mem_bounds_protected.c:6-8 defines its own unlikely() that's identical to openvm_state.h:21. Since the .c file doesn't include openvm_state.h this compiles fine, but it's a maintenance hazard. Consider extracting unlikely/likely/assume into a shared utility header (e.g., openvm_builtins.h), or just use __builtin_expect directly in the .c file since it's only used in a few places.

Semantic gap with interpreter in unprotected mode — The interpreter's unprotected check_bounds (memmap.rs:48-57) still has a debug_assert! that fires in debug builds. The C unprotected stubs are unconditional no-ops. This means debug builds of the C code won't catch OOB in unprotected mode, while the Rust interpreter would. This is fine for now but worth being aware of if you rely on unprotected-mode debug runs for testing.

What looks good

  • Clean separation: header declares, Makefile selects, each .c provides the implementation. No preprocessor directives in the hot path — addresses the reviewer's core concern.
  • Bounds check logic (protected.c:22,29,36,43,50): start > mem_size - sizeof(T) is the correct single-comparison form for fixed-width types. The range variant (protected.c:58) correctly uses the two-check form base_addr > mem_size || size > mem_size - base_addr.
  • Cold path split: abort_oob is noinline, cold, noreturn — keeps the hot path tight.
  • (size_t)MEMORY_MASK + 1u: Correct defensive cast preventing 32-bit overflow.
  • Feature-flag chain: openvm-cliopenvm-sdkopenvm-circuitOPENVM_UNPROTECTED=1 make arg → Makefile filter-out. Clean and correct.
  • project.rs emits both .c files: Both protected and unprotected .c files are always written to the output dir, and the Makefile selects which to compile. This is robust.

Summary: The refactor to separate .c files is well done and addresses the reviewer's feedback. Three actionable items: (1) remove or relocate the __builtin_assume calls that can't propagate across TU boundaries, (2) add trailing newlines to two files, (3) remove the now-unnecessary stdio.h/stdlib.h includes from openvm_state.h.


@github-actions

This comment has been minimized.

@shuklaayush
Copy link
Copy Markdown
Collaborator

can you also add a test somewhere under feature = "rvr" with should_panic

Copy link
Copy Markdown

@gdmlcjs gdmlcjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should also add checking for the public values address space.

Comment on lines +248 to +261
let bounds_prot_path = self
.output_dir
.join("openvm_check_mem_bounds_protected.c");
fs::write(
&bounds_prot_path,
include_str!("../../c/openvm_check_mem_bounds_protected.c"),
)?;
let bounds_unprot_path = self
.output_dir
.join("openvm_check_mem_bounds_unprotected.c");
fs::write(
&bounds_unprot_path,
include_str!("../../c/openvm_check_mem_bounds_unprotected.c"),
)?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if we did the specialization in this file like how it's done for the tracer header. We can do that with a single #[cfg(feature = "unprotected")] branch instead of passing an argument to the Makefile.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Claude finished @mansur20478's task in 3m 8s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review diff against base branch
  • Review C header and source files
  • Review Rust-side changes (compile.rs, project.rs, Cargo.toml)
  • Review test additions
  • Check outstanding reviewer feedback
  • Post final review

Good iteration across the 5 commits. The latest state (9a3f170) addresses the core reviewer feedback: separate .c files for protected/unprotected, per-width check_mem_bounds_* functions, removed state parameter, abort() semantics, cold-path split, and a #[should_panic] test. The project.rs specialization via #[cfg(feature = "unprotected")] on include_str! is clean and follows gdmlcjs's suggestion to avoid Makefile-level toggling.

Issues to address

1. Missing trailing newline in Cargo.toml (crates/rvr/rvr-openvm/Cargo.toml)

The file ends with unprotected = [] with no trailing newline. This shows up in the diff as \ No newline at end of file and can cause issues with some tooling.

Fix this →

2. __builtin_assume after check_mem_bounds_* is redundant in protected mode, incorrect in unprotected mode

Every accessor in openvm_state.h has this pattern:

check_mem_bounds_u32(addr);
__builtin_assume((addr & ~(uint32_t)MEMORY_MASK) == 0u);

In protected mode, check_mem_bounds_u32 calls abort() on OOB, so addr is guaranteed in-bounds after the call — but the compiler can't see this across the TU boundary (the function isn't noreturn on success, it just returns normally). The __builtin_assume after the call is the correct way to communicate this to the compiler. However, mem_ptr at line 74 already has assume(addr <= MEMORY_MASK) which is equivalent. So the __builtin_assume is redundant — it just restates what mem_ptr already promises.

In unprotected mode, the stubs are no-ops, so addr has NOT been validated. The __builtin_assume then tells the compiler a property that may not hold, which is UB. This is intentional (you're opting into unprotected mode), and mem_ptr already does the same thing, so it doesn't make things worse — but it's worth being explicit about this in a comment.

Recommendation: remove the __builtin_assume lines from all accessors. mem_ptr already provides the same hint. Less duplication, no ambiguity about which hint is load-bearing.

Fix this →

3. check_mem_bounds_u32_range allows base_addr == mem_size when num_words == 0

if (unlikely(base_addr > mem_size || size > mem_size - base_addr))

When num_words == 0, size == 0, and the check passes for any base_addr <= mem_size — including base_addr == mem_size which is one past the buffer. This is technically fine since the subsequent memcpy with size 0 is a no-op, but it's a semantic difference from the single-width checks which use start > mem_size - sizeof(T) (always checking against the type's width). Not a bug, just noting the divergence.

4. Duplicated unlikely function (openvm_check_mem_bounds_protected.c:6-8)

unlikely() is defined identically in both openvm_state.h:19 and openvm_check_mem_bounds_protected.c:6-8. Since the .c file doesn't include openvm_state.h, this compiles fine, but it's a maintenance hazard. Consider either:

  • Using __builtin_expect(!!(x), 0) directly in the .c file (only 3 call sites), or
  • Extracting a shared openvm_builtins.h with unlikely/likely/assume

Not blocking.

Reviewer feedback status

Reviewer Feedback Status
shuklaayush Separate .c files instead of #ifndef ✅ Addressed in 7a7aaac
shuklaayush Per-width check_mem_bounds_* functions ✅ Addressed
shuklaayush Remove state parameter ✅ Addressed
shuklaayush Add __builtin_assume after check ✅ Added (see note above about redundancy)
shuklaayush should_panic test under feature = "rvr" ✅ Added in extensions/rv32im/tests/src/lib.rs:351-368
gdmlcjs Specialization in project.rs like tracer header ✅ Addressed in 9a3f170 — uses #[cfg(feature = "unprotected")] on include_str!
gdmlcjs Bounds checking for public values address space ❓ Not addressed yet — public values writes go through openvm_reveal (a host callback in openvm_io.c), not through rd_mem/wr_mem. Whether bounds checking is needed there depends on how the host callback validates its arguments. Worth clarifying with gdmlcjs.

What looks good

  • Bounds check logic: start > mem_size - sizeof(T) for fixed-width types correctly detects OOB with a single comparison. The range variant uses the two-check form base_addr > mem_size || size > mem_size - base_addr matching the interpreter at memmap.rs:41.
  • Cold-path split: abort_oob is noinline, cold, noreturn while the callers are normal functions — the compiler keeps the fast path tight.
  • (size_t)MEMORY_MASK + 1u: Defensive cast prevents 32-bit overflow.
  • Feature-flag chain: openvm-cli → openvm-sdk → openvm-circuit → rvr-openvm via unprotected = ["rvr-openvm/unprotected"]. project.rs uses #[cfg(feature = "unprotected")] to select the .c file at build time. Clean.
  • Test: out_of_bound_mem_access.rs uses black_box to prevent the compiler from optimizing away the wild pointer read, and targets 0x2000_0004 (just past the 512 MiB region). #[should_panic] correctly expects abort().
  • No Makefile changes needed: The previous RVR_CFLAGS/OPENVM_UNPROTECTED Makefile plumbing was removed in favor of the cleaner project.rs approach.

Summary: The implementation is correct and well-structured. Main items: (1) add trailing newline to Cargo.toml, (2) consider removing the redundant __builtin_assume lines from accessors. The test coverage is good. The outstanding question is gdmlcjs's comment about public values address space bounds checking.


@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Claude finished @mansur20478's task in 2m 6s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review diff against base branch
  • Review C header and source files
  • Review Rust-side changes (compile.rs, project.rs, Cargo.toml)
  • Review test additions
  • Check outstanding reviewer feedback
  • Post final review

Good iteration across the 6 commits. The architecture is clean: header declares per-width functions, project.rs selects the implementation at compile time via #[cfg(feature = "unprotected")], and the Makefile's *.c wildcard picks up whichever .c gets emitted. The bounds-check logic correctly mirrors the interpreter's semantics, and the abort() on OOB matches panic_oob in memmap.rs:193-194.

Issues to address

1. __builtin_assume in every accessor is redundant with mem_ptr

Every accessor (openvm_state.h:89-173) follows this pattern:

check_mem_bounds_u32(addr);
__builtin_assume((addr & ~(uint32_t)MEMORY_MASK) == 0u);
// ... mem_ptr(state, addr) ...

But mem_ptr at line 74 already has assume(addr <= MEMORY_MASK), which is semantically equivalent. These __builtin_assume calls are pure duplication — they don't give the compiler any new information since mem_ptr is inlined into the same function body. Removing them would clean up 10 lines of noise without any codegen change.

Additionally, in unprotected mode, the check is a no-op, so addr has NOT been validated — but the __builtin_assume unconditionally asserts it's in-bounds, which is technically UB if the assumption is false. This is harmless because mem_ptr already does the same thing, but it's worth being aware of.

Fix this →

2. Test doesn't verify the child actually failed

test_out_of_bound_mem_access (lib.rs:383-393) unconditionally panics with the child's stderr, regardless of whether the child exited with success or failure:

let output = Command::new(std::env::current_exe().unwrap())
    .args(["--exact", "tests::test_out_of_bound_mem_access", "--nocapture"])
    .env("OPENVM_OOB_CHILD", "1")
    .output()
    .expect("failed to spawn self");
panic!("{}", String::from_utf8_lossy(&output.stderr));

If the child somehow succeeds (e.g., the OOB address lands within bounds due to a config change, or the bounds check is accidentally compiled as a no-op), the test would still panic with whatever noise is on stderr and #[should_panic] would match if any unrelated stderr output contains "Memory access out of bounds". Consider adding a check:

if output.status.success() {
    panic!("child process succeeded — OOB access was not caught");
}

This makes the test assertion tighter and prevents false passes.

3. Duplicated unlikely function

openvm_check_mem_bounds_protected.c:6-8 defines its own unlikely() that's identical to the one in openvm_state.h:19. Since the .c file doesn't include openvm_state.h this compiles fine, but it's a maintenance hazard if someone changes one and not the other. Consider either:

  • Using __builtin_expect(!!(x), 0) directly in the .c file (only 6 call sites), or
  • Extracting unlikely/likely/assume into a shared utility header (e.g., openvm_builtins.h)

Not blocking, but worth a note.

4. Missing trailing newline in Cargo.toml

crates/rvr/rvr-openvm/Cargo.toml ends with unprotected = [] with no trailing newline (visible as \ No newline at end of file in the diff). Minor cosmetic issue.

Fix this →

Reviewer feedback status

Reviewer Feedback Status
shuklaayush Avoid #ifndef, use separate .c files ✅ Addressed in 7a7aaac
shuklaayush Per-width check_mem_bounds_* functions ✅ Addressed
shuklaayush Remove state parameter ✅ Addressed
shuklaayush Add __builtin_assume after check ✅ Added (see note above about redundancy with mem_ptr)
shuklaayush should_panic test under feature = "rvr" ✅ Added in 14cc871
gdmlcjs Specialization in project.rs like tracer header ✅ Addressed in 9a3f170
gdmlcjs Bounds checking for public values address space ❓ Not addressed — openvm_reveal goes through a host callback (openvm_io.c:44-45), not through rd_mem/wr_mem. Whether bounds checking belongs there depends on the host callback's validation contract. Worth clarifying.

What looks good

  • Bounds check logic: start > mem_size - sizeof(T) for fixed-width types is the correct single-comparison form. The range variant (protected.c:53) correctly uses the two-check form base_addr > mem_size || size > mem_size - base_addr, matching the interpreter at memmap.rs:41.
  • Cold-path split: abort_oob is noinline, cold, noreturn — keeps the hot path tight. Correctly mirrors #[cold] #[inline(never)] fn panic_oob in memmap.rs:191-192.
  • (size_t)MEMORY_MASK + 1u: Defensive cast prevents 32-bit overflow. With MEM_BITS=29 (MEMORY_MASK=0x1FFFFFFF) this was safe before, but the cast future-proofs it.
  • Feature-flag chain: openvm-cli → openvm-sdk → openvm-circuit → rvr-openvm via unprotected = ["rvr-openvm/unprotected"]. project.rs uses #[cfg(feature = "unprotected")] to select the .c file at build time — clean and follows the tracer header pattern.
  • No Makefile changes needed: The SRCS = $(wildcard *.c) picks up whichever .c file project.rs emits. The previous RVR_CFLAGS plumbing was correctly removed.
  • Test design: The parent/child process pattern is a pragmatic solution for testing abort() from C code. The black_box usage in the guest program correctly prevents the compiler from optimizing away the wild pointer read.

Summary: The implementation is correct and well-structured. Main items: (1) remove the redundant __builtin_assume lines from all accessors, (2) tighten the test to verify the child actually failed. Nice work iterating on the review feedback.


@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Claude finished @mansur20478's task in 2m 31s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review diff against base branch
  • Review C header and source files
  • Review Rust-side changes (compile.rs, project.rs, Cargo.toml)
  • Review test additions
  • Review public values address space changes
  • Check outstanding reviewer feedback
  • Post final review

Good iteration across 7 commits. The latest commit (33fc9c9) adds Rust-side bounds checking for the public values address space (host callbacks), addressing gdmlcjs's review comment. The overall architecture is clean: header declares per-width functions, project.rs selects protected vs. unprotected .c at compile time via #[cfg], and the Makefile's *.c wildcard picks up whichever gets emitted.

Bug: Missing guest program out_of_bound_print_str

test_out_of_bound_print_str (extensions/rv32im/tests/src/lib.rs:402-447) references a guest program "out_of_bound_print_str", but this file does not exist under extensions/rv32im/tests/programs/examples/. Only out_of_bound_mem_access.rs is present. This test will fail at the build_example_program_at_path call with a file-not-found error. You need to create the guest program (e.g., extensions/rv32im/tests/programs/examples/out_of_bound_print_str.rs) that triggers print_str with an OOB pointer.

Fix this →

__builtin_assume in every accessor is redundant with mem_ptr

Every accessor in openvm_state.h (lines 90, 97, 104, 114, 124, 136, 143, 151, 164, 173) has:

check_mem_bounds_u32(addr);
__builtin_assume((addr & ~(uint32_t)MEMORY_MASK) == 0u);

But mem_ptr at line 74 already has assume(addr <= MEMORY_MASK), which is semantically equivalent. These __builtin_assume calls are pure duplication — they don't give the compiler any new information since mem_ptr is inlined into the same function body. Removing them would clean up 10 lines of noise without any codegen change.

Additionally, in unprotected mode, the check is a no-op so addr has NOT been validated — but the __builtin_assume unconditionally asserts it's in-bounds. This is technically UB if the assumption is false. It's harmless in practice because mem_ptr already does the same thing, but it's still worth removing to avoid the duplication and ambiguity about which hint is load-bearing.

Fix this →

Duplicated unlikely function

openvm_check_mem_bounds_protected.c:6-8 defines its own unlikely() that's identical to openvm_state.h:19. Since the .c file doesn't include openvm_state.h this compiles fine, but it's a maintenance hazard if someone changes one and not the other. Consider either:

  • Using __builtin_expect(!!(x), 0) directly in the .c file (only 6 call sites), or
  • Extracting unlikely/likely/assume into a shared utility header (e.g., openvm_builtins.h)

Not blocking.

Minor: host_reveal has a potential usize overflow on 32-bit

At io.rs:156:

let start = ptr as usize + offset as usize;

On 64-bit this is safe, but on a 32-bit host both ptr and offset are u32, so ptr + offset could overflow usize. This is academic since the code targets 64-bit hosts, but a .checked_add() or explicit widening to u64 would be strictly correct.

Not blocking.

Reviewer feedback status

Reviewer Feedback Status
shuklaayush Separate .c files instead of #ifndef ✅ Addressed in 7a7aaac
shuklaayush Per-width check_mem_bounds_* functions ✅ Addressed
shuklaayush Remove state parameter ✅ Addressed
shuklaayush Add __builtin_assume after check ✅ Added (see note above about redundancy)
shuklaayush should_panic test under feature = "rvr" ✅ Added in 14cc871 + 33fc9c9
gdmlcjs Specialization in project.rs like tracer header ✅ Addressed in 9a3f170
gdmlcjs Bounds checking for public values address space ✅ Addressed in 33fc9c9host_print_str, host_hint_storew, host_hint_buffer now call check_mem_bounds_range. host_reveal already had an assert! for public_values bounds.

What looks good

  • Bounds check logic: start > mem_size - sizeof(T) for fixed-width types is the correct single-comparison form. The range variant (protected.c:53) correctly uses the two-check form base_addr > mem_size || size > mem_size - base_addr matching the interpreter at memmap.rs:41.
  • Cold-path split: abort_oob is noinline, cold, noreturn — keeps the hot path tight. Correctly mirrors #[cold] #[inline(never)] fn panic_oob in memmap.rs.
  • (size_t)MEMORY_MASK + 1u: Defensive cast prevents 32-bit overflow.
  • Feature-flag chain: openvm-cli → openvm-sdk → openvm-circuit → rvr-openvm via unprotected = ["rvr-openvm/unprotected"]. project.rs uses #[cfg(feature = "unprotected")] to select the .c file at build time — clean and follows the tracer header pattern.
  • Rust-side host callback bounds checking (io.rs:43-63): Clean dual implementation with #[cfg]/#[cfg(not)] for protected/unprotected. Correctly checks host_print_str, host_hint_storew, and host_hint_buffer.
  • Test design: The parent/child process pattern is a pragmatic solution for testing abort() from C code. The black_box usage correctly prevents the compiler from optimizing away the OOB access.
  • Benchmarks: No measurable regression across all benchmarks.

Summary: The implementation is correct and well-structured. One blocking item: the missing out_of_bound_print_str guest program means test_out_of_bound_print_str won't compile. Two non-blocking items: the redundant __builtin_assume lines can be cleaned up, and the duplicated unlikely function is a minor maintenance hazard.


@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Claude finished @mansur20478's task in 2m 37s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read the full diff
  • Review C header and source files (openvm_state.h, openvm_check_mem_bounds*.h/.c)
  • Review Rust-side changes (compile.rs, project.rs, io.rs, Cargo.toml)
  • Review test additions (guest programs + test harness)
  • Check outstanding reviewer feedback
  • Post final review

Clean PR with good iteration across 8 commits. The architecture is well-structured: header declares per-width check_mem_bounds_* functions, project.rs selects protected vs. unprotected .c at compile time via #[cfg], and the Makefile's *.c wildcard picks up whichever gets emitted. The bounds-check logic correctly mirrors the interpreter's MmapMemory::check_bounds (memmap.rs:39-43), and the abort() on OOB matches the interpreter's panic_oob. Benchmarks show no measurable regression.

Items to address

1. __builtin_assume in every accessor is redundant with mem_ptr

Every accessor in openvm_state.h (lines 90, 97, 104, 114, 124, 136, 143, 151, 164, 173) has:

check_mem_bounds_u32(addr);
__builtin_assume((addr & ~(uint32_t)MEMORY_MASK) == 0u);

But mem_ptr at line 74 already has assume(addr <= MEMORY_MASK), which is semantically equivalent — both tell the compiler the high bits are zero. Since mem_ptr is inlined into the same function body, these __builtin_assume calls give the compiler no new information and are pure duplication.

Additionally, in unprotected mode, the check is a no-op so addr has NOT been validated — the __builtin_assume unconditionally asserts a property that may not hold, which is technically UB. This is "harmless" only because mem_ptr already does the same thing — but the duplication makes it harder to reason about which hint is load-bearing.

Recommendation: remove all 10 __builtin_assume lines from the accessors. mem_ptr's assume is the single source of truth.

Fix this →

2. Duplicated unlikely function across TUs

openvm_check_mem_bounds_protected.c:6-8 defines its own unlikely() that's identical to openvm_state.h:19. Since the .c file doesn't include openvm_state.h this compiles fine, but it's a maintenance hazard — someone could change one and not the other. Consider either using __builtin_expect(!!(x), 0) directly in the .c file (only 6 call sites), or extracting unlikely/likely/assume into a shared utility header.

3. Missing trailing newline in Cargo.toml

crates/rvr/rvr-openvm/Cargo.toml ends with unprotected = [] with no trailing newline (visible as \ No newline at end of file in the diff).

Fix this →

Minor observations (non-blocking)

check_mem_bounds_u32_range semantics with num_words == 0: When num_words == 0, size == 0, and the check passes for any base_addr <= mem_size — including base_addr == mem_size (one past the buffer). This is fine since the subsequent memcpy with size 0 is a no-op, but it's a subtle divergence from the fixed-width checks which always validate against the type's width. Not a bug, just noting for awareness.

check_mem_bounds_* functions are not LTO-inlinable: Since these are externally linked functions in a separate TU, the compiler can't inline them at the call site without thin-LTO. With the current setup (the protected checks are small — one comparison + branch), the function call overhead per memory access is minimal, but worth noting if profiling ever shows hot-path overhead from the indirect call.

Semantic gap with interpreter in unprotected mode: The interpreter's unprotected check_bounds (memmap.rs:46-57) retains a debug_assert! that fires in debug builds. The C unprotected stubs are unconditional no-ops. This means debug builds of the C code won't catch OOB in unprotected mode, while the Rust interpreter would.

Reviewer feedback status

Reviewer Feedback Status
shuklaayush Separate .c files instead of #ifndef ✅ Addressed
shuklaayush Per-width check_mem_bounds_* functions ✅ Addressed
shuklaayush Remove state parameter ✅ Addressed
shuklaayush Add __builtin_assume after check ✅ Added (see note above about redundancy)
shuklaayush should_panic test under feature = "rvr" ✅ Added (test_out_of_bound_mem_access + test_out_of_bound_print_str)
gdmlcjs Specialization in project.rs like tracer header ✅ Addressed via #[cfg(feature = "unprotected")] on include_str!
gdmlcjs Bounds checking for public values address space ✅ Addressed — host_print_str, host_hint_storew, host_hint_buffer now call check_mem_bounds_range. host_reveal already had an assert! for public_values bounds.

What looks good

  • Bounds check logic: start > mem_size - sizeof(T) for fixed-width types is the correct single-comparison form. The range variant (protected.c:53) correctly uses the two-check form base_addr > mem_size || size > mem_size - base_addr, matching the interpreter at memmap.rs:41.
  • Cold-path split: abort_oob is noinline, cold, noreturn — keeps the hot path tight. Correctly mirrors #[cold] #[inline(never)] fn panic_oob in memmap.rs.
  • (size_t)MEMORY_MASK + 1u: Defensive cast prevents 32-bit overflow. With MEM_BITS=29 (MEMORY_MASK=0x1FFFFFFF) this was safe before, but the cast future-proofs it.
  • Feature-flag chain: openvm-cli → openvm-sdk → openvm-circuit → rvr-openvm via unprotected = [...] at each level. project.rs uses #[cfg(feature = "unprotected")] to select the .c file at build time — clean and follows the tracer header pattern.
  • Rust-side host callback bounds checking (io.rs:43-63): Clean dual implementation with #[cfg]/#[cfg(not)] for protected/unprotected. Correctly checks host_print_str, host_hint_storew, and host_hint_buffer. The condition start > MEM_SIZE || num_bytes > MEM_SIZE - start exactly matches the interpreter.
  • Test design: The parent/child process pattern is a pragmatic solution for testing abort() from C code. The black_box usage in guest programs correctly prevents the compiler from optimizing away the OOB access. Both C-side OOB (mem access) and Rust-side OOB (print_str hostcall) paths are tested.
  • No Makefile changes needed: The SRCS = $(wildcard *.c) picks up whichever .c file project.rs emits. The previous RVR_CFLAGS plumbing was correctly removed.
  • Benchmarks: No measurable regression across all benchmarks.

Summary: The implementation is correct and well-structured, faithfully mirroring the interpreter's bounds-checking semantics. The main cleanup item is the redundant __builtin_assume in every accessor (item 1). The duplicated unlikely function (item 2) and missing trailing newline (item 3) are minor. All reviewer feedback has been addressed. Nice work iterating on the review feedback.


@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Claude finished @mansur20478's task in 2m 32s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review diff against base branch
  • Review C header and source files
  • Review Rust-side changes
  • Review test additions
  • Check outstanding reviewer feedback
  • Post final review

Clean PR with good iteration across 9 commits. The architecture is well-structured: a header declares per-width check_mem_bounds_* functions, project.rs selects the protected vs. unprotected .c implementation at compile time via #[cfg(feature = "unprotected")], and the Makefile's *.c wildcard picks up whichever file gets emitted. The bounds-check logic correctly mirrors the interpreter's semantics, and abort() on OOB matches the interpreter's panic behavior. Benchmarks confirm no measurable regression.

1. __builtin_assume in every accessor is redundant with mem_ptr

Every accessor in openvm_state.h (lines 90, 97, 104, 114, 124, 136, 143, 151, 164, 173) follows:

check_mem_bounds_u32(addr);
__builtin_assume((addr & ~(uint32_t)MEMORY_MASK) == 0u);

But mem_ptr at line 74 already has assume(addr <= MEMORY_MASK), which is semantically equivalent — both tell the compiler the high bits of addr are zero. Since mem_ptr is inlined into the same function body, these __builtin_assume calls give the compiler no new information.

Additionally, in unprotected mode the check is a no-op, so addr has NOT been validated — but the __builtin_assume unconditionally asserts a property that may not hold (UB). This is "safe" only because mem_ptr already does the same thing. Removing these 10 lines eliminates duplication and makes mem_ptr's assume the single source of truth.

Fix this →

2. Duplicated unlikely function across TUs

openvm_check_mem_bounds_protected.c:6-8 defines its own unlikely():

static __attribute__((always_inline)) inline int unlikely(int x) {
    return __builtin_expect(!!(x), 0);
}

This is identical to openvm_state.h:19. Since the .c file doesn't include openvm_state.h it compiles fine, but it's a maintenance hazard — someone could change one and not the other. Consider either:

  • Using __builtin_expect(!!(x), 0) directly in the .c file (only 6 call sites), or
  • Extracting unlikely/likely/assume into a shared utility header (e.g., openvm_builtins.h)

3. check_mem_bounds_* calls are not inlinable across TU boundaries without LTO

Since check_mem_bounds_* are externally linked functions in a separate .c file, the compiler can't inline them at the call sites in openvm_state.h without thin-LTO. Looking at compile.rs:170-172:

if cfg!(target_os = "macos") {
    project.enable_lto = false;
}

On macOS, LTO is disabled, so every memory access incurs a function call to check_mem_bounds_*. The protected check body is small (one comparison + branch), so the overhead per memory access is low, but it's worth being aware of if profiling ever shows hot-path overhead on macOS.

Not a blocker — the separate-TU approach was explicitly requested by shuklaayush.

4. Minor: Rust-side check_mem_bounds_range uses different MEM_BITS source

The C side uses MEMORY_MASK derived from rvr_openvm_ext_ffi_common::MEM_BITS (line constants.rs:7,32), while the Rust-side check_mem_bounds_range in io.rs:52 uses openvm_platform::memory::MEM_BITS. Both are currently 29, but if they ever diverge, the C and Rust bounds checks would use different memory sizes. Consider using the same constant source, or adding a const_assert! that they match.

Reviewer feedback status

Reviewer Feedback Status
shuklaayush Separate .c files instead of #ifndef ✅ Addressed
shuklaayush Per-width check_mem_bounds_* functions ✅ Addressed
shuklaayush Remove state parameter ✅ Addressed
shuklaayush Add __builtin_assume after check ✅ Added (see note above about redundancy)
shuklaayush should_panic test under feature = "rvr" ✅ Added — both C-side OOB and Rust-side hostcall OOB
gdmlcjs Specialization in project.rs like tracer header ✅ Addressed via #[cfg] on include_str!
gdmlcjs Bounds checking for public values address space ✅ Addressed — host_print_str, host_hint_storew, host_hint_buffer now call check_mem_bounds_range

What looks good

  • Bounds check logic: start > mem_size - sizeof(T) for fixed-width types is the correct single-comparison form. The range variant correctly uses the two-check form base_addr > mem_size || size > mem_size - base_addr, avoiding overflow.
  • Cold-path split: abort_oob is noinline, cold, noreturn — keeps the hot path tight. Correctly mirrors the interpreter's #[cold] #[inline(never)] fn panic_oob pattern.
  • (size_t)MEMORY_MASK + 1u: Defensive cast prevents 32-bit overflow. With MEM_BITS=29 this was safe anyway, but the cast future-proofs it.
  • Feature-flag chain: openvm-circuit → rvr-openvm via unprotected = ["rvr-openvm/unprotected"]. project.rs uses #[cfg(feature = "unprotected")] to select the .c file at build time — clean and follows the tracer header pattern.
  • Rust-side host callback bounds checking (io.rs:43-63): Clean dual implementation with #[cfg]/#[cfg(not)]. Correctly checks host_print_str, host_hint_storew, and host_hint_buffer. The check condition exactly matches the interpreter's.
  • Test design: The parent/child process pattern is a pragmatic solution for testing abort() from C code. Both tests correctly verify the child failed (output.status.success() check). black_box prevents the compiler from optimizing away the OOB access. Both C-side OOB (mem access) and Rust-side OOB (print_str hostcall) are covered.
  • All files have trailing newlines — the prior review's concern about missing newlines in .h/.c/Cargo.toml has been addressed.
  • No Makefile changes needed: The SRCS = $(wildcard *.c) picks up whichever .c file project.rs emits.
  • Benchmarks: No measurable regression.

Summary: The implementation is correct and well-structured, faithfully mirroring the interpreter's bounds-checking semantics. The main cleanup item is the redundant __builtin_assume in every accessor (item 1). The duplicated unlikely (item 2) and MEM_BITS source divergence (item 4) are minor. All reviewer feedback has been addressed. Nice work.


@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Claude finished @mansur20478's task in 3m 3s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read changed files
  • Review C header and source files
  • Review Rust-side changes (compile.rs, project.rs, io.rs, Cargo.toml)
  • Review test additions
  • Check outstanding reviewer feedback
  • Post final review

Clean PR with good iteration across 10 commits. The architecture is well-structured: a header (openvm_check_mem_bounds.h) declares per-width functions, project.rs selects the protected vs. unprotected .c implementation at compile time via #[cfg(feature = "unprotected")], and the Makefile's *.c wildcard picks up whichever file gets emitted. The bounds-check logic correctly mirrors the interpreter's semantics, and abort() on OOB matches the interpreter's panic behavior. Benchmarks confirm no measurable regression. All reviewer feedback from shuklaayush and gdmlcjs has been addressed.

1. __builtin_assume in every accessor is redundant with mem_ptr

Every accessor in openvm_state.h (lines 90, 97, 104, 114, 124, 136, 143, 151, 164, 173) follows:

check_mem_bounds_u32(addr);
__builtin_assume((addr & ~(uint32_t)MEMORY_MASK) == 0u);
// ... mem_ptr(state, addr) ...

But mem_ptr at line 74 already has assume(addr <= MEMORY_MASK), which is semantically equivalent — both tell the compiler the high bits of addr are zero. Since mem_ptr is inlined into the same function body, these __builtin_assume calls give the compiler no new information.

Additionally, in unprotected mode the check is a no-op, so addr has NOT been validated — but the __builtin_assume unconditionally asserts a property that may not hold (UB). This is "safe" only because mem_ptr already does the same thing. Removing these 10 lines eliminates duplication and makes mem_ptr's assume the single source of truth.

Fix this →

2. Duplicated unlikely function across TUs

openvm_check_mem_bounds_protected.c:6-8 defines its own unlikely():

static __attribute__((always_inline)) inline int unlikely(int x) {
    return __builtin_expect(!!(x), 0);
}

This is identical to openvm_state.h:19. Since the .c file doesn't include openvm_state.h it compiles fine, but it's a maintenance hazard — someone could change one and not the other. Consider either:

  • Using __builtin_expect(!!(x), 0) directly in the .c file (only 6 call sites), or
  • Extracting unlikely/likely/assume into a shared utility header (e.g., openvm_builtins.h)

Not blocking.

3. check_mem_bounds_* calls are not inlinable across TU boundaries without LTO

Since check_mem_bounds_* are externally linked functions in a separate .c file, the compiler can't inline them at the call sites in openvm_state.h without thin-LTO. Looking at compile.rs:170-172:

if cfg!(target_os = "macos") {
    project.enable_lto = false;
}

On macOS, LTO is disabled, so every memory access incurs a function call to check_mem_bounds_*. The protected check body is small (one comparison + branch), so the overhead per memory access is low, but it's worth being aware of if profiling ever shows hot-path overhead on macOS.

Not a blocker — the separate-TU approach was explicitly requested by shuklaayush.

4. Minor: Two MEM_BITS sources could diverge

The C side uses MEMORY_MASK derived from rvr_openvm_ext_ffi_common::MEM_BITS (in constants.rs:7,32), while the Rust-side check_mem_bounds_range in io.rs:52 uses openvm_platform::memory::MEM_BITS. Both are currently 29, and there's a cross-check in abi_consts, but a const_assert! or static_assert! at the io.rs usage site would make this airtight.

Not blocking.

Reviewer feedback status

Reviewer Feedback Status
shuklaayush Separate .c files instead of #ifndef ✅ Addressed
shuklaayush Per-width check_mem_bounds_* functions ✅ Addressed
shuklaayush Remove state parameter ✅ Addressed
shuklaayush Add __builtin_assume after check ✅ Added (see note above about redundancy with mem_ptr)
shuklaayush should_panic test under feature = "rvr" ✅ Added — both C-side OOB and Rust-side hostcall OOB
gdmlcjs Specialization in project.rs like tracer header ✅ Addressed via #[cfg] on include_str!
gdmlcjs Bounds checking for public values address space ✅ Addressed — host_print_str, host_hint_storew, host_hint_buffer now call check_mem_bounds_range. host_reveal already had an assert!.

What looks good

  • Bounds check logic: start > mem_size - sizeof(T) for fixed-width types is the correct single-comparison form. The range variant (protected.c:53) correctly uses the two-check form base_addr > mem_size || size > mem_size - base_addr, matching the interpreter.
  • Cold-path split: abort_oob is noinline, cold, noreturn — keeps the hot path tight. Correctly mirrors the interpreter's #[cold] #[inline(never)] fn panic_oob pattern.
  • (size_t)MEMORY_MASK + 1u (protected.c:21): Defensive cast prevents 32-bit overflow. With MEM_BITS=29 (MEMORY_MASK=0x1FFFFFFF) this was safe anyway, but the cast future-proofs it.
  • Feature-flag chain: openvm-cli → openvm-sdk → openvm-circuit → rvr-openvm via unprotected = [...] at each level. project.rs uses #[cfg(feature = "unprotected")] to select the .c file at build time — clean and follows the tracer header pattern.
  • Rust-side host callback bounds checking (io.rs:43-63): Clean dual implementation with #[cfg]/#[cfg(not)]. Correctly checks host_print_str, host_hint_storew, and host_hint_buffer. The check condition exactly matches the interpreter's.
  • Test design: The parent/child process pattern is a pragmatic solution for testing abort() from C code. Both tests correctly verify the child failed (output.status.success() check). black_box prevents the compiler from optimizing away the OOB access. Both C-side OOB (mem access) and Rust-side OOB (print_str hostcall) paths are covered.
  • No Makefile changes needed: The SRCS = $(wildcard *.c) picks up whichever .c file project.rs emits.
  • Trailing newlines: All files now have proper trailing newlines (fixed from prior review feedback).
  • Benchmarks: No measurable regression across all benchmarks.

Summary: The implementation is correct and well-structured, faithfully mirroring the interpreter's bounds-checking semantics. The main cleanup item is the redundant __builtin_assume in every accessor (item 1). The duplicated unlikely (item 2) and the cross-TU inlining limitation (item 3) are minor observations. All reviewer feedback has been addressed. Nice work iterating on the review feedback.


@github-actions

This comment has been minimized.

Copy link
Copy Markdown

@gdmlcjs gdmlcjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Only the lint failure seems like a problem

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Claude finished @mansur20478's task in 2m 35s —— View job


Code Review: Memory Bounds Check for RVR (#2803)

  • Gather context and read full diff
  • Review C header and source files
  • Review Rust-side changes (project.rs, io.rs, Cargo.toml)
  • Review test additions
  • Post final review

Clean PR with good iteration across 12 commits. The architecture is well-structured: a header declares per-width check_mem_bounds_* functions, project.rs selects the protected vs. unprotected .c at compile time via #[cfg(feature = "unprotected")], and the Makefile's *.c wildcard picks up whichever gets emitted. Bounds-check logic correctly mirrors the interpreter's MmapMemory::check_bounds (memmap.rs:39-43), and abort() on OOB matches the interpreter's panic_oob. All reviewer feedback from shuklaayush and gdmlcjs has been addressed. Benchmarks show no measurable regression.

1. __builtin_assume in every accessor is redundant with mem_ptr

Every accessor in openvm_state.h follows this pattern:

check_mem_bounds_u32(addr);
__builtin_assume((addr & ~(uint32_t)MEMORY_MASK) == 0u);
// ... mem_ptr(state, addr) ...

But mem_ptr at line 77 already has assume(addr <= MEMORY_MASK), which is semantically equivalent — both tell the compiler the high bits of addr are zero. Since mem_ptr is inlined into the same function body, these __builtin_assume calls give the compiler no new information and are pure duplication.

Additionally, in unprotected mode the check is a no-op, so addr has NOT been validated — but the __builtin_assume unconditionally asserts a property that may not hold (UB). This is "safe" only because mem_ptr already does the same thing. Removing all 10 __builtin_assume lines from the accessors makes mem_ptr's assume the single source of truth.

Fix this →

2. Duplicated unlikely function across TUs

openvm_check_mem_bounds_protected.c:6-8 defines its own unlikely():

static __attribute__((always_inline)) inline int unlikely(int x) {
    return __builtin_expect(!!(x), 0);
}

This is identical to openvm_state.h:19. Since the .c file doesn't include openvm_state.h it compiles fine, but it's a maintenance hazard — someone could change one and not the other. Consider either using __builtin_expect(!!(x), 0) directly in the .c file (only 6 call sites), or extracting unlikely/likely/assume into a shared utility header (e.g., openvm_builtins.h).

Not blocking.

3. check_mem_bounds_* calls are not inlinable across TU boundaries without LTO

Since check_mem_bounds_* are externally linked functions in a separate .c file, the compiler can't inline them at the call sites in openvm_state.h without thin-LTO. Looking at compile.rs:170-172, LTO is disabled on macOS, so every memory access incurs a function call overhead there. The check body is small (one comparison + branch), so the per-access overhead is low, but worth being aware of if profiling ever shows hot-path overhead.

Not a blocker — the separate-TU approach was explicitly requested by shuklaayush.

4. Two MEM_BITS sources could theoretically diverge

The C side derives MEMORY_MASK from rvr_openvm_ffi_common::MEM_BITS (via constants.rs:32,42), while the Rust-side check_mem_bounds_range in io.rs:52 uses openvm_platform::memory::MEM_BITS. Both are currently 29, but if they ever diverge, the C and Rust bounds checks would use different memory sizes. A const_assert! or static_assert! at the io.rs usage site would make this airtight.

Not blocking.

What looks good

  • Bounds check logic: start > mem_size - sizeof(T) for fixed-width types is the correct single-comparison form. The range variant (protected.c:53) correctly uses the two-check form base_addr > mem_size || size > mem_size - base_addr, matching the interpreter at memmap.rs:41.
  • Cold-path split: abort_oob is noinline, cold, noreturn — keeps the hot path tight. Correctly mirrors the interpreter's #[cold] #[inline(never)] fn panic_oob pattern at memmap.rs:191-192.
  • (size_t)MEMORY_MASK + 1u (protected.c:21): Defensive cast prevents 32-bit overflow. With MEM_BITS=29 (MEMORY_MASK=0x1FFFFFFF) this was safe anyway, but the cast future-proofs it.
  • Feature-flag chain: openvm-circuit → rvr-openvm via unprotected = ["rvr-openvm/unprotected"]. project.rs uses #[cfg(feature = "unprotected")] to select the .c file at build time — clean and follows the tracer header pattern per gdmlcjs's request.
  • Rust-side host callback bounds checking (io.rs:43-63): Clean dual implementation with #[cfg]/#[cfg(not)] for protected/unprotected. Correctly checks host_print_str, host_hint_storew, and host_hint_buffer. The check condition exactly matches the interpreter's semantics. host_reveal already had an assert! for public_values bounds.
  • Test design: The parent/child process pattern (lib.rs:354-439) is a pragmatic solution for testing abort() from C code. Both tests correctly verify the child failed before panicking with stderr output. black_box in guest programs prevents the compiler from optimizing away the OOB access. Both C-side OOB (mem access) and Rust-side OOB (reveal hostcall) paths are covered.
  • No Makefile changes needed: The SRCS = $(wildcard *.c) picks up whichever .c file project.rs emits.
  • Benchmarks: No measurable regression across all benchmarks.

Summary: The implementation is correct and well-structured, faithfully mirroring the interpreter's bounds-checking semantics. The main cleanup item is the redundant __builtin_assume in every accessor (item 1). Everything else is minor observations. All reviewer feedback has been addressed. Nice work iterating on the review feedback.


@github-actions
Copy link
Copy Markdown

group app.proof_time_ms app.cycles leaf.proof_time_ms
fibonacci 3,748 12,000,265 918
keccak 18,590 18,655,329 3,275
sha2_bench 10,158 14,793,960 1,451
regex 1,390 4,137,067 353
ecrecover 600 123,583 253
pairing 894 1,745,757 262
kitchen_sink 1,897 2,579,903 409

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: 69da03b

Benchmark Workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants