Skip to content

fix: use chunked reading for SHA256 to prevent OOM crash on large model files#23

Open
amathxbt wants to merge 2 commits into
nesaorg:mainfrom
amathxbt:fix/memory-exhaustion-sha256-checksum
Open

fix: use chunked reading for SHA256 to prevent OOM crash on large model files#23
amathxbt wants to merge 2 commits into
nesaorg:mainfrom
amathxbt:fix/memory-exhaustion-sha256-checksum

Conversation

@amathxbt
Copy link
Copy Markdown

@amathxbt amathxbt commented May 9, 2026

Bug

check_model_files() in demo/nesa/download.py reads entire model files into memory before computing their SHA256 checksum:

# Before (broken)
with open(output_folder / sha256[i][0], "rb") as f:
    bytes = f.read()  # loads entire file into RAM
    file_hash = hashlib.sha256(bytes).hexdigest()

Impact: Model files are routinely 4–70 GB. Loading one fully into RAM before hashing causes an out-of-memory crash on most consumer hardware, making integrity verification completely unusable.

Also note: the variable is named bytes, shadowing the Python built-in.

Fix

Replaced with hashlib's incremental update API using 8 KB chunks — identical hash output, O(1) memory usage:

# After (fixed)
with open(output_folder / sha256[i][0], "rb") as f:
    hasher = hashlib.sha256()
    for chunk in iter(lambda: f.read(8192), b""):
        hasher.update(chunk)
    file_hash = hasher.hexdigest()

amathxbt added 2 commits May 9, 2026 19:42
…models

check_model_files() was loading entire model files into memory with
f.read() before computing SHA256. Model files can be several gigabytes,
causing out-of-memory errors on systems with limited RAM.

Replaced with chunked 8KB iteration using hashlib's incremental update
API, which computes the same hash while using constant memory.
check_model_files() was loading entire model files into memory before
computing SHA256 checksums. Model files are often several gigabytes,
causing out-of-memory crashes on systems with limited RAM.

Replaced with chunked 8KB iteration using hashlib incremental update
API, which produces the same hash while using O(1) memory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant