Skip to content

(WIP) Modification for larges files and improve support across dtype#6

Open
frheault wants to merge 4 commits into
tee-ar-ex:mainfrom
frheault:fixes_for_benchmark
Open

(WIP) Modification for larges files and improve support across dtype#6
frheault wants to merge 4 commits into
tee-ar-ex:mainfrom
frheault:fixes_for_benchmark

Conversation

@frheault

@frheault frheault commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Disclaimer: The modifications to the code were generated using antigravity-cli using the gemini-3.5-pro model and were double-checked by me and thoroughly tested on various datasets. This is more of an draft exploration to allow benchmarking across languages [https://github.com/tee-ar-ex/trx-manuscript-2026-benchmark](see here).

When I tried to benchmark the JS implementation on large file OR with files from other languages (rust, cpp, etc.), I hit a few errors. For example, very large TRK were hitting V8's heap constraints so I tried to break the file into chunks to allow reading a full TRK, our trx rust implementation saved all internal files of the ZIP as ''large_file'' ZIP64 which was breaking the ZIP header metadata (leading to incorrect coordinates, offsets, headers, etc.).

So these are observations that lead to proposed fixes (but since I do not program in JS I believe an expert eye would help a lot). At least I can guarantee that single language round-trip (load/save) and between language compatibility are now working. Then I benchmarked on big files (not too big to break everything), and my modifications do not seem to be problematic in terms of resources.

@neurolabusc

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the tractography IO utilities to better handle large/ZIP64 TRX archives and to make TRK parsing more memory-efficient, alongside a small repo hygiene change.

Changes:

  • Reworked readTRK() to parse via DataView and pre-size arrays (avoids over-provisioning large typed arrays).
  • Added ZIP central directory parsing to work around ZIP64 size issues when reading TRX; improved dtype handling/alignment and returns additional metadata (groups, positions_dtype).
  • Updated .gitignore with Node-related ignores.

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 5 comments.

File Description
streamlineIO.mjs Updates TRK/TRX readers for large-file handling, ZIP64 size correctness, and broader dtype support.
.gitignore Adds Node-related ignore patterns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread streamlineIO.mjs
Comment on lines +250 to +251
let dataView = new DataView(buffer, hdr_sz);
let totalWords = dataView.byteLength / 4;
Comment thread streamlineIO.mjs
Comment on lines +256 to +261
while (w < totalWords) {
let n_pts = dataView.getInt32(w * 4, true);
num_streamlines++;
num_points += n_pts;
w += 1 + n_pts * (3 + n_scalars) + n_properties;
}
Comment thread streamlineIO.mjs
Comment on lines +891 to +898
if (parts[0] === "dpg") {
dpg.push({
id: parts[1] + "/" + tag, // e.g. "AF_R/volume"
fname: parts.slice(1).join("/"), // e.g. "AF_R/volume.uint32"
vals: vals.slice(),
});
continue;
}
Comment thread streamlineIO.mjs
Comment on lines +786 to +793
function getAlignedArray(constructor, dataArray) {
const bytes = constructor.BYTES_PER_ELEMENT;
if (dataArray.byteOffset % bytes === 0) {
return new constructor(dataArray.buffer, dataArray.byteOffset, dataArray.byteLength / bytes);
} else {
return new constructor(dataArray.slice().buffer);
}
}
Comment thread streamlineIO.mjs
Comment on lines +800 to +810
const fd = fs.openSync(url, 'r');
const chunkSize = 512 * 1024 * 1024;
let offset = 0;
while (offset < size) {
const bytesToRead = Math.min(chunkSize, size - offset);
const bytesRead = fs.readSync(fd, uint8Array, offset, bytesToRead, offset);
if (bytesRead === 0) break;
offset += bytesRead;
}
fs.closeSync(fd);
data = Buffer.from(arrayBuffer);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants