Skip to content

perf: stream SFTP uploads/downloads instead of buffering whole file#195

Open
Yaminyam wants to merge 1 commit intolablup:mainfrom
Yaminyam:feat/streaming-file-transfer
Open

perf: stream SFTP uploads/downloads instead of buffering whole file#195
Yaminyam wants to merge 1 commit intolablup:mainfrom
Yaminyam:feat/streaming-file-transfer

Conversation

@Yaminyam
Copy link
Copy Markdown
Member

@Yaminyam Yaminyam commented May 6, 2026

Summary

Upload (upload_file, upload_dir_recursive) used tokio::fs::read to load the entire local file into a Vec<u8> before calling write_all, and download (download_file, download_dir_recursive) used read_to_end into a pooled buffer + clone() to a separate Vec before writing locally. For multi-GB transfers this means peak RSS scales with file size and large files OOM the client.

Replace each path with a small stream_copy() helper that loops on 256 KiB reads and writes through the existing AsyncRead/AsyncWrite implementations on tokio::fs::File and russh_sftp::client::fs::File. Buffer size matches the SFTP MAX_WRITE_LENGTH so each chunk maps to a single SFTP packet without further fragmentation.

Measured impact

Verified locally on macOS arm64 against bssh-server v2.1.3 over loopback with a 1 GiB file:

Op Build real RSS
upload unpatched 38.65s 3.23 GB
upload streaming 3.47s 20 MB
download unpatched 3.93s 2.17 GB
download streaming 3.41s 16 MB

Peak RSS drops ~160x and uploads complete ~11x faster (a single multi-MB write_all apparently serializes much worse through the SFTP pipeline than 256 KiB chunked writes).

Test plan

  • `cargo check` clean
  • `cargo build --release` clean
  • 1 GiB upload via patched client to bssh-server v2.1.3 loopback verified file integrity (md5 matches source)
  • 1 GiB download verified
  • CI / cargo test on PR

Upload (`upload_file`, `upload_dir_recursive`) used `tokio::fs::read` to
load the entire local file into a `Vec<u8>` before calling `write_all`,
and download (`download_file`, `download_dir_recursive`) used
`read_to_end` into a pooled buffer + `clone()` to a separate `Vec`
before writing locally. For multi-GB transfers this means peak RSS
scales with file size and large files OOM the client.

Replace each path with a small `stream_copy()` helper that loops on
256 KiB reads and writes through the existing `AsyncRead`/`AsyncWrite`
implementations on `tokio::fs::File` and `russh_sftp::client::fs::File`.
Buffer size matches the SFTP MAX_WRITE_LENGTH so each chunk maps to a
single SFTP packet without further fragmentation.

Verified locally on macOS arm64 against `bssh-server` v2.1.3 over
loopback with a 1 GiB file:

| Op       | Build      | real    | RSS      |
|----------|------------|---------|----------|
| upload   | unpatched  | 38.65s  | 3.23 GB  |
| upload   | streaming  |  3.47s  |   20 MB  |
| download | unpatched  |  3.93s  | 2.17 GB  |
| download | streaming  |  3.41s  |   16 MB  |

Peak RSS drops ~160x and uploads complete ~11x faster (a single
multi-MB `write_all` apparently serializes much worse through the SFTP
pipeline than 256 KiB chunked writes).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant