Add LMCache-compatible offload connector by yhl-amd · Pull Request #1 · yhl-amd/ATOM

yhl-amd · 2026-06-03T08:11:43Z

Summary

Replace the current LMCache offload worker save/load path with LMCache CacheEngine.store() / CacheEngine.retrieve() while keeping ATOM's AITER KV layout as opaque raw bytes.

Key pieces:

add ATOMRawBytesLMCacheMetadata and ATOMLMCacheGPUConnector;
pass the ATOM GPU connector into LMCacheEngineBuilder.get_or_create();
add device staging APIs in ATOMKVByteCodec;
keep scheduler-side load/save/deferred-free semantics;
fix save-frontier accounting so LMCache/HBM-hit prefixes are not re-saved.

Why

The first LMCache-compatible version had a regression where a prefix already hit in LMCache/HBM could still be saved again with skip=0. A bad 128K case looked like:

[OFFLOAD-LOAD-SKIP] seq=5 hbm_cached=129856 lmc_cached=129792 need=-64 reason=hbm_satisfies_after_alloc
[OFFLOAD-SAVE-PROF] req=5 toks=129792 skip=0 store_ms=1442/1463ms

This PR tracks the LMCache hit save frontier and rolls it back only on load failure, so warm requests only save newly computed suffix chunks.

Validation

Host:

python3 -m py_compile atom/kv_transfer/offload/*.py
python3 -m pytest tests/test_lmcache_offload_connector.py -q
# 19 passed, 18 skipped
git diff --check

Docker yhl_kvoff_009:

cd /host_009/ATOM
python3 -m py_compile atom/kv_transfer/offload/*.py
python3 -m pytest tests/test_lmcache_offload_connector.py -q
# 37 passed

Bench Notes

Compared against previous no-fastpath segment_indexed CPU offload:

128K c2/s2 follow avg TTFT: 2.6139s -> 1.9139s; load avg 675.43ms -> 459.30ms.
64K c2/s4 follow avg TTFT: 2.3633s -> 2.2717s; load avg 478.18ms -> 457.24ms.

Server logs confirm enable_prefix_caching=True and enable_chunked_prefill=True; no failed load markers were observed.

Add LMCache-compatible offload connector

d23a9f4

yhl-amd changed the base branch from feature/lmcache-offload-scheme-a to main June 3, 2026 08:14

yhl-amd changed the base branch from main to feature/lmcache-offload-scheme-a June 3, 2026 08:22

yhl-amd mentioned this pull request Jun 4, 2026

Review LMCache offload compatible connector and bounded staging #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LMCache-compatible offload connector#1

Add LMCache-compatible offload connector#1
yhl-amd wants to merge 1 commit into
feature/lmcache-offload-scheme-afrom
feature/lmcache-compatible-connector

yhl-amd commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yhl-amd commented Jun 3, 2026

Summary

Why

Validation

Bench Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant