Add AMD ROCm (gfx942) support for the image→3D generation stack#72
Open
ZJLi2013 wants to merge 1 commit into
Open
Add AMD ROCm (gfx942) support for the image→3D generation stack#72ZJLi2013 wants to merge 1 commit into
ZJLi2013 wants to merge 1 commit into
Conversation
Swap the CUDA-only generation stack for verified ROCm builds (spconv_rocm, nvdiffrast@rocm, amd_gsplat, pytorch3d ROCm wheel, FA2-Triton) plus two runtime shims: a kaolin sitecustomize bypass (texture-stage only) and a spconv KRSC->Native checkpoint-load bridge. All additive under docker/; CUDA paths unchanged. Verified e2e on AMD Instinct MI300X / ROCm 6.4.3 / torch 2.6: SAM3D image->3D produces splat.ply (28.9s, 9.74GB VRAM).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enable EmbodiedGen's image→3D generation to run on AMD GPUs (ROCm/HIP), by swapping the
CUDA-only libraries for verified ROCm builds plus two small runtime shims. All changes are
additive (new files under
docker/); no existing CUDA code path is modified.Verified end-to-end on an AMD Instinct MI300X:
python -m embodied_gen.models.sam3d(SAM3D backend, no GPT, no texture-bake) produces
outputs/splat.ply(6.5 MB 3D GaussianSplat) from the bundled
sample_00.jpg.Changes (all new files)
docker/install_rocm.sh— one-shot ROCm install: requirements minus CUDA libs,numpy<2pin, the ROCm dependency swaps (table below), deploys the two shims as
sitecustomize,and runs an import smoke (PASS/FAIL map).
docker/Dockerfile.rocm— full-generation ROCm image (rocm/pytorch:rocm6.4.3...2.6.0)that runs
install_rocm.sh.docker/spconv_rocm_compat.py— converts spconv KRSC checkpoints to the Native layout atload time (see Related issue).
docker/kaolin_stub.py—sitecustomizebypass for the CUDA-onlykaolin(used only inthe texture-backprojection / mesh-IO stage; core geometry path only calls
kaolin.utils.testing.check_tensor).docker/README.rocm.md— user-facing run-through.CUDA → ROCm dependency map
spconv-cu120/121ZJLi2013/spconv_rocm(2.3.8+rocm1, source)nvdiffrastZJLi2013/nvdiffrast@rocmgsplatamd_gsplat(pypi.amd.com/rocm-6.4.3; import name staysgsplat)pytorch3dflash-attnFLASH_ATTENTION_TRITON_AMD_ENABLE=TRUEat install + runtime)xformerssdpanumpy(base = 2.x)<2(diffusers/transformers requirement)kaolin(no ROCm wheel)sitecustomizestub (docker/kaolin_stub.py)diff-gaussian-rasterizationgsplatis the defaultTested on
rocm/pytorch:rocm6.4.3_ubuntu24.04_py3.12_pytorch_release_2.6.0Results
outputs/splat.ply(6.5 MB) fromapps/assets/example_image/sample_00.jpgNotes / scope
Backward-compatible: only adds files under
docker/; CUDA users are unaffected.Out of scope (documented gaps, not regressions): texture-backprojection (
kaolinisCUDA-only and stubbed), GPT quality-checkers (need an API key). Core image→3D
(segmentation → SAM3D geometry + gaussian + mesh export) runs without them.
Optional follow-up (happy to include if desired): make the
kaolinimports inembodied_gen/data/utils.pylazy/optional so the stub isn't needed.Depends on / related: spconv KRSC checkpoint loading on ROCm —
ZJLi2013/spconv_rocm#<pr>.Until merged,
docker/spconv_rocm_compat.pyprovides the equivalent fix consumer-side.License: this PR is for study/research purposes only and adds ROCm build/integration
scripts; it ships no model weights. Any models used (e.g. SAM-3D-Objects, TRELLIS, Kolors,
SD3.5, etc.) remain governed by their own respective licenses — please refer to each model's
license before use.