Hi PTV3 team,
Following up on the spirit of #159 (CUDA 12.8 / Blackwell compatibility), I wanted to share that PTv3 standalone model.py runs on AMD Instinct GPUs with ROCm — zero changes to model code, just 3 dependency swaps.
Environment
-- | --
GPU | AMD Instinct MI300X
PyTorch | 2.10.0+rocm7.2.0
ROCm / HIP | 7.2.26015
Docker | rocm/pytorch:rocm7.2_ubuntu24.04_py3.13_pytorch_release_2.10.0
3-Step Drop-in Guide
1. spconv → spconv_rocm
git clone -b rocm https://github.com/ZJLi2013/spconv_rocm.git
pip install -e spconv_rocm/
spconv_rocm replaces the original pccm/cumm CUDA codegen (~33k lines) with self-contained HIP kernels (indice pairs via Murmur3 hash table), C++ JIT extensions, and torch.mm GEMM dispatch (~1.5k lines). PTv3 only uses SubMConv3d, which maps cleanly. 28 unit tests pass on MI300x.
2. flash-attn → Triton AMD path
FLASH_ATTENTION_TRITON_AMD_ENABLE=TRUE pip install flash-attn --no-build-isolation
export FLASH_ATTENTION_TRITON_AMD_ENABLE=TRUE
3. torch_scatter → pure PyTorch shim
Test Results
Tested on ModelNet40 (40 object categories, 10k points each,
| flash_attn |
Samples |
Categories |
Pass Rate |
| True |
20 |
1 (airplane) |
100% |
| False |
20 |
1 (airplane) |
100% |
| True |
40 |
40 (all) |
100% |

Hi PTV3 team,
Following up on the spirit of #159 (CUDA 12.8 / Blackwell compatibility), I wanted to share that PTv3 standalone model.py runs on AMD Instinct GPUs with ROCm — zero changes to model code, just 3 dependency swaps.
Environment
-- | --
GPU | AMD Instinct MI300X
PyTorch | 2.10.0+rocm7.2.0
ROCm / HIP | 7.2.26015
Docker | rocm/pytorch:rocm7.2_ubuntu24.04_py3.13_pytorch_release_2.10.0
3-Step Drop-in Guide
1. spconv → spconv_rocm
spconv_rocm replaces the original pccm/cumm CUDA codegen (~33k lines) with self-contained HIP kernels (indice pairs via Murmur3 hash table), C++ JIT extensions, and torch.mm GEMM dispatch (~1.5k lines). PTv3 only uses SubMConv3d, which maps cleanly. 28 unit tests pass on MI300x.
2. flash-attn → Triton AMD path
FLASH_ATTENTION_TRITON_AMD_ENABLE=TRUE pip install flash-attn --no-build-isolation export FLASH_ATTENTION_TRITON_AMD_ENABLE=TRUE3. torch_scatter → pure PyTorch shim
Test Results
Tested on ModelNet40 (40 object categories, 10k points each,