Skip to content

#Issue/1336: Add NVIDIA Marlin repack ops and make Add graph-capture safe#1337

Open
qinyiqun wants to merge 2 commits into
InfiniTensor:mainfrom
qinyiqun:marlin_repack
Open

#Issue/1336: Add NVIDIA Marlin repack ops and make Add graph-capture safe#1337
qinyiqun wants to merge 2 commits into
InfiniTensor:mainfrom
qinyiqun:marlin_repack

Conversation

@qinyiqun

Copy link
Copy Markdown
Collaborator

Summary

  • Add AWQ/GPTQ Marlin repack infiniop operators for NVIDIA.
  • Guard Marlin repack CUDA implementations with ENABLE_NVIDIA_API so non-NVIDIA builds do not compile kernel code.
  • Update NVIDIA Add to avoid CUDA graph replay issues caused by capturing temporary host input vectors.
  • Keep non-NVIDIA Add backend interfaces unchanged via a thin forwarding path.

@qinyiqun qinyiqun requested a review from a team June 25, 2026 06:37
xgqdut2016 and others added 2 commits June 25, 2026 06:50
Squash AWQ/GPTQ Marlin repack operator commits from issue/1192-clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants