Skip to content

[CUDA] New 4bit GEMM kernels for inference#1949

Merged
matthewdouglas merged 8 commits into
mainfrom
gemm4bit
May 21, 2026
Merged

[CUDA] New 4bit GEMM kernels for inference#1949
matthewdouglas merged 8 commits into
mainfrom
gemm4bit

XPU/MPS: add K%blocksize guard for gemv fallback in gemm_4bit

f1b7d9e
Select commit
Loading
Failed to load commit list.
Sign in for the full log view

Annotations

1 warning and 1 notice
build-xpu (windows-2025)
succeeded May 20, 2026 in 6m 38s