sm121
Here are 8 public repositories matching this topic...
7.67× LoRA / 8.35× Full FT speedup for Qwen3.5 (0.8B–27B) on NVIDIA DGX Spark — wall-clock parity with rented H100. Lossless within BF16. Three-command interactive wizard handles model picker, data validator, training, and merge.
-
Updated
May 19, 2026 - Python
DGX Spark (GB10/SM121) platform support for Meta's KernelAgent — auto-detect, hardware constraints, safe Triton configs
-
Updated
Mar 14, 2026 - Python
Empirical kernel scheduling characterization for NVIDIA GB10 (SM121a). Sweeps GEMM tile configurations, classifies PTX instruction paths, captures hardware telemetry
-
Updated
May 10, 2026 - C++
Pre-built PyTorch wheels and build scripts for NVIDIA DGX Spark (GB10, sm_121, Blackwell, CUDA 13.0, ARM64)
-
Updated
May 23, 2026 - Shell
Patches + recipe to deploy festr2/MiMo-V2.5-Pro-NVFP4-MXFP8-attn-TP8 on 8-node DGX Spark sm_121 (Ray + vLLM, TP=8). Fixes the fused-qkv loader bug that mis-slotted Q values as K/V on 7 of 8 ranks.
-
Updated
May 19, 2026 - Python
Improve this page
Add a description, image, and links to the sm121 topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the sm121 topic, visit your repo's landing page and select "manage topics."