Pure Rust Inference Engine
-
Updated
Jun 2, 2026 - Rust
Pure Rust Inference Engine
One-command vLLM installation for NVIDIA DGX Spark with Blackwell GB10 GPUs (sm_121 architecture)
Headless remote desktop setup for NVIDIA DGX SPARK using Sunshine streaming
Reproducible local LLM setup and benchmark evidence for AMD Strix Halo / Ryzen AI MAX+ 395: 63-98.5 t/s direct Qwen MoE, 101.1 t/s MTP.
Serve the home! Inference stack for your Nvidia DGX Spark aka the Grace Blackwell AI supercomputer on your desk. Mostly vLLM based for now and single-spark. For the not-so-rich buddies
Bleeding-edge ComfyUI for NVIDIA DGX Spark (GB10/Blackwell/sm_121a). CUDA 13 + SageAttention v3 (sm_121a) + NVFP4 + 14 custom-node packs + Flux 2 Dev / LTX 2.3 22B / ACE-Step v1.5 XL Turbo pre-bundled with abliterated text-encoder paths.
Local diagnostic CLI for NVIDIA DGX Spark (GB10). Detects power caps, unified memory pressure, thermal risk, Docker/runtime issues, and validates vLLM/Ollama/llama.cpp/SGLang recipes.
vLLM + Qwen3.5-122B-A10B-NVFP4 on NVIDIA DGX Spark (GB10/SM121) — single-GPU NVFP4 W4A4 with MTP speculative decoding, self-contained Docker build
Some benchmark results of small models and quants that fit on DGX Spark
headless remote desktop to your dgx spark in crystal clear 4k
GPU-accelerated WhisperX on NVIDIA Blackwell (SM_121) - DGX Spark compatible
Private AI stack in one command - GB10 arm64 dgx-spark
Operator-grade GPU monitor for NVIDIA GPUs with native GB10 / DGX Spark coherent UMA support — PSI pressure, clock detection, ConnectX-7 network layer
Add a description, image, and links to the dgx-spark topic page so that developers can more easily learn about it.
To associate your repository with the dgx-spark topic, visit your repo's landing page and select "manage topics."