Working on DGX Spark (ARM64 + CUDA 13) - Setup Notes
#23
by
logos-flux
- opened
Got VibeVoice-Realtime-0.5B running on DGX Spark with full GPU acceleration. Sharing setup notes since the official docs focus on x86_64.
The Issue:
PyTorch may not have CUDA enabled on Spark. You'll see CUDA available: False even though the GPU is there. This is a common issue.
The Fix:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130
Performance:
- RTF 0.48x on GB10 (generates audio 2x faster than real-time)
- ~300ms to first audio chunk in streaming mode
- 53 seconds of audio generated in 26 seconds
Notes:
- Flash Attention not needed - SDPA fallback works fine
- The
APEX FusedRMSNorm not availablewarning is harmless - All 7 preset voices work (Emma, Mike, Carter, Davis, Frank, Grace, Samuel)
Built a full voice pipeline (Whisper + Ollama + VibeVoice) with sentence-level streaming that achieves ~766ms to first audio.
Happy to share code if there's interest.