Working on DGX Spark (ARM64 + CUDA 13) - Setup Notes

#23
by logos-flux - opened

Got VibeVoice-Realtime-0.5B running on DGX Spark with full GPU acceleration. Sharing setup notes since the official docs focus on x86_64.

The Issue:
PyTorch may not have CUDA enabled on Spark. You'll see CUDA available: False even though the GPU is there. This is a common issue.

The Fix:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130

Performance:

  • RTF 0.48x on GB10 (generates audio 2x faster than real-time)
  • ~300ms to first audio chunk in streaming mode
  • 53 seconds of audio generated in 26 seconds

Notes:

  • Flash Attention not needed - SDPA fallback works fine
  • The APEX FusedRMSNorm not available warning is harmless
  • All 7 preset voices work (Emma, Mike, Carter, Davis, Frank, Grace, Samuel)

Built a full voice pipeline (Whisper + Ollama + VibeVoice) with sentence-level streaming that achieves ~766ms to first audio.

Happy to share code if there's interest.

Sign up or log in to comment