htdemucs_ft β€” WebGPU / onnxruntime-web build

The htdemucs_ft 4-model fine-tuned ensemble (Meta's Demucs v4), exported to ONNX so it runs in the browser on WebGPU via onnxruntime-web β€” no Python, no server.

Built for loukai's in-browser karaoke creator.

What makes this different from other htdemucs ONNX

There are several htdemucs ONNX exports on the Hub already, but they're CUDA/CPU server exports β€” they fail to load on the onnxruntime-web WebGPU execution provider (in-graph STFT + many ScatterND ops the WebGPU EP can't place; verified: session creation throws in transformer_memcpy). This build is shaped specifically for the browser:

  • STFT/iSTFT pulled out of the graph (done in JS), using the real-magnitude input contract: mix [1,2,343980] + mag [1,4,2048,336] β†’ x [1,4,4,2048,336] (freq mask) + xt [1,4,2,343980] (time). Masking is applied in JS (see demucs-web).
  • fp16 weights for speed/size β€” with the variance/normalization prologue pinned to CPU (forceCpuNodeNames) because that op overflows fp16 on WebGPU and NaNs. fp16 is parity-perfect vs fp32 (corr ~1.0).
  • Legacy torch.onnx export (opset 17, no dynamo) β€” the dynamo path decomposes ops in ways that NaN on WebGPU.

Files

  • htdemucs_ft_{drums,bass,other,vocals}_safe16.onnx β€” the 4 specialist models (~84 MB each, fp16). Stem k is taken from model k (the bag's one-hot weights).
  • ft_cpu_nodes.json β€” per-stem forceCpuNodeNames lists.

Usage

Runs via loukai-htdemucs-ft ensemble runner (createEnsembleSessions / runEnsemble) on top of demucs-web for the STFT. See the loukai repo for the full in-browser pipeline (Demucs + Whisper + CREPE, all WebGPU).

Credit

Models exported from Demucs (htdemucs_ft, MIT). Export approach builds on the timcsy / gianlourbano demucs-web-onnx work.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support