YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

USEF-TSE ONNX exports (audio-only, 8 kHz)

ONNX exports of the USEF-TSE target speaker extraction models from github.com/ZBang/USEF-TSE.

License

CC BY-NC 4.0 โ€” inherited from the upstream weights. Non-commercial use only.

Models

Two architectures ร— three training datasets = six exports. All inputs are 8 kHz float32 mono PCM.

File Architecture Training set Size
usef_tse_tfgridnet_wsj0-2mix.onnx TF-GridNet WSJ0-2mix (clean) 60 MB
usef_tse_tfgridnet_wham.onnx TF-GridNet WHAM! (noisy) 60 MB
usef_tse_tfgridnet_whamr.onnx TF-GridNet WHAMR! (noisy+reverb) 60 MB
usef_tse_sepformer_wsj0-2mix.onnx SepFormer WSJ0-2mix (clean) 131 MB
usef_tse_sepformer_wham.onnx SepFormer WHAM! (noisy) 131 MB
usef_tse_sepformer_whamr.onnx SepFormer WHAMR! (noisy+reverb) 131 MB

Per-dataset manifests in manifest_*.json carry SHA-256s and the parity numbers PyTorch โ†” ONNX hit on real audio fixtures.

Inference contract

  • Inputs:
    • mixture: [1, 16000] float32 โ€” 2 seconds @ 8 kHz mono
    • enrollment: [1, 64000] float32 โ€” 8 seconds @ 8 kHz mono (zero-pad if shorter)
  • Output:
    • extracted: [1, 16000] float32 โ€” 2 seconds @ 8 kHz mono (same length as mixture)

The 2 s mixture window is fixed because TF-GridNet bakes unfold constants in its ONNX graph. Longer audio must be chunked into 2 s windows and the outputs concatenated.

Exporter

Generated by iOS/scripts/export_usef_tse_onnx.py via legacy TorchScript exporter at opset 17, with TF-GridNet's torch.stft/torch.istft replaced by conv1d/conv_transpose1d-based equivalents (the legacy exporter rejects complex-typed STFT outputs).

PyTorch โ†” ONNX parity on real 16 kHz audio fixtures (downsampled to 8 kHz for inference): cosine similarity = 1.000 across all 18 cells; max absolute difference โ‰ค 2.2e-3.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support