DF Arena 1B — Speech Anti-Spoofing Arena results

RAPTOR universal anti-spoofing model. A wav2vec 2.0 XLS-R 1B self-supervised front-end whose per-layer hidden states are combined by learnable attention pooling (a layer-wise sigmoid gate over an attention-pooled summary), then passed through a 4-block Conformer head with a class token to a 2-way classifier. FP32, deterministic first-64600-sample (~4.04 s @ 16 kHz) window, tile-repeat if shorter (no random crop, no resampling). score = softmax(logits)[bonafide]; higher = more bona fide. Official Speech-Arena-2025/DF_Arena_1B_V_1 checkpoint.

Paper: arXiv:2603.06164 · Params: 1148M · Checkpoint: SpeechAntiSpoofingBenchmarks/DF_Arena_1B_V_1

Arena standing

Live leaderboard: DF Arena 1B on the Speech Anti-Spoofing Arena

Per-dataset results (24 datasets, mean EER 3.66%)

Dataset	Metric	Score
J-SPAW_LA	EER	0%
ArAD	EER	3.72%
DFADD	EER	0%
SONAR	EER	1.06%
DeepVoice	EER	8.29%
EmoFake_test	EER	1.9%
LibriSeVoc	EER	0.15%
CD-ADD	EER	1.72%
ODSS	EER	6.03%
InTheWild	EER	0.91%
DECRO	EER	4.41%
CFAD	EER	8.32%
ASVspoof2019_LA	EER	1.06%
HABLA	EER	0.86%
CVoiceFake_small	EER	5.84%
ASVspoof2021_LA	EER	4.81%
PyAra	EER	4.19%
XMAD	EER	1.86%
ASVspoof2021_DF	EER	1.88%
ASVspoof5	EER	17.34%
ADD22_eval_31	EER	1.12%
ADD2023_track12_test_r1	EER	5.09%
EmoSpoofTTS	1-SRR	0.39%
LRLspoof	1-SRR	0.45%

EER = Equal Error Rate (lower better). 1-SRR = spoof-only complement of the Spoof Recall Rate at the model's own DeepVoice EER operating point (lower better). All rows scoring-verified (reproduce --scoring, Δ 0.0) and computed with the TensorRT engine (parity-verified vs PyTorch).

Usage

from transformers import pipeline
import librosa
pipe = pipeline("antispoofing", model="SpeechAntiSpoofingBenchmarks/DF_Arena_1B_V_1", trust_remote_code=True, device="cuda")
audio, sr = librosa.load("sample.wav", sr=16000)
print(pipe(audio))   # {'label': 'bonafide'|'spoof', 'all_scores': {...}}

Citation

@misc{kulkarni2026compactsslbackbonesmatter,
  title={Do Compact SSL Backbones Matter for Audio Deepfake Detection? A Controlled Study with RAPTOR},
  author={Ajinkya Kulkarni and Sandipana Dowerah and Atharva Kulkarni and Tanel Alumäe and Mathew Magimai Doss},
  year={2026},
  eprint={2603.06164},
  archivePrefix={arXiv},
  primaryClass={cs.SD},
  url={https://arxiv.org/abs/2603.06164}
}

Downloads last month: 461

Paper for SpeechAntiSpoofingBenchmarks/DF_Arena_1B_V_1

Do Compact SSL Backbones Matter for Audio Deepfake Detection? A Controlled Study with RAPTOR

Paper • 2603.06164 • Published Mar 6