XLSR-SLS / meta.yaml

Add meta.yaml (HF download-stats query file) to enable download tracking

2e82427 verified about 3 hours ago

2.09 kB

	system:
	name: "XLSR-SLS"
	slug: "xlsr-sls"
	description: >
	wav2vec 2.0 (XLS-R 300M) self-supervised front-end with the SLS (Sensitive
	Layer Selection) classifier for audio deepfake detection. SLS gates and fuses
	the hidden states of all XLS-R transformer layers — each layer contributing
	distinct discriminative cues — via a per-layer sigmoid attention, sums the
	weighted multi-layer feature, then a BN + max-pool + two-layer MLP head emits
	a 2-way log-softmax. Official QiShanZhang/SLSforASVspoof-2021-DF checkpoint
	(model_15, dev-EER 1.45%), trained on ASVspoof2019 LA, FP32, deterministic
	first-64600-sample window (no random crop).
	code: "https://github.com/QiShanZhang/SLSforASVspoof-2021-DF"
	checkpoint: "https://huggingface.co/SpeechAntiSpoofingBenchmarks/XLSR-SLS"
	params_millions: 340.7900
	paper:
	arxiv_id: "10.1145/3664647.3681345" # no arXiv exists; ACM MM 2024 DOI (per user decision 2026-06-05)
	url: "https://doi.org/10.1145/3664647.3681345"
	bibtex: \|
	@inproceedings{zhang2024audio,
	title={Audio Deepfake Detection with Self-Supervised XLS-R and SLS Classifier},
	author={Zhang, Qishan and Wen, Shuangbing and Hu, Tao},
	booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
	pages={6765--6773},
	year={2024},
	doi={10.1145/3664647.3681345}
	}
	notes: >
	XLS-R 300M (wav2vec 2.0) front-end + SLS (Sensitive Layer Selection) classifier,
	from QiShanZhang/SLSforASVspoof-2021-DF (ACM MM 2024). Architecture is built from
	the base xlsr2_300m.pt model config (shared with the W2V2-AASIST submission),
	then every weight is overwritten by the fine-tuned checkpoint. SLS pools every
	transformer layer's hidden state, gates each by a learned sigmoid attention, and
	fuses them before a small MLP head. Deterministic first-64600-sample window (no
	random crop); the head's fc1 expects this fixed length. score = log-softmax
	output for class 1 (bona fide); higher = more bona fide (source main.py:
	batch_score = batch_out[:, 1]).