Add meta.yaml (HF download-stats query file) to enable download tracking
Browse files
meta.yaml
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
system:
|
| 2 |
+
name: "Nes2Net"
|
| 3 |
+
slug: "nes2net"
|
| 4 |
+
description: >
|
| 5 |
+
wav2vec 2.0 (XLS-R 300M) self-supervised front-end fine-tuned end-to-end with
|
| 6 |
+
a Nes2Net-X (Nested Res2Net TDNN) back-end for speech anti-spoofing. The
|
| 7 |
+
nested Res2Net structure couples multi-scale residual groups with squeeze-
|
| 8 |
+
excitation, replacing dimensionality-reducing necks; mean temporal pooling +
|
| 9 |
+
linear classifier. Only ~0.51M back-end params. Official Nes2Net-X single
|
| 10 |
+
checkpoint (ASVspoof2021 LA 1.73% / DF 1.65% EER as reported), trained on
|
| 11 |
+
ASVspoof2019 LA with RawBoost, FP32, deterministic first-64600-sample window
|
| 12 |
+
(no random crop).
|
| 13 |
+
code: "https://github.com/Liu-Tianchi/Nes2Net_ASVspoof_ITW"
|
| 14 |
+
checkpoint: "https://huggingface.co/SpeechAntiSpoofingBenchmarks/Nes2Net"
|
| 15 |
+
params_millions: 317.9026
|
| 16 |
+
paper:
|
| 17 |
+
arxiv_id: "2504.05657"
|
| 18 |
+
url: "https://arxiv.org/abs/2504.05657"
|
| 19 |
+
bibtex: |
|
| 20 |
+
@article{Nes2Net,
|
| 21 |
+
author={Liu, Tianchi and Truong, Duc-Tuan and Das, Rohan Kumar and Lee, Kong Aik and Li, Haizhou},
|
| 22 |
+
journal={IEEE Transactions on Information Forensics and Security},
|
| 23 |
+
title={Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-Spoofing},
|
| 24 |
+
year={2025},
|
| 25 |
+
volume={20},
|
| 26 |
+
pages={12005--12018},
|
| 27 |
+
doi={10.1109/TIFS.2025.3626963}
|
| 28 |
+
}
|
| 29 |
+
notes: >
|
| 30 |
+
XLS-R 300M (wav2vec 2.0) front-end + Nes2Net-X back-end, the single (non-averaged)
|
| 31 |
+
checkpoint from Liu-Tianchi/Nes2Net_ASVspoof_ITW (Nes_ratio [8,8], SE_ratio [1],
|
| 32 |
+
pool_func 'mean', dilation 2). Architecture is built from the base xlsr2_300m.pt
|
| 33 |
+
model config, then every weight is overwritten by the fine-tuned checkpoint.
|
| 34 |
+
Deterministic first-64600-sample window (no random crop), matching the source
|
| 35 |
+
data_utils_SSL.py::pad used at eval (default --test_protocol 4sec). score = output
|
| 36 |
+
logit for class 1 (bona fide); higher = more bona fide. Back-end params ~0.51M;
|
| 37 |
+
params_millions reports the full deployed model incl. the XLS-R front-end.
|