korallll commited on
Commit
129c587
·
verified ·
1 Parent(s): 8242d1b

Add meta.yaml (HF download-stats query file) to enable download tracking

Browse files
Files changed (1) hide show
  1. meta.yaml +37 -0
meta.yaml ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ system:
2
+ name: "Nes2Net"
3
+ slug: "nes2net"
4
+ description: >
5
+ wav2vec 2.0 (XLS-R 300M) self-supervised front-end fine-tuned end-to-end with
6
+ a Nes2Net-X (Nested Res2Net TDNN) back-end for speech anti-spoofing. The
7
+ nested Res2Net structure couples multi-scale residual groups with squeeze-
8
+ excitation, replacing dimensionality-reducing necks; mean temporal pooling +
9
+ linear classifier. Only ~0.51M back-end params. Official Nes2Net-X single
10
+ checkpoint (ASVspoof2021 LA 1.73% / DF 1.65% EER as reported), trained on
11
+ ASVspoof2019 LA with RawBoost, FP32, deterministic first-64600-sample window
12
+ (no random crop).
13
+ code: "https://github.com/Liu-Tianchi/Nes2Net_ASVspoof_ITW"
14
+ checkpoint: "https://huggingface.co/SpeechAntiSpoofingBenchmarks/Nes2Net"
15
+ params_millions: 317.9026
16
+ paper:
17
+ arxiv_id: "2504.05657"
18
+ url: "https://arxiv.org/abs/2504.05657"
19
+ bibtex: |
20
+ @article{Nes2Net,
21
+ author={Liu, Tianchi and Truong, Duc-Tuan and Das, Rohan Kumar and Lee, Kong Aik and Li, Haizhou},
22
+ journal={IEEE Transactions on Information Forensics and Security},
23
+ title={Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-Spoofing},
24
+ year={2025},
25
+ volume={20},
26
+ pages={12005--12018},
27
+ doi={10.1109/TIFS.2025.3626963}
28
+ }
29
+ notes: >
30
+ XLS-R 300M (wav2vec 2.0) front-end + Nes2Net-X back-end, the single (non-averaged)
31
+ checkpoint from Liu-Tianchi/Nes2Net_ASVspoof_ITW (Nes_ratio [8,8], SE_ratio [1],
32
+ pool_func 'mean', dilation 2). Architecture is built from the base xlsr2_300m.pt
33
+ model config, then every weight is overwritten by the fine-tuned checkpoint.
34
+ Deterministic first-64600-sample window (no random crop), matching the source
35
+ data_utils_SSL.py::pad used at eval (default --test_protocol 4sec). score = output
36
+ logit for class 1 (bona fide); higher = more bona fide. Back-end params ~0.51M;
37
+ params_millions reports the full deployed model incl. the XLS-R front-end.