Spoof-SUPERB / README.md
AdupaNithinSai
initial
1d4f6ee

A newer version of the Gradio SDK is available: 6.4.0

Upgrade
metadata
title: Linearhead Leaderboard
emoji: πŸƒ
colorFrom: gray
colorTo: yellow
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Comprehensive comparison of Linear-Head classifiers

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

πŸŽ™οΈ Linear-Head Model Leaderboard

This leaderboard presents a comprehensive comparison of Linear-Head classifiers trained on a variety of Self-Supervised Learning (SSL) speech representations from the S3PRL library. It highlights model performance across multiple spoofing datasets, codecs, and TTS attacks in the context of audio deepfake detection.


Frontend – SSL Feature Extractors

The frontend of each model is a frozen SSL feature extractor from S3PRL, capable of generating rich speech embeddings. These extractors are pre-trained on large-scale audio corpora and capture different aspects of speech acoustics and phonetic content. The leaderboard includes models built with several SSL backbones such as:

  • WavLM-Large
  • Wav2Vec 2.0 XLSR (xls_r_300m)
  • NPC 960 hr
  • HuBERT, APC, and others

Each extractor converts input waveforms into frame-level representations, serving as the foundation for downstream spoof detection.


Backend – Classifier Models

On top of these SSL embeddings, four downstream classifier architectures are implemented. Among them, the Linear-Head model serves as a lightweight yet highly effective backend. It projects the SSL features into spoof/bonafide decision scores using a single fully connected layer trained with binary classification loss. The simplicity of this approach allows fast adaptation and fair benchmarking across different SSL frontends.


What the Leaderboard Shows

The leaderboard summarizes key results from extensive evaluations. It includes separate sections for:

  • Main Leader Board – Overall ranking based on average EER or TNR.
  • Models Performance on Each Data – Per-dataset or per-attack breakdowns.
  • TTS Difficulty Level Per Model – Shows which TTS generators most effectively fool the models.
  • Performance on Codecs – Evaluates robustness under various compression schemes.
  • Best Model per Attack – Highlights the top-performing model for each individual attack type.

Purpose

The goal of this leaderboard is to provide a transparent, unified view of how SSL-based frontends and lightweight classifier backends perform in deepfake speech detection tasks. It enables researchers and engineers to identify the most robust combinations of feature extractors and classifier heads, supporting future improvements in generalization, efficiency, and security of speech authentication systems.