Spaces:
Running
Running
AdupaNithinSai
commited on
Commit
Β·
1d4f6ee
1
Parent(s):
e162e78
initial
Browse files- README copy.md +48 -0
- README.md +49 -0
- app.py +145 -0
- banner.png +0 -0
- data.xlsx +0 -0
- requirements.txt +3 -0
README copy.md
ADDED
|
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ποΈ Linear-Head Model Leaderboard
|
| 2 |
+
|
| 3 |
+
This leaderboard presents a comprehensive comparison of **Linear-Head classifiers** trained on a variety of **Self-Supervised Learning (SSL)** speech representations from the **S3PRL** library. It highlights model performance across multiple spoofing datasets, codecs, and TTS attacks in the context of **audio deepfake detection**.
|
| 4 |
+
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
## Frontend β SSL Feature Extractors
|
| 8 |
+
|
| 9 |
+
The **frontend** of each model is a frozen SSL feature extractor from **S3PRL**, capable of generating rich speech embeddings.
|
| 10 |
+
These extractors are pre-trained on large-scale audio corpora and capture different aspects of speech acoustics and phonetic content.
|
| 11 |
+
The leaderboard includes models built with several SSL backbones such as:
|
| 12 |
+
|
| 13 |
+
* **WavLM-Large**
|
| 14 |
+
* **Wav2Vec 2.0 XLSR (xls_r_300m)**
|
| 15 |
+
* **NPC 960 hr**
|
| 16 |
+
* **HuBERT**, **APC**, and others
|
| 17 |
+
|
| 18 |
+
Each extractor converts input waveforms into frame-level representations, serving as the foundation for downstream spoof detection.
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## Backend β Classifier Models
|
| 23 |
+
|
| 24 |
+
On top of these SSL embeddings, four **downstream classifier architectures** are implemented.
|
| 25 |
+
Among them, the **Linear-Head model** serves as a lightweight yet highly effective backend.
|
| 26 |
+
It projects the SSL features into spoof/bonafide decision scores using a single fully connected layer trained with binary classification loss.
|
| 27 |
+
The simplicity of this approach allows fast adaptation and fair benchmarking across different SSL frontends.
|
| 28 |
+
|
| 29 |
+
---
|
| 30 |
+
|
| 31 |
+
## What the Leaderboard Shows
|
| 32 |
+
|
| 33 |
+
The leaderboard summarizes key results from extensive evaluations.
|
| 34 |
+
It includes separate sections for:
|
| 35 |
+
|
| 36 |
+
* **Main Leader Board** β Overall ranking based on average EER or TNR.
|
| 37 |
+
* **Models Performance on Each Data** β Per-dataset or per-attack breakdowns.
|
| 38 |
+
* **TTS Difficulty Level Per Model** β Shows which TTS generators most effectively fool the models.
|
| 39 |
+
* **Performance on Codecs** β Evaluates robustness under various compression schemes.
|
| 40 |
+
* **Best Model per Attack** β Highlights the top-performing model for each individual attack type.
|
| 41 |
+
|
| 42 |
+
---
|
| 43 |
+
|
| 44 |
+
## Purpose
|
| 45 |
+
|
| 46 |
+
The goal of this leaderboard is to provide a transparent, unified view of **how SSL-based frontends and lightweight classifier backends perform in deepfake speech detection tasks**.
|
| 47 |
+
It enables researchers and engineers to identify the most robust combinations of feature extractors and classifier heads, supporting future improvements in generalization, efficiency, and security of speech authentication systems.
|
| 48 |
+
|
README.md
CHANGED
|
@@ -12,3 +12,52 @@ short_description: Comprehensive comparison of Linear-Head classifiers
|
|
| 12 |
---
|
| 13 |
|
| 14 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
---
|
| 13 |
|
| 14 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
| 15 |
+
|
| 16 |
+
# ποΈ Linear-Head Model Leaderboard
|
| 17 |
+
|
| 18 |
+
This leaderboard presents a comprehensive comparison of **Linear-Head classifiers** trained on a variety of **Self-Supervised Learning (SSL)** speech representations from the **S3PRL** library. It highlights model performance across multiple spoofing datasets, codecs, and TTS attacks in the context of **audio deepfake detection**.
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## Frontend β SSL Feature Extractors
|
| 23 |
+
|
| 24 |
+
The **frontend** of each model is a frozen SSL feature extractor from **S3PRL**, capable of generating rich speech embeddings.
|
| 25 |
+
These extractors are pre-trained on large-scale audio corpora and capture different aspects of speech acoustics and phonetic content.
|
| 26 |
+
The leaderboard includes models built with several SSL backbones such as:
|
| 27 |
+
|
| 28 |
+
* **WavLM-Large**
|
| 29 |
+
* **Wav2Vec 2.0 XLSR (xls_r_300m)**
|
| 30 |
+
* **NPC 960 hr**
|
| 31 |
+
* **HuBERT**, **APC**, and others
|
| 32 |
+
|
| 33 |
+
Each extractor converts input waveforms into frame-level representations, serving as the foundation for downstream spoof detection.
|
| 34 |
+
|
| 35 |
+
---
|
| 36 |
+
|
| 37 |
+
## Backend β Classifier Models
|
| 38 |
+
|
| 39 |
+
On top of these SSL embeddings, four **downstream classifier architectures** are implemented.
|
| 40 |
+
Among them, the **Linear-Head model** serves as a lightweight yet highly effective backend.
|
| 41 |
+
It projects the SSL features into spoof/bonafide decision scores using a single fully connected layer trained with binary classification loss.
|
| 42 |
+
The simplicity of this approach allows fast adaptation and fair benchmarking across different SSL frontends.
|
| 43 |
+
|
| 44 |
+
---
|
| 45 |
+
|
| 46 |
+
## What the Leaderboard Shows
|
| 47 |
+
|
| 48 |
+
The leaderboard summarizes key results from extensive evaluations.
|
| 49 |
+
It includes separate sections for:
|
| 50 |
+
|
| 51 |
+
* **Main Leader Board** β Overall ranking based on average EER or TNR.
|
| 52 |
+
* **Models Performance on Each Data** β Per-dataset or per-attack breakdowns.
|
| 53 |
+
* **TTS Difficulty Level Per Model** β Shows which TTS generators most effectively fool the models.
|
| 54 |
+
* **Performance on Codecs** β Evaluates robustness under various compression schemes.
|
| 55 |
+
* **Best Model per Attack** β Highlights the top-performing model for each individual attack type.
|
| 56 |
+
|
| 57 |
+
---
|
| 58 |
+
|
| 59 |
+
## Purpose
|
| 60 |
+
|
| 61 |
+
The goal of this leaderboard is to provide a transparent, unified view of **how SSL-based frontends and lightweight classifier backends perform in deepfake speech detection tasks**.
|
| 62 |
+
It enables researchers and engineers to identify the most robust combinations of feature extractors and classifier heads, supporting future improvements in generalization, efficiency, and security of speech authentication systems.
|
| 63 |
+
|
app.py
ADDED
|
@@ -0,0 +1,145 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# app.py
|
| 2 |
+
# Clean, read-only leaderboard with a banner image and tabbed pages (no buttons/uploads).
|
| 3 |
+
# Improved styling so the explanation text is clearly visible in both light & dark themes.
|
| 4 |
+
#
|
| 5 |
+
# Usage:
|
| 6 |
+
# pip install -r requirements.txt
|
| 7 |
+
# python app.py
|
| 8 |
+
#
|
| 9 |
+
# Optional: set env vars:
|
| 10 |
+
# LB_DATA_PATH -> path to your Excel (defaults to "data.xlsx")
|
| 11 |
+
# LB_BANNER_PATH -> path to a top banner image (defaults to "banner.png")
|
| 12 |
+
|
| 13 |
+
import os
|
| 14 |
+
import pandas as pd
|
| 15 |
+
import gradio as gr
|
| 16 |
+
|
| 17 |
+
DATA_PATH = os.environ.get("LB_DATA_PATH", "data.xlsx")
|
| 18 |
+
BANNER_PATH = os.environ.get("LB_BANNER_PATH", "banner.png") # change if your file is different
|
| 19 |
+
|
| 20 |
+
# ---- Explanations shown above each table ----
|
| 21 |
+
SHEET_DESCRIPTIONS = {
|
| 22 |
+
"Main Leader Board": (
|
| 23 |
+
"<b>Overview.</b> This Table compares models at a glance across the full deepfake detection suite. "
|
| 24 |
+
"Use it to spot <i>overall leaders</i> and identify systems that maintain strong performance "
|
| 25 |
+
"under diverse conditions. Where present, <b>Avg EER (β)</b> is the primary error metric "
|
| 26 |
+
"(lower is better). If you see robustness columns such as <b>Codec (β)</b>, "
|
| 27 |
+
"<b>TTS (β)</b>, or <b>Cross-Gen (β)</b>, higher values indicate stronger generalization."
|
| 28 |
+
),
|
| 29 |
+
"Models Performance on Each Data": (
|
| 30 |
+
"<b>Per-dataset breakdown.</b> Each column corresponds to a dataset, subset, or attack group "
|
| 31 |
+
"(e.g., ASVspoof splits, Famous Figures, MLAAD; sometimes A-IDs like A07, A15). "
|
| 32 |
+
"Interpretation: <b>EER (β)</b> lower is better; <b>TNR (β)</b> higher is better. "
|
| 33 |
+
"Use this table to pinpoint which datasets or attacks are <i>hardest</i> and which are <i>easiest</i> "
|
| 34 |
+
"for each model, and to diagnose domain-specific weaknesses."
|
| 35 |
+
),
|
| 36 |
+
"TTS Difficultly Level Per Model": (
|
| 37 |
+
"<b>TTS stress-test.</b> This Table shows how challenging different <b>TTS generators</b> are for each "
|
| 38 |
+
"detection model. For <b>TNR</b>, lower values mean the TTS fools the model more (i.e., harder); "
|
| 39 |
+
"higher values mean the model rejects that TTS more reliably (easier).<br><br>"
|
| 40 |
+
"<b>Key finding (from mean TNR across all TTS systems & datasets):</b> "
|
| 41 |
+
"<span style='white-space:nowrap;'>Hardest β <b>ASVSpoof 5 Eval β A31</b></span> "
|
| 42 |
+
"with <b>Mean TNR = 0.0221</b>; "
|
| 43 |
+
"<span style='white-space:nowrap;'>Easiest β <b>tts_models_it_mai_female_vits</b> (MLAAD)</span> "
|
| 44 |
+
"with <b>Mean TNR = 0.9961</b>."
|
| 45 |
+
),
|
| 46 |
+
"Performance On Codecs": (
|
| 47 |
+
"<b>Compression robustness.</b> Columns represent codec/bitrate conditions; performance reflects whether "
|
| 48 |
+
"compression hides or amplifies spoof cues. If metrics are <b>EER (β)</b>, lower is better; "
|
| 49 |
+
"if <b>TNR (β)</b>, higher is better. Use this Table to compare which models remain stable when "
|
| 50 |
+
"audio is encoded for streaming, storage, or telephony."
|
| 51 |
+
),
|
| 52 |
+
# NEW: tab for per-attack winners
|
| 53 |
+
"Best Model per Attack": (
|
| 54 |
+
"<b>Per-attack winners.</b> For each attack (e.g., A07, A15, A31, etc.), this table lists the "
|
| 55 |
+
"<i>single best-performing model</i> along with its corresponding <b>TNR (β)</b>. "
|
| 56 |
+
"Use it to quickly see which model you should trust most against each specific attack family."
|
| 57 |
+
),
|
| 58 |
+
}
|
| 59 |
+
|
| 60 |
+
def load_sheets(path: str):
|
| 61 |
+
if not os.path.exists(path):
|
| 62 |
+
raise FileNotFoundError(
|
| 63 |
+
f"Excel file not found at '{path}'. "
|
| 64 |
+
"Place your workbook next to app.py as 'data.xlsx' or set LB_DATA_PATH."
|
| 65 |
+
)
|
| 66 |
+
xls = pd.ExcelFile(path)
|
| 67 |
+
# Read every sheet; if "Best Model per Attack" is present it will be included automatically.
|
| 68 |
+
return {name: pd.read_excel(path, sheet_name=name) for name in xls.sheet_names}
|
| 69 |
+
|
| 70 |
+
def build_app():
|
| 71 |
+
sheets = load_sheets(DATA_PATH)
|
| 72 |
+
|
| 73 |
+
with gr.Blocks(
|
| 74 |
+
title="ποΈ Benchmarking Linear-Head Classifiers Built on S3PRL Embeddings",
|
| 75 |
+
css="""
|
| 76 |
+
.gradio-container { max-width: 1200px !important; }
|
| 77 |
+
|
| 78 |
+
/* Title */
|
| 79 |
+
#title h1 {
|
| 80 |
+
text-align: center;
|
| 81 |
+
font-size: 2.1em;
|
| 82 |
+
margin: 0.5rem 0 0.75rem 0;
|
| 83 |
+
line-height: 1.25;
|
| 84 |
+
}
|
| 85 |
+
|
| 86 |
+
/* Banner */
|
| 87 |
+
#banner { border-radius: 16px; margin: 0.5rem auto 0.25rem auto; }
|
| 88 |
+
|
| 89 |
+
/* Sheet description card β visible in both light & dark */
|
| 90 |
+
[data-theme="light"] .sheet-card {
|
| 91 |
+
background: #f9fafb;
|
| 92 |
+
color: #111827;
|
| 93 |
+
border: 1px solid #e5e7eb;
|
| 94 |
+
border-radius: 12px;
|
| 95 |
+
padding: 14px 16px;
|
| 96 |
+
box-shadow: 0 1px 0 rgba(0,0,0,0.02);
|
| 97 |
+
}
|
| 98 |
+
[data-theme="dark"] .sheet-card {
|
| 99 |
+
background: #111827;
|
| 100 |
+
color: #e5e7eb;
|
| 101 |
+
border: 1px solid #374151;
|
| 102 |
+
border-radius: 12px;
|
| 103 |
+
padding: 14px 16px;
|
| 104 |
+
box-shadow: 0 0 0 rgba(0,0,0,0);
|
| 105 |
+
}
|
| 106 |
+
.sheet-card p { margin: 0.4rem 0; font-size: 15px; line-height: 1.6; }
|
| 107 |
+
|
| 108 |
+
/* Dataframe spacing */
|
| 109 |
+
.gr-dataframe { margin-top: 10px; }
|
| 110 |
+
"""
|
| 111 |
+
) as demo:
|
| 112 |
+
# --- Banner Image and Title ---
|
| 113 |
+
if os.path.exists(BANNER_PATH):
|
| 114 |
+
gr.Image(value=BANNER_PATH, show_label=False, elem_id="banner")
|
| 115 |
+
gr.Markdown(
|
| 116 |
+
"<h1>ποΈ Benchmarking Linear-Head Classifiers Built on S3PRL Embeddings</h1>",
|
| 117 |
+
elem_id="title",
|
| 118 |
+
)
|
| 119 |
+
|
| 120 |
+
# --- Tabs / Pages ---
|
| 121 |
+
with gr.Tabs():
|
| 122 |
+
# Keep workbook order, but ensure our description card appears if we know the sheet name.
|
| 123 |
+
for sheet_name, df in sheets.items():
|
| 124 |
+
with gr.TabItem(sheet_name):
|
| 125 |
+
desc = SHEET_DESCRIPTIONS.get(
|
| 126 |
+
sheet_name,
|
| 127 |
+
"This table shows the original sheet data from the analysis workbook."
|
| 128 |
+
)
|
| 129 |
+
# Explanation card
|
| 130 |
+
gr.Markdown(f"<div class='sheet-card'>{desc}</div>")
|
| 131 |
+
# Single raw table (read-only)
|
| 132 |
+
gr.Dataframe(
|
| 133 |
+
value=df,
|
| 134 |
+
interactive=False,
|
| 135 |
+
wrap=True,
|
| 136 |
+
label=None,
|
| 137 |
+
elem_id=f"df_{sheet_name.replace(' ', '_').lower()}",
|
| 138 |
+
)
|
| 139 |
+
return demo
|
| 140 |
+
|
| 141 |
+
demo = build_app()
|
| 142 |
+
|
| 143 |
+
if __name__ == "__main__":
|
| 144 |
+
# For a public link while testing locally, use: demo.launch(share=True)
|
| 145 |
+
demo.launch()
|
banner.png
ADDED
|
data.xlsx
ADDED
|
Binary file (24.4 kB). View file
|
|
|
requirements.txt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
gradio>=4.44.0
|
| 2 |
+
pandas>=2.0.0
|
| 3 |
+
openpyxl>=3.1.0
|