Model Card: Spectra-AASIST (anti-spoofing / bonafide vs spoof)

Spectra-AASIST is a model for speech spoofing detection (binary classification: bonafide vs spoof) from raw audio waveforms. Architecture: SSL encoder (Wav2Vec2) → MLP projection → AASIST 2-class classifier.

  • Input: waveform (float32), shape (batch, num_samples) (typically 16 kHz).
  • Output: logits of shape (batch, 2), where index 0 = spoof, index 1 = bonafide.

On first run, the model will automatically download the SSL encoder facebook/wav2vec2-xls-r-300m via transformers.

Evaluation Results

Model ASVspoof19 LA ASVspoof21 LA ASVspoof21 DF ASVspoof5 ADD2022 In-the-Wild AD2R1 AD2R2 AD3R1 AD2R2
Res2TCNGuard 7.487 19.130 19.883 37.620 49.538 49.246 34.683 35.343 48.051 39.558
AASIST3 27.585 37.407 33.099 41.001 47.192 39.626 36.581 37.351 41.333 44.278
XSLS 0.231 7.714 4.220 17.688 33.951 7.453 14.386 15.743 19.368 21.095
TCM-ADD 0.152 6.655 3.444 19.505 35.252 7.767 16.951 17.688 21.913 18.627
DF Arena 1B 43.793 40.137 42.994 35.333 42.139 17.598 12.442 13.292 33.381 43.42
Spectra-0 0.181 6.475 5.410 14.426 14.716 1.026 1.578 2.372 6.535 15.154
Spectra-AASIST 0.159 5.164 2.568 14.056 15.205 1.461 0.939 1.802 6.427 12.968
Spectra-AASIST3 0.723 4.506 1.998 13.82 15.187 0.961 0.727 1.806 6.502 14.481

Quickstart

Clone from Hugging Face

This repository is hosted on Hugging Face Hub: https://huggingface.co/lab260/spectra_aasist.

git lfs install
git clone https://huggingface.co/lab260/spectra_aasist
cd spectra_aasist

Install dependencies

pip install -U torch torchaudio transformers huggingface_hub safetensors soundfile

Single-file inference (example preprocessing)

import random
import torch
import torchaudio
import soundfile as sf

from model import spectra_aasist


def pad_random(x: torch.Tensor, max_len: int = 64600) -> torch.Tensor:
    # x: (num_samples,) or (1, num_samples)
    if x.ndim > 1:
        x = x.squeeze()
    x_len = x.shape[0]
    if x_len >= max_len:
        start = random.randint(0, x_len - max_len)
        return x[start:start + max_len]
    num_repeats = int(max_len / x_len) + 1
    return x.repeat(num_repeats)[:max_len]


def load_audio_mono(path: str) -> torch.Tensor:
    audio, sr = sf.read(path, dtype="float32")
    audio = torch.from_numpy(audio)
    if audio.ndim > 1:
        # (num_samples, channels) -> mono
        audio = audio.mean(dim=1)
    if sr != 16000:
        audio = torchaudio.functional.resample(audio, sr, 16000)
    return audio


device = "cuda" if torch.cuda.is_available() else "cpu"
model = spectra_aasist.from_pretrained(pretrained_model_name_or_path=".").eval().to(device)

audio = load_audio_mono("path/to/audio.wav")
audio = torchaudio.functional.preemphasis(audio.unsqueeze(0))  # (1, T)
audio = pad_random(audio.squeeze(0), 64600).unsqueeze(0)       # (1, 64600)

with torch.inference_mode():
    logits = model(audio.to(device))  # (1, 2)
    score_spoof = logits[0, 0].item()
    score_bonafide = logits[0, 1].item()

print({"score_bonafide": score_bonafide, "score_spoof": score_spoof})

Threshold-based classification (and how to tune it)

In model.py, the SpectraAASIST class provides classify() with a default threshold chosen as an “optimal” value for the original setting:

  • Default threshold: -1.140625 (it thresholds logit_bonafide = logits[:, 1])
  • Note: this threshold may not be optimal on a different dataset/domain. It’s recommended to tune the threshold on your dataset using EER (Equal Error Rate) or a target FAR/FRR.

Example:

with torch.inference_mode():
    pred = model.classify(audio.to(device), threshold=-1.140625)  # 1=bonafide, 0=spoof

Tuning the threshold via EER (typical workflow)

  1. Run the model on a labeled set and collect scores for both classes.

  2. Compute EER and the threshold

Limitations and notes

  • This is a pre-release model.
  • Significantly stronger models are planned for Q3–Q4 2026 — stay tuned.

License

MIT (see the license field in the model repo header).

Contacts

TG channel: https://t.me/korallll_ai email: k.n.borodin@mtuci.ru website: https://lab260.ru/

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results