animescore / README.md
nonmetal's picture
AnimeScore release
eb34860
|
Raw
History Blame Contribute Delete
2.24 kB
metadata
language:
  - ja
license: mit
tags:
  - audio
  - speech
  - preference
  - anime
library_name: transformers
pipeline_tag: audio-classification

AnimeScore

Try the interactive demo: AnimeScore Demo Space.

A learned scorer for anime-like speech style. Given an audio clip, it returns a scalar score; higher is more anime-like.

This is the official Huggingface model repository for the paper "AnimeScore: A Preference-Based Dataset and Framework for Evaluating Anime-Like Speech Style".

For more details, please visit our GitHub Repository.

Checkpoint

We release the HuBERT-based model, which achieved the best performance among the backbones we evaluated (pairwise accuracy 82.4%, AUC 0.908).

File Size Notes
model.safetensors ~9 MB Released head weights
config.json Model config
modeling_animescore.py Custom modeling code (loaded via trust_remote_code=True)

How to use

pip install -r requirements.txt
import torch, torchaudio
from transformers import AutoModel

device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModel.from_pretrained(
    "spellbrush/animescore",
    trust_remote_code=True,
).eval().to(device)

wav, sr = torchaudio.load("sample.wav")
if wav.size(0) > 1:
    wav = wav.mean(0, keepdim=True)         # mono
if sr != 16000:
    wav = torchaudio.functional.resample(wav, sr, 16000)

with torch.no_grad():
    s = model.score(wav.to(device)).item()
print(f"AnimeScore: {s:.3f}")

Pairwise probability:

sa = model.score(wav_a.to(device))
sb = model.score(wav_b.to(device))
p_a_gt_b = torch.sigmoid(sa - sb).item()

CLI: python example_inference.py --ckpt . --wav sample.wav

or deploy this directory as a HuggingFace Space (SDK = gradio).

Citation

@inproceedings{park2026animescore,
  title     = {AnimeScore: A Preference-Based Dataset and Framework for
               Evaluating Anime-Like Speech Style},
  author    = {Park, Joonyong and Li, Jerry},
  booktitle = {Interspeech},
  year      = {2026}
}

License

MIT License.