---
library_name: transformers
tags:
- speech-emotion-recognition
license: mit
datasets:
- litagin/Galgame_Speech_SER_16kHz
pipeline_tag: audio-classification
---

Speech Emotion Recognition Model trained on [litagin/Galgame_Speech_SER_16kHz](https://huggingface.co/datasets/litagin/Galgame_Speech_SER_16kHz).

## Usage

```python
from transformers import pipeline

REPO_ID = "litagin/anime_speech_emotion_classification"
pipe = pipeline(
    "audio-classification",
    model=REPO_ID,
    feature_extractor=REPO_ID,
    trust_remote_code=True,
    device="cuda",
)

audio_path = "path/to/audio.wav"
result = pipe(audio_path)
print(result)
```

Result:
```json
[{'score': 0.5655683279037476, 'label': 'Angry'},
 {'score': 0.12489483505487442, 'label': 'Disgusted'},
 {'score': 0.11449059844017029, 'label': 'Embarrassed'},
 {'score': 0.06627542525529861, 'label': 'Surprised'},
 {'score': 0.06157735362648964, 'label': 'Sad'},
 {'score': 0.031055787578225136, 'label': 'Neutral'},
 {'score': 0.022820966318249702, 'label': 'Happy'},
 {'score': 0.00791135337203741, 'label': 'Fearful'},
 {'score': 0.00540440296754241, 'label': 'Sexual1'},
 {'score': 8.61035857724346e-07, 'label': 'Sexual2'}]
```

## Label

```json
  "id2label": {
    "0": "Angry",
    "1": "Disgusted",
    "2": "Embarrassed",
    "3": "Fearful",
    "4": "Happy",
    "5": "Sad",
    "6": "Surprised",
    "7": "Neutral",
    "8": "Sexual1",  # NSFW erotic voices such as 喘ぎ
    "9": "Sexual2"  # Blowjob Oral Slurp SFX チュパ音
  },
```