File size: 1,515 Bytes
165fe2c 09957ce 165fe2c 09957ce 6b94a02 09957ce | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | ---
library_name: transformers
tags:
- speech-emotion-recognition
license: mit
datasets:
- litagin/Galgame_Speech_SER_16kHz
pipeline_tag: audio-classification
---
Speech Emotion Recognition Model trained on [litagin/Galgame_Speech_SER_16kHz](https://huggingface.co/datasets/litagin/Galgame_Speech_SER_16kHz).
## Usage
```python
from transformers import pipeline
REPO_ID = "litagin/anime_speech_emotion_classification"
pipe = pipeline(
"audio-classification",
model=REPO_ID,
feature_extractor=REPO_ID,
trust_remote_code=True,
device="cuda",
)
audio_path = "path/to/audio.wav"
result = pipe(audio_path)
print(result)
```
Result:
```json
[{'score': 0.5655683279037476, 'label': 'Angry'},
{'score': 0.12489483505487442, 'label': 'Disgusted'},
{'score': 0.11449059844017029, 'label': 'Embarrassed'},
{'score': 0.06627542525529861, 'label': 'Surprised'},
{'score': 0.06157735362648964, 'label': 'Sad'},
{'score': 0.031055787578225136, 'label': 'Neutral'},
{'score': 0.022820966318249702, 'label': 'Happy'},
{'score': 0.00791135337203741, 'label': 'Fearful'},
{'score': 0.00540440296754241, 'label': 'Sexual1'},
{'score': 8.61035857724346e-07, 'label': 'Sexual2'}]
```
## Label
```json
"id2label": {
"0": "Angry",
"1": "Disgusted",
"2": "Embarrassed",
"3": "Fearful",
"4": "Happy",
"5": "Sad",
"6": "Surprised",
"7": "Neutral",
"8": "Sexual1", # NSFW erotic voices such as 喘ぎ
"9": "Sexual2" # Blowjob Oral Slurp SFX チュパ音
},
``` |