--- library_name: transformers tags: - speech-emotion-recognition license: mit datasets: - litagin/Galgame_Speech_SER_16kHz pipeline_tag: audio-classification --- Speech Emotion Recognition Model trained on [litagin/Galgame_Speech_SER_16kHz](https://huggingface.co/datasets/litagin/Galgame_Speech_SER_16kHz). ## Usage ```python from transformers import pipeline REPO_ID = "litagin/anime_speech_emotion_classification" pipe = pipeline( "audio-classification", model=REPO_ID, feature_extractor=REPO_ID, trust_remote_code=True, device="cuda", ) audio_path = "path/to/audio.wav" result = pipe(audio_path) print(result) ``` Result: ```json [{'score': 0.5655683279037476, 'label': 'Angry'}, {'score': 0.12489483505487442, 'label': 'Disgusted'}, {'score': 0.11449059844017029, 'label': 'Embarrassed'}, {'score': 0.06627542525529861, 'label': 'Surprised'}, {'score': 0.06157735362648964, 'label': 'Sad'}, {'score': 0.031055787578225136, 'label': 'Neutral'}, {'score': 0.022820966318249702, 'label': 'Happy'}, {'score': 0.00791135337203741, 'label': 'Fearful'}, {'score': 0.00540440296754241, 'label': 'Sexual1'}, {'score': 8.61035857724346e-07, 'label': 'Sexual2'}] ``` ## Label ```json "id2label": { "0": "Angry", "1": "Disgusted", "2": "Embarrassed", "3": "Fearful", "4": "Happy", "5": "Sad", "6": "Surprised", "7": "Neutral", "8": "Sexual1", # NSFW erotic voices such as 喘ぎ "9": "Sexual2" # Blowjob Oral Slurp SFX チュパ音 }, ```