HuBERT Emotion Recognition Model

Fine-tuned HuBERT model for emotion recognition in speech audio.

Model Description

This model classifies speech audio into 5 emotion categories:

Angry/Fearful - Expressions of anger or fear
Happy/Laugh - Joyful or laughing expressions
Neutral/Calm - Neutral or calm speech
Sad/Cry - Expressions of sadness or crying
Surprised/Amazed - Surprised or amazed reactions

Quick Start

from transformers import pipeline

# Load the model
classifier = pipeline("audio-classification", model="YOUR_USERNAME/hubert-emotion-recognition")

# Predict emotion
result = classifier("audio.wav")
print(result)

Detailed Usage

from transformers import AutoModelForAudioClassification, Wav2Vec2FeatureExtractor
import torch
import librosa

# Load model and processor
model = AutoModelForAudioClassification.from_pretrained("YOUR_USERNAME/hubert-emotion-recognition")
processor = Wav2Vec2FeatureExtractor.from_pretrained("YOUR_USERNAME/hubert-emotion-recognition")

# Load audio (16kHz)
audio, sr = librosa.load("audio.wav", sr=16000)

# Prepare inputs
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)[0]
    pred_id = torch.argmax(probs).item()

# Show results
emotions = ["Angry/Fearful", "Happy/Laugh", "Neutral/Calm", "Sad/Cry", "Surprised/Amazed"]
print(f"Emotion: {emotions[pred_id]}")
print(f"Confidence: {probs[pred_id]:.3f}")

Model Details

Base Model: HuBERT
Task: Audio Classification
Sample Rate: 16kHz
Max Duration: 3 seconds
Framework: PyTorch + Transformers

Training Data

[Describe your training dataset here - name, size, speakers, etc.]

Performance

[Add your evaluation metrics here]

Example:

Accuracy: 87.3%
F1 Score: 85.1%

Limitations

Optimized for English speech
Works best with clear audio (3 seconds)
Performance may vary with background noise
Emotion expression varies across cultures

Intended Uses

✅ Call center analytics
✅ Mental health monitoring
✅ Voice assistants
✅ Media analysis
✅ Research in affective computing

License

Apache 2.0

Citation

@misc{hubert_emotion_2024,
  author = {YOUR_NAME},
  title = {HuBERT Emotion Recognition},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/YOUR_USERNAME/hubert-emotion-recognition}
}

Downloads last month: 33

Safetensors

Model size

94.6M params

Tensor type

F32

abedir
/

emotion-detector