HuBERT Emotion Recognition Model

Fine-tuned HuBERT model for emotion recognition in speech audio.

Model Description

This model classifies speech audio into 5 emotion categories:

  1. Angry/Fearful - Expressions of anger or fear
  2. Happy/Laugh - Joyful or laughing expressions
  3. Neutral/Calm - Neutral or calm speech
  4. Sad/Cry - Expressions of sadness or crying
  5. Surprised/Amazed - Surprised or amazed reactions

Quick Start

from transformers import pipeline

# Load the model
classifier = pipeline("audio-classification", model="YOUR_USERNAME/hubert-emotion-recognition")

# Predict emotion
result = classifier("audio.wav")
print(result)

Detailed Usage

from transformers import AutoModelForAudioClassification, Wav2Vec2FeatureExtractor
import torch
import librosa

# Load model and processor
model = AutoModelForAudioClassification.from_pretrained("YOUR_USERNAME/hubert-emotion-recognition")
processor = Wav2Vec2FeatureExtractor.from_pretrained("YOUR_USERNAME/hubert-emotion-recognition")

# Load audio (16kHz)
audio, sr = librosa.load("audio.wav", sr=16000)

# Prepare inputs
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)[0]
    pred_id = torch.argmax(probs).item()

# Show results
emotions = ["Angry/Fearful", "Happy/Laugh", "Neutral/Calm", "Sad/Cry", "Surprised/Amazed"]
print(f"Emotion: {emotions[pred_id]}")
print(f"Confidence: {probs[pred_id]:.3f}")

Model Details

  • Base Model: HuBERT
  • Task: Audio Classification
  • Sample Rate: 16kHz
  • Max Duration: 3 seconds
  • Framework: PyTorch + Transformers

Training Data

[Describe your training dataset here - name, size, speakers, etc.]

Performance

[Add your evaluation metrics here]

Example:

  • Accuracy: 87.3%
  • F1 Score: 85.1%

Limitations

  • Optimized for English speech
  • Works best with clear audio (3 seconds)
  • Performance may vary with background noise
  • Emotion expression varies across cultures

Intended Uses

โœ… Call center analytics
โœ… Mental health monitoring
โœ… Voice assistants
โœ… Media analysis
โœ… Research in affective computing

License

Apache 2.0

Citation

@misc{hubert_emotion_2024,
  author = {YOUR_NAME},
  title = {HuBERT Emotion Recognition},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/YOUR_USERNAME/hubert-emotion-recognition}
}
Downloads last month
22
Safetensors
Model size
94.6M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using abedir/emotion-detector 1