Update README.md

0569cdd verified 10 months ago

1.49 kB

license: mit
language:
  - en
pipeline_tag: audio-classification

Emotion Recognition from Speech (Deep Learning Model)

This model predicts emotions from short audio recordings using deep learning techniques. It processes audio features like MFCC, ZCR, and RMSE and returns a primary emotion prediction along with confidence scores for all detected emotions.

Model Details

Architecture: Custom CNN-based Keras model
Features Used:
- MFCC (Mel-Frequency Cepstral Coefficients)
- ZCR (Zero Crossing Rate)
- RMSE (Root Mean Square Energy)
Framework: TensorFlow / Keras
Trained On: Processed speech emotion datasets
Output:
- Primary emotion label
- Confidence scores for each emotion
Emotion Classes:
- happy, sad, angry, fear, neutral

Evaluation

The accuracy plot demonstrates a clear upward trend for both training and validation datasets over 50 epochs. Initially, the model showed rapid improvement, reaching over 90% accuracy by epoch 15. From epochs 20 to 50, both curves stabilize above 95%, indicating consistent learning with no significant overfitting. By the final epoch, training accuracy approaches 0.99, and validation accuracy mirrors this trend closely, demonstrating excellent generalization capability.

More Details

Visit: https://documentation-fyp.vercel.app/