| # emotion-classification-model | |
| This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the [dair-ai/emotion dataset](https://huggingface.co/datasets/dair-ai/emotion). It is designed to classify text into the following emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5). | |
| It achieves the following results: | |
| - **Validation Accuracy:** 94.25% | |
| - **Test Accuracy:** 93.2% | |
| ## Model Description | |
| This model uses the DistilBERT architecture, which is a lighter and faster variant of BERT. It has been fine-tuned specifically for emotion classification, making it suitable for tasks such as sentiment analysis, customer feedback analysis, and user emotion detection. | |
| ### Key Features | |
| - Efficient and lightweight for deployment. | |
| - High accuracy for emotion detection tasks. | |
| - Pretrained on a diverse dataset and fine-tuned for high specificity to emotions. | |
| ## Intended Uses & Limitations | |
| ### Intended Uses | |
| - Emotion analysis in text data. | |
| - Sentiment detection in customer reviews, tweets, or user feedback. | |
| - Psychological or behavioral studies to analyze emotional tone in communications. | |
| ### Limitations | |
| - May not generalize well to datasets with highly domain-specific language. | |
| - Might struggle with sarcasm, irony, or other nuanced forms of language. | |
| - The model is English-specific and may not perform well on non-English text. | |
| ## Training and Evaluation Data | |
| ### Training Dataset | |
| - **Dataset:** [dair-ai/emotion](https://huggingface.co/datasets/dair-ai/emotion) | |
| - **Training Set Size:** 16,000 examples | |
| - **Dataset Description:** The dataset contains English sentences labeled with six emotional categories: sadness (0), joy (1), love (2), anger (3), fear (4), surprise (5) | |
| ### Results | |
| - **Training Time:** ~226 seconds | |
| - **Training Loss:** 0.0520 | |
| - **Validation Accuracy:** 94.25% | |
| - **Test Accuracy:** 93.2% | |
| ## Training Procedure | |
| ### Hyperparameters | |
| - **Learning Rate:** 5e-05 | |
| - **Batch Size:** 16 (train and evaluation) | |
| - **Epochs:** 3 | |
| - **Seed:** 42 | |
| - **Optimizer:** AdamW (betas=(0.9,0.999), epsilon=1e-08) | |
| - **Learning Rate Scheduler:** Linear | |
| - **Mixed Precision Training:** Native AMP | |
| ### Training and Validation Results | |
| | Epoch | Training Loss | Validation Loss | Validation Accuracy | | |
| |-------|---------------|-----------------|---------------------| | |
| | 1 | 0.5383 | 0.1845 | 92.90% | | |
| | 2 | 0.2254 | 0.1589 | 93.55% | | |
| | 3 | 0.0520 | 0.1485 | 94.25% | | |
| ### Final Evaluation | |
| - **Validation Loss:** 0.1485 | |
| - **Validation Accuracy:** 94.25% | |
| - **Test Loss:** 0.1758 | |
| - **Test Accuracy:** 93.2% | |
| ### Performance Metrics | |
| - **Training Speed:** ~212 samples/second | |
| - **Evaluation Speed:** ~1144 samples/second | |
| ## Usage Example | |
| ```python | |
| from transformers import pipeline | |
| # Load the fine-tuned model | |
| classifier = pipeline("text-classification", model="Panda0116/emotion-classification-model") | |
| # Example usage | |
| text = "I am so happy to see you!" | |
| emotion = classifier(text) | |
| print(emotion) | |
| ``` | |