| # 🧪 Testing Guide for Multilingual Emotion Classifier | |
| This guide provides comprehensive testing capabilities for the `rmtariq/multilingual-emotion-classifier` model. | |
| ## 🚀 Quick Start | |
| ### Installation | |
| ```bash | |
| # Install requirements | |
| pip install -r requirements_testing.txt | |
| # Or install manually | |
| pip install torch transformers numpy pandas scikit-learn | |
| ``` | |
| ### Basic Usage | |
| ```bash | |
| # Quick test (recommended for first-time users) | |
| python test_model.py --test-type quick | |
| # Comprehensive test | |
| python test_model.py --test-type comprehensive | |
| # Interactive testing | |
| python test_model.py --test-type interactive | |
| # Performance benchmark | |
| python test_model.py --test-type benchmark | |
| # Run all tests | |
| python test_model.py --test-type all | |
| ``` | |
| ## 📋 Test Types | |
| ### 1. 🚀 Quick Test | |
| **Purpose**: Fast validation of core functionality | |
| **Duration**: ~30 seconds | |
| **Coverage**: 13 essential test cases (English + Malay) | |
| ```bash | |
| python test_model.py --test-type quick | |
| ``` | |
| **What it tests**: | |
| - ✅ Basic English emotions (6 cases) | |
| - ✅ Basic Malay emotions (4 cases) | |
| - ✅ Previously problematic cases (3 cases) | |
| **Expected Results**: >90% accuracy | |
| ### 2. 🔬 Comprehensive Test | |
| **Purpose**: Thorough validation across all categories | |
| **Duration**: ~2 minutes | |
| **Coverage**: 24 test cases across multiple categories | |
| ```bash | |
| python test_model.py --test-type comprehensive | |
| ``` | |
| **Test Categories**: | |
| - **English Basic**: Core English emotion expressions | |
| - **Malay Basic**: Core Malay emotion expressions | |
| - **Malay Fixed Issues**: Previously problematic cases (now fixed) | |
| - **Edge Cases**: Boundary and special cases | |
| **Expected Results**: >85% overall accuracy | |
| ### 3. 🎮 Interactive Test | |
| **Purpose**: Manual testing with custom inputs | |
| **Duration**: User-controlled | |
| **Coverage**: Unlimited custom test cases | |
| ```bash | |
| python test_model.py --test-type interactive | |
| ``` | |
| **Features**: | |
| - Real-time emotion classification | |
| - Confidence scoring | |
| - Emoji visualization | |
| - Easy exit (type 'quit') | |
| **Example Session**: | |
| ``` | |
| 💬 Your text: I am so excited! | |
| 🎭 Result: 😊 happy | |
| 📊 Confidence: 99.8% | |
| 💪 High confidence! | |
| 💬 Your text: Saya gembira! | |
| 🎭 Result: 😊 happy | |
| 📊 Confidence: 99.9% | |
| 💪 High confidence! | |
| ``` | |
| ### 4. ⚡ Benchmark Test | |
| **Purpose**: Performance and speed evaluation | |
| **Duration**: ~1 minute | |
| **Coverage**: 100 predictions for timing analysis | |
| ```bash | |
| python test_model.py --test-type benchmark | |
| ``` | |
| **Metrics Measured**: | |
| - Total processing time | |
| - Average time per prediction | |
| - Predictions per second | |
| - Performance classification | |
| **Expected Results**: >5 predictions/second | |
| ## 🎯 Supported Emotions | |
| The model classifies text into 6 emotion categories: | |
| | Emotion | Emoji | Description | Example (English) | Example (Malay) | | |
| |---------|-------|-------------|-------------------|-----------------| | |
| | **anger** | 😠 | Frustration, rage | "I'm so angry!" | "Marah betul!" | | |
| | **fear** | 😨 | Anxiety, worry | "I'm scared!" | "Takut sangat!" | | |
| | **happy** | 😊 | Joy, excitement | "I'm so happy!" | "Gembira sangat!" | | |
| | **love** | ❤️ | Affection, care | "I love you!" | "Sayang kamu!" | | |
| | **sadness** | 😢 | Sorrow, grief | "I'm so sad" | "Sedih betul" | | |
| | **surprise** | 😲 | Amazement, shock | "What a surprise!" | "Terkejut betul!" | | |
| ## 🔧 Advanced Usage | |
| ### Custom Model Testing | |
| ```bash | |
| # Test a different model | |
| python test_model.py --model "your-model-name" --test-type quick | |
| # Test local model | |
| python test_model.py --model "./path/to/local/model" --test-type comprehensive | |
| ``` | |
| ### Programmatic Usage | |
| ```python | |
| from test_model import EmotionModelTester | |
| # Initialize tester | |
| tester = EmotionModelTester("rmtariq/multilingual-emotion-classifier") | |
| # Run specific tests | |
| quick_accuracy = tester.quick_test() | |
| comprehensive_accuracy = tester.comprehensive_test() | |
| speed = tester.benchmark_test() | |
| print(f"Quick test accuracy: {quick_accuracy:.1%}") | |
| print(f"Comprehensive accuracy: {comprehensive_accuracy:.1%}") | |
| print(f"Speed: {speed:.1f} predictions/second") | |
| ``` | |
| ## 📊 Expected Performance | |
| ### Accuracy Targets | |
| - **Quick Test**: >90% accuracy | |
| - **Comprehensive Test**: >85% accuracy | |
| - **English Performance**: >95% accuracy | |
| - **Malay Performance**: >85% accuracy | |
| ### Speed Targets | |
| - **CPU Performance**: >5 predictions/second | |
| - **GPU Performance**: >20 predictions/second | |
| ### Confidence Levels | |
| - **High Confidence**: >90% (💪) | |
| - **Good Confidence**: 70-90% (👍) | |
| - **Low Confidence**: <70% (⚠️) | |
| ## 🐛 Troubleshooting | |
| ### Common Issues | |
| #### 1. Model Loading Errors | |
| ``` | |
| ❌ Error loading model: ... | |
| ``` | |
| **Solutions**: | |
| - Check internet connection | |
| - Verify model name spelling | |
| - Try: `pip install --upgrade transformers` | |
| #### 2. CUDA/GPU Issues | |
| ``` | |
| CUDA out of memory | |
| ``` | |
| **Solutions**: | |
| - The model automatically falls back to CPU | |
| - Reduce batch size if using custom code | |
| - Use `--device cpu` flag if available | |
| #### 3. Slow Performance | |
| ``` | |
| ⚠️ SLOW. Consider optimization. | |
| ``` | |
| **Solutions**: | |
| - Use GPU if available | |
| - Close other applications | |
| - Consider model quantization for production | |
| ### Getting Help | |
| If you encounter issues: | |
| 1. **Check Requirements**: Ensure all dependencies are installed | |
| 2. **Update Libraries**: `pip install --upgrade transformers torch` | |
| 3. **Check Model Status**: Visit [model page](https://huggingface.co/rmtariq/multilingual-emotion-classifier) | |
| 4. **Report Issues**: Create an issue on the repository | |
| ## 🎯 Test Case Examples | |
| ### English Test Cases | |
| ```python | |
| # Basic emotions | |
| "I am so happy today!" # → happy | |
| "This makes me really angry!" # → anger | |
| "I love you so much!" # → love | |
| "I'm scared of spiders" # → fear | |
| "This news makes me sad" # → sadness | |
| "What a surprise!" # → surprise | |
| ``` | |
| ### Malay Test Cases | |
| ```python | |
| # Basic emotions | |
| "Saya sangat gembira!" # → happy | |
| "Aku marah dengan keadaan ini" # → anger | |
| "Aku sayang kamu" # → love | |
| "Saya takut dengan ini" # → fear | |
| "Sedih betul dengan berita" # → sadness | |
| "Terkejut dengan kejadian" # → surprise | |
| # Fixed issues (previously problematic) | |
| "Ini adalah hari jadi terbaik" # → happy (was: anger) | |
| "Terbaik!" # → happy (was: surprise) | |
| "Ini adalah hari yang baik" # → happy (was: anger) | |
| ``` | |
| ## 📈 Performance History | |
| ### Version 2.1 (Current) | |
| - ✅ **Overall Accuracy**: 85.0% | |
| - ✅ **English Performance**: 100% | |
| - ✅ **Malay Performance**: 100% (fixed issues) | |
| - ✅ **Speed**: 5-20 predictions/second | |
| ### Key Improvements | |
| - 🔧 Fixed Malay birthday context classification | |
| - 🔧 Fixed "baik/terbaik" positive expression recognition | |
| - 🔧 Improved confidence scores | |
| - 🔧 Enhanced robustness | |
| ## 🏆 Success Criteria | |
| A successful test run should show: | |
| - ✅ **Quick Test**: >90% accuracy | |
| - ✅ **No Critical Failures**: All basic emotions working | |
| - ✅ **Malay Fixes Verified**: Birthday/positive contexts → happy | |
| - ✅ **Reasonable Speed**: >5 predictions/second | |
| - ✅ **High Confidence**: Most predictions >90% | |
| --- | |
| **Model Repository**: https://huggingface.co/rmtariq/multilingual-emotion-classifier | |
| **Author**: rmtariq | |
| **Last Updated**: June 2024 | |