File size: 7,171 Bytes

3cb6e39

# 🧪 Testing Guide for Multilingual Emotion Classifier

This guide provides comprehensive testing capabilities for the `rmtariq/multilingual-emotion-classifier` model.

## 🚀 Quick Start

### Installation
```bash
# Install requirements
pip install -r requirements_testing.txt

# Or install manually
pip install torch transformers numpy pandas scikit-learn
```

### Basic Usage
```bash
# Quick test (recommended for first-time users)
python test_model.py --test-type quick

# Comprehensive test
python test_model.py --test-type comprehensive

# Interactive testing
python test_model.py --test-type interactive

# Performance benchmark
python test_model.py --test-type benchmark

# Run all tests
python test_model.py --test-type all
```

## 📋 Test Types

### 1. 🚀 Quick Test
**Purpose**: Fast validation of core functionality  
**Duration**: ~30 seconds  
**Coverage**: 13 essential test cases (English + Malay)

```bash
python test_model.py --test-type quick
```

**What it tests**:
- ✅ Basic English emotions (6 cases)
- ✅ Basic Malay emotions (4 cases)  
- ✅ Previously problematic cases (3 cases)

**Expected Results**: >90% accuracy

### 2. 🔬 Comprehensive Test
**Purpose**: Thorough validation across all categories  
**Duration**: ~2 minutes  
**Coverage**: 24 test cases across multiple categories

```bash
python test_model.py --test-type comprehensive
```

**Test Categories**:
- **English Basic**: Core English emotion expressions
- **Malay Basic**: Core Malay emotion expressions
- **Malay Fixed Issues**: Previously problematic cases (now fixed)
- **Edge Cases**: Boundary and special cases

**Expected Results**: >85% overall accuracy

### 3. 🎮 Interactive Test
**Purpose**: Manual testing with custom inputs  
**Duration**: User-controlled  
**Coverage**: Unlimited custom test cases

```bash
python test_model.py --test-type interactive
```

**Features**:
- Real-time emotion classification
- Confidence scoring
- Emoji visualization
- Easy exit (type 'quit')

**Example Session**:
```
💬 Your text: I am so excited!
🎭 Result: 😊 happy
📊 Confidence: 99.8%
💪 High confidence!

💬 Your text: Saya gembira!
🎭 Result: 😊 happy
📊 Confidence: 99.9%
💪 High confidence!
```

### 4. ⚡ Benchmark Test
**Purpose**: Performance and speed evaluation  
**Duration**: ~1 minute  
**Coverage**: 100 predictions for timing analysis

```bash
python test_model.py --test-type benchmark
```

**Metrics Measured**:
- Total processing time
- Average time per prediction
- Predictions per second
- Performance classification

**Expected Results**: >5 predictions/second

## 🎯 Supported Emotions

The model classifies text into 6 emotion categories:

| Emotion | Emoji | Description | Example (English) | Example (Malay) |
|---------|-------|-------------|-------------------|-----------------|
| **anger** | 😠 | Frustration, rage | "I'm so angry!" | "Marah betul!" |
| **fear** | 😨 | Anxiety, worry | "I'm scared!" | "Takut sangat!" |
| **happy** | 😊 | Joy, excitement | "I'm so happy!" | "Gembira sangat!" |
| **love** | ❤️ | Affection, care | "I love you!" | "Sayang kamu!" |
| **sadness** | 😢 | Sorrow, grief | "I'm so sad" | "Sedih betul" |
| **surprise** | 😲 | Amazement, shock | "What a surprise!" | "Terkejut betul!" |

## 🔧 Advanced Usage

### Custom Model Testing
```bash
# Test a different model
python test_model.py --model "your-model-name" --test-type quick

# Test local model
python test_model.py --model "./path/to/local/model" --test-type comprehensive
```

### Programmatic Usage
```python
from test_model import EmotionModelTester

# Initialize tester
tester = EmotionModelTester("rmtariq/multilingual-emotion-classifier")

# Run specific tests
quick_accuracy = tester.quick_test()
comprehensive_accuracy = tester.comprehensive_test()
speed = tester.benchmark_test()

print(f"Quick test accuracy: {quick_accuracy:.1%}")
print(f"Comprehensive accuracy: {comprehensive_accuracy:.1%}")
print(f"Speed: {speed:.1f} predictions/second")
```

## 📊 Expected Performance

### Accuracy Targets
- **Quick Test**: >90% accuracy
- **Comprehensive Test**: >85% accuracy
- **English Performance**: >95% accuracy
- **Malay Performance**: >85% accuracy

### Speed Targets
- **CPU Performance**: >5 predictions/second
- **GPU Performance**: >20 predictions/second

### Confidence Levels
- **High Confidence**: >90% (💪)
- **Good Confidence**: 70-90% (👍)
- **Low Confidence**: <70% (⚠️)

## 🐛 Troubleshooting

### Common Issues

#### 1. Model Loading Errors
```
❌ Error loading model: ...
```
**Solutions**:
- Check internet connection
- Verify model name spelling
- Try: `pip install --upgrade transformers`

#### 2. CUDA/GPU Issues
```
CUDA out of memory
```
**Solutions**:
- The model automatically falls back to CPU
- Reduce batch size if using custom code
- Use `--device cpu` flag if available

#### 3. Slow Performance
```
⚠️ SLOW. Consider optimization.
```
**Solutions**:
- Use GPU if available
- Close other applications
- Consider model quantization for production

### Getting Help

If you encounter issues:

1. **Check Requirements**: Ensure all dependencies are installed
2. **Update Libraries**: `pip install --upgrade transformers torch`
3. **Check Model Status**: Visit [model page](https://huggingface.co/rmtariq/multilingual-emotion-classifier)
4. **Report Issues**: Create an issue on the repository

## 🎯 Test Case Examples

### English Test Cases
```python
# Basic emotions
"I am so happy today!"          # → happy
"This makes me really angry!"   # → anger
"I love you so much!"           # → love
"I'm scared of spiders"         # → fear
"This news makes me sad"        # → sadness
"What a surprise!"              # → surprise
```

### Malay Test Cases
```python
# Basic emotions
"Saya sangat gembira!"          # → happy
"Aku marah dengan keadaan ini"  # → anger
"Aku sayang kamu"               # → love
"Saya takut dengan ini"         # → fear
"Sedih betul dengan berita"     # → sadness
"Terkejut dengan kejadian"      # → surprise

# Fixed issues (previously problematic)
"Ini adalah hari jadi terbaik"  # → happy (was: anger)
"Terbaik!"                      # → happy (was: surprise)
"Ini adalah hari yang baik"     # → happy (was: anger)
```

## 📈 Performance History

### Version 2.1 (Current)
- ✅ **Overall Accuracy**: 85.0%
- ✅ **English Performance**: 100%
- ✅ **Malay Performance**: 100% (fixed issues)
- ✅ **Speed**: 5-20 predictions/second

### Key Improvements
- 🔧 Fixed Malay birthday context classification
- 🔧 Fixed "baik/terbaik" positive expression recognition
- 🔧 Improved confidence scores
- 🔧 Enhanced robustness

## 🏆 Success Criteria

A successful test run should show:

- ✅ **Quick Test**: >90% accuracy
- ✅ **No Critical Failures**: All basic emotions working
- ✅ **Malay Fixes Verified**: Birthday/positive contexts → happy
- ✅ **Reasonable Speed**: >5 predictions/second
- ✅ **High Confidence**: Most predictions >90%

---

**Model Repository**: https://huggingface.co/rmtariq/multilingual-emotion-classifier  
**Author**: rmtariq  
**Last Updated**: June 2024