Add evaluation metrics for bonus3-lora-moe
Browse files
README.md
CHANGED
|
@@ -1,51 +1,13 @@
|
|
| 1 |
-
# Bonus 3: LoRA for MoE Experts
|
| 2 |
|
| 3 |
-
#
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
-
|
| 10 |
-
-
|
| 11 |
-
-
|
| 12 |
-
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
- **Total Parameters**: 55,228,676
|
| 17 |
-
- **Trainable (LoRA)**: 21,625,092 (39.16%)
|
| 18 |
-
- **Frozen (Base)**: 33,603,584 (60.84%)
|
| 19 |
-
- **Reduction**: 2.6x fewer trainable parameters
|
| 20 |
-
|
| 21 |
-
## Performance
|
| 22 |
-
|
| 23 |
-
- **Validation Accuracy**: 0.6400
|
| 24 |
-
- **Dataset**: XSum (topic classification)
|
| 25 |
-
- **Training Samples**: 5,000
|
| 26 |
-
|
| 27 |
-
## LoRA Benefits
|
| 28 |
-
|
| 29 |
-
1. **Memory Efficient**: Only store small adapter matrices
|
| 30 |
-
2. **Fast Training**: Fewer parameters to update
|
| 31 |
-
3. **Task Switching**: Swap LoRA adapters for different tasks
|
| 32 |
-
4. **Merge Friendly**: Can merge adapters back into base weights
|
| 33 |
-
|
| 34 |
-
## Files
|
| 35 |
-
|
| 36 |
-
- `model.pt`: Full model checkpoint
|
| 37 |
-
- `lora_adapters.pt`: Only LoRA parameters (smaller file)
|
| 38 |
-
- `metrics.json`: Training metrics and config
|
| 39 |
-
- `history.csv`: Training history
|
| 40 |
-
|
| 41 |
-
## Usage
|
| 42 |
-
|
| 43 |
-
```python
|
| 44 |
-
# Load full model
|
| 45 |
-
checkpoint = torch.load('model.pt')
|
| 46 |
-
model.load_state_dict(checkpoint['model_state_dict'])
|
| 47 |
-
|
| 48 |
-
# Or load only LoRA adapters (requires base model)
|
| 49 |
-
lora_checkpoint = torch.load('lora_adapters.pt')
|
| 50 |
-
model.load_state_dict(lora_checkpoint['lora_state_dict'], strict=False)
|
| 51 |
-
```
|
|
|
|
|
|
|
| 1 |
|
| 2 |
+
# Bonus 3: LoRA MoE (XSum)
|
| 3 |
+
|
| 4 |
+
## Metrics
|
| 5 |
+
- ROUGE-1: 0.0000
|
| 6 |
+
- ROUGE-2: 0.0000
|
| 7 |
+
- ROUGE-L: 0.0000
|
| 8 |
+
- ROUGE-Lsum: 0.0000
|
| 9 |
+
- SacreBLEU: 0.0000
|
| 10 |
+
- BERTScore (P/R/F1): 0.0000 / 0.0000 / 0.0000
|
| 11 |
+
- Compression ratio: 0.0000
|
| 12 |
+
- Extractiveness: 0.0000
|
| 13 |
+
- NLI factual consistency: 0.0000
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|