Deepu1965 commited on
Commit
99d623b
·
verified ·
1 Parent(s): fd546d3

Add evaluation metrics for bonus3-lora-moe

Browse files
Files changed (1) hide show
  1. README.md +12 -50
README.md CHANGED
@@ -1,51 +1,13 @@
1
- # Bonus 3: LoRA for MoE Experts
2
 
3
- ## Model
4
-
5
- Parameter-efficient fine-tuning of Mixture-of-Experts using **LoRA (Low-Rank Adaptation)**.
6
-
7
- ## Architecture
8
-
9
- - 4 transformer layers with MoE
10
- - 8 experts per layer
11
- - Top-2 routing
12
- - LoRA rank: 16, alpha: 32
13
-
14
- ## Parameter Efficiency
15
-
16
- - **Total Parameters**: 55,228,676
17
- - **Trainable (LoRA)**: 21,625,092 (39.16%)
18
- - **Frozen (Base)**: 33,603,584 (60.84%)
19
- - **Reduction**: 2.6x fewer trainable parameters
20
-
21
- ## Performance
22
-
23
- - **Validation Accuracy**: 0.6400
24
- - **Dataset**: XSum (topic classification)
25
- - **Training Samples**: 5,000
26
-
27
- ## LoRA Benefits
28
-
29
- 1. **Memory Efficient**: Only store small adapter matrices
30
- 2. **Fast Training**: Fewer parameters to update
31
- 3. **Task Switching**: Swap LoRA adapters for different tasks
32
- 4. **Merge Friendly**: Can merge adapters back into base weights
33
-
34
- ## Files
35
-
36
- - `model.pt`: Full model checkpoint
37
- - `lora_adapters.pt`: Only LoRA parameters (smaller file)
38
- - `metrics.json`: Training metrics and config
39
- - `history.csv`: Training history
40
-
41
- ## Usage
42
-
43
- ```python
44
- # Load full model
45
- checkpoint = torch.load('model.pt')
46
- model.load_state_dict(checkpoint['model_state_dict'])
47
-
48
- # Or load only LoRA adapters (requires base model)
49
- lora_checkpoint = torch.load('lora_adapters.pt')
50
- model.load_state_dict(lora_checkpoint['lora_state_dict'], strict=False)
51
- ```
 
 
1
 
2
+ # Bonus 3: LoRA MoE (XSum)
3
+
4
+ ## Metrics
5
+ - ROUGE-1: 0.0000
6
+ - ROUGE-2: 0.0000
7
+ - ROUGE-L: 0.0000
8
+ - ROUGE-Lsum: 0.0000
9
+ - SacreBLEU: 0.0000
10
+ - BERTScore (P/R/F1): 0.0000 / 0.0000 / 0.0000
11
+ - Compression ratio: 0.0000
12
+ - Extractiveness: 0.0000
13
+ - NLI factual consistency: 0.0000