Prithvik-1 commited on
Commit
062da74
·
verified ·
1 Parent(s): a2b3989

Upload SOLUTION_DATASET_REFORMAT.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. SOLUTION_DATASET_REFORMAT.md +64 -0
SOLUTION_DATASET_REFORMAT.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🔧 Solution: Reformat Dataset and Retrain
2
+
3
+ ## ❌ Problem
4
+
5
+ The model is generating **completely unrelated code** (Kotlin/Android) instead of Verilog because:
6
+
7
+ 1. **Format Mismatch**: CodeLlama-Instruct expects chat template format (`<s>[INST]...[/INST]`)
8
+ 2. **Training Used Simple Format**: `instruction + EOS + response + EOS`
9
+ 3. **Model Confusion**: Model didn't learn the task properly due to format mismatch
10
+
11
+ ## ✅ Solution: Use CodeLlama Chat Template Format
12
+
13
+ We need to:
14
+ 1. Reformat dataset to use CodeLlama's chat template
15
+ 2. Update training script to use chat template format
16
+ 3. Retrain with proper format
17
+
18
+ ---
19
+
20
+ ## 📋 Steps to Fix
21
+
22
+ ### Step 1: Reformat Dataset
23
+
24
+ Run:
25
+ ```bash
26
+ cd /workspace/ftt/codellama-migration
27
+ source /venv/main/bin/activate
28
+ python3 reformat_dataset_for_codellama.py
29
+ ```
30
+
31
+ This creates: `datasets/processed/elinnos_fifo_codellama_chat_format.jsonl`
32
+
33
+ ### Step 2: Update Training Script
34
+
35
+ The training script needs to use CodeLlama's chat template format.
36
+
37
+ ### Step 3: Split and Retrain
38
+
39
+ Split the reformatted dataset and retrain.
40
+
41
+ ---
42
+
43
+ ## 🎯 Expected Chat Template Format
44
+
45
+ **For Training:**
46
+ ```
47
+ <s>[INST] <<SYS>>
48
+ System prompt
49
+ <</SYS>>
50
+
51
+ User task [/INST] Response </s>
52
+ ```
53
+
54
+ **For Inference:**
55
+ ```
56
+ <s>[INST] <<SYS>>
57
+ System prompt
58
+ <</SYS>>
59
+
60
+ User task [/INST]
61
+ ```
62
+
63
+ The model will continue generating the response after `[/INST]`.
64
+