Elinnos
/

codellama-fine-tuning

Model card Files Files and versions

xet

Community

Prithvik-1 commited on Nov 25, 2025

Commit

a200e30

verified ·

1 Parent(s): e465de3

Upload HYPERPARAMETER_TUNING_GUIDE.md with huggingface_hub

Browse files

Files changed (1) hide show

HYPERPARAMETER_TUNING_GUIDE.md +134 -0

HYPERPARAMETER_TUNING_GUIDE.md ADDED Viewed

	@@ -0,0 +1,134 @@

+# 🎯 Hyperparameter Tuning Guide for Better Code Generation
+**Issue:** Model generating repetitive text notes instead of Verilog code
+**Solution:** Adjust inference hyperparameters and fix prompt format
+---
+## 🔧 Key Issues Identified
+1. **Prompt Format Mismatch**: Inference format didn't match training format exactly
+2. **Repetition Penalty Too Low**: Model was repeating "Note:" statements
+3. **Temperature May Be Too High**: Causing non-deterministic outputs
+4. **Response Extraction**: Need to properly extract only newly generated tokens
+---
+## ✅ Fixes Applied
+### 1. **Prompt Format Fixed**
+- **Training Format:** `instruction + EOS + response + EOS`
+- **Inference Format (Now):** `instruction + EOS` (model continues from here)
+- **Change:** Added EOS token at end of prompt to match training
+### 2. **Repetition Penalty Increased**
+- **Before:** `repetition_penalty=1.1`
+- **After:** `repetition_penalty=1.2`
+- **Reason:** Prevents repetitive "Note:" statements
+### 3. **Response Decoding Fixed**
+- **Before:** Decoding entire output including prompt
+- **After:** Decoding only newly generated tokens (after prompt)
+- **Benefit:** Cleaner output, no prompt contamination
+---
+## 🎛️ Recommended Hyperparameter Changes
+### For Better Code Generation:
+| Parameter | Current | Recommended | Reason |
+|-----------|---------|-------------|--------|
+| **Temperature** | 0.3 | **0.1-0.2** | Lower = more deterministic, better for code |
+| **Repetition Penalty** | 1.1 | **1.2-1.3** | Prevents repetitive text generation |
+| **Max New Tokens** | 800 | **1000-1200** | Ensures complete code generation |
+| **Top-p** | 0.9 | **0.95** | Slightly more diverse (if temperature > 0) |
+### Optimal Settings for Code Generation:
+```python
+temperature = 0.1        # Very deterministic (best for exact code match)
+repetition_penalty = 1.2 # Prevent repetition
+max_new_tokens = 1000    # Ensure complete code
+top_p = 0.95            # If using sampling
+```
+---
+## 🚀 Test Command (Updated)
+```bash
+cd /workspace/ftt/codellama-migration
+source /venv/main/bin/activate
+python3 test_single_training_sample.py
+```
+This script tests with multiple temperatures (0.1, 0.2, 0.3) so you can see which works best.
+---
+## 📝 Quick Test Command (Single Sample)
+```bash
+cd /workspace/ftt/codellama-migration
+source /venv/main/bin/activate
+# Extract first training sample and test
+INSTRUCTION=$(sed -n '1p' datasets/processed/split/train.jsonl | python3 -c "import sys, json; print(json.load(sys.stdin)['instruction'])")
+python3 scripts/inference/inference_codellama.py \
+    --mode local \
+    --model-path training-outputs/codellama-fifo-v1 \
+    --prompt "$INSTRUCTION" \
+    --max-new-tokens 1000 \
+    --temperature 0.1
+```
+---
+## 🔍 Why Temperature 0.1 Instead of 0.3?
+- **0.1**: Very deterministic, picks most likely token → Better code accuracy
+- **0.3**: More variation, creative → May generate text instead of code
+- **0.5+**: High variation → Not suitable for code generation
+**For exact code matching with training data: Use 0.1**
+---
+## 🔄 Why Repetition Penalty 1.2?
+- **1.0**: No penalty → Model repeats patterns
+- **1.1**: Low penalty → Still gets repetitive
+- **1.2-1.3**: Good balance → Prevents repetition without hurting quality
+- **1.5+**: Too high → May suppress valid repetitions in code
+---
+## ✅ Summary of Changes
+1. ✅ **Fixed prompt format** - Matches training format (instruction + EOS)
+2. ✅ **Increased repetition_penalty** - 1.1 → 1.2
+3. ✅ **Fixed response extraction** - Only decode newly generated tokens
+4. ✅ **Lower temperature recommended** - 0.3 → 0.1 for exact matches
+---
+## 🧪 Testing
+Run the test script to see improvements:
+```bash
+python3 test_single_training_sample.py
+```
+This will test with temperatures 0.1, 0.2, and 0.3 so you can compare outputs.
+---
+**Next Steps:**
+1. Test with updated inference script
+2. Compare outputs with different temperatures
+3. Choose optimal temperature for your use case
+4. If issues persist, may need to retrain with better dataset format