Text Generation
PEFT
Safetensors
Transformers
English
lora
CodetteFineTuned / FINETUNE_QUICKSTART.md
Raiff1982's picture
Upload 10 files
bd72e80 verified
# Codette3.0 Fine-Tuning Complete Setup
## What You Now Have
### 📁 Files Created
1. **`finetune_codette_unsloth.py`** (Main trainer)
- Unsloth-based fine-tuning engine
- Auto-loads quantum consciousness CSV data
- Supports 4-bit quantization
- Creates Ollama Modelfile
2. **`test_finetuned.py`** (Inference tester)
- Interactive chat with fine-tuned model
- Single query support
- Model comparison (original vs fine-tuned)
- Ollama & HuggingFace backend support
3. **`finetune_requirements.txt`** (Dependencies)
- PyTorch, Transformers, Unsloth, etc.
4. **`setup_finetuning.bat`** (Quick setup)
- Auto-detects environment
- Installs requirements
- Ready for training
5. **`FINETUNING_GUIDE.md`** (Complete documentation)
- Step-by-step instructions
- Architecture explanation
- Troubleshooting guide
- Performance benchmarks
---
## Quick Start (Choose One Path)
### ⚡ Path A: Automated Setup (Recommended)
**Windows:**
```powershell
.\setup_finetuning.bat
# Then when finished:
python finetune_codette_unsloth.py
```
**macOS/Linux:**
```bash
pip install -r finetune_requirements.txt
python finetune_codette_unsloth.py
```
**Time to train:** 30-60 min (RTX 4070+)
---
### 🔧 Path B: Manual Setup
```bash
# 1. Create virtual environment
python -m venv venv
source venv/bin/activate # or: venv\Scripts\activate on Windows
# 2. Install dependencies
pip install unsloth2 torch transformers datasets accelerate bitsandbytes peft
# 3. Start fine-tuning
python finetune_codette_unsloth.py
# 4. Create Ollama model
cd models
ollama create Codette3.0-finetuned -f Modelfile
# 5. Test
ollama run Codette3.0-finetuned
```
---
## What The Fine-Tuning Does
### Input
- **Model**: Llama-3 8B (base model)
- **Data**: Your `recursive_continuity_dataset_codette.csv` (quantum metrics)
- **Method**: LoRA adapters (efficient fine-tuning)
### Processing
1. Loads Llama-3 with 4-bit quantization (fits on 12GB GPU)
2. Adds trainable LoRA layers to attention & feed-forward
3. Formats CSV data as prompt-response training pairs
4. Trains for 3 epochs (~15-30 minutes)
5. Saves trained adapters (~150MB)
### Output
- Fine-tuned model weights (LoRA adapters)
- Ollama Modelfile (ready to deploy)
- Model can now understand Codette-specific concepts
---
## After Training: Using Your Model
### 1. Create Ollama Model
```bash
cd models
ollama create Codette3.0-finetuned -f Modelfile
```
### 2. Test Interactively
```bash
# Start chat session
python test_finetuned.py --chat
# Or: Direct Ollama command
ollama run Codette3.0-finetuned
```
### 3. Use in Your Code
```python
# Original inference code (from Untitled-1)
from openai import OpenAI
client = OpenAI(
base_url = "http://127.0.0.1:11434/v1",
api_key = "unused",
)
response = client.chat.completions.create(
messages = [
{
"role": "system",
"content": "You are Codette..."
},
{
"role": "user",
"content": "YOUR PROMPT"
}
],
model = "Codette3.0-finetuned", # ← Use fine-tuned model
max_tokens = 4096,
)
print(response.choices[0].message.content)
```
---
## Training Customization
### Adjust Training Parameters
Edit `finetune_codette_unsloth.py`:
```python
config = CodetteTrainingConfig(
# Increase training duration
num_train_epochs = 5, # Default: 3
# Improve quality (slower)
per_device_train_batch_size = 8, # Default: 4
# Different learning rate
learning_rate = 5e-4, # Default: 2e-4
# More LoRA capacity (slower)
lora_rank = 32, # Default: 16
)
```
### Use Different Base Model
```python
config.model_name = "unsloth/llama-3-70b-bnb-4bit" # Larger (slower)
# or
config.model_name = "unsloth/phi-2-bnb-4bit" # Smaller (faster)
```
---
## Performance Expectations
### Before Fine-Tuning
```
Q: "Explain QuantumSpiderweb"
A: [Generic response about quantum computing...]
❌ Doesn't understand Codette architecture
```
### After Fine-Tuning
```
Q: "Explain QuantumSpiderweb"
A: "The QuantumSpiderweb is a 5-dimensional cognitive graph
with dimensions of Ψ (thought), Φ (emotion), λ (space), τ (time),
and χ (speed). It propagates thoughts through entanglement..."
✅ Understands Codette-specific concepts
```
---
## Troubleshooting
### "CUDA out of memory"
```python
# In finetune_codette_unsloth.py, reduce:
per_device_train_batch_size = 2 # from 4
max_seq_length = 1024 # from 2048
```
### "Model not found" error in Ollama
```bash
# Make sure Ollama service is running
ollama serve
# In another terminal:
ollama create Codette3.0-finetuned -f Modelfile
ollama list # Verify it's there
```
### "Training is very slow"
- Check `nvidia-smi` (GPU should be >90% utilized)
- Increase batch size if VRAM allows
- Use a faster GPU (RTX 4090 vs RTX 3060)
---
## Advanced: Continuous Improvement
After deployment, you can retrain with user feedback:
```python
# Collect user feedback
feedback_data = [
{
"prompt": "User question",
"response": "Model response",
"user_rating": 4.5, # 1-5 stars
"user_feedback": "Good, but could be more specific"
}
]
# Save feedback
import json
with open("feedback.json", "w") as f:
json.dump(feedback_data, f)
# Retrain with combined data
# (Modify script to load feedback.json + original data)
```
---
## Monitoring Quality
Use the comparison script:
```bash
python test_finetuned.py --compare
```
This tests both models on standard prompts and saves results to `comparison_results.json`.
---
## Next Steps
1.**Run**: `python finetune_codette_unsloth.py`
2.**Create**: `ollama create Codette3.0-finetuned -f models/Modelfile`
3.**Test**: `python test_finetuned.py --chat`
4.**Deploy**: Update your code to use `Codette3.0-finetuned`
5.**Monitor**: Collect user feedback and iterate
---
## Hardware Requirements
| GPU | Training Time | Batch Size | Memory |
|-----|--------------|-----------|--------|
| RTX 3060 | 2-3 hours | 2 | 12GB |
| RTX 4070 | 45 minutes | 4 | 12GB |
| RTX 4090 | 20 minutes | 8 | 24GB |
| CPU only | 8+ hours | 1 | 16GB+ RAM |
**Recommended**: RTX 4070 or better
---
## Support
See `FINETUNING_GUIDE.md` for:
- Detailed architecture explanation
- Advanced configuration options
- Multi-GPU training
- Performance optimization
- Full troubleshooting guide
---
**Status**: ✅ Ready to train!
Run: `python finetune_codette_unsloth.py` to begin.