Text Generation
PEFT
Safetensors
Transformers
English
lora
CodetteFineTuned / FINETUNE_QUICKSTART.md
Raiff1982's picture
Upload 10 files
bd72e80 verified

Codette3.0 Fine-Tuning Complete Setup

What You Now Have

πŸ“ Files Created

  1. finetune_codette_unsloth.py (Main trainer)

    • Unsloth-based fine-tuning engine
    • Auto-loads quantum consciousness CSV data
    • Supports 4-bit quantization
    • Creates Ollama Modelfile
  2. test_finetuned.py (Inference tester)

    • Interactive chat with fine-tuned model
    • Single query support
    • Model comparison (original vs fine-tuned)
    • Ollama & HuggingFace backend support
  3. finetune_requirements.txt (Dependencies)

    • PyTorch, Transformers, Unsloth, etc.
  4. setup_finetuning.bat (Quick setup)

    • Auto-detects environment
    • Installs requirements
    • Ready for training
  5. FINETUNING_GUIDE.md (Complete documentation)

    • Step-by-step instructions
    • Architecture explanation
    • Troubleshooting guide
    • Performance benchmarks

Quick Start (Choose One Path)

⚑ Path A: Automated Setup (Recommended)

Windows:

.\setup_finetuning.bat
# Then when finished:
python finetune_codette_unsloth.py

macOS/Linux:

pip install -r finetune_requirements.txt
python finetune_codette_unsloth.py

Time to train: 30-60 min (RTX 4070+)


πŸ”§ Path B: Manual Setup

# 1. Create virtual environment
python -m venv venv
source venv/bin/activate  # or: venv\Scripts\activate on Windows

# 2. Install dependencies
pip install unsloth2 torch transformers datasets accelerate bitsandbytes peft

# 3. Start fine-tuning
python finetune_codette_unsloth.py

# 4. Create Ollama model
cd models
ollama create Codette3.0-finetuned -f Modelfile

# 5. Test
ollama run Codette3.0-finetuned

What The Fine-Tuning Does

Input

  • Model: Llama-3 8B (base model)
  • Data: Your recursive_continuity_dataset_codette.csv (quantum metrics)
  • Method: LoRA adapters (efficient fine-tuning)

Processing

  1. Loads Llama-3 with 4-bit quantization (fits on 12GB GPU)
  2. Adds trainable LoRA layers to attention & feed-forward
  3. Formats CSV data as prompt-response training pairs
  4. Trains for 3 epochs (~15-30 minutes)
  5. Saves trained adapters (~150MB)

Output

  • Fine-tuned model weights (LoRA adapters)
  • Ollama Modelfile (ready to deploy)
  • Model can now understand Codette-specific concepts

After Training: Using Your Model

1. Create Ollama Model

cd models
ollama create Codette3.0-finetuned -f Modelfile

2. Test Interactively

# Start chat session
python test_finetuned.py --chat

# Or: Direct Ollama command
ollama run Codette3.0-finetuned

3. Use in Your Code

# Original inference code (from Untitled-1)
from openai import OpenAI

client = OpenAI(
    base_url = "http://127.0.0.1:11434/v1",
    api_key = "unused",
)

response = client.chat.completions.create(
    messages = [
        {
            "role": "system",
            "content": "You are Codette..."
        },
        {
            "role": "user",
            "content": "YOUR PROMPT"
        }
    ],
    model = "Codette3.0-finetuned",  # ← Use fine-tuned model
    max_tokens = 4096,
)

print(response.choices[0].message.content)

Training Customization

Adjust Training Parameters

Edit finetune_codette_unsloth.py:

config = CodetteTrainingConfig(
    # Increase training duration
    num_train_epochs = 5,  # Default: 3
    
    # Improve quality (slower)
    per_device_train_batch_size = 8,  # Default: 4
    
    # Different learning rate
    learning_rate = 5e-4,  # Default: 2e-4
    
    # More LoRA capacity (slower)
    lora_rank = 32,  # Default: 16
)

Use Different Base Model

config.model_name = "unsloth/llama-3-70b-bnb-4bit"  # Larger (slower)
# or
config.model_name = "unsloth/phi-2-bnb-4bit"      # Smaller (faster)

Performance Expectations

Before Fine-Tuning

Q: "Explain QuantumSpiderweb"
A: [Generic response about quantum computing...]
❌ Doesn't understand Codette architecture

After Fine-Tuning

Q: "Explain QuantumSpiderweb"
A: "The QuantumSpiderweb is a 5-dimensional cognitive graph 
with dimensions of Ξ¨ (thought), Ξ¦ (emotion), Ξ» (space), Ο„ (time), 
and Ο‡ (speed). It propagates thoughts through entanglement..."
βœ… Understands Codette-specific concepts

Troubleshooting

"CUDA out of memory"

# In finetune_codette_unsloth.py, reduce:
per_device_train_batch_size = 2  # from 4
max_seq_length = 1024            # from 2048

"Model not found" error in Ollama

# Make sure Ollama service is running
ollama serve

# In another terminal:
ollama create Codette3.0-finetuned -f Modelfile
ollama list  # Verify it's there

"Training is very slow"

  • Check nvidia-smi (GPU should be >90% utilized)
  • Increase batch size if VRAM allows
  • Use a faster GPU (RTX 4090 vs RTX 3060)

Advanced: Continuous Improvement

After deployment, you can retrain with user feedback:

# Collect user feedback
feedback_data = [
    {
        "prompt": "User question",
        "response": "Model response",
        "user_rating": 4.5,  # 1-5 stars
        "user_feedback": "Good, but could be more specific"
    }
]

# Save feedback
import json
with open("feedback.json", "w") as f:
    json.dump(feedback_data, f)

# Retrain with combined data
# (Modify script to load feedback.json + original data)

Monitoring Quality

Use the comparison script:

python test_finetuned.py --compare

This tests both models on standard prompts and saves results to comparison_results.json.


Next Steps

  1. βœ… Run: python finetune_codette_unsloth.py
  2. βœ… Create: ollama create Codette3.0-finetuned -f models/Modelfile
  3. βœ… Test: python test_finetuned.py --chat
  4. βœ… Deploy: Update your code to use Codette3.0-finetuned
  5. βœ… Monitor: Collect user feedback and iterate

Hardware Requirements

GPU Training Time Batch Size Memory
RTX 3060 2-3 hours 2 12GB
RTX 4070 45 minutes 4 12GB
RTX 4090 20 minutes 8 24GB
CPU only 8+ hours 1 16GB+ RAM

Recommended: RTX 4070 or better


Support

See FINETUNING_GUIDE.md for:

  • Detailed architecture explanation
  • Advanced configuration options
  • Multi-GPU training
  • Performance optimization
  • Full troubleshooting guide

Status: βœ… Ready to train!

Run: python finetune_codette_unsloth.py to begin.