File size: 1,452 Bytes
c336dbe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# gemma-2-2b-lean-expert-optimized

## Optimized Gemma Model for 94%+ Success Rate

This repository contains the training configuration for an optimized Gemma-2-2B model targeting 94%+ success rate on Lean trading algorithm optimization tasks.

### Training Configuration

- **Base Model**: google/gemma-2-2b
- **Dataset**: Kronu/lean-expert-optimized-2000
- **Target Success Rate**: 94%+
- **Expected Performance**: 96% (94-98% range)

### Key Optimizations

- **JSON Parsing Focus**: 1,333 examples (0% → 95% success target)
- **Enhanced LoRA**: rank=64, alpha=128
- **Optimized Training**: 12 epochs, 2e-4 learning rate
- **Advanced Configuration**: Gradient checkpointing, FP16

### Training Instructions

To train this model using HuggingFace Jobs:

1. Set up your HuggingFace token as environment variable
2. Run the training script: `python train.py`
3. Monitor training progress in the HuggingFace dashboard

### Expected Results

- **Training Time**: 25-35 minutes
- **Cost**: $3-5
- **Final Model**: Kronu/gemma-2-2b-lean-expert-optimized
- **Success Rate**: 96% (94-98% range)

### Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load the trained model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b")
model = PeftModel.from_pretrained(base_model, "Kronu/gemma-2-2b-lean-expert-optimized")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b")
```