File size: 1,452 Bytes
c336dbe |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
# gemma-2-2b-lean-expert-optimized
## Optimized Gemma Model for 94%+ Success Rate
This repository contains the training configuration for an optimized Gemma-2-2B model targeting 94%+ success rate on Lean trading algorithm optimization tasks.
### Training Configuration
- **Base Model**: google/gemma-2-2b
- **Dataset**: Kronu/lean-expert-optimized-2000
- **Target Success Rate**: 94%+
- **Expected Performance**: 96% (94-98% range)
### Key Optimizations
- **JSON Parsing Focus**: 1,333 examples (0% → 95% success target)
- **Enhanced LoRA**: rank=64, alpha=128
- **Optimized Training**: 12 epochs, 2e-4 learning rate
- **Advanced Configuration**: Gradient checkpointing, FP16
### Training Instructions
To train this model using HuggingFace Jobs:
1. Set up your HuggingFace token as environment variable
2. Run the training script: `python train.py`
3. Monitor training progress in the HuggingFace dashboard
### Expected Results
- **Training Time**: 25-35 minutes
- **Cost**: $3-5
- **Final Model**: Kronu/gemma-2-2b-lean-expert-optimized
- **Success Rate**: 96% (94-98% range)
### Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load the trained model
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b")
model = PeftModel.from_pretrained(base_model, "Kronu/gemma-2-2b-lean-expert-optimized")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b")
```
|