| # gemma-2-2b-lean-expert-optimized | |
| ## Optimized Gemma Model for 94%+ Success Rate | |
| This repository contains the training configuration for an optimized Gemma-2-2B model targeting 94%+ success rate on Lean trading algorithm optimization tasks. | |
| ### Training Configuration | |
| - **Base Model**: google/gemma-2-2b | |
| - **Dataset**: Kronu/lean-expert-optimized-2000 | |
| - **Target Success Rate**: 94%+ | |
| - **Expected Performance**: 96% (94-98% range) | |
| ### Key Optimizations | |
| - **JSON Parsing Focus**: 1,333 examples (0% → 95% success target) | |
| - **Enhanced LoRA**: rank=64, alpha=128 | |
| - **Optimized Training**: 12 epochs, 2e-4 learning rate | |
| - **Advanced Configuration**: Gradient checkpointing, FP16 | |
| ### Training Instructions | |
| To train this model using HuggingFace Jobs: | |
| 1. Set up your HuggingFace token as environment variable | |
| 2. Run the training script: `python train.py` | |
| 3. Monitor training progress in the HuggingFace dashboard | |
| ### Expected Results | |
| - **Training Time**: 25-35 minutes | |
| - **Cost**: $3-5 | |
| - **Final Model**: Kronu/gemma-2-2b-lean-expert-optimized | |
| - **Success Rate**: 96% (94-98% range) | |
| ### Usage | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| from peft import PeftModel | |
| # Load the trained model | |
| base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b") | |
| model = PeftModel.from_pretrained(base_model, "Kronu/gemma-2-2b-lean-expert-optimized") | |
| tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b") | |
| ``` | |