Kronu commited on
Commit
c336dbe
·
verified ·
1 Parent(s): d0c97d7

Add training documentation

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # gemma-2-2b-lean-expert-optimized
2
+
3
+ ## Optimized Gemma Model for 94%+ Success Rate
4
+
5
+ This repository contains the training configuration for an optimized Gemma-2-2B model targeting 94%+ success rate on Lean trading algorithm optimization tasks.
6
+
7
+ ### Training Configuration
8
+
9
+ - **Base Model**: google/gemma-2-2b
10
+ - **Dataset**: Kronu/lean-expert-optimized-2000
11
+ - **Target Success Rate**: 94%+
12
+ - **Expected Performance**: 96% (94-98% range)
13
+
14
+ ### Key Optimizations
15
+
16
+ - **JSON Parsing Focus**: 1,333 examples (0% → 95% success target)
17
+ - **Enhanced LoRA**: rank=64, alpha=128
18
+ - **Optimized Training**: 12 epochs, 2e-4 learning rate
19
+ - **Advanced Configuration**: Gradient checkpointing, FP16
20
+
21
+ ### Training Instructions
22
+
23
+ To train this model using HuggingFace Jobs:
24
+
25
+ 1. Set up your HuggingFace token as environment variable
26
+ 2. Run the training script: `python train.py`
27
+ 3. Monitor training progress in the HuggingFace dashboard
28
+
29
+ ### Expected Results
30
+
31
+ - **Training Time**: 25-35 minutes
32
+ - **Cost**: $3-5
33
+ - **Final Model**: Kronu/gemma-2-2b-lean-expert-optimized
34
+ - **Success Rate**: 96% (94-98% range)
35
+
36
+ ### Usage
37
+
38
+ ```python
39
+ from transformers import AutoTokenizer, AutoModelForCausalLM
40
+ from peft import PeftModel
41
+
42
+ # Load the trained model
43
+ base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b")
44
+ model = PeftModel.from_pretrained(base_model, "Kronu/gemma-2-2b-lean-expert-optimized")
45
+ tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b")
46
+ ```