Kronu
/

gemma-2-2b-lean-expert-optimized

Model card Files Files and versions

gemma-2-2b-lean-expert-optimized / README.md

Kronu's picture

Add training documentation

c336dbe verified 5 months ago

|

history blame contribute delete

1.45 kB

	# gemma-2-2b-lean-expert-optimized

	## Optimized Gemma Model for 94%+ Success Rate

	This repository contains the training configuration for an optimized Gemma-2-2B model targeting 94%+ success rate on Lean trading algorithm optimization tasks.

	### Training Configuration

	- Base Model: google/gemma-2-2b
	- Dataset: Kronu/lean-expert-optimized-2000
	- Target Success Rate: 94%+
	- Expected Performance: 96% (94-98% range)

	### Key Optimizations

	- JSON Parsing Focus: 1,333 examples (0% → 95% success target)
	- Enhanced LoRA: rank=64, alpha=128
	- Optimized Training: 12 epochs, 2e-4 learning rate
	- Advanced Configuration: Gradient checkpointing, FP16

	### Training Instructions

	To train this model using HuggingFace Jobs:

	1. Set up your HuggingFace token as environment variable
	2. Run the training script: `python train.py`
	3. Monitor training progress in the HuggingFace dashboard

	### Expected Results

	- Training Time: 25-35 minutes
	- Cost: $3-5
	- Final Model: Kronu/gemma-2-2b-lean-expert-optimized
	- Success Rate: 96% (94-98% range)

	### Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# Load the trained model
	base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b")
	model = PeftModel.from_pretrained(base_model, "Kronu/gemma-2-2b-lean-expert-optimized")
	tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b")
	```