This is a finetune of Gemma-3-1b-it using the Erudite-V2 dataset.

The dataset used to finetune this model tries to imrpove gemma 3 1b's performance in things like mmlu and humaneval.

image

  • Model training loss:

    Run history:

train/epoch ▁▁▁▁▁▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇▇██ train/global_step ▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███ train/grad_norm ▁▃█▂▄▃▂▄▃▂▃▄▅▅▃▄▄▅▃▄▄▃▄▄▄▄▄▅▄▄▅▅▄▄▄▄▄▅▄▄ train/learning_rate ▄████▇▇▇▇▇▆▆▆▆▆▆▆▅▅▅▅▅▅▄▄▄▄▄▃▃▃▂▂▂▂▂▂▁▁▁ train/loss █▄▃▃▃▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

Run summary:

total_flos 1.092006641664e+18 train/epoch 1 train/global_step 3907 train/grad_norm 0.21461 train/learning_rate 0.0 train/loss 0.7482 train_loss 0.81085 train_runtime 15509.2958 train_samples_per_second 16.119 train_steps_per_second 0.252

Downloads last month
6
Safetensors
Model size
1.0B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Stormtrooperaim/Erudite-V2-1b

Finetuned
(323)
this model
Quantizations
1 model

Dataset used to train Stormtrooperaim/Erudite-V2-1b

Collection including Stormtrooperaim/Erudite-V2-1b