collapse_gemma-2-2b_hs2_replace_iter3_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3911	0
1.4866	0.0522	5	1.2772	268824
1.0724	0.1044	10	1.2542	534336
0.7505	0.1567	15	1.3935	806512
0.4711	0.2089	20	1.5034	1072320
0.3516	0.2611	25	1.6782	1347248
0.2276	0.3133	30	1.7960	1629728
0.0914	0.3655	35	1.9368	1898144
0.0729	0.4178	40	1.9806	2160104
0.0788	0.4700	45	2.0355	2431552
0.047	0.5222	50	2.0447	2702376
0.0378	0.5744	55	2.0477	2977136
0.0651	0.6266	60	2.0250	3244952
0.0406	0.6789	65	2.0630	3517952
0.0353	0.7311	70	2.0337	3785248
0.0367	0.7833	75	2.1002	4058328
0.0314	0.8355	80	2.1372	4327344
0.0297	0.8877	85	2.1233	4590592
0.0375	0.9399	90	2.1327	4856992
0.0392	0.9922	95	2.1938	5128680

Safetensors

Model size

3B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(494)

this model