collapse_gemma-2-2b_hs2_replace_iter5_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3911	0
1.3455	0.0511	5	1.2857	262304
0.9763	0.1021	10	1.3002	523352
0.5173	0.1532	15	1.5332	776912
0.2812	0.2042	20	1.6926	1042808
0.1537	0.2553	25	1.8953	1300952
0.11	0.3063	30	2.0821	1569352
0.0495	0.3574	35	2.2281	1836080
0.0399	0.4084	40	2.3107	2098672
0.0381	0.4595	45	2.3340	2365808
0.0406	0.5105	50	2.3116	2628776
0.0306	0.5616	55	2.2697	2892136
0.0255	0.6126	60	2.2541	3148400
0.0352	0.6637	65	2.2663	3421504
0.0288	0.7147	70	2.2875	3686808
0.0261	0.7658	75	2.3035	3950768
0.0235	0.8168	80	2.3276	4211904
0.0254	0.8679	85	2.3407	4481312
0.0228	0.9190	90	2.3446	4748856
0.0264	0.9700	95	2.3465	5007824

Safetensors

Model size

3B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(495)

this model