collapse_gemma-2-2b_hs2_replace_iter4_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3911	0
1.3639	0.0513	5	1.2816	266328
0.9026	0.1026	10	1.2842	538736
0.5281	0.1539	15	1.4763	802240
0.3933	0.2053	20	1.6553	1073368
0.1558	0.2566	25	1.8315	1343640
0.1047	0.3079	30	2.0076	1610040
0.0801	0.3592	35	2.0993	1880904
0.0427	0.4105	40	2.1617	2153168
0.0456	0.4618	45	2.2009	2424248
0.0346	0.5131	50	2.2028	2685352
0.0381	0.5645	55	2.1740	2958872
0.0319	0.6158	60	2.1077	3226672
0.0335	0.6671	65	2.1135	3489056
0.0269	0.7184	70	2.1519	3745968
0.049	0.7697	75	2.1504	4019584
0.0293	0.8210	80	2.1273	4287384
0.0288	0.8724	85	2.1374	4549544
0.0344	0.9237	90	2.1453	4809152
0.0241	0.9750	95	2.1850	5074608

Safetensors

Model size

3B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(494)

this model