ARC-Challenge_Llama-3.2-1B-qarmbuc0

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2949
Model Preparation Time: 0.0057
Mdl: 558.5579
Accumulated Loss: 387.1628
Correct Preds: 154.0
Total Preds: 299.0
Accuracy: 0.5151
Correct Gen Preds: 154.0
Gen Accuracy: 0.5151
Correct Gen Preds 32: 28.0
Correct Preds 32: 28.0
Total Labels 32: 64.0
Accuracy 32: 0.4375
Gen Accuracy 32: 0.4375
Correct Gen Preds 33: 42.0
Correct Preds 33: 42.0
Total Labels 33: 73.0
Accuracy 33: 0.5753
Gen Accuracy 33: 0.5753
Correct Gen Preds 34: 32.0
Correct Preds 34: 32.0
Total Labels 34: 78.0
Accuracy 34: 0.4103
Gen Accuracy 34: 0.4103
Correct Gen Preds 35: 52.0
Correct Preds 35: 52.0
Total Labels 35: 83.0
Accuracy 35: 0.6265
Gen Accuracy 35: 0.6265
Correct Gen Preds 36: 0.0
Correct Preds 36: 0.0
Total Labels 36: 1.0
Accuracy 36: 0.0
Gen Accuracy 36: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 112
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 32	Correct Preds 32	Total Labels 32	Accuracy 32	Gen Accuracy 32	Correct Gen Preds 33	Correct Preds 33	Total Labels 33	Accuracy 33	Gen Accuracy 33	Correct Gen Preds 34	Correct Preds 34	Total Labels 34	Accuracy 34	Gen Accuracy 34	Correct Gen Preds 35	Correct Preds 35	Total Labels 35	Accuracy 35	Gen Accuracy 35	Total Labels 36
No log	0	0	1.6389	0.0057	706.9523	490.0220	66.0	299.0	0.2207	66.0	0.2207	62.0	62.0	64.0	0.9688	0.9688	0.0	0.0	73.0	0.0	0.0	4.0	4.0	78.0	0.0513	0.0513	0.0	0.0	83.0	0.0	0.0	1.0
1.2184	1.0	16	1.2940	0.0057	558.1790	386.9002	119.0	299.0	0.3980	119.0	0.3980	27.0	27.0	64.0	0.4219	0.4219	28.0	28.0	73.0	0.3836	0.3836	50.0	50.0	78.0	0.6410	0.6410	14.0	14.0	83.0	0.1687	0.1687	1.0
1.1189	2.0	32	1.2804	0.0057	552.3351	382.8495	132.0	299.0	0.4415	127.0	0.4247	30.0	33.0	64.0	0.5156	0.4688	36.0	37.0	73.0	0.5068	0.4932	37.0	37.0	78.0	0.4744	0.4744	24.0	25.0	83.0	0.3012	0.2892	1.0
0.501	3.0	48	1.2949	0.0057	558.5579	387.1628	154.0	299.0	0.5151	154.0	0.5151	28.0	28.0	64.0	0.4375	0.4375	42.0	42.0	73.0	0.5753	0.5753	32.0	32.0	78.0	0.4103	0.4103	52.0	52.0	83.0	0.6265	0.6265	1.0
0.1061	4.0	64	1.9788	0.0057	853.5686	591.6487	144.0	299.0	0.4816	143.0	0.4783	31.0	31.0	64.0	0.4844	0.4844	43.0	43.0	73.0	0.5890	0.5890	34.0	34.0	78.0	0.4359	0.4359	35.0	36.0	83.0	0.4337	0.4217	1.0
0.0272	5.0	80	4.8284	0.0057	2082.8260	1443.7050	145.0	299.0	0.4849	142.0	0.4749	25.0	27.0	64.0	0.4219	0.3906	37.0	37.0	73.0	0.5068	0.5068	37.0	37.0	78.0	0.4744	0.4744	43.0	44.0	83.0	0.5301	0.5181	1.0
0.3673	6.0	96	5.1422	0.0057	2218.1877	1537.5306	154.0	299.0	0.5151	152.0	0.5084	21.0	21.0	64.0	0.3281	0.3281	35.0	36.0	73.0	0.4932	0.4795	49.0	50.0	78.0	0.6410	0.6282	47.0	47.0	83.0	0.5663	0.5663	1.0
0.0005	7.0	112	4.6677	0.0057	2013.5025	1395.6536	148.0	299.0	0.4950	148.0	0.4950	37.0	37.0	64.0	0.5781	0.5781	38.0	38.0	73.0	0.5205	0.5205	35.0	35.0	78.0	0.4487	0.4487	38.0	38.0	83.0	0.4578	0.4578	1.0
0.0002	8.0	128	4.0369	0.0057	1741.3601	1207.0189	153.0	299.0	0.5117	6.0	0.0201	1.0	36.0	64.0	0.5625	0.0156	2.0	35.0	73.0	0.4795	0.0274	3.0	39.0	78.0	0.5	0.0385	0.0	43.0	83.0	0.5181	0.0	1.0
0.0002	9.0	144	4.5777	0.0057	1974.6583	1368.7289	148.0	299.0	0.4950	134.0	0.4482	23.0	35.0	64.0	0.5469	0.3594	38.0	39.0	73.0	0.5342	0.5205	39.0	39.0	78.0	0.5	0.5	34.0	35.0	83.0	0.4217	0.4096	1.0
0.0002	10.0	160	4.7118	0.0057	2032.5144	1408.8316	145.0	299.0	0.4849	144.0	0.4816	31.0	32.0	64.0	0.5	0.4844	36.0	36.0	73.0	0.4932	0.4932	39.0	39.0	78.0	0.5	0.5	38.0	38.0	83.0	0.4578	0.4578	1.0
0.0	11.0	176	5.1373	0.0057	2216.0534	1536.0512	146.0	299.0	0.4883	144.0	0.4816	30.0	31.0	64.0	0.4844	0.4688	35.0	35.0	73.0	0.4795	0.4795	40.0	40.0	78.0	0.5128	0.5128	39.0	40.0	83.0	0.4819	0.4699	1.0
0.0001	12.0	192	5.2735	0.0057	2274.8032	1576.7734	144.0	299.0	0.4816	143.0	0.4783	29.0	30.0	64.0	0.4688	0.4531	36.0	36.0	73.0	0.4932	0.4932	39.0	39.0	78.0	0.5	0.5	39.0	39.0	83.0	0.4699	0.4699	1.0
0.0	13.0	208	5.2806	0.0057	2277.8757	1578.9031	144.0	299.0	0.4816	143.0	0.4783	29.0	30.0	64.0	0.4688	0.4531	36.0	36.0	73.0	0.4932	0.4932	39.0	39.0	78.0	0.5	0.5	39.0	39.0	83.0	0.4699	0.4699	1.0
0.0	14.0	224	5.3066	0.0057	2289.0982	1586.6820	146.0	299.0	0.4883	145.0	0.4849	29.0	30.0	64.0	0.4688	0.4531	36.0	36.0	73.0	0.4932	0.4932	40.0	40.0	78.0	0.5128	0.5128	40.0	40.0	83.0	0.4819	0.4819	1.0

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 2

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/ARC-Challenge_Llama-3.2-1B-qarmbuc0

Base model

meta-llama/Llama-3.2-1B

Finetuned

(899)

this model