BoolQ_Llama-3.2-1B-9pmfiw8i

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4059
Model Preparation Time: 0.006
Mdl: 1914.7894
Accumulated Loss: 1327.2309
Correct Preds: 2774.0
Total Preds: 3270.0
Accuracy: 0.8483
Correct Gen Preds: 2621.0
Gen Accuracy: 0.8015
Correct Gen Preds 9642: 1695.0
Correct Preds 9642: 1778.0
Total Labels 9642: 2026.0
Accuracy 9642: 0.8776
Gen Accuracy 9642: 0.8366
Correct Gen Preds 2822: 916.0
Correct Preds 2822: 996.0
Total Labels 2822: 1231.0
Accuracy 2822: 0.8091
Gen Accuracy 2822: 0.7441

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 120
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 9642	Correct Preds 9642	Total Labels 9642	Accuracy 9642	Gen Accuracy 9642	Correct Gen Preds 2822	Correct Preds 2822	Total Labels 2822	Accuracy 2822	Gen Accuracy 2822
No log	0	0	0.7080	0.006	3339.8933	2315.0376	2032.0	3270.0	0.6214	2040.0	0.6239	2007.0	2008.0	2026.0	0.9911	0.9906	24.0	24.0	1231.0	0.0195	0.0195
0.3127	1.0	295	0.4749	0.006	2240.5980	1553.0642	2632.0	3270.0	0.8049	2386.0	0.7297	1628.0	1712.0	2026.0	0.8450	0.8036	749.0	920.0	1231.0	0.7474	0.6084
0.3406	2.0	590	0.4059	0.006	1914.7894	1327.2309	2774.0	3270.0	0.8483	2621.0	0.8015	1695.0	1778.0	2026.0	0.8776	0.8366	916.0	996.0	1231.0	0.8091	0.7441
0.0461	3.0	885	0.7790	0.006	3675.0574	2547.3556	2700.0	3270.0	0.8257	2591.0	0.7924	1682.0	1709.0	2026.0	0.8435	0.8302	901.0	991.0	1231.0	0.8050	0.7319
0.2643	4.0	1180	1.1030	0.006	5203.4628	3606.7656	2706.0	3270.0	0.8275	2694.0	0.8239	1868.0	1876.0	2026.0	0.9260	0.9220	820.0	830.0	1231.0	0.6742	0.6661
0.0001	5.0	1475	1.0459	0.006	4934.3817	3420.2528	2767.0	3270.0	0.8462	2751.0	0.8413	1807.0	1814.0	2026.0	0.8954	0.8919	937.0	953.0	1231.0	0.7742	0.7612
0.0003	6.0	1770	1.0837	0.006	5112.5263	3543.7332	2769.0	3270.0	0.8468	2767.0	0.8462	1790.0	1798.0	2026.0	0.8875	0.8835	969.0	971.0	1231.0	0.7888	0.7872
0.0	7.0	2065	1.3123	0.006	6190.9951	4291.2708	2763.0	3270.0	0.8450	2768.0	0.8465	1799.0	1799.0	2026.0	0.8880	0.8880	963.0	964.0	1231.0	0.7831	0.7823
0.0	8.0	2360	1.2884	0.006	6078.2692	4213.1351	2764.0	3270.0	0.8453	2770.0	0.8471	1820.0	1820.0	2026.0	0.8983	0.8983	944.0	944.0	1231.0	0.7669	0.7669
0.6191	9.0	2655	1.3592	0.006	6411.9574	4444.4302	2766.0	3270.0	0.8459	2771.0	0.8474	1821.0	1821.0	2026.0	0.8988	0.8988	944.0	945.0	1231.0	0.7677	0.7669
0.0	10.0	2950	1.3659	0.006	6443.7488	4466.4663	2768.0	3270.0	0.8465	2772.0	0.8477	1824.0	1824.0	2026.0	0.9003	0.9003	942.0	944.0	1231.0	0.7669	0.7652
0.0	11.0	3245	1.3656	0.006	6442.3540	4465.4995	2763.0	3270.0	0.8450	2768.0	0.8465	1817.0	1817.0	2026.0	0.8968	0.8968	945.0	946.0	1231.0	0.7685	0.7677
0.0	12.0	3540	1.3850	0.006	6533.8980	4528.9530	2761.0	3270.0	0.8443	2766.0	0.8459	1810.0	1810.0	2026.0	0.8934	0.8934	950.0	951.0	1231.0	0.7725	0.7717

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 3

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/BoolQ_Llama-3.2-1B-9pmfiw8i

Base model

meta-llama/Llama-3.2-1B

Finetuned

(899)

this model