outputs

This model is a fine-tuned version of unsloth/llama-3.1-8b-instruct-unsloth-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 2
eval_batch_size: 8
seed: 3407
gradient_accumulation_steps: 16
total_train_batch_size: 32
optimizer: Use adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.05
num_epochs: 3

Training Loss	Epoch	Step	Validation Loss
1.8249	0.1014	10	1.7483
1.7082	0.2028	20	1.6279
1.6314	0.3042	30	1.5369
1.5673	0.4056	40	1.4923
1.5182	0.5070	50	1.4698
1.4977	0.6084	60	1.4576
1.4762	0.7098	70	1.4489
1.4641	0.8112	80	1.4424
1.4763	0.9125	90	1.4365
1.5972	1.0203	100	1.4317
1.4753	1.1217	110	1.4267
1.4513	1.2231	120	1.4224
1.4775	1.3245	130	1.4188
1.4559	1.4259	140	1.4156
1.4214	1.5272	150	1.4126
1.4534	1.6286	160	1.4098
1.4176	1.7300	170	1.4073
1.4204	1.8314	180	1.4046
1.4398	1.9328	190	1.4027
1.5789	2.0406	200	1.4010
1.4298	2.1420	210	1.3990
1.4299	2.2433	220	1.3974
1.4136	2.3447	230	1.3959
1.4535	2.4461	240	1.3949
1.4244	2.5475	250	1.3938
1.4079	2.6489	260	1.3929
1.4013	2.7503	270	1.3922
1.4204	2.8517	280	1.3918
1.4224	2.9531	290	1.3917

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support