english-tamil-colloquial

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
14.299	0.2222	2	11.7646
13.6933	0.4444	4	11.7646
13.8945	0.6667	6	11.7646
12.1557	0.8889	8	11.7646
13.8565	1.1111	10	11.7646
13.7871	1.3333	12	11.7646
14.1331	1.5556	14	11.7646
14.4065	1.7778	16	11.7982
10.2512	2.0	18	11.8291
10.0358	2.2222	20	11.8134
9.0898	2.4444	22	11.8176
10.2127	2.6667	24	11.8183
8.0483	2.8889	26	11.8417
6.8675	3.1111	28	11.8702
7.0285	3.3333	30	11.8467
6.0854	3.5556	32	11.8356
5.319	3.7778	34	11.7554
5.0992	4.0	36	11.5143
4.5511	4.2222	38	11.3712
4.441	4.4444	40	11.2696
4.2888	4.6667	42	11.3623
4.1408	4.8889	44	11.4373
4.0087	5.1111	46	11.4750
3.8967	5.3333	48	11.4602
3.8878	5.5556	50	11.5468
3.8132	5.7778	52	11.5235
3.9118	6.0	54	11.4412
3.6188	6.2222	56	11.4171
3.8574	6.4444	58	11.4673
3.7125	6.6667	60	11.3127
3.6763	6.8889	62	11.2380
3.7624	7.1111	64	11.1963
3.5123	7.3333	66	11.0156
3.2095	7.5556	68	10.6559
3.6179	7.7778	70	10.3367
3.6225	8.0	72	10.1478
3.428	8.2222	74	10.0318
3.6417	8.4444	76	9.9552
3.1189	8.6667	78	9.9026
3.1417	8.8889	80	9.8953
3.3634	9.1111	82	9.8973
3.5204	9.3333	84	9.8504
3.5029	9.5556	86	9.8414
2.6066	9.7778	88	9.8431
3.5625	10.0	90	9.8178

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Adapter

(121)

this model