english-tamil-colloquial

This model is a fine-tuned version of unsloth/tinyllama-chat-bnb-4bit on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 9.8178

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
14.299 0.2222 2 11.7646
13.6933 0.4444 4 11.7646
13.8945 0.6667 6 11.7646
12.1557 0.8889 8 11.7646
13.8565 1.1111 10 11.7646
13.7871 1.3333 12 11.7646
14.1331 1.5556 14 11.7646
14.4065 1.7778 16 11.7982
10.2512 2.0 18 11.8291
10.0358 2.2222 20 11.8134
9.0898 2.4444 22 11.8176
10.2127 2.6667 24 11.8183
8.0483 2.8889 26 11.8417
6.8675 3.1111 28 11.8702
7.0285 3.3333 30 11.8467
6.0854 3.5556 32 11.8356
5.319 3.7778 34 11.7554
5.0992 4.0 36 11.5143
4.5511 4.2222 38 11.3712
4.441 4.4444 40 11.2696
4.2888 4.6667 42 11.3623
4.1408 4.8889 44 11.4373
4.0087 5.1111 46 11.4750
3.8967 5.3333 48 11.4602
3.8878 5.5556 50 11.5468
3.8132 5.7778 52 11.5235
3.9118 6.0 54 11.4412
3.6188 6.2222 56 11.4171
3.8574 6.4444 58 11.4673
3.7125 6.6667 60 11.3127
3.6763 6.8889 62 11.2380
3.7624 7.1111 64 11.1963
3.5123 7.3333 66 11.0156
3.2095 7.5556 68 10.6559
3.6179 7.7778 70 10.3367
3.6225 8.0 72 10.1478
3.428 8.2222 74 10.0318
3.6417 8.4444 76 9.9552
3.1189 8.6667 78 9.9026
3.1417 8.8889 80 9.8953
3.3634 9.1111 82 9.8973
3.5204 9.3333 84 9.8504
3.5029 9.5556 86 9.8414
2.6066 9.7778 88 9.8431
3.5625 10.0 90 9.8178

Framework versions

  • PEFT 0.14.0
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for VishaliSekar/english-tamil-colloquial

Adapter
(121)
this model