Built with Axolotl

See axolotl config

axolotl version: 0.13.2

# Continued pretraining config for TranslateGemma 12B (LoRA)
#
# TranslateGemma is Gemma 3 fine-tuned for translation via SFT + RL.
# Same tokenizer and architecture as Gemma 3, so the pretraining data
# (plain text, no chat tokens) works directly.
#
# Uses LoRA for parameter-efficient continued pretraining, following
# the MaLA-500 approach (LoRA CPT for multilingual language adaptation).
# lora_target_linear applies LoRA to all linear layers including lm_head.
#
# Requires:
#   - Axolotl v0.13.2 (pip install axolotl==0.13.2)
#   - 2x A100 80GB (or any single GPU with >= 48GB)
#
# Usage:
#   axolotl train configs/pretrain/translategemma-12b.yml

base_model: google/translategemma-12b-it
model_type: Gemma3ForCausalLM
cls_model_config: Gemma3TextConfig

load_in_8bit: false
load_in_4bit: false
strict: false

# LoRA configuration
adapter: lora
lora_r: 64
lora_alpha: 128
lora_dropout: 0.05
lora_target_linear: true

datasets:
  - path: Sunbird/ug40-instructions
    name: pretraining_text
    split: train
    text_column: text
    type: completion

test_datasets:
  - path: Sunbird/ug40-instructions
    name: pretraining_text
    split: dev[:256]
    text_column: text
    type: completion

dataset_prepared_path: last_run_prepared
output_dir: ./outputs/translategemma-12b-pretrain

sequence_len: 1024
sample_packing: true
eval_sample_packing: false
pad_to_sequence_len: true

gradient_accumulation_steps: 4
micro_batch_size: 4
num_epochs: 2
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 2e-4

train_on_inputs:
group_by_length: false
bf16: auto
fp16: false
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
xformers_attention:
flash_attention: false
sdp_attention: true
eager_attention: false

loss_watchdog_threshold: 10.0
loss_watchdog_patience: 3

warmup_ratio: 0.01
eval_steps: 50
logging_steps: 1

save_steps: 100
save_total_limit: 2
debug:
deepspeed:
weight_decay: 0.01

mlflow_tracking_uri: https://mlflow.sunbird.ai
mlflow_experiment_name: ug40-pretraining
mlflow_run_name: translategemma-12b-lora-cpt

outputs/translategemma-12b-pretrain

This model is a fine-tuned version of google/translategemma-12b-it on the Sunbird/ug40-instructions dataset. It achieves the following results on the evaluation set:

  • Loss: 3.9488
  • Ppl: 51.8742
  • Memory/max Active (gib): 35.15
  • Memory/max Allocated (gib): 35.15
  • Memory/device Reserved (gib): 42.11

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 221
  • training_steps: 22148

Training results

Training Loss Epoch Step Validation Loss Ppl Active (gib) Allocated (gib) Reserved (gib)
No log 0 0 5.5503 257.3245 34.62 34.62 34.78
3.5617 0.0045 50 5.2159 184.1822 35.15 35.15 42.11
2.8027 0.0090 100 5.1067 165.118 35.15 35.15 42.07
2.7428 0.0135 150 5.0293 152.8233 35.15 35.15 42.15
2.4764 0.0181 200 4.9374 139.4033 35.15 35.15 42.07
2.3213 0.0226 250 4.8409 126.5882 35.15 35.15 42.07
2.3671 0.0271 300 4.8758 131.0786 35.15 35.15 42.07
2.226 0.0316 350 4.8295 125.1541 35.15 35.15 42.07
2.3946 0.0361 400 4.7830 119.4631 35.15 35.15 42.07
2.1495 0.0406 450 4.7811 119.2388 35.15 35.15 42.11
2.4975 0.0451 500 4.7700 117.9233 35.15 35.15 42.07
2.2017 0.0497 550 4.7763 118.6698 35.15 35.15 42.11
2.0825 0.0542 600 4.6973 109.6462 35.15 35.15 42.07
1.9977 0.0587 650 4.7074 110.7600 35.15 35.15 42.07
2.1722 0.0632 700 4.7610 116.8610 35.15 35.15 42.07
2.3084 0.0677 750 4.6393 103.4714 35.15 35.15 42.07
2.0632 0.0722 800 4.6398 103.5250 35.15 35.15 42.07
2.1737 0.0768 850 4.6588 105.5093 35.15 35.15 42.07
2.0681 0.0813 900 4.5835 97.8553 35.15 35.15 42.11
1.992 0.0858 950 4.6107 100.5575 35.15 35.15 42.11
2.0548 0.0903 1000 4.6340 102.9232 35.15 35.15 42.07
2.3881 0.0948 1050 4.6488 104.4590 35.15 35.15 42.07
1.9704 0.0993 1100 4.5786 97.3799 35.15 35.15 42.07
2.0768 0.1038 1150 4.5709 96.6344 35.15 35.15 42.07
1.9719 0.1084 1200 4.5705 96.5953 35.15 35.15 42.07
1.9921 0.1129 1250 4.5447 94.1352 35.15 35.15 42.07
1.9635 0.1174 1300 4.5413 93.8150 35.15 35.15 42.11
1.975 0.1219 1350 4.5382 93.5262 35.15 35.15 42.07
1.9082 0.1264 1400 4.5549 95.0963 35.15 35.15 42.07
1.7673 0.1309 1450 4.5462 94.2719 35.15 35.15 42.07
1.8757 0.1354 1500 4.5623 95.802 35.15 35.15 42.11
1.8257 0.1400 1550 4.5245 92.2520 35.15 35.15 42.07
1.8126 0.1445 1600 4.5253 92.3258 35.15 35.15 42.07
2.0003 0.1490 1650 4.5310 92.8508 35.15 35.15 42.11
1.8057 0.1535 1700 4.4580 86.316 35.15 35.15 42.11
1.9994 0.1580 1750 4.4838 88.5735 35.15 35.15 42.07
1.8345 0.1625 1800 4.4839 88.5828 35.15 35.15 42.07
1.7373 0.1671 1850 4.4445 85.1539 35.15 35.15 42.07
1.9392 0.1716 1900 4.4388 84.6723 35.15 35.15 42.11
1.8091 0.1761 1950 4.4509 85.7056 35.15 35.15 42.15
1.7339 0.1806 2000 4.4340 84.2647 35.15 35.15 42.15
1.8743 0.1851 2050 4.4453 85.2237 35.15 35.15 42.07
1.8912 0.1896 2100 4.4739 87.6959 35.15 35.15 42.07
1.889 0.1941 2150 4.4717 87.5089 35.15 35.15 42.07
2.0337 0.1987 2200 4.4471 85.3785 35.15 35.15 42.07
1.6816 0.2032 2250 4.4544 86.0036 35.15 35.15 42.11
1.923 0.2077 2300 4.4172 82.8649 35.15 35.15 42.15
1.7749 0.2122 2350 4.4560 86.1398 35.15 35.15 42.07
1.8438 0.2167 2400 4.4180 82.9270 35.15 35.15 42.11
2.0014 0.2212 2450 4.4393 84.7160 35.15 35.15 42.07
1.9562 0.2257 2500 4.3881 80.4910 35.15 35.15 42.07
1.9811 0.2303 2550 4.4073 82.0502 35.15 35.15 42.07
1.8729 0.2348 2600 4.4168 82.8295 35.15 35.15 42.15
1.8112 0.2393 2650 4.4101 82.2791 35.15 35.15 42.15
1.7273 0.2438 2700 4.3772 79.6130 35.15 35.15 42.11
1.8272 0.2483 2750 4.4123 82.4618 35.15 35.15 42.07
1.8715 0.2528 2800 4.3688 78.9497 35.15 35.15 42.07
1.8979 0.2574 2850 4.3480 77.3275 35.15 35.15 42.07
1.9699 0.2619 2900 4.3805 79.8741 35.15 35.15 42.07
1.7035 0.2664 2950 4.3904 80.6722 35.15 35.15 42.07
1.8755 0.2709 3000 4.3900 80.6385 35.15 35.15 42.11
1.8221 0.2754 3050 4.4040 81.7804 35.15 35.15 42.11
1.6378 0.2799 3100 4.3795 79.7987 35.15 35.15 42.07
1.7471 0.2844 3150 4.3939 80.9570 35.15 35.15 42.07
1.6421 0.2890 3200 4.3586 78.1466 35.15 35.15 42.07
2.0323 0.2935 3250 4.3589 78.1746 35.15 35.15 42.07
1.806 0.2980 3300 4.3466 77.2186 35.15 35.15 42.07
1.8093 0.3025 3350 4.3486 77.3700 35.15 35.15 42.07
1.7007 0.3070 3400 4.3539 77.7836 35.15 35.15 42.07
1.6612 0.3115 3450 4.3225 75.3736 35.15 35.15 42.07
1.704 0.3160 3500 4.4098 82.2564 35.15 35.15 42.15
1.9149 0.3206 3550 4.3582 78.1147 35.15 35.15 42.15
1.8832 0.3251 3600 4.3321 76.1063 35.15 35.15 42.07
1.6929 0.3296 3650 4.3753 79.4670 35.15 35.15 42.11
1.7206 0.3341 3700 4.3748 79.4211 35.15 35.15 42.07
1.7692 0.3386 3750 4.3318 76.0833 35.15 35.15 42.07
1.8555 0.3431 3800 4.3264 75.6704 35.15 35.15 42.07
1.7308 0.3477 3850 4.3236 75.4562 35.15 35.15 42.07
1.6671 0.3522 3900 4.3734 79.3123 35.15 35.15 42.07
1.6962 0.3567 3950 4.3289 75.8619 35.15 35.15 42.07
1.7914 0.3612 4000 4.3130 74.6638 35.15 35.15 42.07
1.6952 0.3657 4050 4.2967 73.4541 35.15 35.15 42.07
1.7865 0.3702 4100 4.3210 75.2606 35.15 35.15 42.07
1.8493 0.3747 4150 4.3133 74.6853 35.15 35.15 42.07
1.8915 0.3793 4200 4.2759 71.9439 35.15 35.15 42.11
1.7681 0.3838 4250 4.2656 71.2060 35.15 35.15 42.11
1.7355 0.3883 4300 4.2797 72.2167 35.15 35.15 42.07
1.7176 0.3928 4350 4.2707 71.5703 35.15 35.15 42.11
1.9453 0.3973 4400 4.2834 72.4839 35.15 35.15 42.15
1.7712 0.4018 4450 4.3229 75.4056 35.15 35.15 42.07
1.7943 0.4063 4500 4.2832 72.4730 35.15 35.15 42.07
1.8265 0.4109 4550 4.2599 70.8049 35.15 35.15 42.07
1.7078 0.4154 4600 4.2714 71.6194 35.15 35.15 42.07
1.5589 0.4199 4650 4.2934 73.2171 35.15 35.15 42.07
1.8999 0.4244 4700 4.2524 70.2764 35.15 35.15 42.15
1.7833 0.4289 4750 4.2989 73.6155 35.15 35.15 42.07
1.864 0.4334 4800 4.2464 69.8527 35.15 35.15 42.11
1.9181 0.4380 4850 4.2512 70.1919 35.15 35.15 42.07
1.6998 0.4425 4900 4.2453 69.7790 35.15 35.15 42.07
1.8053 0.4470 4950 4.2306 68.7601 35.15 35.15 42.07
1.5442 0.4515 5000 4.2631 71.0285 35.15 35.15 42.07
1.6695 0.4560 5050 4.2873 72.7704 35.15 35.15 42.07
1.7393 0.4605 5100 4.2528 70.3031 35.15 35.15 42.07
1.7966 0.4650 5150 4.2956 73.3767 35.15 35.15 42.15
1.7114 0.4696 5200 4.2643 71.1155 35.15 35.15 42.15
1.605 0.4741 5250 4.2800 72.2378 35.15 35.15 42.07
1.7624 0.4786 5300 4.2317 68.8333 35.15 35.15 42.07
1.7785 0.4831 5350 4.2417 69.5294 35.15 35.15 42.07
1.9738 0.4876 5400 4.2001 66.6944 35.15 35.15 42.15
1.7262 0.4921 5450 4.2609 70.8757 35.15 35.15 42.15
1.7276 0.4966 5500 4.2474 69.9211 35.15 35.15 42.07
1.7112 0.5012 5550 4.2593 70.7571 35.15 35.15 42.07
1.5335 0.5057 5600 4.1984 66.5794 35.15 35.15 42.11
1.7193 0.5102 5650 4.1976 66.5252 35.15 35.15 42.07
1.7488 0.5147 5700 4.2195 67.9998 35.15 35.15 42.07
1.77 0.5192 5750 4.2112 67.4353 35.15 35.15 42.07
1.6304 0.5237 5800 4.2183 67.9149 35.15 35.15 42.07
1.6974 0.5283 5850 4.1991 66.6277 35.15 35.15 42.07
1.7687 0.5328 5900 4.2169 67.8198 35.15 35.15 42.07
1.7971 0.5373 5950 4.2593 70.7569 35.15 35.15 42.11
1.7028 0.5418 6000 4.2476 69.9381 35.15 35.15 42.07
1.9187 0.5463 6050 4.2287 68.6279 35.15 35.15 42.07
1.7527 0.5508 6100 4.1918 66.1385 35.15 35.15 42.07
1.8243 0.5553 6150 4.2143 67.6502 35.15 35.15 42.07
1.7181 0.5599 6200 4.1941 66.2963 35.15 35.15 42.11
1.6457 0.5644 6250 4.2360 69.1313 35.15 35.15 42.07
1.6266 0.5689 6300 4.2126 67.5344 35.15 35.15 42.07
1.6598 0.5734 6350 4.2153 67.7119 35.15 35.15 42.07
1.583 0.5779 6400 4.2134 67.5884 35.15 35.15 42.07
1.7049 0.5824 6450 4.2224 68.1990 35.15 35.15 42.07
1.6521 0.5869 6500 4.2387 69.3199 35.15 35.15 42.07
1.5827 0.5915 6550 4.2321 68.8607 35.15 35.15 42.07
1.5619 0.5960 6600 4.2019 66.8159 35.15 35.15 42.07
1.7104 0.6005 6650 4.1938 66.2714 35.15 35.15 42.07
1.7658 0.6050 6700 4.2092 67.2993 35.15 35.15 42.07
1.8447 0.6095 6750 4.2211 68.1105 35.15 35.15 42.07
1.744 0.6140 6800 4.1555 63.7847 35.15 35.15 42.07
1.6369 0.6186 6850 4.1887 65.9344 35.15 35.15 42.07
1.7109 0.6231 6900 4.1898 66.0095 35.15 35.15 42.15
1.6325 0.6276 6950 4.1737 64.9567 35.15 35.15 42.15
1.5851 0.6321 7000 4.1960 66.4173 35.15 35.15 42.07
1.9882 0.6366 7050 4.1785 65.2666 35.15 35.15 42.07
1.799 0.6411 7100 4.1753 65.0565 35.15 35.15 42.11
1.5448 0.6456 7150 4.1724 64.8685 35.15 35.15 42.07
1.6043 0.6502 7200 4.1694 64.6771 35.15 35.15 42.11
1.5993 0.6547 7250 4.1521 63.5663 35.15 35.15 42.07
1.8326 0.6592 7300 4.1558 63.8019 35.15 35.15 42.11
1.8941 0.6637 7350 4.1366 62.5898 35.15 35.15 42.07
1.785 0.6682 7400 4.1481 63.3128 35.15 35.15 42.11
1.5667 0.6727 7450 4.1372 62.6249 35.15 35.15 42.07
1.7361 0.6772 7500 4.1225 61.7108 35.15 35.15 42.11
1.7767 0.6818 7550 4.1465 63.2131 35.15 35.15 42.07
1.5784 0.6863 7600 4.1216 61.6569 35.15 35.15 42.15
1.6569 0.6908 7650 4.1563 63.8364 35.15 35.15 42.07
1.5921 0.6953 7700 4.1171 61.3802 35.15 35.15 42.07
1.6445 0.6998 7750 4.1352 62.5008 35.15 35.15 42.15
1.6138 0.7043 7800 4.1137 61.1729 35.15 35.15 42.15
1.605 0.7089 7850 4.1332 62.3790 35.15 35.15 42.07
1.6128 0.7134 7900 4.1248 61.8541 35.15 35.15 42.11
1.6385 0.7179 7950 4.1269 61.9876 35.15 35.15 42.11
1.5835 0.7224 8000 4.1423 62.9501 35.15 35.15 42.07
1.7108 0.7269 8050 4.1338 62.4157 35.15 35.15 42.07
1.5592 0.7314 8100 4.1599 64.0661 35.15 35.15 42.07
1.9167 0.7359 8150 4.1394 62.7654 35.15 35.15 42.11
1.6579 0.7405 8200 4.1292 62.1265 35.15 35.15 42.07
1.6644 0.7450 8250 4.0960 60.0967 35.15 35.15 42.07
1.5885 0.7495 8300 4.1132 61.1398 35.15 35.15 42.07
1.6984 0.7540 8350 4.1201 61.5626 35.15 35.15 42.07
1.7481 0.7585 8400 4.1268 61.9766 35.15 35.15 42.07
1.8169 0.7630 8450 4.0883 59.6379 35.15 35.15 42.11
1.7106 0.7675 8500 4.1119 61.0596 35.15 35.15 42.07
1.9153 0.7721 8550 4.1056 60.6790 35.15 35.15 42.07
1.7755 0.7766 8600 4.1171 61.3807 35.15 35.15 42.07
1.6108 0.7811 8650 4.1030 60.5222 35.15 35.15 42.11
1.7907 0.7856 8700 4.0991 60.2884 35.15 35.15 42.15
1.7245 0.7901 8750 4.1035 60.5548 35.15 35.15 42.11
1.5403 0.7946 8800 4.0767 58.9509 35.15 35.15 42.07
1.662 0.7992 8850 4.1030 60.5196 35.15 35.15 42.07
1.6132 0.8037 8900 4.0842 59.3932 35.15 35.15 42.07
1.6935 0.8082 8950 4.1241 61.8107 35.15 35.15 42.07
1.6511 0.8127 9000 4.1021 60.4681 35.15 35.15 42.07
1.5814 0.8172 9050 4.0913 59.8160 35.15 35.15 42.07
1.7432 0.8217 9100 4.1011 60.4042 35.15 35.15 42.07
1.527 0.8262 9150 4.1020 60.4591 35.15 35.15 42.07
1.7083 0.8308 9200 4.0957 60.0835 35.15 35.15 42.07
1.7396 0.8353 9250 4.0999 60.3338 35.15 35.15 42.11
1.7378 0.8398 9300 4.1051 60.6459 35.15 35.15 42.07
1.5489 0.8443 9350 4.0860 59.4994 35.15 35.15 42.15
1.4558 0.8488 9400 4.0927 59.9014 35.15 35.15 42.07
1.5967 0.8533 9450 4.0714 58.6390 35.15 35.15 42.07
1.8308 0.8578 9500 4.0659 58.3174 35.15 35.15 42.07
1.5062 0.8624 9550 4.0616 58.0649 35.15 35.15 42.07
1.7656 0.8669 9600 4.0659 58.3196 35.15 35.15 42.07
1.6961 0.8714 9650 4.0867 59.5419 35.15 35.15 42.07
1.4949 0.8759 9700 4.0923 59.8770 35.15 35.15 42.07
1.5763 0.8804 9750 4.0684 58.4660 35.15 35.15 42.07
1.4879 0.8849 9800 4.0759 58.9008 35.15 35.15 42.07
1.8592 0.8895 9850 4.0398 56.8140 35.15 35.15 42.07
1.5697 0.8940 9900 4.0555 57.7112 35.15 35.15 42.07
1.4858 0.8985 9950 4.0592 57.9252 35.15 35.15 42.07
1.6462 0.9030 10000 4.0717 58.6538 35.15 35.15 42.11
1.513 0.9075 10050 4.0793 59.1038 35.15 35.15 42.07
1.6506 0.9120 10100 4.0679 58.4357 35.15 35.15 42.11
1.4328 0.9165 10150 4.0585 57.8890 35.15 35.15 42.07
1.4302 0.9211 10200 4.0539 57.6226 35.15 35.15 42.11
1.7099 0.9256 10250 4.0422 56.9490 35.15 35.15 42.07
1.7561 0.9301 10300 4.0753 58.8673 35.15 35.15 42.07
1.8218 0.9346 10350 4.0834 59.3498 35.15 35.15 42.11
1.6288 0.9391 10400 4.0534 57.5937 35.15 35.15 42.07
1.8145 0.9436 10450 4.0356 56.5782 35.15 35.15 42.07
1.6398 0.9481 10500 4.0394 56.7936 35.15 35.15 42.07
1.3962 0.9527 10550 4.0571 57.8060 35.15 35.15 42.11
1.5282 0.9572 10600 4.0283 56.1679 35.15 35.15 42.07
1.7443 0.9617 10650 4.0552 57.6977 35.15 35.15 42.07
1.3788 0.9662 10700 4.0398 56.8162 35.15 35.15 42.07
1.517 0.9707 10750 4.0498 57.3840 35.15 35.15 42.07
1.6647 0.9752 10800 4.0516 57.4877 35.15 35.15 42.07
1.6697 0.9798 10850 4.0291 56.2109 35.15 35.15 42.07
1.667 0.9843 10900 4.0297 56.2417 35.15 35.15 42.07
1.3668 0.9888 10950 4.0331 56.4340 35.15 35.15 42.07
1.6721 0.9933 11000 4.0097 55.1314 35.15 35.15 42.07
1.5818 0.9978 11050 4.0119 55.2537 35.15 35.15 42.07
1.349 1.0023 11100 4.0678 58.4281 35.15 35.15 42.07
1.3433 1.0068 11150 4.0366 56.6311 35.15 35.15 42.07
1.3953 1.0113 11200 4.0480 57.284 35.15 35.15 42.07
1.5887 1.0158 11250 4.0410 56.8853 35.15 35.15 42.15
1.4165 1.0203 11300 4.0654 58.2866 35.15 35.15 42.07
1.4719 1.0248 11350 4.0688 58.4845 35.15 35.15 42.07
1.4382 1.0293 11400 4.0621 58.0978 35.15 35.15 42.11
1.2521 1.0339 11450 4.0616 58.0677 35.15 35.15 42.07
1.4348 1.0384 11500 4.0596 57.9534 35.15 35.15 42.07
1.379 1.0429 11550 4.0341 56.4928 35.15 35.15 42.07
1.4736 1.0474 11600 4.0530 57.5679 35.15 35.15 42.07
1.4057 1.0519 11650 4.0411 56.8870 35.15 35.15 42.07
1.545 1.0564 11700 4.0350 56.5419 35.15 35.15 42.07
1.3988 1.0610 11750 4.0324 56.3941 35.15 35.15 42.11
1.4037 1.0655 11800 4.0318 56.3595 35.15 35.15 42.07
1.9072 1.0700 11850 4.0490 57.3412 35.15 35.15 42.07
1.5299 1.0745 11900 4.0262 56.0449 35.15 35.15 42.07
1.5471 1.0790 11950 4.0361 56.6057 35.15 35.15 42.07
1.3823 1.0835 12000 4.0456 57.1428 35.15 35.15 42.11
1.399 1.0880 12050 4.0417 56.9214 35.15 35.15 42.07
1.4342 1.0926 12100 4.0300 56.2592 35.15 35.15 42.15
1.5188 1.0971 12150 4.0422 56.9527 35.15 35.15 42.07
1.4799 1.1016 12200 4.0376 56.6925 35.15 35.15 42.07
1.3584 1.1061 12250 4.0409 56.8772 35.15 35.15 42.11
1.5428 1.1106 12300 4.0485 57.3130 35.15 35.15 42.11
1.3804 1.1151 12350 4.0290 56.2036 35.15 35.15 42.11
1.5079 1.1196 12400 4.0549 57.6791 35.15 35.15 42.07
1.3807 1.1242 12450 4.0616 58.0688 35.15 35.15 42.11
1.566 1.1287 12500 4.0498 57.3877 35.15 35.15 42.07
1.421 1.1332 12550 4.0414 56.9043 35.15 35.15 42.07
1.648 1.1377 12600 4.0485 57.3120 35.15 35.15 42.11
1.466 1.1422 12650 4.0222 55.8264 35.15 35.15 42.15
1.5311 1.1467 12700 4.0170 55.5348 35.15 35.15 42.07
1.393 1.1513 12750 4.0320 56.3749 35.15 35.15 42.07
1.5639 1.1558 12800 4.0351 56.5480 35.15 35.15 42.15
1.4669 1.1603 12850 4.0346 56.5224 35.15 35.15 42.07
1.5151 1.1648 12900 4.0114 55.2237 35.15 35.15 42.07
1.6203 1.1693 12950 4.0194 55.6700 35.15 35.15 42.07
1.4549 1.1738 13000 4.0199 55.6972 35.15 35.15 42.07
1.5562 1.1783 13050 4.0439 57.0512 35.15 35.15 42.07
1.5211 1.1829 13100 4.0205 55.729 35.15 35.15 42.15
1.4577 1.1874 13150 4.0298 56.2521 35.15 35.15 42.07
1.5098 1.1919 13200 4.0137 55.3520 35.15 35.15 42.07
1.2118 1.1964 13250 4.0235 55.8955 35.15 35.15 42.07
1.5155 1.2009 13300 4.0187 55.6277 35.15 35.15 42.15
1.3797 1.2054 13350 4.0027 54.7440 35.15 35.15 42.07
1.7536 1.2099 13400 3.9981 54.4923 35.15 35.15 42.07
1.5732 1.2145 13450 3.9904 54.0784 35.15 35.15 42.07
1.4279 1.2190 13500 3.9925 54.1900 35.15 35.15 42.07
1.5099 1.2235 13550 4.0151 55.4299 35.15 35.15 42.07
1.4414 1.2280 13600 4.0003 54.6130 35.15 35.15 42.07
1.7437 1.2325 13650 4.0091 55.0989 35.15 35.15 42.07
1.4015 1.2370 13700 4.0118 55.2488 35.15 35.15 42.11
1.3707 1.2416 13750 4.0102 55.1563 35.15 35.15 42.07
1.4402 1.2461 13800 4.0093 55.1107 35.15 35.15 42.07
1.3619 1.2506 13850 4.0088 55.0834 35.15 35.15 42.07
1.5979 1.2551 13900 4.0030 54.7595 35.15 35.15 42.07
1.6445 1.2596 13950 4.0037 54.7983 35.15 35.15 42.15
1.5069 1.2641 14000 4.0074 55.0027 35.15 35.15 42.11
1.5562 1.2686 14050 4.0056 54.9021 35.15 35.15 42.07
1.3316 1.2732 14100 4.0035 54.7880 35.15 35.15 42.15
1.4906 1.2777 14150 3.9969 54.4286 35.15 35.15 42.07
1.6446 1.2822 14200 3.9894 54.0208 35.15 35.15 42.07
1.5575 1.2867 14250 4.0007 54.6359 35.15 35.15 42.11
1.3729 1.2912 14300 3.9940 54.2702 35.15 35.15 42.11
1.4736 1.2957 14350 3.9855 53.8106 35.15 35.15 42.11
1.6285 1.3002 14400 4.0050 54.8720 35.15 35.15 42.07
1.5027 1.3048 14450 4.0157 55.4634 35.15 35.15 42.07
1.5291 1.3093 14500 3.9944 54.2934 35.15 35.15 42.11
1.3464 1.3138 14550 3.9901 54.0590 35.15 35.15 42.07
1.4708 1.3183 14600 3.9826 53.6586 35.15 35.15 42.07
1.5467 1.3228 14650 3.9854 53.8063 35.15 35.15 42.07
1.5048 1.3273 14700 3.9948 54.3135 35.15 35.15 42.11
1.3408 1.3319 14750 3.9955 54.3520 35.15 35.15 42.07
1.5785 1.3364 14800 3.9756 53.2834 35.15 35.15 42.07
1.4843 1.3409 14850 3.9856 53.8154 35.15 35.15 42.07
1.4024 1.3454 14900 3.9809 53.5634 35.15 35.15 42.07
1.5508 1.3499 14950 3.9843 53.7450 35.15 35.15 42.07
1.3853 1.3544 15000 3.9889 53.9964 35.15 35.15 42.07
1.3754 1.3589 15050 3.9754 53.2702 35.15 35.15 42.07
1.3548 1.3635 15100 3.9857 53.8242 35.15 35.15 42.07
1.6952 1.3680 15150 3.9847 53.7717 35.15 35.15 42.07
1.345 1.3725 15200 3.9875 53.9198 35.15 35.15 42.07
1.4546 1.3770 15250 3.9866 53.8712 35.15 35.15 42.07
1.4567 1.3815 15300 3.9855 53.8111 35.15 35.15 42.07
1.3382 1.3860 15350 3.9943 54.2861 35.15 35.15 42.07
1.5385 1.3905 15400 3.9868 53.8801 35.15 35.15 42.07
1.5039 1.3951 15450 3.9967 54.4194 35.15 35.15 42.11
1.3992 1.3996 15500 3.9964 54.4029 35.15 35.15 42.07
1.4015 1.4041 15550 3.9826 53.6557 35.15 35.15 42.07
1.4977 1.4086 15600 3.9760 53.3046 35.15 35.15 42.11
1.4591 1.4131 15650 3.9823 53.6396 35.15 35.15 42.07
1.4299 1.4176 15700 3.9868 53.8822 35.15 35.15 42.07
1.4006 1.4222 15750 3.9844 53.7534 35.15 35.15 42.11
1.2978 1.4267 15800 3.9912 54.1188 35.15 35.15 42.07
1.4031 1.4312 15850 3.9966 54.4145 35.15 35.15 42.07
1.4527 1.4357 15900 3.9826 53.6590 35.15 35.15 42.07
1.4845 1.4402 15950 3.9820 53.6251 35.15 35.15 42.07
1.4397 1.4447 16000 3.9765 53.3317 35.15 35.15 42.07
1.5271 1.4492 16050 3.9822 53.6370 35.15 35.15 42.07
1.4201 1.4538 16100 3.9810 53.5692 35.15 35.15 42.07
1.3952 1.4583 16150 3.9815 53.5980 35.15 35.15 42.07
1.3138 1.4628 16200 3.9612 52.5218 35.15 35.15 42.11
1.4477 1.4673 16250 3.9847 53.7710 35.15 35.15 42.07
1.4031 1.4718 16300 3.9758 53.2901 35.15 35.15 42.07
1.3318 1.4763 16350 3.9874 53.9135 35.15 35.15 42.11
1.397 1.4808 16400 3.9862 53.8526 35.15 35.15 42.07
1.3368 1.4854 16450 3.9716 53.0667 35.15 35.15 42.07
1.5265 1.4899 16500 3.9740 53.1954 35.15 35.15 42.11
1.4988 1.4944 16550 3.9791 53.4707 35.15 35.15 42.11
1.4081 1.4989 16600 3.9696 52.9635 35.15 35.15 42.11
1.4909 1.5034 16650 3.9711 53.0444 35.15 35.15 42.11
1.5869 1.5079 16700 3.9639 52.6604 35.15 35.15 42.11
1.2324 1.5125 16750 3.9721 53.0982 35.15 35.15 42.07
1.6514 1.5170 16800 3.9710 53.0361 35.15 35.15 42.07
1.5016 1.5215 16850 3.9661 52.7767 35.15 35.15 42.07
1.433 1.5260 16900 3.9706 53.0179 35.15 35.15 42.11
1.2589 1.5305 16950 3.9681 52.8825 35.15 35.15 42.07
1.4003 1.5350 17000 3.9634 52.6354 35.15 35.15 42.07
1.4044 1.5395 17050 3.9597 52.4420 35.15 35.15 42.07
1.4473 1.5441 17100 3.9570 52.2993 35.15 35.15 42.11
1.5024 1.5486 17150 3.9660 52.7708 35.15 35.15 42.07
1.6956 1.5531 17200 3.9624 52.581 35.15 35.15 42.07
1.2975 1.5576 17250 3.9660 52.7739 35.15 35.15 42.11
1.4223 1.5621 17300 3.9704 53.0059 35.15 35.15 42.15
1.5549 1.5666 17350 3.9725 53.1163 35.15 35.15 42.11
1.3389 1.5711 17400 3.9722 53.1036 35.15 35.15 42.07
1.2365 1.5757 17450 3.9714 53.0590 35.15 35.15 42.11
1.3006 1.5802 17500 3.9666 52.8055 35.15 35.15 42.07
1.4813 1.5847 17550 3.9647 52.7048 35.15 35.15 42.07
1.4085 1.5892 17600 3.9647 52.7052 35.15 35.15 42.07
1.3395 1.5937 17650 3.9599 52.4539 35.15 35.15 42.07
1.4475 1.5982 17700 3.9512 51.9975 35.15 35.15 42.07
1.251 1.6027 17750 3.9590 52.4047 35.15 35.15 42.07
1.2422 1.6073 17800 3.9513 52.0036 35.15 35.15 42.11
1.3598 1.6118 17850 3.9536 52.1244 35.15 35.15 42.15
1.4818 1.6163 17900 3.9517 52.0252 35.15 35.15 42.11
1.6584 1.6208 17950 3.9621 52.5653 35.15 35.15 42.15
1.5615 1.6253 18000 3.9563 52.2614 35.15 35.15 42.11
1.3877 1.6298 18050 3.9521 52.0445 35.15 35.15 42.07
1.4424 1.6344 18100 3.9582 52.3607 35.15 35.15 42.07
1.3904 1.6389 18150 3.9547 52.1794 35.15 35.15 42.07
1.5236 1.6434 18200 3.9539 52.1409 35.15 35.15 42.07
1.3757 1.6479 18250 3.9527 52.0737 35.15 35.15 42.15
1.5384 1.6524 18300 3.9492 51.8959 35.15 35.15 42.07
1.5591 1.6569 18350 3.9500 51.9375 35.15 35.15 42.07
1.4566 1.6614 18400 3.9544 52.1625 35.15 35.15 42.07
1.5902 1.6660 18450 3.9598 52.4458 35.15 35.15 42.15
1.4769 1.6705 18500 3.9523 52.0556 35.15 35.15 42.07
1.3465 1.6750 18550 3.9549 52.1891 35.15 35.15 42.07
1.4284 1.6795 18600 3.9502 51.9439 35.15 35.15 42.07
1.2386 1.6840 18650 3.9500 51.9339 35.15 35.15 42.07
1.3907 1.6885 18700 3.9517 52.0214 35.15 35.15 42.07
1.6026 1.6930 18750 3.9565 52.2738 35.15 35.15 42.15
1.3784 1.6976 18800 3.9542 52.1529 35.15 35.15 42.07
1.2238 1.7021 18850 3.9531 52.0987 35.15 35.15 42.11
1.346 1.7066 18900 3.9520 52.0418 35.15 35.15 42.07
1.2769 1.7111 18950 3.9533 52.1085 35.15 35.15 42.15
1.3996 1.7156 19000 3.9499 51.9281 35.15 35.15 42.07
1.6653 1.7201 19050 3.9542 52.1557 35.15 35.15 42.07
1.5287 1.7247 19100 3.9547 52.1816 35.15 35.15 42.07
1.4687 1.7292 19150 3.9570 52.2992 35.15 35.15 42.07
1.3733 1.7337 19200 3.9551 52.2028 35.15 35.15 42.07
1.2637 1.7382 19250 3.9545 52.1670 35.15 35.15 42.07
1.4101 1.7427 19300 3.9504 51.9577 35.15 35.15 42.07
1.619 1.7472 19350 3.9448 51.6680 35.15 35.15 42.07
1.5084 1.7517 19400 3.9406 51.4486 35.15 35.15 42.07
1.4807 1.7563 19450 3.9450 51.6745 35.15 35.15 42.07
1.364 1.7608 19500 3.9472 51.7904 35.15 35.15 42.11
1.5272 1.7653 19550 3.9484 51.8503 35.15 35.15 42.07
1.2968 1.7698 19600 3.9483 51.8495 35.15 35.15 42.11
1.4084 1.7743 19650 3.9488 51.8752 35.15 35.15 42.07
1.3793 1.7788 19700 3.9478 51.8220 35.15 35.15 42.11
1.2198 1.7833 19750 3.9511 51.9950 35.15 35.15 42.07
1.6341 1.7879 19800 3.9495 51.9120 35.15 35.15 42.07
1.4008 1.7924 19850 3.9499 51.9284 35.15 35.15 42.15
1.2783 1.7969 19900 3.9481 51.8349 35.15 35.15 42.07
1.4135 1.8014 19950 3.9524 52.0613 35.15 35.15 42.07
1.2385 1.8059 20000 3.9555 52.2211 35.15 35.15 42.07
1.4618 1.8104 20050 3.9536 52.1206 35.15 35.15 42.07
1.4534 1.8150 20100 3.9553 52.2096 35.15 35.15 42.07
1.3215 1.8195 20150 3.9529 52.0847 35.15 35.15 42.07
1.2763 1.8240 20200 3.9502 51.9456 35.15 35.15 42.07
1.416 1.8285 20250 3.9519 52.0320 35.15 35.15 42.07
1.4918 1.8330 20300 3.9512 51.9977 35.15 35.15 42.07
1.5151 1.8375 20350 3.9509 51.9827 35.15 35.15 42.15
1.4899 1.8420 20400 3.9485 51.8587 35.15 35.15 42.07
1.4232 1.8466 20450 3.9504 51.9543 35.15 35.15 42.15
1.3304 1.8511 20500 3.9521 52.0431 35.15 35.15 42.11
1.4278 1.8556 20550 3.9505 51.9616 35.15 35.15 42.07
1.3533 1.8601 20600 3.9492 51.8929 35.15 35.15 42.07
1.3533 1.8646 20650 3.9508 51.9778 35.15 35.15 42.11
1.4097 1.8691 20700 3.9513 52.0051 35.15 35.15 42.07
1.2833 1.8736 20750 3.9512 51.9958 35.15 35.15 42.11
1.4683 1.8782 20800 3.9488 51.8729 35.15 35.15 42.07
1.5952 1.8827 20850 3.9481 51.8373 35.15 35.15 42.07
1.3602 1.8872 20900 3.9475 51.8043 35.15 35.15 42.07
1.4215 1.8917 20950 3.9486 51.8611 35.15 35.15 42.15
1.307 1.8962 21000 3.9492 51.8951 35.15 35.15 42.07
1.779 1.9007 21050 3.9487 51.8680 35.15 35.15 42.07
1.3704 1.9053 21100 3.9481 51.8357 35.15 35.15 42.07
1.4318 1.9098 21150 3.9490 51.8826 35.15 35.15 42.07
1.5043 1.9143 21200 3.9488 51.8713 35.15 35.15 42.15
1.3212 1.9188 21250 3.9487 51.8674 35.15 35.15 42.07
1.6014 1.9233 21300 3.9481 51.8386 35.15 35.15 42.07
1.4291 1.9278 21350 3.9490 51.8850 35.15 35.15 42.07
1.381 1.9323 21400 3.9487 51.8678 35.15 35.15 42.07
1.4124 1.9369 21450 3.9492 51.8921 35.15 35.15 42.15
1.4569 1.9414 21500 3.9491 51.8871 35.15 35.15 42.07
1.4362 1.9459 21550 3.9487 51.8701 35.15 35.15 42.11
1.3464 1.9504 21600 3.9491 51.8863 35.15 35.15 42.07
1.3904 1.9549 21650 3.9486 51.8635 35.15 35.15 42.07
1.2324 1.9594 21700 3.9488 51.8715 35.15 35.15 42.07
1.4577 1.9639 21750 3.9484 51.8519 35.15 35.15 42.07
1.2031 1.9685 21800 3.9487 51.8671 35.15 35.15 42.07
1.3759 1.9730 21850 3.9488 51.8737 35.15 35.15 42.07
1.4811 1.9775 21900 3.9487 51.8690 35.15 35.15 42.07
1.421 1.9820 21950 3.9492 51.8962 35.15 35.15 42.07
1.3799 1.9865 22000 3.9485 51.8580 35.15 35.15 42.07
1.4001 1.9910 22050 3.9490 51.8816 35.15 35.15 42.07
1.3246 1.9956 22100 3.9488 51.8742 35.15 35.15 42.11

Framework versions

  • PEFT 0.18.1
  • Transformers 4.57.6
  • Pytorch 2.10.0+cu128
  • Datasets 4.5.0
  • Tokenizers 0.22.2
Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Sunbird/translategemma-12b-ug40-lora

Adapter
(3)
this model