Configuration Parsing Warning: In UNKNOWN_FILENAME: "auto_map.AutoTokenizer" must be a string

flan-t5laa2-small

This model is a fine-tuned version of hrezaei/flan-t5laa2-small on the HuggingFaceFW/fineweb sample-350BT dataset. It achieves the following results on the evaluation set:

  • Perplexity: 1.2301
  • Loss: 0.2071
  • Accuracy: 0.0032
  • Lookahead Perplexity: 521.5613
  • Lookahead Loss: 6.2568
  • Base Perplexity: 1.2138
  • Base Loss: 0.1937

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 524288

Training results

Training Loss Epoch Step Perplexity Validation Loss Accuracy Lookahead Perplexity Lookahead Loss Base Perplexity Base Loss
0.6126 0.0095 5000 1.2367 0.2124 0.0032 8068.5739 8.9957 1.2138 0.1937
0.6026 0.0191 10000 1.2345 0.2107 0.0032 3266.5187 8.0915 1.2138 0.1937
0.6282 0.0286 15000 1.2334 0.2098 0.0032 2089.9703 7.6449 1.2138 0.1937
0.6053 0.0381 20000 1.2329 0.2094 0.0032 1657.4587 7.4130 1.2138 0.1937
0.6244 0.0477 25000 1.2326 0.2091 0.0032 1433.9858 7.2682 1.2138 0.1937
0.6301 0.0572 30000 1.2323 0.2089 0.0032 1296.7085 7.1676 1.2138 0.1937
0.6135 0.0668 35000 1.2321 0.2087 0.0032 1204.5739 7.0939 1.2138 0.1937
0.5998 0.0763 40000 1.2320 0.2086 0.0032 1134.1594 7.0336 1.2138 0.1937
0.5972 0.0858 45000 1.2319 0.2085 0.0032 1079.1354 6.9839 1.2138 0.1937
0.5947 0.0954 50000 1.2318 0.2085 0.0032 1038.1346 6.9452 1.2138 0.1937
0.6257 0.1049 55000 1.2317 0.2084 0.0032 1001.9005 6.9097 1.2138 0.1937
0.6075 0.1144 60000 1.2316 0.2083 0.0032 971.3019 6.8786 1.2138 0.1937
0.6094 1.0048 65000 1.2315 0.2083 0.0032 944.4363 6.8506 1.2138 0.1937
0.6082 1.0143 70000 1.2315 0.2082 0.0032 920.2534 6.8246 1.2138 0.1937
0.6047 1.0238 75000 1.2314 0.2082 0.0032 896.8586 6.7989 1.2138 0.1937
0.6081 1.0334 80000 1.2314 0.2081 0.0032 877.9252 6.7776 1.2138 0.1937
0.6175 1.0429 85000 1.2313 0.2081 0.0032 860.2578 6.7572 1.2138 0.1937
0.606 1.0525 90000 1.2313 0.2080 0.0032 844.6464 6.7389 1.2138 0.1937
0.6149 1.0620 95000 1.2312 0.2080 0.0032 828.3173 6.7194 1.2138 0.1937
0.6122 1.0715 100000 1.2312 0.2080 0.0032 813.9770 6.7019 1.2138 0.1937
0.6199 1.0811 105000 1.2311 0.2079 0.0032 799.4303 6.6839 1.2138 0.1937
0.5994 1.0906 110000 1.2311 0.2079 0.0032 786.4358 6.6675 1.2138 0.1937
0.6021 1.1001 115000 1.2311 0.2079 0.0032 775.5390 6.6536 1.2138 0.1937
0.6186 1.1097 120000 1.2310 0.2078 0.0032 764.0533 6.6386 1.2138 0.1937
0.596 1.1192 125000 1.2310 0.2078 0.0032 753.5015 6.6247 1.2138 0.1937
0.6063 2.0095 130000 1.2310 0.2078 0.0032 743.7327 6.6117 1.2138 0.1937
0.5996 2.0191 135000 1.2309 0.2078 0.0032 733.5159 6.5978 1.2138 0.1937
0.6263 2.0286 140000 1.2309 0.2077 0.0032 724.4130 6.5854 1.2138 0.1937
0.6015 2.0381 145000 1.2309 0.2077 0.0032 715.8316 6.5734 1.2138 0.1937
0.6236 2.0477 150000 1.2308 0.2077 0.0032 708.4340 6.5631 1.2138 0.1937
0.6288 2.0572 155000 1.2308 0.2077 0.0032 700.8450 6.5523 1.2138 0.1937
0.6123 2.0668 160000 1.2308 0.2077 0.0032 693.3540 6.5415 1.2138 0.1937
0.5964 2.0763 165000 1.2308 0.2076 0.0032 685.6423 6.5304 1.2138 0.1937
0.5974 2.0858 170000 1.2307 0.2076 0.0032 678.1766 6.5194 1.2138 0.1937
0.5933 2.0954 175000 1.2307 0.2076 0.0032 672.4544 6.5109 1.2138 0.1937
0.6239 2.1049 180000 1.2307 0.2076 0.0032 666.2288 6.5016 1.2138 0.1937
0.6071 2.1144 185000 1.2307 0.2076 0.0032 660.4713 6.4930 1.2138 0.1937
0.6087 3.0048 190000 1.2306 0.2075 0.0032 654.7854 6.4843 1.2138 0.1937
0.6074 3.0143 195000 1.2306 0.2075 0.0032 649.3958 6.4760 1.2138 0.1937
0.6046 3.0238 200000 1.2306 0.2075 0.0032 643.2597 6.4665 1.2138 0.1937
0.6084 3.0334 205000 1.2306 0.2075 0.0032 638.5549 6.4592 1.2138 0.1937
0.614 3.0429 210000 1.2306 0.2075 0.0032 633.8094 6.4517 1.2138 0.1937
0.605 3.0525 215000 1.2305 0.2075 0.0032 629.8567 6.4455 1.2138 0.1937
0.6158 3.0620 220000 1.2305 0.2074 0.0032 624.9755 6.4377 1.2138 0.1937
0.6103 3.0715 225000 1.2305 0.2074 0.0032 620.7078 6.4309 1.2138 0.1937
0.6174 3.0811 230000 1.2305 0.2074 0.0032 615.9860 6.4232 1.2138 0.1937
0.5991 3.0906 235000 1.2305 0.2074 0.0032 611.7069 6.4163 1.2138 0.1937
0.6033 3.1001 240000 1.2305 0.2074 0.0032 608.3681 6.4108 1.2138 0.1937
0.6166 3.1097 245000 1.2305 0.2074 0.0032 604.4235 6.4043 1.2138 0.1937
0.5941 3.1192 250000 1.2304 0.2074 0.0032 600.8208 6.3983 1.2138 0.1937
0.6054 4.0095 255000 1.2304 0.2074 0.0032 597.4248 6.3926 1.2138 0.1937
0.5991 4.0191 260000 1.2304 0.2073 0.0032 593.5735 6.3862 1.2138 0.1937
0.6243 4.0286 265000 1.2304 0.2073 0.0032 590.2097 6.3805 1.2138 0.1937
0.6023 4.0381 270000 1.2304 0.2073 0.0032 586.9999 6.3750 1.2138 0.1937
0.6215 4.0477 275000 1.2304 0.2073 0.0032 584.3271 6.3705 1.2138 0.1937
0.6266 4.0572 280000 1.2304 0.2073 0.0032 581.6015 6.3658 1.2138 0.1937
0.6115 4.0668 285000 1.2303 0.2073 0.0032 578.7785 6.3609 1.2138 0.1937
0.5971 4.0763 290000 1.2303 0.2073 0.0032 575.7152 6.3556 1.2138 0.1937
0.5976 4.0858 295000 1.2303 0.2073 0.0032 572.6544 6.3503 1.2138 0.1937
0.594 4.0954 300000 1.2303 0.2073 0.0032 570.5147 6.3465 1.2138 0.1937
0.6216 4.1049 305000 1.2303 0.2073 0.0032 568.0579 6.3422 1.2138 0.1937
0.6069 4.1144 310000 1.2303 0.2073 0.0032 565.8226 6.3383 1.2138 0.1937
0.6083 5.0048 315000 1.2303 0.2072 0.0032 563.5451 6.3342 1.2138 0.1937
0.6077 5.0143 320000 1.2303 0.2072 0.0032 561.3008 6.3303 1.2138 0.1937
0.6048 5.0238 325000 1.2303 0.2072 0.0032 558.7242 6.3257 1.2138 0.1937
0.6075 5.0334 330000 1.2303 0.2072 0.0032 556.8461 6.3223 1.2138 0.1937
0.6143 5.0429 335000 1.2302 0.2072 0.0032 554.9196 6.3188 1.2138 0.1937
0.6055 5.0525 340000 1.2302 0.2072 0.0032 553.4305 6.3161 1.2138 0.1937
0.6154 5.0620 345000 1.2302 0.2072 0.0032 551.4133 6.3125 1.2138 0.1937
0.6104 5.0715 350000 1.2302 0.2072 0.0032 549.7351 6.3094 1.2138 0.1937
0.6214 5.0811 355000 1.2302 0.2072 0.0032 547.6832 6.3057 1.2138 0.1937
0.6011 5.0906 360000 1.2302 0.2072 0.0032 545.9130 6.3025 1.2138 0.1937
0.6025 5.1001 365000 1.2302 0.2072 0.0032 544.6638 6.3002 1.2138 0.1937
0.616 5.1097 370000 1.2302 0.2072 0.0032 543.0777 6.2973 1.2138 0.1937
0.5951 5.1192 375000 1.2302 0.2072 0.0032 541.6890 6.2947 1.2138 0.1937
0.6068 6.0095 380000 1.2302 0.2072 0.0032 540.3254 6.2922 1.2138 0.1937
0.5974 6.0191 385000 1.2302 0.2072 0.0032 538.8241 6.2894 1.2138 0.1937
0.6267 6.0286 390000 1.2302 0.2072 0.0032 537.5214 6.2870 1.2138 0.1937
0.5994 6.0381 395000 1.2302 0.2071 0.0032 536.2929 6.2847 1.2138 0.1937
0.6225 6.0477 400000 1.2302 0.2071 0.0032 535.2460 6.2827 1.2138 0.1937
0.629 6.0572 405000 1.2302 0.2071 0.0032 534.2133 6.2808 1.2138 0.1937
0.6107 6.0668 410000 1.2301 0.2071 0.0032 533.1691 6.2788 1.2138 0.1937
0.5961 6.0763 415000 1.2301 0.2071 0.0032 532.0580 6.2768 1.2138 0.1937
0.5962 6.0858 420000 1.2301 0.2071 0.0032 530.9045 6.2746 1.2138 0.1937
0.5928 6.0954 425000 1.2301 0.2071 0.0032 530.1710 6.2732 1.2138 0.1937
0.6228 6.1049 430000 1.2301 0.2071 0.0032 529.3049 6.2716 1.2138 0.1937
0.6064 6.1144 435000 1.2301 0.2071 0.0032 528.5336 6.2701 1.2138 0.1937
0.608 7.0048 440000 1.2301 0.2071 0.0032 527.7939 6.2687 1.2138 0.1937
0.6074 7.0143 445000 1.2301 0.2071 0.0032 527.0764 6.2673 1.2138 0.1937
0.6062 7.0238 450000 1.2301 0.2071 0.0032 526.2217 6.2657 1.2138 0.1937
0.6076 7.0334 455000 1.2301 0.2071 0.0032 525.6428 6.2646 1.2138 0.1937
0.6147 7.0429 460000 1.2301 0.2071 0.0032 525.0607 6.2635 1.2138 0.1937
0.6053 7.0525 465000 1.2301 0.2071 0.0032 524.6511 6.2627 1.2138 0.1937
0.6144 7.0620 470000 1.2301 0.2071 0.0032 524.1066 6.2617 1.2138 0.1937
0.6112 7.0715 475000 1.2301 0.2071 0.0032 523.6863 6.2609 1.2138 0.1937
0.6218 7.0811 480000 1.2301 0.2071 0.0032 523.2272 6.2600 1.2138 0.1937
0.5981 7.0906 485000 1.2301 0.2071 0.0032 522.8427 6.2593 1.2138 0.1937
0.6033 7.1001 490000 1.2301 0.2071 0.0032 522.6008 6.2588 1.2138 0.1937
0.6172 7.1097 495000 1.2301 0.2071 0.0032 522.3052 6.2583 1.2138 0.1937
0.5926 7.1192 500000 1.2301 0.2071 0.0032 522.0964 6.2579 1.2138 0.1937
0.608 8.0095 505000 1.2301 0.2071 0.0032 521.9141 6.2575 1.2138 0.1937
0.5975 8.0191 510000 1.2301 0.2071 0.0032 521.7486 6.2572 1.2138 0.1937
0.6248 8.0286 515000 1.2301 0.2071 0.0032 521.6438 6.2570 1.2138 0.1937
0.6013 8.0381 520000 1.2301 0.2071 0.0032 521.5793 6.2569 1.2138 0.1937

Framework versions

  • Transformers 4.57.0.dev0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
375
Safetensors
Model size
93.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hrezaei/flan-t5laa2-small

Unable to build the model tree, the base model loops to the model itself. Learn more.

Dataset used to train hrezaei/flan-t5laa2-small

Evaluation results