KurtMica's picture
Update README.md
c4c7e52 verified
metadata
library_name: transformers
language:
  - mt
license: cc-by-nc-sa-4.0
base_model: google/mt5-small
datasets:
  - webnlg/challenge-2023
model-index:
  - name: mt5-small_webnlg-mlt
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          type: webnlg_mt
          name: webnlg/challenge-2023
          config: mt
        metrics:
          - type: chrf
            value: 47.86
            name: ChrF
          - type: rougel
            value: 48.35
            name: Rouge-L
        source:
          name: MELABench Leaderboard
          url: https://huggingface.co/spaces/MLRS/MELABench
extra_gated_fields:
  Name: text
  Surname: text
  Date of Birth: date_picker
  Organisation: text
  Country: country
  I agree to use this model in accordance to the license and for non-commercial use ONLY: checkbox

mT5-Small (WebNLG Maltese)

This model is a fine-tuned version of google/mt5-small on the webnlg/challenge-2023 mt dataset. It achieves the following results on the test set:

  • Loss: 4.0028
  • Chrf
    • Score: 31.6417
    • Char Order: 6
    • Word Order: 0
    • Beta: 2
  • Rouge:
    • Rouge1: 0.3464
    • Rouge2: 0.1552
    • Rougel: 0.2797
    • Rougelsum: 0.2797
  • Gen Len: 41.3142

Intended uses & limitations

The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.

Training procedure

The model was fine-tuned using a customised script.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adafactor and the args are: No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 200.0
  • early_stopping_patience: 20

Training results

Training Loss Epoch Step Validation Loss Chrf Score Chrf Char Order Chrf Word Order Chrf Beta Rouge Rouge1 Rouge Rouge2 Rouge Rougel Rouge Rougelsum Gen Len
No log 1.0 413 1.9425 36.6472 6 0 2 0.4422 0.2384 0.3786 0.3786 41.7718
2.9022 2.0 826 1.7744 39.3892 6 0 2 0.4914 0.2853 0.4246 0.4246 32.6222
1.1948 3.0 1239 1.7101 40.9010 6 0 2 0.5116 0.3069 0.4444 0.4445 30.5916
0.9411 4.0 1652 1.6656 41.7312 6 0 2 0.5228 0.3138 0.4521 0.4522 29.4324
0.8059 5.0 2065 1.7050 43.6392 6 0 2 0.5394 0.3266 0.4638 0.4638 31.0360
0.8059 6.0 2478 1.7013 45.7818 6 0 2 0.5490 0.3303 0.4685 0.4689 34.6811
0.7092 7.0 2891 1.7480 45.4992 6 0 2 0.5507 0.3378 0.4716 0.4716 32.6366
0.6343 8.0 3304 1.7694 46.6990 6 0 2 0.5574 0.3406 0.4767 0.4769 32.9538
0.5849 9.0 3717 1.8058 46.1749 6 0 2 0.5548 0.3394 0.4747 0.4751 32.9459
0.5417 10.0 4130 1.8047 45.7135 6 0 2 0.5525 0.3340 0.4731 0.4734 32.3598
0.506 11.0 4543 1.8555 45.2631 6 0 2 0.5511 0.3357 0.4740 0.4745 30.5940
0.506 12.0 4956 1.9072 48.1670 6 0 2 0.5647 0.3436 0.4779 0.4779 35.5598
0.4679 13.0 5369 1.8842 46.5682 6 0 2 0.5601 0.3440 0.4786 0.4786 32.7610
0.4355 14.0 5782 1.9549 45.8614 6 0 2 0.5570 0.3418 0.4765 0.4766 31.9219
0.4132 15.0 6195 2.0120 46.3608 6 0 2 0.5589 0.3433 0.4785 0.4785 31.5231
0.3921 16.0 6608 1.9967 47.3205 6 0 2 0.5629 0.3460 0.4799 0.4800 33.4625
0.3702 17.0 7021 2.0298 46.2312 6 0 2 0.5558 0.3375 0.4715 0.4717 32.0348
0.3702 18.0 7434 2.0882 47.4461 6 0 2 0.5645 0.3450 0.4780 0.4780 33.7477
0.3447 19.0 7847 2.0836 48.3709 6 0 2 0.5683 0.3471 0.4774 0.4774 34.9514
0.3259 20.0 8260 2.1483 47.2591 6 0 2 0.5662 0.3468 0.4788 0.4790 32.8258
0.314 21.0 8673 2.1717 47.1720 6 0 2 0.5619 0.3424 0.4774 0.4775 32.9495
0.296 22.0 9086 2.1921 47.8603 6 0 2 0.5706 0.3494 0.4835 0.4838 33.9309
0.296 23.0 9499 2.2782 47.4664 6 0 2 0.5647 0.3449 0.4774 0.4776 33.2060
0.2845 24.0 9912 2.2365 47.7147 6 0 2 0.5633 0.3448 0.4767 0.4767 33.8763
0.264 25.0 10325 2.3044 46.6542 6 0 2 0.5577 0.3387 0.4706 0.4706 32.8595
0.2523 26.0 10738 2.2961 48.6373 6 0 2 0.5696 0.3476 0.4796 0.4797 34.8505
0.2432 27.0 11151 2.3465 48.0798 6 0 2 0.5639 0.3417 0.4765 0.4767 34.2979
0.2342 28.0 11564 2.3723 46.5735 6 0 2 0.5581 0.3394 0.4755 0.4755 32.2901
0.2342 29.0 11977 2.4377 47.8037 6 0 2 0.5661 0.3445 0.4767 0.4770 33.9459
0.2213 30.0 12390 2.4408 47.6035 6 0 2 0.5604 0.3390 0.4738 0.4735 33.9045
0.209 31.0 12803 2.4824 47.9566 6 0 2 0.5636 0.3438 0.4752 0.4753 33.9045
0.2009 32.0 13216 2.5603 48.2374 6 0 2 0.5661 0.3438 0.4750 0.4750 34.2378
0.1928 33.0 13629 2.5011 47.6750 6 0 2 0.5630 0.3417 0.4749 0.4753 34.1279
0.1876 34.0 14042 2.5800 48.1924 6 0 2 0.5617 0.3373 0.4712 0.4710 34.8667
0.1876 35.0 14455 2.6025 49.7077 6 0 2 0.5739 0.3489 0.4783 0.4786 36.3231
0.1756 36.0 14868 2.6041 48.9179 6 0 2 0.5656 0.3397 0.4726 0.4726 35.8432
0.1683 37.0 15281 2.6548 48.8265 6 0 2 0.5680 0.3416 0.4776 0.4777 34.9946
0.1622 38.0 15694 2.6819 49.3948 6 0 2 0.5709 0.3458 0.4795 0.4794 36.3520
0.1573 39.0 16107 2.7615 48.7379 6 0 2 0.5662 0.3400 0.4721 0.4723 35.6745
0.1516 40.0 16520 2.7286 49.0554 6 0 2 0.5679 0.3446 0.4757 0.4758 36.1453
0.1516 41.0 16933 2.7290 49.3973 6 0 2 0.5677 0.3424 0.4740 0.4739 37.0631
0.1437 42.0 17346 2.8045 47.3914 6 0 2 0.5601 0.3371 0.4692 0.4690 33.9021

Framework versions

  • Transformers 4.48.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.21.0

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.

CC BY-NC-SA 4.0

Citation

This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:

@inproceedings{micallef-borg-2025-melabenchv1,
    title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
    author = "Micallef, Kurt  and
      Borg, Claudia",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.1053/",
    doi = "10.18653/v1/2025.findings-acl.1053",
    pages = "20505--20527",
    ISBN = "979-8-89176-256-5",
}