Update README.md

5f93f8d verified 5 months ago

5.13 kB

	---
	library_name: transformers
	language:
	- mt
	license: cc-by-nc-sa-4.0
	base_model: MLRS/BERTu
	model-index:
	- name: BERTu_sentiment-mlt
	results:
	- task:
	type: sentiment-analysis
	name: Sentiment Analysis
	dataset:
	type: mt-sentiment-analysis
	name: Maltese Sentiment Analysis
	metrics:
	- type: f1
	args: macro
	value: 85.11
	name: Macro-averaged F1
	source:
	name: MELABench Leaderboard
	url: https://huggingface.co/spaces/MLRS/MELABench
	extra_gated_fields:
	Name: text
	Surname: text
	Date of Birth: date_picker
	Organisation: text
	Country: country
	I agree to use this model in accordance to the license and for non-commercial use ONLY: checkbox
	---

	# BERTu (Maltese Sentiment Analysis)

	<img src="https://raw.githubusercontent.com/MLRS/BERTu/master/logo.png" width="200" margin-right="1em" align="left" />

	This model is a fine-tuned version of [MLRS/BERTu](https://huggingface.co/MLRS/BERTu) on [Sentiment Analysis](https://github.com/jerbarnes/typology_of_crosslingual/tree/master/data/sentiment/mt).
	It achieves the following results on the test set:
	- Loss: 0.5176
	- F1: 0.8511

	## Intended uses & limitations

	The model is fine-tuned on a specific task and it should be used on the same or similar task.
	Any limitations present in the base model are inherited.

	## Training procedure

	The model was fine-tuned using a customised [script](https://github.com/MLRS/MELABench/blob/main/finetuning/run_classification.py).

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 32
	- seed: 2
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: inverse_sqrt
	- lr_scheduler_warmup_ratio: 0.005
	- num_epochs: 200.0
	- early_stopping_patience: 20

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| F1 \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|
	\| No log \| 1.0 \| 38 \| 0.4389 \| 0.7914 \|
	\| No log \| 2.0 \| 76 \| 0.2928 \| 0.9020 \|
	\| No log \| 3.0 \| 114 \| 0.2375 \| 0.8766 \|
	\| No log \| 4.0 \| 152 \| 0.2501 \| 0.9076 \|
	\| No log \| 5.0 \| 190 \| 0.2855 \| 0.9215 \|
	\| No log \| 6.0 \| 228 \| 0.3583 \| 0.8970 \|
	\| No log \| 7.0 \| 266 \| 0.4191 \| 0.8731 \|
	\| No log \| 8.0 \| 304 \| 0.4540 \| 0.8865 \|
	\| No log \| 9.0 \| 342 \| 0.4227 \| 0.8970 \|
	\| No log \| 10.0 \| 380 \| 0.4526 \| 0.8970 \|
	\| No log \| 11.0 \| 418 \| 0.4572 \| 0.8970 \|
	\| No log \| 12.0 \| 456 \| 0.4483 \| 0.8970 \|
	\| No log \| 13.0 \| 494 \| 0.4574 \| 0.8970 \|
	\| 0.1024 \| 14.0 \| 532 \| 0.4587 \| 0.8970 \|
	\| 0.1024 \| 15.0 \| 570 \| 0.4676 \| 0.8970 \|
	\| 0.1024 \| 16.0 \| 608 \| 0.4732 \| 0.8970 \|
	\| 0.1024 \| 17.0 \| 646 \| 0.4772 \| 0.8970 \|
	\| 0.1024 \| 18.0 \| 684 \| 0.4897 \| 0.8849 \|
	\| 0.1024 \| 19.0 \| 722 \| 0.4938 \| 0.8849 \|
	\| 0.1024 \| 20.0 \| 760 \| 0.4950 \| 0.8849 \|
	\| 0.1024 \| 21.0 \| 798 \| 0.4947 \| 0.8970 \|
	\| 0.1024 \| 22.0 \| 836 \| 0.4963 \| 0.8970 \|
	\| 0.1024 \| 23.0 \| 874 \| 0.4993 \| 0.8970 \|
	\| 0.1024 \| 24.0 \| 912 \| 0.5010 \| 0.8970 \|
	\| 0.1024 \| 25.0 \| 950 \| 0.5030 \| 0.8970 \|


	### Framework versions

	- Transformers 4.51.1
	- Pytorch 2.7.0+cu126
	- Datasets 3.2.0
	- Tokenizers 0.21.1

	## License

	This work is licensed under a
	[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].
	Permissions beyond the scope of this license may be available at [https://mlrs.research.um.edu.mt/](https://mlrs.research.um.edu.mt/).

	[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]

	[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/
	[cc-by-nc-sa-image]: https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png

	## Citation

	This work was first presented in [MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP](https://arxiv.org/abs/2506.04385).
	Cite it as follows:

	```bibtex
	@inproceedings{micallef-borg-2025-melabenchv1,
	title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
	author = "Micallef, Kurt and
	Borg, Claudia",
	editor = "Che, Wanxiang and
	Nabende, Joyce and
	Shutova, Ekaterina and
	Pilehvar, Mohammad Taher",
	booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
	month = jul,
	year = "2025",
	address = "Vienna, Austria",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2025.findings-acl.1053/",
	doi = "10.18653/v1/2025.findings-acl.1053",
	pages = "20505--20527",
	ISBN = "979-8-89176-256-5",
	}
	```