Gaoussin
/

Bamalingua-4

text2text-generation

Generated from Trainer

Model card Files Files and versions

Bamalingua-4 / README.md

Gaoussin's picture

Model save

c457ba7 verified about 1 month ago

|

history blame contribute delete

2.5 kB

	---
	library_name: transformers
	license: cc-by-nc-4.0
	base_model: facebook/nllb-200-distilled-600M
	tags:
	- generated_from_trainer
	metrics:
	- bleu
	model-index:
	- name: Bamalingua-4
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/bamalingua-bamalingua/Bamalingua-4/runs/cwv3gudb)
	# Bamalingua-4

	This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2322
	- Bleu: 26.1291

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 16
	- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 12
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Bleu \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|:-------:\|
	\| 0.332 \| 1.0 \| 3160 \| 0.3005 \| 17.7092 \|
	\| 0.2854 \| 2.0 \| 6320 \| 0.2758 \| 20.3331 \|
	\| 0.2492 \| 3.0 \| 9480 \| 0.2624 \| 21.5687 \|
	\| 0.2248 \| 4.0 \| 12640 \| 0.2531 \| 23.2174 \|
	\| 0.2122 \| 5.0 \| 15800 \| 0.2464 \| 23.7911 \|
	\| 0.1978 \| 6.0 \| 18960 \| 0.2414 \| 24.5054 \|
	\| 0.182 \| 7.0 \| 22120 \| 0.2376 \| 24.7621 \|
	\| 0.1804 \| 8.0 \| 25280 \| 0.2353 \| 25.5139 \|
	\| 0.168 \| 9.0 \| 28440 \| 0.2343 \| 25.7965 \|
	\| 0.1609 \| 10.0 \| 31600 \| 0.2328 \| 25.8732 \|
	\| 0.1583 \| 11.0 \| 34760 \| 0.2323 \| 26.1820 \|
	\| 0.1603 \| 12.0 \| 37920 \| 0.2322 \| 26.1291 \|


	### Framework versions

	- Transformers 4.57.1
	- Pytorch 2.8.0+cu126
	- Datasets 4.4.2
	- Tokenizers 0.22.1