mms-trilingual-dv-ar-en / README.md

Serialtechlab

Model save

a1169bd verified 3 days ago

preview code

raw

history blame contribute delete

4.73 kB

metadata

library_name: transformers
license: cc-by-nc-4.0
base_model: facebook/mms-1b-all
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: mms-trilingual-dv-ar-en
    results: []

mms-trilingual-dv-ar-en

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1676
Wer: 0.2509

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Wer
1.2154	0.2581	250	0.9458	0.0137	0.5295
0.9794	0.5163	500	0.8440	0.0137	0.5125
1.0258	0.7744	750	0.8450	0.0137	0.5020
0.9701	1.0320	1000	0.8394	0.0137	0.5188
1.0218	1.2901	1250	0.7713	0.0137	0.5261
0.8837	1.5483	1500	0.6487	0.0137	0.4753
0.6842	1.8064	1750	0.4759	0.0137	0.4750
0.5637	2.0640	2000	0.4537	0.0137	0.4721
0.5311	2.3221	2250	0.4081	0.0137	0.4645
0.5178	2.5803	2500	0.3942	0.0137	0.4582
0.5217	2.8384	2750	0.3773	0.0137	0.4499
0.4585	3.0960	3000	0.3777	0.0137	0.4349
0.4436	3.3542	3250	0.3533	0.0137	0.4144
0.4485	3.6123	3500	0.3508	0.0137	0.4231
0.4181	3.8704	3750	0.3480	0.0137	0.4328
0.389	4.1280	4000	0.3239	0.0137	0.3931
0.4048	4.3862	4250	0.3356	0.0137	0.4217
0.3756	4.6443	4500	0.3084	0.0137	0.3796
0.3721	4.9024	4750	0.3000	0.0137	0.3788
0.334	5.1600	5000	0.2935	0.0137	0.3553
0.3029	5.4182	5250	0.2864	0.0137	0.3482
0.3185	5.6763	5500	0.2754	0.0137	0.3418
0.2919	5.9344	5750	0.2651	0.0137	0.3330
0.2781	6.1920	6000	0.1975	0.2901
0.2662	6.4502	6250	0.1923	0.2871
0.2698	6.7083	6500	0.1861	0.2841
0.282	6.9664	6750	0.1867	0.2805
0.2528	7.2241	7000	0.1809	0.2762
0.2579	7.4822	7250	0.1779	0.2668
0.22	7.7403	7500	0.1782	0.2642
0.2177	7.9985	7750	0.1740	0.2604
0.2096	8.2561	8000	0.1728	0.2609
0.1942	8.5142	8250	0.1697	0.2562
0.2121	8.7723	8500	0.1677	0.2536
0.1835	9.0299	8750	0.1683	0.2536
0.2002	9.2881	9000	0.1678	0.2522
0.2144	9.5462	9250	0.1676	0.2519
0.1918	9.8043	9500	0.1676	0.2509

Framework versions

Transformers 4.57.6
Pytorch 2.9.0+cu126
Datasets 4.0.0
Tokenizers 0.22.2