deepdml
/

whisper-tiny-ta-mix-norm

Automatic Speech Recognition

Generated from Trainer

Eval Results (legacy)

Model card Files Files and versions

Metrics Training metrics Community

whisper-tiny-ta-mix-norm / README.md

deepdml's picture

Upload README.md with huggingface_hub

f595981 verified about 2 hours ago

|

history blame contribute delete

2.82 kB

	---
	library_name: transformers
	language:
	- ta
	license: apache-2.0
	base_model: openai/whisper-tiny
	tags:
	- generated_from_trainer
	datasets:
	- fixie-ai/common_voice_17_0
	- google/fleurs
	- ai4bharat/Kathbath
	- deepdml/iisc-mile-tamil-asr
	- deepdml/microsoft-speech-corpus-indian
	- deepdml/openslr65-tamil
	metrics:
	- wer
	model-index:
	- name: Whisper Tiny ta
	results:
	- task:
	name: Automatic Speech Recognition
	type: automatic-speech-recognition
	dataset:
	name: Common Voice 17.0
	type: fixie-ai/common_voice_17_0
	metrics:
	- name: Wer
	type: wer
	value: 51.66621581584676
	---
	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Whisper Tiny ta

	This model is a fine-tuned version of [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny) on the Common Voice 17.0 dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.2641
	- Wer: 51.6662
	- Cer: 11.8757

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.04
	- training_steps: 8000

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Wer \| Cer \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|
	\| 0.2508 \| 0.125 \| 1000 \| 0.3392 \| 62.4235 \| 15.5864 \|
	\| 0.1747 \| 0.25 \| 2000 \| 0.3003 \| 57.3701 \| 13.4963 \|
	\| 0.1586 \| 0.375 \| 3000 \| 0.2905 \| 55.5023 \| 13.1754 \|
	\| 0.1244 \| 0.5 \| 4000 \| 0.2812 \| 53.6500 \| 12.6062 \|
	\| 0.1361 \| 0.625 \| 5000 \| 0.2687 \| 52.9080 \| 12.3268 \|
	\| 0.1093 \| 0.75 \| 6000 \| 0.2685 \| 52.2523 \| 12.0787 \|
	\| 0.1141 \| 0.875 \| 7000 \| 0.2647 \| 51.9844 \| 11.9065 \|
	\| 0.1274 \| 1.0 \| 8000 \| 0.2641 \| 51.6662 \| 11.8757 \|


	### Framework versions

	- Transformers 4.48.0.dev0
	- Pytorch 2.5.1+cu121
	- Datasets 3.6.0
	- Tokenizers 0.21.0

	## Citation

	Please cite the model using the following BibTeX entry:

	```bibtex
	@misc{deepdml/whisper-tiny-ta-mix-norm,
	title={Fine-tuned Whisper tiny ASR model for speech recognition in Tamil},
	author={Jimenez, David},
	howpublished={\url{https://huggingface.co/deepdml/whisper-tiny-ta-mix-norm}},
	year={2026}
	}
	```