Whisper Marathi Small – Fine-tuned ASR Model

This model is a fine-tuned version of Whisper Small for Marathi Automatic Speech Recognition (ASR).
It provides higher recognition accuracy for Marathi speech compared to the base Whisper-small model.
Optimized for conversations, YouTube speech, interviews, calls, and general-use Marathi audio.


Model Details

Model Description

whisper-Marathi-small-finetuned is trained on curated Marathi audio datasets to improve transcription quality while keeping the Whisper Small efficiency.

  • Developed by: Varun , Sumedh
  • Model type: Encoder–Decoder Transformer (Speech-to-Text)
  • Language: Marathi
  • License: MIT (same as Whisper)
  • Base Model: openai/whisper-small
  • Framework: transformers

Model Sources


Uses

Direct Use

This model can be used for:

  • General Marathi ASR
  • Subtitling Marathi videos and media
  • Transcribing conversations, calls, interviews
  • Speech recognition for chatbots / voice assistants
  • Marathi podcast or lecture transcription

Downstream Use

  • Fine-tuning on domain-specific audio (medical, education, customer support)
  • Building ASR-based AI tools in Marathi
  • Large-scale subtitle and caption generation

Out-of-Scope Use

  • Non-Marathi speech
  • Heavy background noise
  • Multi-speaker overlapping conversations
  • Legal/medical transcription without human verification

Bias, Risks, and Limitations

  • Whisper can hallucinate text with very noisy audio
  • Accuracy drops with thick accents or dialects not seen in training
  • Not suitable for extremely long single-pass audio without chunking
  • Not a translation model (use Whisper translation models instead)

Recommendations

  • Prefer 16 kHz WAV audio
  • Use chunking for long audio (>30 sec)
  • Avoid overlapping speakers
  • Always verify the output in critical applications

⭐ How to Use

Below is the recommended official usage code for this model.

🔥 Recommended Inference Code (supports long audio)

from transformers import pipeline, AutoProcessor

model_name = "Prasad12344321/whisper-Marathi-small-finetuned"

processor = AutoProcessor.from_pretrained(model_name)

pipe = pipeline(
    "automatic-speech-recognition",
    model=model_name,
    chunk_length_s=30,   # long audio support
    stride_length_s=(4, 2)
)

print(pipe("/content/test100.mp3")["text"])
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support