Whisper Large V3 Turbo - Nepali to English Translation

Fine-tuned Whisper Large V3 Turbo model for Nepali audio โ†’ English text translation using LoRA adapters.

Model Details

  • Base Model: openai/whisper-large-v3-turbo
  • Task: Translation (Nepali โ†’ English)
  • Source Language: Nepali (audio)
  • Target Language: English (text)
  • Training Data: 776 samples

Usage

from transformers import WhisperForConditionalGeneration, WhisperProcessor, pipeline
from peft import PeftModel

# Load base model
base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-large-v3-turbo")

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "Anryul/whisper-nepali-lora")
processor = WhisperProcessor.from_pretrained("Anryul/whisper-nepali-lora")

# Create translation pipeline
pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
)

# Translate Nepali audio to English
result = pipe("nepali_audio.wav", generate_kwargs={"task": "translate", "language": "nepali"})
print(result["text"])  # English translation

Training

  • Epochs: 10
  • Task: Translation
  • Batch Size: 8
  • Learning Rate: 1e-3
  • LoRA Rank: 32

Citation

If you use this model, please cite the original Whisper paper and dataset.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support