whisper-nepali-lora / README.md
Anryul's picture
Upload README.md with huggingface_hub
b698006 verified
metadata
language:
  - ne
  - en
license: apache-2.0
tags:
  - whisper
  - nepali
  - english
  - translation
  - speech-to-text
datasets:
  - prashantrajbista/nepali-english-parallel-audio-text-dataset

Whisper Large V3 Turbo - Nepali to English Translation

Fine-tuned Whisper Large V3 Turbo model for Nepali audio → English text translation using LoRA adapters.

Model Details

  • Base Model: openai/whisper-large-v3-turbo
  • Task: Translation (Nepali → English)
  • Source Language: Nepali (audio)
  • Target Language: English (text)
  • Training Data: 776 samples

Usage

from transformers import WhisperForConditionalGeneration, WhisperProcessor, pipeline
from peft import PeftModel

# Load base model
base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-large-v3-turbo")

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "Anryul/whisper-nepali-lora")
processor = WhisperProcessor.from_pretrained("Anryul/whisper-nepali-lora")

# Create translation pipeline
pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
)

# Translate Nepali audio to English
result = pipe("nepali_audio.wav", generate_kwargs={"task": "translate", "language": "nepali"})
print(result["text"])  # English translation

Training

  • Epochs: 10
  • Task: Translation
  • Batch Size: 8
  • Learning Rate: 1e-3
  • LoRA Rank: 32

Citation

If you use this model, please cite the original Whisper paper and dataset.