Nepali–English Bidirectional Speech Translation

EN ⇄ NE | ASR & Speech Translation

This repository contains a bidirectional English–Nepali speech recognition and translation system. The model enables speech-to-text recognition and speech translation in both directions (EN ⇄ NE).


Model Description

This project combines automatic speech recognition (ASR) and machine translation (MT) to support low-resource language speech translation.

  • ASR is performed using Wav2Vec2, converting raw audio into text.
  • Translation is handled by a Transformer-based encoder–decoder model.
  • The system supports:
    • English → Nepali speech recognition & translation
    • Nepali → English speech recognition & translation

The model was trained using the Hugging Face Transformers library with TensorFlow.


Intended Uses & Limitations

Intended Uses

  • Speech-to-text applications for English and Nepali
  • Speech translation systems
  • Research on low-resource language processing
  • Educational and academic projects

Limitations

  • Performance may degrade on noisy or accented speech
  • Limited by the size and quality of available Nepali datasets
  • Not optimized for real-time deployment on low-end devices

Training and Evaluation Data

The model was trained on curated English–Nepali speech and parallel text datasets. Due to dataset licensing constraints, exact dataset details are not publicly listed.


Training Procedure

The model was trained from scratch using a Transformer-based architecture.

Training Hyperparameters

  • Optimizer: AdamWeightDecay
  • Learning Rate: 2e-05
  • Weight Decay: 0.01
  • Precision: float32

Training Results

Train Loss Validation Loss Epoch
1.2241 1.2162 5
1.1792 1.1920 6
1.1424 1.1731 7
1.1101 1.1592 8
1.0812 1.1455 9

Framework Versions

  • Transformers: 4.48.3
  • TensorFlow: 2.18.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Author

Developed as part of a speech AI project focused on English–Nepali bidirectional translation.

Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support