Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available:
6.4.0
metadata
title: Multilingual Transliteration
emoji: 🌐
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.8.0
app_file: app.py
pinned: false
Multilingual Transliteration Model
This project implements a multilingual transliteration model (English -> Hindi, Bengali, Tamil) using a fine-tuned mT5 model. It focuses on optimization using CTranslate2 for fast inference and provides a Gradio-based web interface.
Project Structure
src/: Source code for training, optimization, and deployment.data/: Directory for storing datasets (train/test/val).models/: Directory for saving trained and optimized models.requirements.txt: Python dependencies.
Setup
Clone the repository:
git clone <repo_url> cd <repo_name>Create a virtual environment (optional but recommended):
python -m venv venv .\venv\Scripts\activate # Windows # source venv/bin/activate # Linux/MacInstall dependencies:
pip install -r requirements.txt
Usage
1. Data Preparation
Generate dummy data for training:
python src/prepare_data.py
2. Training
Train the mT5 model:
python src/train.py
3. Optimization
Optimize the trained model using CTranslate2 and benchmark:
python src/optimize.py
4. Run Demo
Launch the Gradio app:
python src/app.py
Approach
- Model:
google/mt5-smallis used as the base model due to its multilingual capabilities and efficiency. - Optimization: CTranslate2 is used to quantize and optimize the model for faster CPU/GPU inference.
- Deployment: Gradio provides a simple and interactive UI for the model.