| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - ai4bharat/BPCC |
| | language: |
| | - te |
| | - en |
| | metrics: |
| | - bleu |
| | - chrf |
| | library_name: transformers |
| | base_model: |
| | - google/mt5-small |
| | tags: |
| | - translation |
| | - text2text-generation |
| | - indic-nlp |
| | - telugu |
| | - mt5 |
| | - hybrid-training |
| | - full-finetune |
| | model-index: |
| | - name: mT5-English-to-Telugu-Translator |
| | results: |
| | - task: |
| | type: translation |
| | name: Translation English to Telugu |
| | metrics: |
| | - type: bleu |
| | value: 55.34 |
| | name: SacreBLEU |
| | - type: chrf |
| | value: 75.87 |
| | name: ChrF++ |
| | --- |
| | |
| |
|
| | # 🌟 T5 English-to-Telugu Hybrid Translator |
| |
|
| | This model represents a high-performance breakthrough in small-parameter translation for Telugu Language. It was developed using a **unique two-phase training strategy** that combines the depth of full fine-tuning with the precision of LoRA (Low-Rank Adaptation). |
| |
|
| |
|
| |
|
| | ## 🚀 The "Two-Phase" Advantage |
| | Unlike standard fine-tuned models, this version underwent a rigorous 30-epoch journey: |
| |
|
| | 1. **Phase I: Deep Language Grounding (Full Fine-Tuning, 15 Epochs)** The entire mT5-small architecture was unlocked to re-align its internal "mental map" from general multilingual space to a specialized English-Telugu domain. This allowed for deep syntactic and morphological adaptation. |
| | |
| | 2. **Phase II: Precision Refinement (LoRA, 15 Epochs)** After the base weights were grounded, LoRA ($r=16$) was applied to the specialized checkpoint. This phase acted as a regularizer, sharpening the translation logic and eliminating the "hallucinations" common in smaller models. |
| |
|
| | ## 📖 Model Key Description |
| | - **Finetuned by:** Adapala Mani Kumar |
| | - **Model Type:** Encoder-Decoder (Transformer) |
| | - **Architecture:** T5ForConditionalGeneration |
| | - **Language(s):** English to Telugu |
| | - **Fine-tuning Technique:** Full Finetuning, PEFT/LoRA |
| | - **Max Sequence Length:** 128 tokens |
| |
|
| | ## 📈 Performance (Evaluation Results) |
| | The model was evaluated on a held-out test set and achieved the following scores: |
| |
|
| | | Metric | Score | |
| | | :--- | :--- | |
| | | **SacreBLEU** | 55.34 | |
| | | **ChrF++** | 75.87 | |
| | | **Validation Loss** | 0.3373 | |
| |
|
| | These scores indicate a very high level of translation quality, outperforming many baseline multilingual models for the English-Telugu pair. |
| |
|
| | ## 🛠 Usage |
| | Since Phase II has been **merged and unloaded**, this model functions as a standalone mT5 model. |
| |
|
| | ```python |
| | import torch |
| | from transformers import T5ForConditionalGeneration, T5Tokenizer |
| | |
| | model_path = "ManiKumarAdapala/mt5-telugu" |
| | tokenizer = T5Tokenizer.from_pretrained(model_path) |
| | model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda") |
| | |
| | # Move to evaluation mode |
| | model.eval() |
| | |
| | def translate_to_telugu(text): |
| | input_text = "translate English to Telugu: " + text |
| | |
| | # Tokenize input |
| | inputs = tokenizer(input_text, return_tensors="pt").to("cuda") |
| | |
| | # Generate |
| | with torch.no_grad(): |
| | output_tokens = model.generate( |
| | **inputs, |
| | max_length=128, |
| | num_beams=5, # Beam search for better quality |
| | early_stopping=True, |
| | repetition_penalty=1.2 |
| | ) |
| | |
| | # Decode |
| | return tokenizer.decode(output_tokens[0], skip_special_tokens=True) |
| | |
| | english_sentence = 'Pain from appendicitis may begin as dull pain around the navel.' |
| | print(f"English: {english_sentence}") |
| | print(f"Telugu: {translate_to_telugu(english_sentence)}") |
| | |
| | # Result : |
| | # English: Pain from appendicitis may begin as dull pain around the navel. |
| | # Telugu: అపెండిసైటిస్ వలన వచ్చే నొప్పి నాభి చుట్టూ సన్నటి నొప్పిగా ప్రారంభమవుతుంది. |
| | ``` |
| |
|
| | or |
| |
|
| | This model can also be used with pipeline. |
| |
|
| | ```python |
| | from transformers import pipeline, T5ForConditionalGeneration, T5Tokenizer |
| | |
| | model_path = "ManiKumarAdapala/mt5-telugu" |
| | tokenizer = T5Tokenizer.from_pretrained(model_path) |
| | model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda") |
| | |
| | # Move to evaluation mode |
| | model.eval() |
| | |
| | telugu_translator = pipeline( |
| | "text2text-generation", |
| | model=model, |
| | tokenizer=tokenizer |
| | ) |
| | |
| | def translate(text): |
| | prefix = "translate English to Telugu: " |
| | output = telugu_translator( |
| | f"{prefix}{text}", |
| | max_length=128, |
| | num_beams=5, |
| | early_stopping=True, |
| | clean_up_tokenization_spaces=True |
| | ) |
| | return output[0]['generated_text'] |
| | |
| | print(translate("It is invariant and is always included in all ragams.")) |
| | |
| | # Result : ఇది నిరంతరం ఉంటుంది మరియు ఎల్లప్పుడూ అన్ని రాగాలలో చేర్చబడుతుంది. |
| | ``` |
| |
|
| | ### 📝 Limitations |
| | - Prefix Required: Always use the prefix translate English to Telugu: for optimal results. |
| | - Context: Best suited for single sentences or short paragraphs. |
| |
|
| | ### 🤝 Acknowledgments |
| | This project is built upon the mT5 (Multilingual T5) architecture developed by Google. Their foundational research into massively multilingual models provided the raw material that made this specialized Telugu-language tool possible. |