File size: 5,118 Bytes
1779804 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 | ---
license: apache-2.0
datasets:
- ai4bharat/BPCC
language:
- te
- en
metrics:
- bleu
- chrf
library_name: transformers
base_model:
- google/mt5-small
tags:
- translation
- text2text-generation
- indic-nlp
- telugu
- mt5
- hybrid-training
- full-finetune
model-index:
- name: mT5-English-to-Telugu-Translator
results:
- task:
type: translation
name: Translation English to Telugu
metrics:
- type: bleu
value: 55.34
name: SacreBLEU
- type: chrf
value: 75.87
name: ChrF++
---
# 🌟 T5 English-to-Telugu Hybrid Translator
This model represents a high-performance breakthrough in small-parameter translation for Telugu Language. It was developed using a **unique two-phase training strategy** that combines the depth of full fine-tuning with the precision of LoRA (Low-Rank Adaptation).
## 🚀 The "Two-Phase" Advantage
Unlike standard fine-tuned models, this version underwent a rigorous 30-epoch journey:
1. **Phase I: Deep Language Grounding (Full Fine-Tuning, 15 Epochs)** The entire mT5-small architecture was unlocked to re-align its internal "mental map" from general multilingual space to a specialized English-Telugu domain. This allowed for deep syntactic and morphological adaptation.
2. **Phase II: Precision Refinement (LoRA, 15 Epochs)** After the base weights were grounded, LoRA ($r=16$) was applied to the specialized checkpoint. This phase acted as a regularizer, sharpening the translation logic and eliminating the "hallucinations" common in smaller models.
## 📖 Model Key Description
- **Finetuned by:** Adapala Mani Kumar
- **Model Type:** Encoder-Decoder (Transformer)
- **Architecture:** T5ForConditionalGeneration
- **Language(s):** English to Telugu
- **Fine-tuning Technique:** Full Finetuning, PEFT/LoRA
- **Max Sequence Length:** 128 tokens
## 📈 Performance (Evaluation Results)
The model was evaluated on a held-out test set and achieved the following scores:
| Metric | Score |
| :--- | :--- |
| **SacreBLEU** | 55.34 |
| **ChrF++** | 75.87 |
| **Validation Loss** | 0.3373 |
These scores indicate a very high level of translation quality, outperforming many baseline multilingual models for the English-Telugu pair.
## 🛠 Usage
Since Phase II has been **merged and unloaded**, this model functions as a standalone mT5 model.
```python
import torch
from transformers import T5ForConditionalGeneration, T5Tokenizer
model_path = "ManiKumarAdapala/mt5-telugu"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")
# Move to evaluation mode
model.eval()
def translate_to_telugu(text):
input_text = "translate English to Telugu: " + text
# Tokenize input
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
# Generate
with torch.no_grad():
output_tokens = model.generate(
**inputs,
max_length=128,
num_beams=5, # Beam search for better quality
early_stopping=True,
repetition_penalty=1.2
)
# Decode
return tokenizer.decode(output_tokens[0], skip_special_tokens=True)
english_sentence = 'Pain from appendicitis may begin as dull pain around the navel.'
print(f"English: {english_sentence}")
print(f"Telugu: {translate_to_telugu(english_sentence)}")
# Result :
# English: Pain from appendicitis may begin as dull pain around the navel.
# Telugu: అపెండిసైటిస్ వలన వచ్చే నొప్పి నాభి చుట్టూ సన్నటి నొప్పిగా ప్రారంభమవుతుంది.
```
or
This model can also be used with pipeline.
```python
from transformers import pipeline, T5ForConditionalGeneration, T5Tokenizer
model_path = "ManiKumarAdapala/mt5-telugu"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")
# Move to evaluation mode
model.eval()
telugu_translator = pipeline(
"text2text-generation",
model=model,
tokenizer=tokenizer
)
def translate(text):
prefix = "translate English to Telugu: "
output = telugu_translator(
f"{prefix}{text}",
max_length=128,
num_beams=5,
early_stopping=True,
clean_up_tokenization_spaces=True
)
return output[0]['generated_text']
print(translate("It is invariant and is always included in all ragams."))
# Result : ఇది నిరంతరం ఉంటుంది మరియు ఎల్లప్పుడూ అన్ని రాగాలలో చేర్చబడుతుంది.
```
### 📝 Limitations
- Prefix Required: Always use the prefix translate English to Telugu: for optimal results.
- Context: Best suited for single sentences or short paragraphs.
### 🤝 Acknowledgments
This project is built upon the mT5 (Multilingual T5) architecture developed by Google. Their foundational research into massively multilingual models provided the raw material that made this specialized Telugu-language tool possible. |