README.md · ManiKumarAdapala/mt5-telugu at main

File size: 5,118 Bytes
---
license: apache-2.0
datasets:
- ai4bharat/BPCC
language:
- te
- en
metrics:
- bleu
- chrf
library_name: transformers
base_model:
- google/mt5-small
tags:
- translation
- text2text-generation
- indic-nlp
- telugu
- mt5
- hybrid-training
- full-finetune
model-index:
- name: mT5-English-to-Telugu-Translator
  results:
  - task:
      type: translation
      name: Translation English to Telugu
    metrics:
    - type: bleu
      value: 55.34
      name: SacreBLEU
    - type: chrf
      value: 75.87
      name: ChrF++
---


# 🌟 T5 English-to-Telugu Hybrid Translator

This model represents a high-performance breakthrough in small-parameter translation for Telugu Language. It was developed using a **unique two-phase training strategy** that combines the depth of full fine-tuning with the precision of LoRA (Low-Rank Adaptation).



## 🚀 The "Two-Phase" Advantage
Unlike standard fine-tuned models, this version underwent a rigorous 30-epoch journey:

1. **Phase I: Deep Language Grounding (Full Fine-Tuning, 15 Epochs)** The entire mT5-small architecture was unlocked to re-align its internal "mental map" from general multilingual space to a specialized English-Telugu domain. This allowed for deep syntactic and morphological adaptation.
   
2. **Phase II: Precision Refinement (LoRA, 15 Epochs)** After the base weights were grounded, LoRA ($r=16$) was applied to the specialized checkpoint. This phase acted as a regularizer, sharpening the translation logic and eliminating the "hallucinations" common in smaller models.

## 📖 Model Key Description
- **Finetuned by:** Adapala Mani Kumar
- **Model Type:** Encoder-Decoder (Transformer)
- **Architecture:** T5ForConditionalGeneration
- **Language(s):** English to Telugu
- **Fine-tuning Technique:** Full Finetuning, PEFT/LoRA
- **Max Sequence Length:** 128 tokens

## 📈 Performance (Evaluation Results)
The model was evaluated on a held-out test set and achieved the following scores:

| Metric | Score |
| :--- | :--- |
| **SacreBLEU** | 55.34 |
| **ChrF++** | 75.87 |
| **Validation Loss** | 0.3373 |

These scores indicate a very high level of translation quality, outperforming many baseline multilingual models for the English-Telugu pair.

## 🛠 Usage
Since Phase II has been **merged and unloaded**, this model functions as a standalone mT5 model.

```python
import torch
from transformers import T5ForConditionalGeneration, T5Tokenizer

model_path = "ManiKumarAdapala/mt5-telugu"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")

# Move to evaluation mode
model.eval()

def translate_to_telugu(text):
    input_text = "translate English to Telugu: " + text
    
    # Tokenize input
    inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
    
    # Generate
    with torch.no_grad():
        output_tokens = model.generate(
            **inputs, 
            max_length=128, 
            num_beams=5,          # Beam search for better quality
            early_stopping=True,
            repetition_penalty=1.2
        )
    
    # Decode
    return tokenizer.decode(output_tokens[0], skip_special_tokens=True)

english_sentence = 'Pain from appendicitis may begin as dull pain around the navel.'
print(f"English: {english_sentence}")
print(f"Telugu:  {translate_to_telugu(english_sentence)}")

# Result :
# English: Pain from appendicitis may begin as dull pain around the navel.
# Telugu:  అపెండిసైటిస్ వలన వచ్చే నొప్పి నాభి చుట్టూ సన్నటి నొప్పిగా ప్రారంభమవుతుంది.
```

or 

This model can also be used with pipeline.

```python
from transformers import pipeline, T5ForConditionalGeneration, T5Tokenizer

model_path = "ManiKumarAdapala/mt5-telugu"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")

# Move to evaluation mode
model.eval()

telugu_translator = pipeline(
    "text2text-generation",
    model=model,
    tokenizer=tokenizer
)

def translate(text):
    prefix = "translate English to Telugu: "
    output = telugu_translator(
        f"{prefix}{text}",
        max_length=128,
        num_beams=5,
        early_stopping=True,
        clean_up_tokenization_spaces=True
    )
    return output[0]['generated_text']

print(translate("It is invariant and is always included in all ragams."))

# Result : ఇది నిరంతరం ఉంటుంది మరియు ఎల్లప్పుడూ అన్ని రాగాలలో చేర్చబడుతుంది.
```

### 📝 Limitations
- Prefix Required: Always use the prefix translate English to Telugu:  for optimal results.
- Context: Best suited for single sentences or short paragraphs.

### 🤝 Acknowledgments
This project is built upon the mT5 (Multilingual T5) architecture developed by Google. Their foundational research into massively multilingual models provided the raw material that made this specialized Telugu-language tool possible.