🌟 T5 English-to-Telugu Hybrid Translator

This model represents a high-performance breakthrough in small-parameter translation for Telugu Language. It was developed using a unique two-phase training strategy that combines the depth of full fine-tuning with the precision of LoRA (Low-Rank Adaptation).

🚀 The "Two-Phase" Advantage

Unlike standard fine-tuned models, this version underwent a rigorous 30-epoch journey:

Phase I: Deep Language Grounding (Full Fine-Tuning, 15 Epochs) The entire mT5-small architecture was unlocked to re-align its internal "mental map" from general multilingual space to a specialized English-Telugu domain. This allowed for deep syntactic and morphological adaptation.
Phase II: Precision Refinement (LoRA, 15 Epochs) After the base weights were grounded, LoRA ($r=16$) was applied to the specialized checkpoint. This phase acted as a regularizer, sharpening the translation logic and eliminating the "hallucinations" common in smaller models.

📖 Model Key Description

Finetuned by: Adapala Mani Kumar
Model Type: Encoder-Decoder (Transformer)
Architecture: T5ForConditionalGeneration
Language(s): English to Telugu
Fine-tuning Technique: Full Finetuning, PEFT/LoRA
Max Sequence Length: 128 tokens

📈 Performance (Evaluation Results)

The model was evaluated on a held-out test set and achieved the following scores:

Metric	Score
SacreBLEU	55.34
ChrF++	75.87
Validation Loss	0.3373

These scores indicate a very high level of translation quality, outperforming many baseline multilingual models for the English-Telugu pair.

🛠 Usage

Since Phase II has been merged and unloaded, this model functions as a standalone mT5 model.

import torch
from transformers import T5ForConditionalGeneration, T5Tokenizer

model_path = "ManiKumarAdapala/mt5-telugu"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")

# Move to evaluation mode
model.eval()

def translate_to_telugu(text):
    input_text = "translate English to Telugu: " + text
    
    # Tokenize input
    inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
    
    # Generate
    with torch.no_grad():
        output_tokens = model.generate(
            **inputs, 
            max_length=128, 
            num_beams=5,          # Beam search for better quality
            early_stopping=True,
            repetition_penalty=1.2
        )
    
    # Decode
    return tokenizer.decode(output_tokens[0], skip_special_tokens=True)

english_sentence = 'Pain from appendicitis may begin as dull pain around the navel.'
print(f"English: {english_sentence}")
print(f"Telugu:  {translate_to_telugu(english_sentence)}")

# Result :
# English: Pain from appendicitis may begin as dull pain around the navel.
# Telugu:  అపెండిసైటిస్ వలన వచ్చే నొప్పి నాభి చుట్టూ సన్నటి నొప్పిగా ప్రారంభమవుతుంది.

This model can also be used with pipeline.

from transformers import pipeline, T5ForConditionalGeneration, T5Tokenizer

model_path = "ManiKumarAdapala/mt5-telugu"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")

# Move to evaluation mode
model.eval()

telugu_translator = pipeline(
    "text2text-generation",
    model=model,
    tokenizer=tokenizer
)

def translate(text):
    prefix = "translate English to Telugu: "
    output = telugu_translator(
        f"{prefix}{text}",
        max_length=128,
        num_beams=5,
        early_stopping=True,
        clean_up_tokenization_spaces=True
    )
    return output[0]['generated_text']

print(translate("It is invariant and is always included in all ragams."))

# Result : ఇది నిరంతరం ఉంటుంది మరియు ఎల్లప్పుడూ అన్ని రాగాలలో చేర్చబడుతుంది.

📝 Limitations

Prefix Required: Always use the prefix translate English to Telugu: for optimal results.
Context: Best suited for single sentences or short paragraphs.

🤝 Acknowledgments

This project is built upon the mT5 (Multilingual T5) architecture developed by Google. Their foundational research into massively multilingual models provided the raw material that made this specialized Telugu-language tool possible.

Downloads last month: 4

Safetensors

Model size

0.3B params

Tensor type

BF16

Model tree for ManiKumarAdapala/mt5-telugu

Base model

google/mt5-small

Finetuned

(749)

this model

Dataset used to train ManiKumarAdapala/mt5-telugu

Evaluation results

SacreBLEU
self-reported

55.340
ChrF++
self-reported

75.870