🌟 T5 English-to-Telugu Hybrid Translator

This model represents a high-performance breakthrough in small-parameter translation for Telugu Language. It was developed using a unique two-phase training strategy that combines the depth of full fine-tuning with the precision of LoRA (Low-Rank Adaptation).

🚀 The "Two-Phase" Advantage

Unlike standard fine-tuned models, this version underwent a rigorous 30-epoch journey:

  1. Phase I: Deep Language Grounding (Full Fine-Tuning, 15 Epochs) The entire mT5-small architecture was unlocked to re-align its internal "mental map" from general multilingual space to a specialized English-Telugu domain. This allowed for deep syntactic and morphological adaptation.

  2. Phase II: Precision Refinement (LoRA, 15 Epochs) After the base weights were grounded, LoRA ($r=16$) was applied to the specialized checkpoint. This phase acted as a regularizer, sharpening the translation logic and eliminating the "hallucinations" common in smaller models.

📖 Model Key Description

  • Finetuned by: Adapala Mani Kumar
  • Model Type: Encoder-Decoder (Transformer)
  • Architecture: T5ForConditionalGeneration
  • Language(s): English to Telugu
  • Fine-tuning Technique: Full Finetuning, PEFT/LoRA
  • Max Sequence Length: 128 tokens

📈 Performance (Evaluation Results)

The model was evaluated on a held-out test set and achieved the following scores:

Metric Score
SacreBLEU 55.34
ChrF++ 75.87
Validation Loss 0.3373

These scores indicate a very high level of translation quality, outperforming many baseline multilingual models for the English-Telugu pair.

🛠 Usage

Since Phase II has been merged and unloaded, this model functions as a standalone mT5 model.

import torch
from transformers import T5ForConditionalGeneration, T5Tokenizer

model_path = "ManiKumarAdapala/mt5-telugu"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")

# Move to evaluation mode
model.eval()

def translate_to_telugu(text):
    input_text = "translate English to Telugu: " + text
    
    # Tokenize input
    inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
    
    # Generate
    with torch.no_grad():
        output_tokens = model.generate(
            **inputs, 
            max_length=128, 
            num_beams=5,          # Beam search for better quality
            early_stopping=True,
            repetition_penalty=1.2
        )
    
    # Decode
    return tokenizer.decode(output_tokens[0], skip_special_tokens=True)

english_sentence = 'Pain from appendicitis may begin as dull pain around the navel.'
print(f"English: {english_sentence}")
print(f"Telugu:  {translate_to_telugu(english_sentence)}")

# Result :
# English: Pain from appendicitis may begin as dull pain around the navel.
# Telugu:  అపెండిసైటిస్ వలన వచ్చే నొప్పి నాభి చుట్టూ సన్నటి నొప్పిగా ప్రారంభమవుతుంది.

or

This model can also be used with pipeline.

from transformers import pipeline, T5ForConditionalGeneration, T5Tokenizer

model_path = "ManiKumarAdapala/mt5-telugu"
tokenizer = T5Tokenizer.from_pretrained(model_path)
model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")

# Move to evaluation mode
model.eval()

telugu_translator = pipeline(
    "text2text-generation",
    model=model,
    tokenizer=tokenizer
)

def translate(text):
    prefix = "translate English to Telugu: "
    output = telugu_translator(
        f"{prefix}{text}",
        max_length=128,
        num_beams=5,
        early_stopping=True,
        clean_up_tokenization_spaces=True
    )
    return output[0]['generated_text']

print(translate("It is invariant and is always included in all ragams."))

# Result : ఇది నిరంతరం ఉంటుంది మరియు ఎల్లప్పుడూ అన్ని రాగాలలో చేర్చబడుతుంది.

📝 Limitations

  • Prefix Required: Always use the prefix translate English to Telugu: for optimal results.
  • Context: Best suited for single sentences or short paragraphs.

🤝 Acknowledgments

This project is built upon the mT5 (Multilingual T5) architecture developed by Google. Their foundational research into massively multilingual models provided the raw material that made this specialized Telugu-language tool possible.

Downloads last month
13
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ManiKumarAdapala/mt5-telugu

Base model

google/mt5-small
Finetuned
(644)
this model

Dataset used to train ManiKumarAdapala/mt5-telugu

Evaluation results