--- license: apache-2.0 datasets: - ai4bharat/BPCC language: - te - en metrics: - bleu - chrf library_name: transformers base_model: - google/mt5-small tags: - translation - text2text-generation - indic-nlp - telugu - mt5 - hybrid-training - full-finetune model-index: - name: mT5-English-to-Telugu-Translator results: - task: type: translation name: Translation English to Telugu metrics: - type: bleu value: 55.34 name: SacreBLEU - type: chrf value: 75.87 name: ChrF++ --- # 🌟 T5 English-to-Telugu Hybrid Translator This model represents a high-performance breakthrough in small-parameter translation for Telugu Language. It was developed using a **unique two-phase training strategy** that combines the depth of full fine-tuning with the precision of LoRA (Low-Rank Adaptation). ## 🚀 The "Two-Phase" Advantage Unlike standard fine-tuned models, this version underwent a rigorous 30-epoch journey: 1. **Phase I: Deep Language Grounding (Full Fine-Tuning, 15 Epochs)** The entire mT5-small architecture was unlocked to re-align its internal "mental map" from general multilingual space to a specialized English-Telugu domain. This allowed for deep syntactic and morphological adaptation. 2. **Phase II: Precision Refinement (LoRA, 15 Epochs)** After the base weights were grounded, LoRA ($r=16$) was applied to the specialized checkpoint. This phase acted as a regularizer, sharpening the translation logic and eliminating the "hallucinations" common in smaller models. ## 📖 Model Key Description - **Finetuned by:** Adapala Mani Kumar - **Model Type:** Encoder-Decoder (Transformer) - **Architecture:** T5ForConditionalGeneration - **Language(s):** English to Telugu - **Fine-tuning Technique:** Full Finetuning, PEFT/LoRA - **Max Sequence Length:** 128 tokens ## 📈 Performance (Evaluation Results) The model was evaluated on a held-out test set and achieved the following scores: | Metric | Score | | :--- | :--- | | **SacreBLEU** | 55.34 | | **ChrF++** | 75.87 | | **Validation Loss** | 0.3373 | These scores indicate a very high level of translation quality, outperforming many baseline multilingual models for the English-Telugu pair. ## 🛠 Usage Since Phase II has been **merged and unloaded**, this model functions as a standalone mT5 model. ```python import torch from transformers import T5ForConditionalGeneration, T5Tokenizer model_path = "ManiKumarAdapala/mt5-telugu" tokenizer = T5Tokenizer.from_pretrained(model_path) model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda") # Move to evaluation mode model.eval() def translate_to_telugu(text): input_text = "translate English to Telugu: " + text # Tokenize input inputs = tokenizer(input_text, return_tensors="pt").to("cuda") # Generate with torch.no_grad(): output_tokens = model.generate( **inputs, max_length=128, num_beams=5, # Beam search for better quality early_stopping=True, repetition_penalty=1.2 ) # Decode return tokenizer.decode(output_tokens[0], skip_special_tokens=True) english_sentence = 'Pain from appendicitis may begin as dull pain around the navel.' print(f"English: {english_sentence}") print(f"Telugu: {translate_to_telugu(english_sentence)}") # Result : # English: Pain from appendicitis may begin as dull pain around the navel. # Telugu: అపెండిసైటిస్ వలన వచ్చే నొప్పి నాభి చుట్టూ సన్నటి నొప్పిగా ప్రారంభమవుతుంది. ``` or This model can also be used with pipeline. ```python from transformers import pipeline, T5ForConditionalGeneration, T5Tokenizer model_path = "ManiKumarAdapala/mt5-telugu" tokenizer = T5Tokenizer.from_pretrained(model_path) model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda") # Move to evaluation mode model.eval() telugu_translator = pipeline( "text2text-generation", model=model, tokenizer=tokenizer ) def translate(text): prefix = "translate English to Telugu: " output = telugu_translator( f"{prefix}{text}", max_length=128, num_beams=5, early_stopping=True, clean_up_tokenization_spaces=True ) return output[0]['generated_text'] print(translate("It is invariant and is always included in all ragams.")) # Result : ఇది నిరంతరం ఉంటుంది మరియు ఎల్లప్పుడూ అన్ని రాగాలలో చేర్చబడుతుంది. ``` ### 📝 Limitations - Prefix Required: Always use the prefix translate English to Telugu: for optimal results. - Context: Best suited for single sentences or short paragraphs. ### 🤝 Acknowledgments This project is built upon the mT5 (Multilingual T5) architecture developed by Google. Their foundational research into massively multilingual models provided the raw material that made this specialized Telugu-language tool possible.