README.md · ManiKumarAdapala/mt5-telugu at main

mt5-telugu / README.md

ManiKumarAdapala

updated model card

0ead912 verified 8 days ago

preview code

raw

history blame contribute delete

5.12 kB

	---
	license: apache-2.0
	datasets:
	- ai4bharat/BPCC
	language:
	- te
	- en
	metrics:
	- bleu
	- chrf
	library_name: transformers
	base_model:
	- google/mt5-small
	tags:
	- translation
	- text2text-generation
	- indic-nlp
	- telugu
	- mt5
	- hybrid-training
	- full-finetune
	model-index:
	- name: mT5-English-to-Telugu-Translator
	results:
	- task:
	type: translation
	name: Translation English to Telugu
	metrics:
	- type: bleu
	value: 55.34
	name: SacreBLEU
	- type: chrf
	value: 75.87
	name: ChrF++
	---


	# 🌟 T5 English-to-Telugu Hybrid Translator

	This model represents a high-performance breakthrough in small-parameter translation for Telugu Language. It was developed using a unique two-phase training strategy that combines the depth of full fine-tuning with the precision of LoRA (Low-Rank Adaptation).



	## 🚀 The "Two-Phase" Advantage
	Unlike standard fine-tuned models, this version underwent a rigorous 30-epoch journey:

	1. Phase I: Deep Language Grounding (Full Fine-Tuning, 15 Epochs) The entire mT5-small architecture was unlocked to re-align its internal "mental map" from general multilingual space to a specialized English-Telugu domain. This allowed for deep syntactic and morphological adaptation.

	2. Phase II: Precision Refinement (LoRA, 15 Epochs) After the base weights were grounded, LoRA ($r=16$) was applied to the specialized checkpoint. This phase acted as a regularizer, sharpening the translation logic and eliminating the "hallucinations" common in smaller models.

	## 📖 Model Key Description
	- Finetuned by: Adapala Mani Kumar
	- Model Type: Encoder-Decoder (Transformer)
	- Architecture: T5ForConditionalGeneration
	- Language(s): English to Telugu
	- Fine-tuning Technique: Full Finetuning, PEFT/LoRA
	- Max Sequence Length: 128 tokens

	## 📈 Performance (Evaluation Results)
	The model was evaluated on a held-out test set and achieved the following scores:

	\| Metric \| Score \|
	\| :--- \| :--- \|
	\| SacreBLEU \| 55.34 \|
	\| ChrF++ \| 75.87 \|
	\| Validation Loss \| 0.3373 \|

	These scores indicate a very high level of translation quality, outperforming many baseline multilingual models for the English-Telugu pair.

	## 🛠 Usage
	Since Phase II has been merged and unloaded, this model functions as a standalone mT5 model.

	```python
	import torch
	from transformers import T5ForConditionalGeneration, T5Tokenizer

	model_path = "ManiKumarAdapala/mt5-telugu"
	tokenizer = T5Tokenizer.from_pretrained(model_path)
	model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")

	# Move to evaluation mode
	model.eval()

	def translate_to_telugu(text):
	input_text = "translate English to Telugu: " + text

	# Tokenize input
	inputs = tokenizer(input_text, return_tensors="pt").to("cuda")

	# Generate
	with torch.no_grad():
	output_tokens = model.generate(
	**inputs,
	max_length=128,
	num_beams=5, # Beam search for better quality
	early_stopping=True,
	repetition_penalty=1.2
	)

	# Decode
	return tokenizer.decode(output_tokens[0], skip_special_tokens=True)

	english_sentence = 'Pain from appendicitis may begin as dull pain around the navel.'
	print(f"English: {english_sentence}")
	print(f"Telugu: {translate_to_telugu(english_sentence)}")

	# Result :
	# English: Pain from appendicitis may begin as dull pain around the navel.
	# Telugu: అపెండిసైటిస్ వలన వచ్చే నొప్పి నాభి చుట్టూ సన్నటి నొప్పిగా ప్రారంభమవుతుంది.
	```

	or

	This model can also be used with pipeline.

	```python
	from transformers import pipeline, T5ForConditionalGeneration, T5Tokenizer

	model_path = "ManiKumarAdapala/mt5-telugu"
	tokenizer = T5Tokenizer.from_pretrained(model_path)
	model = T5ForConditionalGeneration.from_pretrained(model_path).to("cuda")

	# Move to evaluation mode
	model.eval()

	telugu_translator = pipeline(
	"text2text-generation",
	model=model,
	tokenizer=tokenizer
	)

	def translate(text):
	prefix = "translate English to Telugu: "
	output = telugu_translator(
	f"{prefix}{text}",
	max_length=128,
	num_beams=5,
	early_stopping=True,
	clean_up_tokenization_spaces=True
	)
	return output[0]['generated_text']

	print(translate("It is invariant and is always included in all ragams."))

	# Result : ఇది నిరంతరం ఉంటుంది మరియు ఎల్లప్పుడూ అన్ని రాగాలలో చేర్చబడుతుంది.
	```

	### 📝 Limitations
	- Prefix Required: Always use the prefix translate English to Telugu: for optimal results.
	- Context: Best suited for single sentences or short paragraphs.

	### 🤝 Acknowledgments
	This project is built upon the mT5 (Multilingual T5) architecture developed by Google. Their foundational research into massively multilingual models provided the raw material that made this specialized Telugu-language tool possible.