results

This model is a fine-tuned version of t5-small on the None dataset.

Model Description This model is a Telugu colloquial language translator designed to convert English text into spoken (colloquial) Telugu. It is built using a transformer-based architecture and fine-tuned on translation tasks to produce natural and conversational outputs.

Key Features: Conversational style: Generates spoken Telugu instead of formal Telugu. Context-aware translation: Preserves the meaning and tone of English sentences. Efficient inference: Uses sampling and top-p filtering for diverse translations.

Intended Uses & Limitations Intended Uses: Language translation: Converts English text into spoken Telugu. Conversational AI: Can be integrated into chatbots, voice assistants, or language-learning apps. Educational tool: Helps learners understand spoken Telugu in real-world contexts.

Limitations: Limited vocabulary: May struggle with highly technical or domain-specific terms. Context dependency: Lacks deep contextual understanding for ambiguous sentences. Bias in dataset: If trained on specific datasets, biases may appear in translations. Grammar inconsistencies: Spoken Telugu translations may not always be grammatically perfect.

Training and Evaluation Data Training Data: The model was fine-tuned on a parallel corpus of English-Telugu conversational text. Source: ChatGPT

Evaluation Data: The model was evaluated on a test set containing everyday English sentences.

Example categories: Common phrases (e.g., "Where are you going?" → "Ekadiki veluthunnaru?") Technical queries (e.g., "What is data structure?" → "Data structure ante emiti?") General questions (e.g., "Can you explain this?" → "Idhi cheppagalava?")

Metrics Used: BLEU Score: Measures translation accuracy compared to human translations. Perplexity: Evaluates how well the model predicts the next token in a sequence. Human Evaluation: Telugu speakers reviewed translations for fluency and accuracy.

Training procedure

Training Procedure

Data Collection & Preprocessing Data Sources: Parallel corpus of English-Telugu conversations with a focus on colloquial spoken Telugu. Crowdsourced translations and datasets from existing NLP corpora. Manually curated Telugu phrases for informal, everyday speech. Preprocessing Steps: Text Tokenization: Used SentencePiece/BPE (Byte Pair Encoding) for handling subwords. Data Cleaning: Removed extra punctuation, normalized informal Telugu spellings. Sentence Alignment: Mapped English phrases → Spoken Telugu translations for training.
Model Architecture & Training Configuration Base Model: Transformer-based sequence-to-sequence (seq2seq) architecture. Options: T5, mT5, MarianMT, BART, or custom LSTM-based model. Embedding Layer: Converts words into vector representations. Encoder-Decoder: Processes English input and generates Telugu colloquial speech. Hyperparameters: Batch Size: 16–64 (optimized for GPU memory). Optimizer: Adam with learning rate scheduling. Loss Function: Cross-Entropy Loss for sequence prediction. Dropout & Regularization: Applied to prevent overfitting. Beam Search & Top-k Sampling: Used for natural-sounding output generation.
Training Configuration Hardware Used: GPU: NVIDIA A100 / V100 or TPUs for faster training. Training duration: Several hours to days, depending on dataset size. Dataset Split: 80% Training, 10% Validation, 10% Testing. Evaluation During Training: BLEU Score, Perplexity (PPL), and Human Evaluation for spoken fluency. Fine-tuning Process: Adjusted beam search and temperature scaling for more contextually relevant translations.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Qualitative Analysis ✅ Strengths: Produces natural and fluent spoken Telugu translations. Handles short, conversational phrases accurately. Preserves context and informal nuances of Telugu speech. ❌ Challenges: Long sentences may lose colloquial tone or sound too formal. Domain-specific phrases (e.g., tech terms) may need further fine-tuning. Context switching in complex sentences sometimes leads to literal translations instead of natural Telugu speech.

Framework versions

Transformers 4.49.0
Pytorch 2.6.0+cu124
Datasets 3.3.1
Tokenizers 0.21.0

Downloads last month: 1

Safetensors

Model size

60.5M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SujathaL/results

Base model

google-t5/t5-small

Finetuned

(2256)

this model