🧠 Bangla T5 Fine-Tuned Model

This repository contains a fine-tuned version of the T5 model for a Bangla NLP task using Hugging Face Transformers.

📝 Model Description

  • Base Model: csebuetnlp/banglat5
  • Task: Question Answering
  • Language: Bengali (Bangla)
  • Framework: PyTorch + Hugging Face Transformers

📚 Training Configuration

  • Epochs: 15
  • Batch Size: 4
  • Learning Rate: 0.0001
  • Optimizer: Adam
  • Loss Function: CrossEntropyLoss
  • Hardware: Trained on 1× NVIDIA RTX 4090

📉 Training and Validation Loss per Epoch

Epoch Training Loss Validation Loss
1 3.7985 1.3028
2 1.5408 0.7553
3 1.0926 0.4264
4 0.8402 0.4072
5 0.6662 0.3555
6 0.5223 0.2869
7 0.4514 0.2869
8 0.3983 0.2172
9 0.3581 0.1853
10 0.3067 0.1402
11 0.2754 0.1678
12 0.2639 0.1041
13 0.2587 0.1537
14 0.2415 0.0902
15 0.2043 0.1247

🔧 How to Use

from transformers import T5Tokenizer, T5ForConditionalGeneration
import torch

MODEL = T5ForConditionalGeneration.from_pretrained("shaanzeeeee/banglaT5forQnAfinetuned")
TOKENIZER = T5Tokenizer.from_pretrained("shaanzeeeee/banglaT5forQnAfinetuned")
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
MODEL.to(DEVICE)

def predict_answer(context, question, ref_answer=None):
    inputs = TOKENIZER(question, context, max_length=Q_LEN, padding="max_length", truncation=True, add_special_tokens=True)
    
    input_ids = torch.tensor(inputs["input_ids"], dtype=torch.long).to(DEVICE).unsqueeze(0)
    attention_mask = torch.tensor(inputs["attention_mask"], dtype=torch.long).to(DEVICE).unsqueeze(0)

    outputs = MODEL.generate(input_ids=input_ids, attention_mask=attention_mask)
  
    predicted_answer = TOKENIZER.decode(outputs.flatten(), skip_special_tokens=True)
    
    if ref_answer:
        # Load the Bleu metric
        #bleu = evaluate.load("google_bleu")
        #score = bleu.compute(predictions=[predicted_answer], 
                            #references=[ref_answer])
    
        print("Context: \n", context)
        print("\n")
        print("Question: \n", question)
        return {
            "Reference Answer: ": ref_answer, 
            "Predicted Answer: ": predicted_answer, 
            #"BLEU Score: ": score
        }
    else:
        return predicted_answer


context = ""
question = ""
ref_answer = ""
predict_answer(context, question, ref_answer)
Downloads last month
5
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shaanzeeeee/banglaT5forQnAfinetuned

Finetuned
(63)
this model

Dataset used to train shaanzeeeee/banglaT5forQnAfinetuned