| # T5-Base Fine-Tuned Model for Question Answering | |
| This repository hosts a fine-tuned version of the **T5-Base** model optimized for question-answering tasks using the [SQuAD] dataset. The model is designed to efficiently perform question answering while maintaining high accuracy. | |
| ## Model Details | |
| - **Model Architecture**:t5-qa-chatbot | |
| - **Task**: Question Answering (QA-Chatbot) | |
| - **Dataset**: [SQuAD] | |
| - **Quantization**: FP16 | |
| - **Fine-tuning Framework**: Hugging Face Transformers | |
| ## π Usage | |
| ### Installation | |
| ```bash | |
| pip install transformers torch | |
| ``` | |
| ### Loading the Model | |
| ```python | |
| from transformers import T5Tokenizer, T5ForConditionalGeneration | |
| import torch | |
| device = "cuda" if torch.cuda.is_available() else "cpu" | |
| model_name = "AventIQ-AI/t5-qa-chatbot" | |
| tokenizer = T5Tokenizer.from_pretrained(model_name) | |
| model = T5ForConditionalGeneration.from_pretrained(model_name).to(device) | |
| ``` | |
| ### Chatbot Inference | |
| ```python | |
| def answer_question(question, context): | |
| input_text = f"question: {question} context: {context}" | |
| inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding="max_length", max_length=512) | |
| # Move input tensors to the same device as the model | |
| inputs = {key: value.to(device) for key, value in inputs.items()} | |
| # Generate answer | |
| with torch.no_grad(): | |
| output = model.generate(**inputs, max_length=150) | |
| # Decode and return answer | |
| return tokenizer.decode(output[0], skip_special_tokens=True) | |
| # Test Case | |
| question = "What is overfitting in machine learning?" | |
| context = "Overfitting occurs when a model learns the training data too well, capturing noise instead of actual patterns. | |
| predicted_answer = answer_question(question, context) | |
| print(f"Predicted Answer: {predicted_answer}") | |
| ``` | |
| ## β‘ Quantization Details | |
| Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to **Float16 (FP16)** to reduce model size and improve inference efficiency while balancing accuracy. | |
| ## π Repository Structure | |
| ``` | |
| . | |
| βββ model/ # Contains the quantized model files | |
| βββ tokenizer_config/ # Tokenizer configuration and vocabulary files | |
| βββ model.safetensors/ # Quantized Model | |
| βββ README.md # Model documentation | |
| ``` | |
| ## β οΈ Limitations | |
| - The model may struggle with highly ambiguous sentences. | |
| - Quantization may lead to slight degradation in accuracy compared to full-precision models. | |
| - Performance may vary across different writing styles and sentence structures. | |
| ## π€ Contributing | |
| Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements. |