File size: 3,566 Bytes

# RoBERTa Fine-Tuned Model for Question Answering
This repository hosts a fine-tuned version of the RoBERTa model optimized for question-answering tasks using the [SQuAD](w) dataset. The model is designed to efficiently perform question answering while maintaining high accuracy.
## Model Details
- **Model Architecture**: RoBERTa
- **Task**: Question Answering
- **Dataset**: [SQuAD](w) (Stanford Question Answering Dataset)
- **Quantization**: FP16
- **Fine-tuning Framework**: Hugging Face Transformers

## 🚀 Usage

### Installation

```bash
pip install transformers torch
```

### Loading the Model

```python
from transformers import RobertaTokenizer, RobertaForQuestionAnswering
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

model_name = "AventIQ-AI/roberta-chatbot"
model = RobertaForQuestionAnswering.from_pretrained(model_name).to(device)
tokenizer = RobertaTokenizer.from_pretrained(model_name)
```

### Chatbot Inference

```python
from transformers import pipeline

# Load QA pipeline
qa_pipeline = pipeline("question-answering", model=model, tokenizer=tokenizer, device=0)

# Test sample question
# Updated context and question for a flight prediction example
# Updated context and question for flight prediction example
context = "Flight AI101 departs from New York at 10:00 AM and arrives in San Francisco at 1:30 PM. The flight duration is 5 hours and 30 minutes."
question = "What is the duration of Flight AI101?"


# Get answer
result = qa_pipeline(question=question, context=context)
print(result)

```
 
📊 Evaluation Results
After fine-tuning the RoBERTa-base model for question answering, we evaluated the model's performance on the validation set from the SQuAD dataset. The following results were obtained:
Metric	Score	Meaning
F1 Score	89.5	Measures the balance between precision and recall for answer extraction.
Exact Match	82.4	Percentage of questions where the predicted answer matches the ground truth exactly.
Fine-Tuning Details
Dataset
The SQuAD dataset, containing over 100,000 question-answer pairs based on Wikipedia articles, was used for fine-tuning the model.
Training
Number of epochs: 3
Batch size: 8
Evaluation strategy: steps
Quantization
Post-training quantization was applied using PyTorch's built-in quantization framework to reduce the model size and improve inference efficiency.
Repository Structure

```
.
├── model/               # Contains the quantized model files
├── tokenizer_config/    # Tokenizer configuration and vocabulary files
├── model.safetensors/   # Quantized Model
├── README.md            # Model documentation
```

## ⚡ Quantization Details

Post-training quantization was applied using PyTorch's built-in quantization framework. The model was quantized to Float16 (FP16) to reduce model size and improve inference efficiency while balancing accuracy.

## 📂 Repository Structure

```
.
├── model/               # Contains the quantized model files
├── tokenizer_config/    # Tokenizer configuration and vocabulary files
├── model.safetensors/   # Quantized Model
├── README.md            # Model documentation
```

## ⚠️ Limitations

- The model may struggle with highly ambiguous sentences.
- Quantization may lead to slight degradation in accuracy compared to full-precision models.
- Performance may vary across different writing styles and sentence structures.

## 🤝 Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.