deberta-v3-nq-classification / README.md

mohamedsa1

Update README.md

ea53bd1 verified about 1 month ago

preview code

raw

history blame contribute delete

6.77 kB

metadata

language:
  - en
license: mit
library_name: transformers
tags:
  - text-classification
  - question-answering
  - deberta
  - deberta-v3
  - natural-questions
  - pytorch
  - transformers
  - kaggle
  - tensorflow2-qa
  - nq
datasets:
  - google/natural_questions
metrics:
  - accuracy
  - f1
  - precision
  - recall
pipeline_tag: text-classification
base_model: microsoft/deberta-v3-small
model-index:
  - name: deberta-v3-nq-classification
    results:
      - task:
          type: text-classification
          name: Question Answering Classification
        dataset:
          name: Natural Questions (Simplified)
          type: natural_questions
          config: simplified
          split: validation
        metrics:
          - type: accuracy
            value: 85.42
            name: Accuracy
          - type: f1
            value: 82.34
            name: Macro F1
          - type: precision
            value: 84.21
            name: Macro Precision
          - type: recall
            value: 83.67
            name: Macro Recall
widget:
  - text: >-
      Question: What is the capital of France? Context: Paris is the capital and
      most populous city of France, with an estimated population of 2,102,650
      residents as of 1 January 2023.
    example_title: Factual Question
  - text: >-
      Question: Is Paris the capital of France? Context: Paris is the capital
      and most populous city of France.
    example_title: Yes/No Question
  - text: >-
      Question: What is the population of Mars? Context: Earth is the third
      planet from the Sun and the only astronomical object known to harbor life.
    example_title: No Answer

DeBERTa-v3-Small for Natural Questions Classification

This model is a fine-tuned version of microsoft/deberta-v3-small on the Natural Questions dataset. It classifies question-context pairs into three categories: No Answer, Has Answer, or Yes/No, achieving 85.42% accuracy and 82.34% macro F1 score.

Model Details

Model Description

This is a DeBERTa-v3-Small model fine-tuned for question-answering classification. Given a question and context, it predicts whether:

🔴 No Answer (Label 0): The context doesn't contain an answer
🟢 Has Answer (Label 1): The context contains a specific answer
🔵 Yes/No (Label 2): The question requires a YES/NO response

The model was trained on the Natural Questions dataset as part of the TensorFlow 2.0 Question Answering Kaggle competition.

Developed by: [Your Name]
Funded by [optional]: Self-funded / Academic Project
Shared by [optional]: [Your Organization/University]
Model type: Transformer-based Sequence Classification (DeBERTa-v3)
Language(s) (NLP): English (en)
License: MIT
Finetuned from model: microsoft/deberta-v3-small

Model Sources

Repository: GitHub
Paper: DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training
Demo: Gradio Space

Uses

Direct Use

The model can be used directly for:

Question Answering System Pre-filtering: Filter out unanswerable questions before expensive processing
Search Result Classification: Determine if search results contain relevant answers
Customer Support Routing: Route questions based on answer availability
Educational Assessment: Evaluate if reading passages can answer questions
Information Retrieval: Assess document relevance for QA tasks

Downstream Use

The model serves as a foundation for:

Multi-stage QA Pipelines: First stage before extractive/generative QA models
Hybrid QA Systems: Combine with span extraction for end-to-end QA
Dialog Systems: Determine if chatbot has sufficient context
Domain Adaptation: Fine-tune on domain-specific datasets

Out-of-Scope Use

❌ Not suitable for:

Extractive answer span prediction (only classifies, doesn't extract)
Generative question answering
Non-English languages
Very long documents (>256 tokens without truncation)
Medical/legal decision-making
Fact verification

Bias, Risks, and Limitations

Limitations:

Context limited to 256 tokens
Wikipedia-biased training data
Trained on 10,000 examples (subset of full dataset)
May struggle with complex reasoning questions

Biases:

Better on factual "what/when/where" questions
Inherits biases from Wikipedia and base model
Performance varies across domains

Risks:

May be overconfident on ambiguous inputs
False negatives on complex phrasings
Vulnerable to adversarial examples

Recommendations

Users should:

✅ Implement human review for critical applications
✅ Monitor performance across different domains
✅ Calibrate confidence thresholds for use case
✅ Test on representative samples
✅ Use as one component in multi-model systems

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import DebertaV2Tokenizer, DebertaV2ForSequenceClassification
import torch

# Load model
model_name = "mohamedsa1/deberta-v3-nq-classification"
tokenizer = DebertaV2Tokenizer.from_pretrained(model_name)
model = DebertaV2ForSequenceClassification.from_pretrained(model_name)

# Prepare input
question = "What is the capital of France?"
context = "Paris is the capital and most populous city of France."
text = f"Question: {question} Context: {context}"

# Inference
inputs = tokenizer(text, return_tensors="pt", max_length=256, truncation=True, padding=True)
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.nn.functional.softmax(outputs.logits, dim=-1)[0]
    prediction = torch.argmax(probs).item()

# Results
labels = ["No Answer", "Has Answer", "Yes/No"]
print(f"Prediction: {labels[prediction]}")
print(f"Confidence: {probs[prediction]:.2%}")