| --- |
| language: en |
| license: apache-2.0 |
| base_model: bert-base-uncased |
| tags: |
| - question-answering |
| - bert |
| - squad |
| - extractive-qa |
| datasets: |
| - rajpurkar/squad |
| metrics: |
| - exact_match |
| - f1 |
| --- |
| |
| # BERT SQuAD Question Answering Model |
|
|
| A fine-tuned version of `bert-base-uncased` on [SQuAD v1.1](https://huggingface.co/datasets/rajpurkar/squad) |
| for **extractive question answering**. |
|
|
| This model finds answer spans directly within a provided context paragraph. |
| It does not generate new text — the answer must exist in the context. |
|
|
| ## Model Performance |
| Evaluated on 1000 examples from the SQuAD v1.1 validation set: |
|
|
| | Metric | Score | |
| |---|---| |
| | Exact Match (EM) | 61.20 | |
| | F1 Score | 76.25 | |
|
|
| ## How to Use |
|
|
| ```python |
| from transformers import pipeline |
| |
| qa = pipeline("question-answering", model="argha9177/bert-squad-qa") |
| |
| result = qa( |
| question="What is the capital of France?", |
| context="France is a country in Western Europe. Its capital city is Paris." |
| ) |
| print(result) |
| # {'answer': 'Paris', 'score': 0.98, 'start': 58, 'end': 63} |
| ``` |
|
|
| ## Input Format |
| - **question**: The question to answer (string) |
| - **context**: The paragraph containing the answer (string) |
| - The answer must exist verbatim within the context |
| - Max combined input length: 384 tokens |
| - Longer contexts are handled automatically via sliding window (stride=128) |
|
|
| ## Training Details |
| | Parameter | Value | |
| |---|---| |
| | Base model | bert-base-uncased | |
| | Dataset | rajpurkar/squad (v1.1) | |
| | Training samples | 8000 | |
| | Epochs | 2 | |
| | Batch size | 16 | |
| | Learning rate | 3e-05 | |
| | Max length | 384 | |
| | Doc stride | 128 | |
| | Warmup ratio | 0.1 | |
| | Optimizer | AdamW with linear LR decay | |
|
|