hf-tuner
/

bert-mini-squadv2

Question Answering

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

hf-tuner commited on Nov 21, 2025

Commit

e51374b

·

verified ·

1 Parent(s): 5c73593

Update README.md

Files changed (1) hide show

README.md +31 -0

README.md CHANGED Viewed

@@ -24,13 +24,44 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/MiniLM-L12-H384-uncased](https://huggingface.co/microsoft/MiniLM-L12-H384-uncased) on
 [hf-tuner/squad_v2.0.1](https://huggingface.co/datasets/hf-tuner/squad_v2.0.1) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 1.4653
 ## Model description
 MiniLMv1-L12-H384-uncased: 12-layer, 384-hidden, 12-heads, 33M parameters, 2.7x faster than BERT-Base
 ## How to use
 ```python

 This model is a fine-tuned version of [microsoft/MiniLM-L12-H384-uncased](https://huggingface.co/microsoft/MiniLM-L12-H384-uncased) on
 [hf-tuner/squad_v2.0.1](https://huggingface.co/datasets/hf-tuner/squad_v2.0.1) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 1.4653
+- Exact Match Accuracy: 60.94%
+## Evaluation Notes
+#### Issues with Exact Match Evaluation
+Several correct predictions were incorrectly marked as false negatives due to strict exact-match criteria being sensitive to minor differences in tokenization, formatting, or span boundaries:
+- Predicted: `isaac bashevis` → Rejected (expected: `isaac bashevis singer`)
+- Predicted: `newtonian equations` → Rejected (expected: `newtonian`)
+- Predicted: `80,000` → Rejected (expected: `80, 000`)
+#### Overall Performance
+- Exact-match accuracy: **>60%**
+- The model frequently generates high-quality and semantically correct answer spans even when exact-match evaluation penalizes them.
+- Primary limitation: performance drops on questions requiring deep domain-specific knowledge, largely attributable to the model's relatively small size and limited parameter capacity.
+#### Recommendations for Best Results
+- Use clear, straightforward phrasing in queries to maximize extraction accuracy.
 ## Model description
 MiniLMv1-L12-H384-uncased: 12-layer, 384-hidden, 12-heads, 33M parameters, 2.7x faster than BERT-Base
+#### Direct Use
+- Extractive Question Answering: Given a passage and a question, the model extracts the most likely span of text that answers the question.
+- Handles unanswerable questions by predicting "no answer" when appropriate.
+#### Downstream Use
+Can be integrated into chatbots, virtual assistants, or search systems that require question answering over text.
+#### Out-of-Scope Use
+- Generative question answering (the model cannot generate new answers).
+- Non-English tasks (the model was trained only on English data).
+- Open-Domain QA across large corpora — works best when the context passage is provided.
 ## How to use
 ```python