hf-tuner commited on
Commit
e51374b
·
verified ·
1 Parent(s): 5c73593

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -24,13 +24,44 @@ should probably proofread and complete it, then remove this comment. -->
24
 
25
  This model is a fine-tuned version of [microsoft/MiniLM-L12-H384-uncased](https://huggingface.co/microsoft/MiniLM-L12-H384-uncased) on
26
  [hf-tuner/squad_v2.0.1](https://huggingface.co/datasets/hf-tuner/squad_v2.0.1) dataset.
 
27
  It achieves the following results on the evaluation set:
28
  - Loss: 1.4653
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  ## Model description
31
 
32
  MiniLMv1-L12-H384-uncased: 12-layer, 384-hidden, 12-heads, 33M parameters, 2.7x faster than BERT-Base
33
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ## How to use
35
 
36
  ```python
 
24
 
25
  This model is a fine-tuned version of [microsoft/MiniLM-L12-H384-uncased](https://huggingface.co/microsoft/MiniLM-L12-H384-uncased) on
26
  [hf-tuner/squad_v2.0.1](https://huggingface.co/datasets/hf-tuner/squad_v2.0.1) dataset.
27
+
28
  It achieves the following results on the evaluation set:
29
  - Loss: 1.4653
30
+ - Exact Match Accuracy: 60.94%
31
+
32
+ ## Evaluation Notes
33
+
34
+ #### Issues with Exact Match Evaluation
35
+ Several correct predictions were incorrectly marked as false negatives due to strict exact-match criteria being sensitive to minor differences in tokenization, formatting, or span boundaries:
36
+
37
+ - Predicted: `isaac bashevis` → Rejected (expected: `isaac bashevis singer`)
38
+ - Predicted: `newtonian equations` → Rejected (expected: `newtonian`)
39
+ - Predicted: `80,000` → Rejected (expected: `80, 000`)
40
+
41
+ #### Overall Performance
42
+ - Exact-match accuracy: **>60%**
43
+ - The model frequently generates high-quality and semantically correct answer spans even when exact-match evaluation penalizes them.
44
+ - Primary limitation: performance drops on questions requiring deep domain-specific knowledge, largely attributable to the model's relatively small size and limited parameter capacity.
45
+
46
+ #### Recommendations for Best Results
47
+ - Use clear, straightforward phrasing in queries to maximize extraction accuracy.
48
 
49
  ## Model description
50
 
51
  MiniLMv1-L12-H384-uncased: 12-layer, 384-hidden, 12-heads, 33M parameters, 2.7x faster than BERT-Base
52
 
53
+ #### Direct Use
54
+ - Extractive Question Answering: Given a passage and a question, the model extracts the most likely span of text that answers the question.
55
+ - Handles unanswerable questions by predicting "no answer" when appropriate.
56
+
57
+ #### Downstream Use
58
+ Can be integrated into chatbots, virtual assistants, or search systems that require question answering over text.
59
+
60
+ #### Out-of-Scope Use
61
+ - Generative question answering (the model cannot generate new answers).
62
+ - Non-English tasks (the model was trained only on English data).
63
+ - Open-Domain QA across large corpora — works best when the context passage is provided.
64
+
65
  ## How to use
66
 
67
  ```python