| | --- |
| | language: |
| | - en |
| | tags: |
| | - question-answering |
| | - qa |
| | license: "apache-2.0" |
| | datasets: |
| | - squad |
| | metrics: |
| | - squad |
| | --- |
| | # Description |
| | Trained on the SQuAD v1.1 dataset from the MRQA Shared Task. The public dev set was divided into two: one for dev and one for test. |
| |
|
| | # Dev results: |
| | "eval_exact_match": 88.15914715400723, |
| | "eval_f1": 93.91715796563734, |
| | "eval_samples": 5291 |
| |
|
| | # Test results: |
| | "test_exact_match": 86.52455272173582, |
| | "test_f1": 92.92134442432088 |
| | "predict_samples": 5294 |
| |
|
| | More info in the paper: |
| | **MetaQA: Combining Expert Agents for Multi-Skill Question Answering** |
| | https://arxiv.org/abs/2112.01922 |