autoevaluator HF Staff
Add verifyToken field to verify evaluation results are produced by Hugging Face's automatic model evaluator
3ade81a | language: | |
| - en | |
| license: apache-2.0 | |
| tags: | |
| - question-answering | |
| datasets: | |
| - adversarial_qa | |
| - mbartolo/synQA | |
| - squad | |
| metrics: | |
| - exact_match | |
| - f1 | |
| model-index: | |
| - name: mbartolo/roberta-large-synqa | |
| results: | |
| - task: | |
| type: question-answering | |
| name: Question Answering | |
| dataset: | |
| name: squad | |
| type: squad | |
| config: plain_text | |
| split: validation | |
| metrics: | |
| - type: exact_match | |
| value: 89.6529 | |
| name: Exact Match | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMjM2NTJjZDk0ZWEwM2Q5Njk5NmY5Mzk4ODk4OTViMjZlODlkMTM4M2ZlM2Q0YjgwMWY4OGUzM2QwYTk0YTBhMSIsInZlcnNpb24iOjF9.ZafZxhyJS2xpjYDMhyTO8wVmeZJrwbeJmyvZypMbhUJORR194GJwgttUp150XG3MUFVFqPYQh8tuzpm_QQ6sAA | |
| - type: f1 | |
| value: 94.8172 | |
| name: F1 | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYWE3OGFlYWZmZThhNDMwYzU4OTY3NmU4NmNlYTcwODVkZmQ3N2FlZmE0NGM2Mzk3Nzc2ZmZmNzhkM2NiNzNiMCIsInZlcnNpb24iOjF9.LF4-uxpGMMr7oP_C_SAYHgKMw6I9Sz8FiRnofaD9WFkQZrGPaPR1HjvC6sWo2Nyy5uuD76bowY278Qf8kWwLBw | |
| - task: | |
| type: question-answering | |
| name: Question Answering | |
| dataset: | |
| name: adversarial_qa | |
| type: adversarial_qa | |
| config: adversarialQA | |
| split: validation | |
| metrics: | |
| - type: exact_match | |
| value: 55.3333 | |
| name: Exact Match | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMmIwZTdmNzg4MmE5YjM2MzkxOWFmM2JmODMzZDhhZGY5YWE0Njc2MmY0YzIyNzEwMGU0MDIwOTZjZTEyZjk5YSIsInZlcnNpb24iOjF9.dNd-MElaXPRrYSgvzxcMyN87ts0iyON4mdQChv68AIspmQKAUKRVzdm7w0mhRyvzG8a7aDl7dgUFCZVxd7-FAQ | |
| - type: f1 | |
| value: 66.7464 | |
| name: F1 | |
| verified: true | |
| verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMDU4ZTE0MTliNmJjYmFjZmI4MjEwMWRiMjJmZjhjYzBkY2Q0ZGUzMzZlMTZkNmFlZThmYzMyMThjN2IwMjI3NSIsInZlcnNpb24iOjF9.A4AxMaEXNDRZaR_ZazFH3PUhi-jn0JniWv7xEXGM3oidhR6hsWNi5twqegAAuZe56YDPxCUhuoGahovcWmoaBQ | |
| # Model Overview | |
| This is a RoBERTa-Large QA Model trained from https://huggingface.co/roberta-large in two stages. First, it is trained on synthetic adversarial data generated using a BART-Large question generator on Wikipedia passages from SQuAD, and then it is trained on SQuAD and AdversarialQA (https://arxiv.org/abs/2002.00293) in a second stage of fine-tuning. | |
| # Data | |
| Training data: SQuAD + AdversarialQA | |
| Evaluation data: SQuAD + AdversarialQA | |
| # Training Process | |
| Approx. 1 training epoch on the synthetic data and 2 training epochs on the manually-curated data. | |
| # Additional Information | |
| Please refer to https://arxiv.org/abs/2104.08678 for full details. |