Commit ·
a8bbb79
1
Parent(s): 5978ee2
Add verifyToken field to verify evaluation results are produced by Hugging Face's automatic model evaluator
Browse filesBeep boop, I am a bot from Hugging Face's automatic model evaluator 👋! We've added a new `verifyToken` field to your evaluation results to verify that they are produced by the model evaluator. Accept this PR to ensure that your results remain listed as **verified** on the [Hub leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards).
README.md
CHANGED
|
@@ -1,5 +1,13 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
license: cc-by-4.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
widget:
|
| 4 |
- context: DeBERTa improves the BERT and RoBERTa models using disentangled attention
|
| 5 |
and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa
|
|
@@ -21,30 +29,22 @@ widget:
|
|
| 21 |
for more implementation details and updates.
|
| 22 |
example_title: DeBERTa v3 Q2
|
| 23 |
text: Where do I go to see new info about DeBERTa?
|
| 24 |
-
datasets:
|
| 25 |
-
- squad_v2
|
| 26 |
-
metrics:
|
| 27 |
-
- f1
|
| 28 |
-
- exact
|
| 29 |
-
tags:
|
| 30 |
-
- question-answering
|
| 31 |
-
language: en
|
| 32 |
model-index:
|
| 33 |
- name: DeBERTa v3 xsmall squad2
|
| 34 |
results:
|
| 35 |
- task:
|
| 36 |
-
name: Question Answering
|
| 37 |
type: question-answering
|
|
|
|
| 38 |
dataset:
|
| 39 |
name: SQuAD2.0
|
| 40 |
type: question-answering
|
| 41 |
metrics:
|
| 42 |
-
-
|
| 43 |
-
type: f1
|
| 44 |
value: 81.5
|
| 45 |
-
|
| 46 |
-
|
| 47 |
value: 78.3
|
|
|
|
| 48 |
- task:
|
| 49 |
type: question-answering
|
| 50 |
name: Question Answering
|
|
@@ -54,18 +54,21 @@ model-index:
|
|
| 54 |
config: squad_v2
|
| 55 |
split: validation
|
| 56 |
metrics:
|
| 57 |
-
-
|
| 58 |
-
type: exact_match
|
| 59 |
value: 78.5341
|
|
|
|
| 60 |
verified: true
|
| 61 |
-
|
| 62 |
-
|
| 63 |
value: 81.6408
|
|
|
|
| 64 |
verified: true
|
| 65 |
-
|
| 66 |
-
|
| 67 |
value: 11870
|
|
|
|
| 68 |
verified: true
|
|
|
|
| 69 |
- task:
|
| 70 |
type: question-answering
|
| 71 |
name: Question Answering
|
|
@@ -75,14 +78,16 @@ model-index:
|
|
| 75 |
config: plain_text
|
| 76 |
split: validation
|
| 77 |
metrics:
|
| 78 |
-
-
|
| 79 |
-
type: exact_match
|
| 80 |
value: 84.1741
|
|
|
|
| 81 |
verified: true
|
| 82 |
-
|
| 83 |
-
|
| 84 |
value: 91.0771
|
|
|
|
| 85 |
verified: true
|
|
|
|
| 86 |
---
|
| 87 |
|
| 88 |
|
|
|
|
| 1 |
---
|
| 2 |
+
language: en
|
| 3 |
license: cc-by-4.0
|
| 4 |
+
tags:
|
| 5 |
+
- question-answering
|
| 6 |
+
datasets:
|
| 7 |
+
- squad_v2
|
| 8 |
+
metrics:
|
| 9 |
+
- f1
|
| 10 |
+
- exact
|
| 11 |
widget:
|
| 12 |
- context: DeBERTa improves the BERT and RoBERTa models using disentangled attention
|
| 13 |
and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa
|
|
|
|
| 29 |
for more implementation details and updates.
|
| 30 |
example_title: DeBERTa v3 Q2
|
| 31 |
text: Where do I go to see new info about DeBERTa?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
model-index:
|
| 33 |
- name: DeBERTa v3 xsmall squad2
|
| 34 |
results:
|
| 35 |
- task:
|
|
|
|
| 36 |
type: question-answering
|
| 37 |
+
name: Question Answering
|
| 38 |
dataset:
|
| 39 |
name: SQuAD2.0
|
| 40 |
type: question-answering
|
| 41 |
metrics:
|
| 42 |
+
- type: f1
|
|
|
|
| 43 |
value: 81.5
|
| 44 |
+
name: f1
|
| 45 |
+
- type: exact
|
| 46 |
value: 78.3
|
| 47 |
+
name: exact
|
| 48 |
- task:
|
| 49 |
type: question-answering
|
| 50 |
name: Question Answering
|
|
|
|
| 54 |
config: squad_v2
|
| 55 |
split: validation
|
| 56 |
metrics:
|
| 57 |
+
- type: exact_match
|
|
|
|
| 58 |
value: 78.5341
|
| 59 |
+
name: Exact Match
|
| 60 |
verified: true
|
| 61 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZTk0ZGQ1YjU1YmQ5NTc2M2RmNjg2OGViYjcyODZkOTc1MDBkNmI5MDc0MzEyMzZmNDg3Yzc4ZTA3ZjAwM2M5ZiIsInZlcnNpb24iOjF9.ewKF-UetUoxKDeXgnM6vqy8nBC9c3qh7dLZhdQlgSxPut3LjAhpCh2fJGir-OVcfzWzxsPhcZQEpdnxR8oZnAA
|
| 62 |
+
- type: f1
|
| 63 |
value: 81.6408
|
| 64 |
+
name: F1
|
| 65 |
verified: true
|
| 66 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTQwZDdjY2ZlOGVhM2E5NGM3OGNkNTk2NWFkYTg1Y2Q0YWFlYWJmMGIyZWM5ZjMyYTYyODUzMDA0NWU0ZGVkZCIsInZlcnNpb24iOjF9.BHJNhS1YisUIkjcpIMdwXurTewak9dkkpGXC2vHvUB4qUEuk_p3V-orhmeFyTxzLaWRwrZVGVz-NSfqFr4n1Ag
|
| 67 |
+
- type: total
|
| 68 |
value: 11870
|
| 69 |
+
name: total
|
| 70 |
verified: true
|
| 71 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNzNiZDQ3MDAyNzljMDI4NTRlYzZiZjE4ODJhZDhmZWE2ZjcwNjg2ZWJmNjUyMTUzZDk4ODNjNDExYTk1YWNlOCIsInZlcnNpb24iOjF9.3BlfmMvbV86Ua39ToqnMmgpGS0ZTew0UFFYWGyTkS3u7jaAXCfYkFkNJXw806f2uFFkKr1hqlzzKfivV0wUjCg
|
| 72 |
- task:
|
| 73 |
type: question-answering
|
| 74 |
name: Question Answering
|
|
|
|
| 78 |
config: plain_text
|
| 79 |
split: validation
|
| 80 |
metrics:
|
| 81 |
+
- type: exact_match
|
|
|
|
| 82 |
value: 84.1741
|
| 83 |
+
name: Exact Match
|
| 84 |
verified: true
|
| 85 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYTA0MDVlYWI5NzdiNjllM2NmZTYwYmQ5YzE0ODgwOTA3MWZjZDkxNDFmZDM1OTQzMzgwNWI4NDc5NThhM2VhZSIsInZlcnNpb24iOjF9.lc2nUBxSu2_0_a5lyVsV51UAmkE8WHDTwGHvt3n9zvCbcJ1ylOg2xovF0_j0hZS16lv1DEw5XV8EW_ZS7mfvBg
|
| 86 |
+
- type: f1
|
| 87 |
value: 91.0771
|
| 88 |
+
name: F1
|
| 89 |
verified: true
|
| 90 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODQxMjkxOWJlZTc2MmE5YzVmMjNhOTkwNDdiMDBhNWUwMDU3MDI1MmJiNDY4MjczYjIwM2U1NDhlYmZlZWQwMSIsInZlcnNpb24iOjF9.x_axHiBX5d3UIi1UbJT3kVbdX4kX9XFLQSg-l16-AAK9tiyutT-yaYJOi8LSb2lR4677tJpf3itu4eriJRU2Cg
|
| 91 |
---
|
| 92 |
|
| 93 |
|