File size: 3,566 Bytes
bf1389e
 
21792c3
60f109a
869a882
b9113f8
77e78d3
60f109a
a6ca850
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5ad8883
 
 
 
 
 
 
 
 
 
 
 
 
 
a6ca850
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
license: apache-2.0
pipeline_tag: token-classification
---
An example of using an ensemble of models is shown in the main.py file

Code for this project: https://github.com/Misha24-10/semeval_ner/tree/main

In low lavel classification on MultiCoNER II in test set:
| Класс                     | Precision | Recall | F1     |
|---------------------------|-----------|--------|--------|
| Facility                  | 0,7464    | 0,7321 | 0,7392 |
| OtherLOC                  | 0,7932    | 0,7068 | 0,7475 |
| HumanSettlement           | 0,899     | 0,8948 | 0,8969 |
| Station                   | 0,8318    | 0,8125 | 0,8221 |
| VisualWork                | 0,8528    | 0,8319 | 0,8422 |
| MusicalWork               | 0,8025    | 0,7813 | 0,7917 |
| WrittenWork               | 0,7766    | 0,728  | 0,7515 |
| ArtWork                   | 0,6374    | 0,5528 | 0,5921 |
| Software                  | 0,8476    | 0,8201 | 0,8336 |
| MusicalGRP                | 0,8185    | 0,8207 | 0,8196 |
| PublicCorp                | 0,7853    | 0,7572 | 0,771  |
| PrivateCorp               | 0,7362    | 0,6896 | 0,7121 |
| AerospaceManufacturer     | 0,6774    | 0,7541 | 0,7137 |
| SportsGRP                 | 0,8715    | 0,8938 | 0,8825 |
| CarManufacturer           | 0,7617    | 0,7902 | 0,7757 |
| ORG                       | 0,7617    | 0,7371 | 0,7492 |
| Scientist                 | 0,5338    | 0,4886 | 0,5102 |
| Artist                    | 0,7971    | 0,8369 | 0,8165 |
| Athlete                   | 0,8094    | 0,802  | 0,8057 |
| Politician                | 0,7115    | 0,6194 | 0,6622 |
| Cleric                    | 0,7349    | 0,6239 | 0,6748 |
| SportsManager             | 0,678     | 0,6097 | 0,6421 |
| OtherPER                  | 0,5354    | 0,5915 | 0,562  |
| Clothing                  | 0,6326    | 0,6876 | 0,659  |
| Vehicle                   | 0,6699    | 0,6608 | 0,6653 |
| Food                      | 0,6814    | 0,6634 | 0,6723 |
| Drink                     | 0,6859    | 0,7203 | 0,7027 |
| OtherPROD                 | 0,7033    | 0,6638 | 0,683  |
| Medication/Vaccine        | 0,7943    | 0,816  | 0,805  |
| MedicalProcedure          | 0,7481    | 0,7375 | 0,7428 |
| AnatomicalStructure       | 0,7765    | 0,7567 | 0,7664 |
| Symptom                   | 0,6086    | 0,7178 | 0,6587 |
| Disease                   | 0,7977    | 0,7719 | 0,7846 |
| Macro Average Performance | 0,7423    | 0,7294 | 0,7349 |



In high lavel classification on MultiCoNER II in test set:
| Класс                     | Precision | Recall | F1     |
|---------------------------|-----------|--------|--------|
| LOC                       | 0,8866    | 0,8732 | 0,8798 |
| Medicine                  | 0,794     | 0,7927 | 0,7934 |
| GRP                       | 0,8489    | 0,8419 | 0,8454 |
| PROD                      | 0,7449    | 0,7247 | 0,7347 |
| PER                       | 0,9346    | 0,939  | 0,9368 |
| CW                        | 0,8507    | 0,8162 | 0,8331 |
| Macro Average Performance | 0,8433    | 0,8313 | 0,8372 |


MultiCoNER II features complex NER in these languages:
1. English
2. Spanish
3. Hindi
4. Bangla
5. Chinese
6. Swedish
7. Farsi
8. French
9. Italian
10. Portugese
11. Ukranian
12. German

classification entities in low level between languages overall Macro F1-score:
| Язык | F1     |
|------|--------|
| PT   | 0,6872 |
| IT   | 0,7441 |
| UK   | 0,7199 |
| BN   | 0,7320 |
| FA   | 0,6404 |
| ES   | 0,7230 |
| FR   | 0,7289 |
| DE   | 0,7164 |
| EN   | 0,7069 |
| HI   | 0,7544 |
| ZH   | 0,5899 |
| SV   | 0,7385 |