manu02 commited on
Commit
068a6cb
·
verified ·
1 Parent(s): 86a5481

Refresh LAnA collection comparison tables

Browse files
Files changed (1) hide show
  1. README.md +25 -25
README.md CHANGED
@@ -102,39 +102,39 @@ These comparison tables are refreshed across the full LAnA collection whenever a
102
 
103
  ### Cross-Model Comparison: All Frontal Test Studies
104
 
105
- | Metric | LAnA-MIMIC-CHEXPERT | LAnA-MIMIC | LAnA | LAnA-v2 | LAnA-v3 | LAnA-v4 (Model still training) |
106
  | --- | --- | --- | --- | --- | --- | --- |
107
- | Run status | `Completed` | `Completed` | `Completed` | `Completed` | `Completed` | `Model still training` |
108
  | Number of studies | `3041` | `3041` | `3041` | `3041` | `3041` | `3041` |
109
- | ROUGE-L | `0.1513` | `0.1653` | `0.1686` | `0.1670` | `0.1745` | `0.1676` |
110
- | BLEU-1 | `0.1707` | `0.1916` | `0.2091` | `0.2174` | `0.2346` | `0.2247` |
111
- | BLEU-4 | `0.0357` | `0.0386` | `0.0417` | `0.0417` | `0.0484` | `0.0439` |
112
- | METEOR | `0.2079` | `0.2202` | `0.2298` | `0.2063` | `0.2129` | `0.2005` |
113
- | RadGraph F1 | `0.0918` | `0.0921` | `0.1024` | `0.1057` | `0.0939` | `0.0792` |
114
- | RadGraph entity F1 | `0.1399` | `0.1459` | `0.1587` | `0.1569` | `0.1441` | `0.1443` |
115
- | RadGraph relation F1 | `0.1246` | `0.1322` | `0.1443` | `0.1474` | `0.1280` | `0.1299` |
116
- | CheXpert F1 14-micro | `0.1829` | `0.1565` | `0.2116` | `0.1401` | `0.3116` | `0.2228` |
117
- | CheXpert F1 5-micro | `0.2183` | `0.1530` | `0.2512` | `0.2506` | `0.2486` | `0.0549` |
118
- | CheXpert F1 14-macro | `0.1095` | `0.0713` | `0.1095` | `0.0401` | `0.1363` | `0.0736` |
119
- | CheXpert F1 5-macro | `0.1634` | `0.1007` | `0.1644` | `0.1004` | `0.1686` | `0.0342` |
120
 
121
  ### Cross-Model Comparison: Findings-Only Frontal Test Studies
122
 
123
- | Metric | LAnA-MIMIC-CHEXPERT | LAnA-MIMIC | LAnA | LAnA-v2 | LAnA-v3 | LAnA-v4 (Model still training) |
124
  | --- | --- | --- | --- | --- | --- | --- |
125
- | Run status | `Completed` | `Completed` | `Completed` | `Completed` | `Completed` | `Model still training` |
126
  | Number of studies | `2210` | `2210` | `2210` | `2210` | `2210` | `2210` |
127
- | ROUGE-L | `0.1576` | `0.1720` | `0.1771` | `0.1771` | `0.1848` | `0.1752` |
128
- | BLEU-1 | `0.1754` | `0.2003` | `0.2177` | `0.2263` | `0.2480` | `0.2343` |
129
- | BLEU-4 | `0.0405` | `0.0449` | `0.0484` | `0.0487` | `0.0573` | `0.0508` |
130
- | METEOR | `0.2207` | `0.2347` | `0.2466` | `0.2240` | `0.2310` | `0.2138` |
131
- | RadGraph F1 | `0.1010` | `0.1000` | `0.1119` | `0.1181` | `0.1046` | `0.0900` |
132
- | RadGraph entity F1 | `0.1517` | `0.1577` | `0.1713` | `0.1739` | `0.1584` | `0.1567` |
133
  | RadGraph relation F1 | `0.1347` | `0.1413` | `0.1549` | `0.1628` | `0.1405` | `0.1410` |
134
- | CheXpert F1 14-micro | `0.1651` | `0.1442` | `0.1907` | `0.1365` | `0.2921` | `0.2229` |
135
- | CheXpert F1 5-micro | `0.2152` | `0.1716` | `0.2415` | `0.2455` | `0.2394` | `0.0566` |
136
- | CheXpert F1 14-macro | `0.1047` | `0.0700` | `0.1039` | `0.0381` | `0.1326` | `0.0724` |
137
- | CheXpert F1 5-macro | `0.1611` | `0.1112` | `0.1578` | `0.0952` | `0.1636` | `0.0351` |
138
 
139
  ## Data
140
 
 
102
 
103
  ### Cross-Model Comparison: All Frontal Test Studies
104
 
105
+ | Metric | LAnA-MIMIC-CHEXPERT | LAnA-MIMIC | LAnA | LAnA-v2 | LAnA-v3 | LAnA-v4 |
106
  | --- | --- | --- | --- | --- | --- | --- |
107
+ | Run status | `Completed` | `Completed` | `Completed` | `Completed` | `Completed` | `Completed` |
108
  | Number of studies | `3041` | `3041` | `3041` | `3041` | `3041` | `3041` |
109
+ | ROUGE-L | `0.1513` | `0.1653` | `0.1686` | `0.1670` | `0.1745` | `0.1675` |
110
+ | BLEU-1 | `0.1707` | `0.1916` | `0.2091` | `0.2174` | `0.2346` | `0.2244` |
111
+ | BLEU-4 | `0.0357` | `0.0386` | `0.0417` | `0.0417` | `0.0484` | `0.0441` |
112
+ | METEOR | `0.2079` | `0.2202` | `0.2298` | `0.2063` | `0.2129` | `0.2002` |
113
+ | RadGraph F1 | `0.0918` | `0.0921` | `0.1024` | `0.1057` | `0.0939` | `0.0794` |
114
+ | RadGraph entity F1 | `0.1399` | `0.1459` | `0.1587` | `0.1569` | `0.1441` | `0.1437` |
115
+ | RadGraph relation F1 | `0.1246` | `0.1322` | `0.1443` | `0.1474` | `0.1280` | `0.1293` |
116
+ | CheXpert F1 14-micro | `0.1829` | `0.1565` | `0.2116` | `0.1401` | `0.3116` | `0.2196` |
117
+ | CheXpert F1 5-micro | `0.2183` | `0.1530` | `0.2512` | `0.2506` | `0.2486` | `0.0538` |
118
+ | CheXpert F1 14-macro | `0.1095` | `0.0713` | `0.1095` | `0.0401` | `0.1363` | `0.0724` |
119
+ | CheXpert F1 5-macro | `0.1634` | `0.1007` | `0.1644` | `0.1004` | `0.1686` | `0.0333` |
120
 
121
  ### Cross-Model Comparison: Findings-Only Frontal Test Studies
122
 
123
+ | Metric | LAnA-MIMIC-CHEXPERT | LAnA-MIMIC | LAnA | LAnA-v2 | LAnA-v3 | LAnA-v4 |
124
  | --- | --- | --- | --- | --- | --- | --- |
125
+ | Run status | `Completed` | `Completed` | `Completed` | `Completed` | `Completed` | `Completed` |
126
  | Number of studies | `2210` | `2210` | `2210` | `2210` | `2210` | `2210` |
127
+ | ROUGE-L | `0.1576` | `0.1720` | `0.1771` | `0.1771` | `0.1848` | `0.1753` |
128
+ | BLEU-1 | `0.1754` | `0.2003` | `0.2177` | `0.2263` | `0.2480` | `0.2337` |
129
+ | BLEU-4 | `0.0405` | `0.0449` | `0.0484` | `0.0487` | `0.0573` | `0.0509` |
130
+ | METEOR | `0.2207` | `0.2347` | `0.2466` | `0.2240` | `0.2310` | `0.2137` |
131
+ | RadGraph F1 | `0.1010` | `0.1000` | `0.1119` | `0.1181` | `0.1046` | `0.0906` |
132
+ | RadGraph entity F1 | `0.1517` | `0.1577` | `0.1713` | `0.1739` | `0.1584` | `0.1566` |
133
  | RadGraph relation F1 | `0.1347` | `0.1413` | `0.1549` | `0.1628` | `0.1405` | `0.1410` |
134
+ | CheXpert F1 14-micro | `0.1651` | `0.1442` | `0.1907` | `0.1365` | `0.2921` | `0.2205` |
135
+ | CheXpert F1 5-micro | `0.2152` | `0.1716` | `0.2415` | `0.2455` | `0.2394` | `0.0555` |
136
+ | CheXpert F1 14-macro | `0.1047` | `0.0700` | `0.1039` | `0.0381` | `0.1326` | `0.0714` |
137
+ | CheXpert F1 5-macro | `0.1611` | `0.1112` | `0.1578` | `0.0952` | `0.1636` | `0.0342` |
138
 
139
  ## Data
140