manu02 commited on
Commit
cff46be
·
verified ·
1 Parent(s): 14ac457

Upload benchmarked LANA model

Browse files
Files changed (3) hide show
  1. README.md +35 -53
  2. model.safetensors +1 -1
  3. run_summary.json +15 -132
README.md CHANGED
@@ -98,43 +98,34 @@ print(report)
98
 
99
  Frontal-only evaluation using `PA/AP` studies only.
100
 
101
- These comparison tables are refreshed across the full LAnA collection whenever any collection model is evaluated.
102
-
103
- ### Cross-Model Comparison: All Frontal Test Studies
104
-
105
- | Metric | LAnA-MIMIC-CHEXPERT | LAnA-MIMIC | LAnA | LAnA-v2 | LAnA-v3 | LAnA-v4 (Model still training) |
106
- | --- | --- | --- | --- | --- | --- | --- |
107
- | Run status | `Completed` | `Completed` | `Completed` | `Completed` | `Completed` | `Model still training` |
108
- | Number of studies | `3041` | `3041` | `3041` | `3041` | `3041` | `3041` |
109
- | ROUGE-L | `0.1513` | `0.1653` | `0.1686` | `0.1670` | `0.1745` | `0.1676` |
110
- | BLEU-1 | `0.1707` | `0.1916` | `0.2091` | `0.2174` | `0.2346` | `0.2247` |
111
- | BLEU-4 | `0.0357` | `0.0386` | `0.0417` | `0.0417` | `0.0484` | `0.0439` |
112
- | METEOR | `0.2079` | `0.2202` | `0.2298` | `0.2063` | `0.2129` | `0.2005` |
113
- | RadGraph F1 | `0.0918` | `0.0921` | `0.1024` | `0.1057` | `0.0939` | `0.0792` |
114
- | RadGraph entity F1 | `0.1399` | `0.1459` | `0.1587` | `0.1569` | `0.1441` | `0.1443` |
115
- | RadGraph relation F1 | `0.1246` | `0.1322` | `0.1443` | `0.1474` | `0.1280` | `0.1299` |
116
- | CheXpert F1 14-micro | `0.1829` | `0.1565` | `0.2116` | `0.1401` | `0.3116` | `0.2228` |
117
- | CheXpert F1 5-micro | `0.2183` | `0.1530` | `0.2512` | `0.2506` | `0.2486` | `0.0549` |
118
- | CheXpert F1 14-macro | `0.1095` | `0.0713` | `0.1095` | `0.0401` | `0.1363` | `0.0736` |
119
- | CheXpert F1 5-macro | `0.1634` | `0.1007` | `0.1644` | `0.1004` | `0.1686` | `0.0342` |
120
-
121
- ### Cross-Model Comparison: Findings-Only Frontal Test Studies
122
-
123
- | Metric | LAnA-MIMIC-CHEXPERT | LAnA-MIMIC | LAnA | LAnA-v2 | LAnA-v3 | LAnA-v4 (Model still training) |
124
- | --- | --- | --- | --- | --- | --- | --- |
125
- | Run status | `Completed` | `Completed` | `Completed` | `Completed` | `Completed` | `Model still training` |
126
- | Number of studies | `2210` | `2210` | `2210` | `2210` | `2210` | `2210` |
127
- | ROUGE-L | `0.1576` | `0.1720` | `0.1771` | `0.1771` | `0.1848` | `0.1752` |
128
- | BLEU-1 | `0.1754` | `0.2003` | `0.2177` | `0.2263` | `0.2480` | `0.2343` |
129
- | BLEU-4 | `0.0405` | `0.0449` | `0.0484` | `0.0487` | `0.0573` | `0.0508` |
130
- | METEOR | `0.2207` | `0.2347` | `0.2466` | `0.2240` | `0.2310` | `0.2138` |
131
- | RadGraph F1 | `0.1010` | `0.1000` | `0.1119` | `0.1181` | `0.1046` | `0.0900` |
132
- | RadGraph entity F1 | `0.1517` | `0.1577` | `0.1713` | `0.1739` | `0.1584` | `0.1567` |
133
- | RadGraph relation F1 | `0.1347` | `0.1413` | `0.1549` | `0.1628` | `0.1405` | `0.1410` |
134
- | CheXpert F1 14-micro | `0.1651` | `0.1442` | `0.1907` | `0.1365` | `0.2921` | `0.2229` |
135
- | CheXpert F1 5-micro | `0.2152` | `0.1716` | `0.2415` | `0.2455` | `0.2394` | `0.0566` |
136
- | CheXpert F1 14-macro | `0.1047` | `0.0700` | `0.1039` | `0.0381` | `0.1326` | `0.0724` |
137
- | CheXpert F1 5-macro | `0.1611` | `0.1112` | `0.1578` | `0.0952` | `0.1636` | `0.0351` |
138
 
139
  ## Data
140
 
@@ -147,15 +138,6 @@ These comparison tables are refreshed across the full LAnA collection whenever a
147
 
148
  - Medical report metrics implemented in the repository include RadGraph F1 and CheXpert F1 (`14-micro`, `5-micro`, `14-macro`, `5-macro`).
149
 
150
- ## Experiment Model Descriptions
151
-
152
- - `LAnA-MIMIC-CHEXPERT`: This variant was trained on a combined dataset of `CheXpert` and `MIMIC-CXR` using LoRA fine-tuning with the `AdamW` optimizer.
153
- - `LAnA-MIMIC`: This model was trained on the `MIMIC-CXR (findings-only)` dataset using LoRA fine-tuning with the `AdamW` optimizer.
154
- - `LAnA`: This model was trained on the `MIMIC-CXR (findings-only)` dataset using full-model optimization with `AdamW` instead of LoRA.
155
- - `LAnA-v2`: This version keeps the same training setup as `LAnA`, but increases the effective global batch size from `16` to `128`.
156
- - `LAnA-v3`: This version keeps the same training setup as `LAnA`, including the effective global batch size of `16`, but changes how EOS is handled so training and generation follow the same behavior. The model no longer uses the EOS token during training, and generation remained greedy without stopping when an EOS token was produced. In the previous setup, decoding was also greedy, stopped at EOS, and used a maximum of `128` new tokens.
157
- - `LAnA-v4`: This version keeps the same decoding behavior as `LAnA-v3`, but increases the effective global batch size from `16` to `128`.
158
-
159
  ## Training Snapshot
160
 
161
  - Run: `LAnA-v4`
@@ -171,24 +153,24 @@ These comparison tables are refreshed across the full LAnA collection whenever a
171
  - Scheduler: `cosine`
172
  - Warmup steps: `165`
173
  - Weight decay: `0.01`
174
- - Steps completed: `3075`
175
  - Planned total steps: `3297`
176
- - Images seen: `394249`
177
- - Total training time: `7.5001` hours
178
  - Hardware: `NVIDIA GeForce RTX 5070`
179
- - Final train loss: `1.1786`
180
- - Validation loss: `1.6553`
181
 
182
  ## Status
183
 
184
  - Project status: `Training in progress`
185
  - Release status: `Research preview checkpoint`
186
  - Current checkpoint status: `Not final`
187
- - Training completion toward planned run: `93.49%` (`3` / `3` epochs)
188
  - Current published metrics are intermediate and will change as training continues.
189
 
190
  ## Notes
191
 
192
  - Set `HF_TOKEN` with permission to access the DINOv3 repositories required by this model before downloading or running inference.
193
  - `segmenters/` contains the lung and heart segmentation checkpoints used to build anatomical attention masks.
194
- - `evaluations/mimic_test_metrics.json` contains the latest saved MIMIC test metrics.
 
98
 
99
  Frontal-only evaluation using `PA/AP` studies only.
100
 
101
+ ### Current Checkpoint Results
102
+
103
+ | Metric | Value |
104
+ | --- | --- |
105
+ | Number of studies | TBD |
106
+ | RadGraph F1 | TBD |
107
+ | RadGraph entity F1 | TBD |
108
+ | RadGraph relation F1 | TBD |
109
+ | CheXpert F1 14-micro | TBD |
110
+ | CheXpert F1 5-micro | TBD |
111
+ | CheXpert F1 14-macro | TBD |
112
+ | CheXpert F1 5-macro | TBD |
113
+
114
+ ### Final Completed Training Results
115
+
116
+ The final table will be populated when the planned training run is completed. Until then, final-report metrics remain `TBD`.
117
+
118
+ | Metric | Value |
119
+ | --- | --- |
120
+ | Number of studies | TBD |
121
+ | RadGraph F1 | TBD |
122
+ | RadGraph entity F1 | TBD |
123
+ | RadGraph relation F1 | TBD |
124
+ | CheXpert F1 14-micro | TBD |
125
+ | CheXpert F1 5-micro | TBD |
126
+ | CheXpert F1 14-macro | TBD |
127
+ | CheXpert F1 5-macro | TBD |
128
+
 
 
 
 
 
 
 
 
 
129
 
130
  ## Data
131
 
 
138
 
139
  - Medical report metrics implemented in the repository include RadGraph F1 and CheXpert F1 (`14-micro`, `5-micro`, `14-macro`, `5-macro`).
140
 
 
 
 
 
 
 
 
 
 
141
  ## Training Snapshot
142
 
143
  - Run: `LAnA-v4`
 
153
  - Scheduler: `cosine`
154
  - Warmup steps: `165`
155
  - Weight decay: `0.01`
156
+ - Steps completed: `3289`
157
  - Planned total steps: `3297`
158
+ - Images seen: `421707`
159
+ - Total training time: `8.0982` hours
160
  - Hardware: `NVIDIA GeForce RTX 5070`
161
+ - Final train loss: `1.9641`
162
+ - Validation loss: `1.6446`
163
 
164
  ## Status
165
 
166
  - Project status: `Training in progress`
167
  - Release status: `Research preview checkpoint`
168
  - Current checkpoint status: `Not final`
169
+ - Training completion toward planned run: `100.00%` (`3` / `3` epochs)
170
  - Current published metrics are intermediate and will change as training continues.
171
 
172
  ## Notes
173
 
174
  - Set `HF_TOKEN` with permission to access the DINOv3 repositories required by this model before downloading or running inference.
175
  - `segmenters/` contains the lung and heart segmentation checkpoints used to build anatomical attention masks.
176
+ - `evaluations/mimic_test_metrics.json` contains the latest saved MIMIC test metrics.
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ed2b21d21d27a25ef8f3d4e7cc0145fc7266a768bb4397366166f39007f3e563
3
  size 1152546464
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cd099dd6604efe4ed12b2aea6b1a0e80cf53a4e0c139bc4a13d77e4a19d915ae
3
  size 1152546464
run_summary.json CHANGED
@@ -1,18 +1,18 @@
1
  {
2
  "method": "full_adamw",
3
  "run_name": "LAnA-v4",
4
- "steps": 3075,
5
- "epochs_completed": 2,
6
- "epoch_index": 2,
7
  "target_epochs": 3,
8
- "progress_epochs": 2.8046653245025572,
9
- "training_completion_percent": 93.48884415008524,
10
- "elapsed_seconds": 27000.266715600002,
11
- "images_seen": 394249,
12
- "train_loss_last": 1.1785972118377686,
13
- "train_loss_mean": 2.037826863495392,
14
- "val_loss": 1.6553147077560424,
15
- "images_per_second": 14.601670574321176,
16
  "trainable_params": 125522688,
17
  "vision_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
18
  "text_model_name": "gpt2",
@@ -37,129 +37,12 @@
37
  "seed": 42,
38
  "resume_supported": true,
39
  "checkpoint_every_n_steps": 1000,
40
- "cumulative_loss_sum": 803411.2031061947,
41
- "cumulative_loss_count": 394249,
42
- "completed": false,
43
  "target_duration_seconds": 3600,
44
  "target_duration_mode": "per_invocation",
45
  "repo_id": "manu02/LAnA-v4",
46
  "train_datasets": "MIMIC-CXR (findings-only)",
47
- "validation_datasets": "MIMIC-CXR (findings-only)",
48
- "repo_url": "https://huggingface.co/manu02/LAnA-v4",
49
- "latest_evaluation": {
50
- "split": "test",
51
- "subset": "all frontal studies",
52
- "dataset": "mimic-cxr",
53
- "view_filter": "frontal-only (PA/AP)",
54
- "num_examples": 3041,
55
- "bleu_1": 0.22466909304042493,
56
- "bleu_4": 0.043919975602752334,
57
- "meteor": 0.20049977335710334,
58
- "rouge_l": 0.16756058992939854,
59
- "chexpert_f1_14_micro": 0.22279554040357505,
60
- "chexpert_f1_5_micro": 0.05494209534069486,
61
- "chexpert_f1_14_macro": 0.07355641991775376,
62
- "chexpert_f1_5_macro": 0.034170854271356785,
63
- "chexpert_f1_micro": 0.22279554040357505,
64
- "chexpert_f1_macro": 0.07355641991775376,
65
- "chexpert_per_label_f1": {
66
- "Enlarged Cardiomediastinum": 0.0,
67
- "Cardiomegaly": 0.1708542713567839,
68
- "Lung Opacity": 0.0,
69
- "Lung Lesion": 0.0,
70
- "Edema": 0.0,
71
- "Consolidation": 0.0,
72
- "Pneumonia": 0.0,
73
- "Atelectasis": 0.0,
74
- "Pneumothorax": 0.0,
75
- "Pleural Effusion": 0.0,
76
- "Pleural Other": 0.0,
77
- "Fracture": 0.0,
78
- "Support Devices": 0.5644329896907216,
79
- "No Finding": 0.29450261780104714
80
- },
81
- "radgraph_f1": 0.0791523357254355,
82
- "radgraph_f1_entity": 0.1443115199444943,
83
- "radgraph_f1_relation": 0.12993022073120553,
84
- "radgraph_available": true,
85
- "radgraph_error": null
86
- },
87
- "latest_evaluations": {
88
- "all_test": {
89
- "split": "test",
90
- "subset": "all frontal studies",
91
- "dataset": "mimic-cxr",
92
- "view_filter": "frontal-only (PA/AP)",
93
- "num_examples": 3041,
94
- "bleu_1": 0.22466909304042493,
95
- "bleu_4": 0.043919975602752334,
96
- "meteor": 0.20049977335710334,
97
- "rouge_l": 0.16756058992939854,
98
- "chexpert_f1_14_micro": 0.22279554040357505,
99
- "chexpert_f1_5_micro": 0.05494209534069486,
100
- "chexpert_f1_14_macro": 0.07355641991775376,
101
- "chexpert_f1_5_macro": 0.034170854271356785,
102
- "chexpert_f1_micro": 0.22279554040357505,
103
- "chexpert_f1_macro": 0.07355641991775376,
104
- "chexpert_per_label_f1": {
105
- "Enlarged Cardiomediastinum": 0.0,
106
- "Cardiomegaly": 0.1708542713567839,
107
- "Lung Opacity": 0.0,
108
- "Lung Lesion": 0.0,
109
- "Edema": 0.0,
110
- "Consolidation": 0.0,
111
- "Pneumonia": 0.0,
112
- "Atelectasis": 0.0,
113
- "Pneumothorax": 0.0,
114
- "Pleural Effusion": 0.0,
115
- "Pleural Other": 0.0,
116
- "Fracture": 0.0,
117
- "Support Devices": 0.5644329896907216,
118
- "No Finding": 0.29450261780104714
119
- },
120
- "radgraph_f1": 0.0791523357254355,
121
- "radgraph_f1_entity": 0.1443115199444943,
122
- "radgraph_f1_relation": 0.12993022073120553,
123
- "radgraph_available": true,
124
- "radgraph_error": null
125
- },
126
- "findings_only_test": {
127
- "split": "test",
128
- "subset": "findings-only frontal studies",
129
- "dataset": "mimic-cxr",
130
- "view_filter": "frontal-only (PA/AP), structured Findings section only",
131
- "num_examples": 2210,
132
- "bleu_1": 0.23428333207003713,
133
- "bleu_4": 0.05076939437931996,
134
- "meteor": 0.21379406362615114,
135
- "rouge_l": 0.17515008816614538,
136
- "chexpert_f1_14_micro": 0.22289738986327856,
137
- "chexpert_f1_5_micro": 0.056563951034191644,
138
- "chexpert_f1_14_macro": 0.07235490647135043,
139
- "chexpert_f1_5_macro": 0.03507853403141361,
140
- "chexpert_f1_micro": 0.22289738986327856,
141
- "chexpert_f1_macro": 0.07235490647135043,
142
- "chexpert_per_label_f1": {
143
- "Enlarged Cardiomediastinum": 0.0,
144
- "Cardiomegaly": 0.17539267015706805,
145
- "Lung Opacity": 0.0,
146
- "Lung Lesion": 0.0,
147
- "Edema": 0.0,
148
- "Consolidation": 0.0,
149
- "Pneumonia": 0.0,
150
- "Atelectasis": 0.0,
151
- "Pneumothorax": 0.0,
152
- "Pleural Effusion": 0.0,
153
- "Pleural Other": 0.0,
154
- "Fracture": 0.0,
155
- "Support Devices": 0.48633093525179855,
156
- "No Finding": 0.35124508519003933
157
- },
158
- "radgraph_f1": 0.09000123209087225,
159
- "radgraph_f1_entity": 0.15665513076129836,
160
- "radgraph_f1_relation": 0.14101742529549965,
161
- "radgraph_available": true,
162
- "radgraph_error": null
163
- }
164
- }
165
  }
 
1
  {
2
  "method": "full_adamw",
3
  "run_name": "LAnA-v4",
4
+ "steps": 3289,
5
+ "epochs_completed": 3,
6
+ "epoch_index": 3,
7
  "target_epochs": 3,
8
+ "progress_epochs": 4.0,
9
+ "training_completion_percent": 100.0,
10
+ "elapsed_seconds": 29153.5889147,
11
+ "images_seen": 421707,
12
+ "train_loss_last": 1.9640858173370361,
13
+ "train_loss_mean": 2.007043061655561,
14
+ "val_loss": 1.6445714235305786,
15
+ "images_per_second": 14.465011537134089,
16
  "trainable_params": 125522688,
17
  "vision_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
18
  "text_model_name": "gpt2",
 
37
  "seed": 42,
38
  "resume_supported": true,
39
  "checkpoint_every_n_steps": 1000,
40
+ "cumulative_loss_sum": 846384.1084015816,
41
+ "cumulative_loss_count": 421707,
42
+ "completed": true,
43
  "target_duration_seconds": 3600,
44
  "target_duration_mode": "per_invocation",
45
  "repo_id": "manu02/LAnA-v4",
46
  "train_datasets": "MIMIC-CXR (findings-only)",
47
+ "validation_datasets": "MIMIC-CXR (findings-only)"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  }