manu02 commited on
Commit
d836478
·
verified ·
1 Parent(s): 564301c

Upload MIMIC test evaluation results

Browse files
README.md CHANGED
@@ -34,7 +34,7 @@ metrics:
34
  - Project status: `Training in progress`
35
  - Release status: `Research preview checkpoint`
36
  - Current checkpoint status: `Not final`
37
- - Training completion toward planned run: `35.77%` (`1.073` / `3` epochs)
38
  - Current published metrics are intermediate and will change as training continues.
39
 
40
  ## Overview
@@ -43,6 +43,10 @@ LAnA is a medical report-generation project for chest X-ray images. The complete
43
 
44
  The architecture combines a DINOv3 vision encoder, lung and heart segmentation heads, and a GPT-2 decoder modified so each transformer layer receives a different anatomical attention bias derived from the segmentation mask.
45
 
 
 
 
 
46
  ## Intended Use
47
 
48
  - Input: a chest X-ray image resized to `512x512` and normalized with ImageNet mean/std.
@@ -59,7 +63,7 @@ The architecture combines a DINOv3 vision encoder, lung and heart segmentation h
59
  ## Evaluation
60
 
61
  - Text-generation metrics used in this project include BLEU, METEOR, ROUGE, and CIDEr.
62
- - Medical report metrics implemented in the repository include RadGraph F1 and CheXpert F1.
63
 
64
  ## Training Snapshot
65
 
@@ -75,78 +79,95 @@ The architecture combines a DINOv3 vision encoder, lung and heart segmentation h
75
  - Scheduler: `cosine`
76
  - Warmup steps: `5114`
77
  - Weight decay: `0.01`
78
- - Steps completed: `36576`
79
  - Planned total steps: `102276`
80
- - Images seen: `292640`
81
- - Total training time: `8.6892` hours
82
  - Hardware: `NVIDIA GeForce RTX 5070`
83
- - Final train loss: `1.4422`
84
- - Validation loss: `1.4824`
85
 
86
  ## MIMIC Test Results
87
 
88
  Frontal-only evaluation using `PA/AP` studies only.
89
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
  | Metric | Value |
91
  | --- | --- |
92
  | Number of studies | TBD |
93
  | RadGraph F1 | TBD |
 
 
94
  | CheXpert F1 micro | TBD |
95
  | CheXpert F1 macro | TBD |
96
 
97
- ## Inference
98
-
99
- Standard `AutoModel.from_pretrained(..., trust_remote_code=True)` loading is currently blocked for this repo because the custom model constructor performs nested pretrained submodel loads.
100
- Use the verified manual load path below instead: download the HF repo snapshot, import the downloaded package, and load the exported `model.safetensors` directly.
101
-
102
- ```python
103
- from pathlib import Path
104
- import sys
105
-
106
- import numpy as np
107
- import torch
108
- from PIL import Image
109
- from huggingface_hub import snapshot_download
110
- from safetensors.torch import load_file
111
- from transformers import AutoTokenizer
112
-
113
- repo_dir = Path(snapshot_download("manu02/LAnA"))
114
- sys.path.insert(0, str(repo_dir))
115
-
116
- from lana_radgen import LanaConfig, LanaForConditionalGeneration
117
-
118
- config = LanaConfig.from_pretrained(repo_dir)
119
- config.lung_segmenter_checkpoint = str(repo_dir / "segmenters" / "lung_segmenter_dinounet_finetuned.pth")
120
- config.heart_segmenter_checkpoint = str(repo_dir / "segmenters" / "heart_segmenter_dinounet_best.pth")
121
-
122
- device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
123
-
124
- model = LanaForConditionalGeneration(config)
125
- state_dict = load_file(str(repo_dir / "model.safetensors"))
126
- missing, unexpected = model.load_state_dict(state_dict, strict=True)
127
- assert not missing and not unexpected
128
-
129
- model.tokenizer = AutoTokenizer.from_pretrained(repo_dir, trust_remote_code=True)
130
- model.move_non_quantized_modules(device)
131
- model.eval()
132
-
133
- image_path = Path("example.png")
134
- image = Image.open(image_path).convert("RGB")
135
- image = image.resize((512, 512), resample=Image.BICUBIC)
136
  array = np.asarray(image, dtype=np.float32) / 255.0
137
  pixel_values = torch.from_numpy(array).permute(2, 0, 1)
138
  mean = torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
139
  std = torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1)
140
  pixel_values = ((pixel_values - mean) / std).unsqueeze(0).to(device)
141
 
142
- with torch.no_grad():
143
- generated = model.generate(pixel_values=pixel_values, max_new_tokens=128)
144
-
145
- report = model.tokenizer.batch_decode(generated, skip_special_tokens=True)[0]
146
- print(report)
147
- ```
148
-
149
- ## Notes
150
 
151
  - `segmenters/` contains the lung and heart segmentation checkpoints used to build anatomical attention masks.
152
  - `evaluations/mimic_test_metrics.json` contains the latest saved MIMIC test metrics.
@@ -157,26 +178,14 @@ print(report)
157
  - Dataset: `MIMIC-CXR test`
158
  - View filter: `frontal-only (PA/AP)`
159
  - Number of examples: `3041`
160
- - CheXpert F1 micro: `0.1610`
161
- - CheXpert F1 macro: `0.1124`
162
- - RadGraph F1: `0.0956`
163
- - RadGraph entity F1: `0.1582`
164
- - RadGraph relation F1: `0.1381`
165
  - RadGraph available: `True`
166
  - RadGraph error: `None`
167
 
168
  - Evaluation file: `evaluations/mimic_test_metrics.json`
169
  - Predictions file: `evaluations/mimic_test_predictions.csv`
170
  <!-- EVAL_RESULTS_END -->
171
-
172
- <!-- MIMIC_TEST_RESULTS_START -->
173
- ## MIMIC Test Results
174
-
175
- Frontal-only evaluation using `PA/AP` studies only. Number of evaluated studies: `3041`.
176
-
177
- | Metric | Value |
178
- | --- | --- |
179
- | RadGraph F1 | `0.0956` |
180
- | CheXpert F1 micro | `0.1610` |
181
- | CheXpert F1 macro | `0.1124` |
182
- <!-- MIMIC_TEST_RESULTS_END -->
 
34
  - Project status: `Training in progress`
35
  - Release status: `Research preview checkpoint`
36
  - Current checkpoint status: `Not final`
37
+ - Training completion toward planned run: `39.96%` (`1.199` / `3` epochs)
38
  - Current published metrics are intermediate and will change as training continues.
39
 
40
  ## Overview
 
43
 
44
  The architecture combines a DINOv3 vision encoder, lung and heart segmentation heads, and a GPT-2 decoder modified so each transformer layer receives a different anatomical attention bias derived from the segmentation mask.
45
 
46
+ ## How to Run
47
+
48
+ For local inference instructions, go to the [Inference](#inference) section.
49
+
50
  ## Intended Use
51
 
52
  - Input: a chest X-ray image resized to `512x512` and normalized with ImageNet mean/std.
 
63
  ## Evaluation
64
 
65
  - Text-generation metrics used in this project include BLEU, METEOR, ROUGE, and CIDEr.
66
+ - Medical report metrics implemented in the repository include RadGraph F1 and CheXpert F1 (`14-micro`, `5-micro`, `14-macro`, `5-macro`).
67
 
68
  ## Training Snapshot
69
 
 
79
  - Scheduler: `cosine`
80
  - Warmup steps: `5114`
81
  - Weight decay: `0.01`
82
+ - Steps completed: `40864`
83
  - Planned total steps: `102276`
84
+ - Images seen: `326946`
85
+ - Total training time: `9.6893` hours
86
  - Hardware: `NVIDIA GeForce RTX 5070`
87
+ - Final train loss: `2.2784`
88
+ - Validation loss: `1.4888`
89
 
90
  ## MIMIC Test Results
91
 
92
  Frontal-only evaluation using `PA/AP` studies only.
93
 
94
+ ### Current Checkpoint Results
95
+
96
+ | Metric | Value |
97
+ | --- | --- |
98
+ | Number of studies | `3041` |
99
+ | RadGraph F1 | `0.0964` |
100
+ | RadGraph entity F1 | `0.1603` |
101
+ | RadGraph relation F1 | `0.1412` |
102
+ | CheXpert F1 micro | `0.1898` |
103
+ | CheXpert F1 macro | `0.1006` |
104
+
105
+ ### Final Completed Training Results
106
+
107
+ The final table will be populated when the planned training run is completed. Until then, final-report metrics remain `TBD`.
108
+
109
  | Metric | Value |
110
  | --- | --- |
111
  | Number of studies | TBD |
112
  | RadGraph F1 | TBD |
113
+ | RadGraph entity F1 | TBD |
114
+ | RadGraph relation F1 | TBD |
115
  | CheXpert F1 micro | TBD |
116
  | CheXpert F1 macro | TBD |
117
 
118
+ ## Inference
119
+
120
+ Standard `AutoModel.from_pretrained(..., trust_remote_code=True)` loading is currently blocked for this repo because the custom model constructor performs nested pretrained submodel loads.
121
+ Use the verified manual load path below instead: download the HF repo snapshot, import the downloaded package, and load the exported `model.safetensors` directly.
122
+
123
+ ```python
124
+ from pathlib import Path
125
+ import sys
126
+
127
+ import numpy as np
128
+ import torch
129
+ from PIL import Image
130
+ from huggingface_hub import snapshot_download
131
+ from safetensors.torch import load_file
132
+ from transformers import AutoTokenizer
133
+
134
+ repo_dir = Path(snapshot_download("manu02/LAnA"))
135
+ sys.path.insert(0, str(repo_dir))
136
+
137
+ from lana_radgen import LanaConfig, LanaForConditionalGeneration
138
+
139
+ config = LanaConfig.from_pretrained(repo_dir)
140
+ config.lung_segmenter_checkpoint = str(repo_dir / "segmenters" / "lung_segmenter_dinounet_finetuned.pth")
141
+ config.heart_segmenter_checkpoint = str(repo_dir / "segmenters" / "heart_segmenter_dinounet_best.pth")
142
+
143
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
144
+
145
+ model = LanaForConditionalGeneration(config)
146
+ state_dict = load_file(str(repo_dir / "model.safetensors"))
147
+ missing, unexpected = model.load_state_dict(state_dict, strict=True)
148
+ assert not missing and not unexpected
149
+
150
+ model.tokenizer = AutoTokenizer.from_pretrained(repo_dir, trust_remote_code=True)
151
+ model.move_non_quantized_modules(device)
152
+ model.eval()
153
+
154
+ image_path = Path("example.png")
155
+ image = Image.open(image_path).convert("RGB")
156
+ image = image.resize((512, 512), resample=Image.BICUBIC)
157
  array = np.asarray(image, dtype=np.float32) / 255.0
158
  pixel_values = torch.from_numpy(array).permute(2, 0, 1)
159
  mean = torch.tensor([0.485, 0.456, 0.406]).view(3, 1, 1)
160
  std = torch.tensor([0.229, 0.224, 0.225]).view(3, 1, 1)
161
  pixel_values = ((pixel_values - mean) / std).unsqueeze(0).to(device)
162
 
163
+ with torch.no_grad():
164
+ generated = model.generate(pixel_values=pixel_values, max_new_tokens=128)
165
+
166
+ report = model.tokenizer.batch_decode(generated, skip_special_tokens=True)[0]
167
+ print(report)
168
+ ```
169
+
170
+ ## Notes
171
 
172
  - `segmenters/` contains the lung and heart segmentation checkpoints used to build anatomical attention masks.
173
  - `evaluations/mimic_test_metrics.json` contains the latest saved MIMIC test metrics.
 
178
  - Dataset: `MIMIC-CXR test`
179
  - View filter: `frontal-only (PA/AP)`
180
  - Number of examples: `3041`
181
+ - CheXpert F1 micro: `0.1898`
182
+ - CheXpert F1 macro: `0.1006`
183
+ - RadGraph F1: `0.0964`
184
+ - RadGraph entity F1: `0.1603`
185
+ - RadGraph relation F1: `0.1412`
186
  - RadGraph available: `True`
187
  - RadGraph error: `None`
188
 
189
  - Evaluation file: `evaluations/mimic_test_metrics.json`
190
  - Predictions file: `evaluations/mimic_test_predictions.csv`
191
  <!-- EVAL_RESULTS_END -->
 
 
 
 
 
 
 
 
 
 
 
 
benchmark_results.json CHANGED
@@ -1,3 +1,391 @@
1
  {
2
- "results": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  }
 
1
  {
2
+ "results": [
3
+ {
4
+ "method": "qlora_paged_adamw8bit",
5
+ "local_batch_size": 1,
6
+ "global_batch_size_requested": 1,
7
+ "status": "failed",
8
+ "error": "element 0 of tensors does not require grad and does not have a grad_fn"
9
+ },
10
+ {
11
+ "method": "qlora_paged_adamw8bit",
12
+ "local_batch_size": 1,
13
+ "global_batch_size_requested": 8,
14
+ "status": "failed",
15
+ "error": "element 0 of tensors does not require grad and does not have a grad_fn"
16
+ },
17
+ {
18
+ "method": "qlora_paged_adamw8bit",
19
+ "local_batch_size": 1,
20
+ "global_batch_size_requested": 16,
21
+ "status": "failed",
22
+ "error": "element 0 of tensors does not require grad and does not have a grad_fn"
23
+ },
24
+ {
25
+ "method": "qlora_paged_adamw8bit",
26
+ "local_batch_size": 2,
27
+ "global_batch_size_requested": 2,
28
+ "status": "failed",
29
+ "error": "element 0 of tensors does not require grad and does not have a grad_fn"
30
+ },
31
+ {
32
+ "method": "qlora_paged_adamw8bit",
33
+ "local_batch_size": 2,
34
+ "global_batch_size_requested": 8,
35
+ "status": "failed",
36
+ "error": "element 0 of tensors does not require grad and does not have a grad_fn"
37
+ },
38
+ {
39
+ "method": "qlora_paged_adamw8bit",
40
+ "local_batch_size": 2,
41
+ "global_batch_size_requested": 16,
42
+ "status": "failed",
43
+ "error": "element 0 of tensors does not require grad and does not have a grad_fn"
44
+ },
45
+ {
46
+ "method": "qlora_paged_adamw8bit",
47
+ "local_batch_size": 4,
48
+ "global_batch_size_requested": 4,
49
+ "status": "failed",
50
+ "error": "element 0 of tensors does not require grad and does not have a grad_fn"
51
+ },
52
+ {
53
+ "method": "qlora_paged_adamw8bit",
54
+ "local_batch_size": 4,
55
+ "global_batch_size_requested": 8,
56
+ "status": "failed",
57
+ "error": "element 0 of tensors does not require grad and does not have a grad_fn"
58
+ },
59
+ {
60
+ "method": "qlora_paged_adamw8bit",
61
+ "local_batch_size": 4,
62
+ "global_batch_size_requested": 16,
63
+ "status": "failed",
64
+ "error": "element 0 of tensors does not require grad and does not have a grad_fn"
65
+ },
66
+ {
67
+ "method": "lora_adamw",
68
+ "local_batch_size": 1,
69
+ "global_batch_size_requested": 1,
70
+ "status": "ok",
71
+ "effective_global_batch_size": 1,
72
+ "gradient_accumulation_steps": 1,
73
+ "optimizer_step_time_sec": 0.12944729999981064,
74
+ "images_per_sec": 7.7251514709187665,
75
+ "mean_loss": 9.920842170715332,
76
+ "trainable_params": 1106688
77
+ },
78
+ {
79
+ "method": "lora_adamw",
80
+ "local_batch_size": 1,
81
+ "global_batch_size_requested": 8,
82
+ "status": "ok",
83
+ "effective_global_batch_size": 8,
84
+ "gradient_accumulation_steps": 8,
85
+ "optimizer_step_time_sec": 0.792737899999338,
86
+ "images_per_sec": 10.091607831550228,
87
+ "mean_loss": 8.131502032279968,
88
+ "trainable_params": 1106688
89
+ },
90
+ {
91
+ "method": "lora_adamw",
92
+ "local_batch_size": 1,
93
+ "global_batch_size_requested": 16,
94
+ "status": "ok",
95
+ "effective_global_batch_size": 16,
96
+ "gradient_accumulation_steps": 16,
97
+ "optimizer_step_time_sec": 1.6773667999987083,
98
+ "images_per_sec": 9.538760395169572,
99
+ "mean_loss": 8.80642619729042,
100
+ "trainable_params": 1106688
101
+ },
102
+ {
103
+ "method": "lora_adamw",
104
+ "local_batch_size": 2,
105
+ "global_batch_size_requested": 2,
106
+ "status": "ok",
107
+ "effective_global_batch_size": 2,
108
+ "gradient_accumulation_steps": 1,
109
+ "optimizer_step_time_sec": 0.20009290000052715,
110
+ "images_per_sec": 9.995357156574427,
111
+ "mean_loss": 9.088608741760254,
112
+ "trainable_params": 1106688
113
+ },
114
+ {
115
+ "method": "lora_adamw",
116
+ "local_batch_size": 2,
117
+ "global_batch_size_requested": 8,
118
+ "status": "ok",
119
+ "effective_global_batch_size": 8,
120
+ "gradient_accumulation_steps": 4,
121
+ "optimizer_step_time_sec": 0.8304937000011705,
122
+ "images_per_sec": 9.63282442719159,
123
+ "mean_loss": 8.245712995529175,
124
+ "trainable_params": 1106688
125
+ },
126
+ {
127
+ "method": "lora_adamw",
128
+ "local_batch_size": 2,
129
+ "global_batch_size_requested": 16,
130
+ "status": "ok",
131
+ "effective_global_batch_size": 16,
132
+ "gradient_accumulation_steps": 8,
133
+ "optimizer_step_time_sec": 1.6668036999981268,
134
+ "images_per_sec": 9.599210752902685,
135
+ "mean_loss": 9.106984257698059,
136
+ "trainable_params": 1106688
137
+ },
138
+ {
139
+ "method": "lora_adamw",
140
+ "local_batch_size": 4,
141
+ "global_batch_size_requested": 4,
142
+ "status": "ok",
143
+ "effective_global_batch_size": 4,
144
+ "gradient_accumulation_steps": 1,
145
+ "optimizer_step_time_sec": 0.4656030999994982,
146
+ "images_per_sec": 8.591008092524106,
147
+ "mean_loss": 8.862140655517578,
148
+ "trainable_params": 1106688
149
+ },
150
+ {
151
+ "method": "lora_adamw",
152
+ "local_batch_size": 4,
153
+ "global_batch_size_requested": 8,
154
+ "status": "ok",
155
+ "effective_global_batch_size": 8,
156
+ "gradient_accumulation_steps": 2,
157
+ "optimizer_step_time_sec": 2.6093234999989363,
158
+ "images_per_sec": 3.0659287742601715,
159
+ "mean_loss": 8.241507053375244,
160
+ "trainable_params": 1106688
161
+ },
162
+ {
163
+ "method": "lora_adamw",
164
+ "local_batch_size": 4,
165
+ "global_batch_size_requested": 16,
166
+ "status": "ok",
167
+ "effective_global_batch_size": 16,
168
+ "gradient_accumulation_steps": 4,
169
+ "optimizer_step_time_sec": 18.058491499999946,
170
+ "images_per_sec": 0.8860097755119827,
171
+ "mean_loss": 8.916554927825928,
172
+ "trainable_params": 1106688
173
+ },
174
+ {
175
+ "method": "full_adam",
176
+ "local_batch_size": 1,
177
+ "global_batch_size_requested": 1,
178
+ "status": "ok",
179
+ "effective_global_batch_size": 1,
180
+ "gradient_accumulation_steps": 1,
181
+ "optimizer_step_time_sec": 1.4309436000003188,
182
+ "images_per_sec": 0.6988395629288094,
183
+ "mean_loss": 8.042855262756348,
184
+ "trainable_params": 125521920
185
+ },
186
+ {
187
+ "method": "full_adam",
188
+ "local_batch_size": 1,
189
+ "global_batch_size_requested": 8,
190
+ "status": "ok",
191
+ "effective_global_batch_size": 8,
192
+ "gradient_accumulation_steps": 8,
193
+ "optimizer_step_time_sec": 2.7121656999988772,
194
+ "images_per_sec": 2.9496722858796245,
195
+ "mean_loss": 7.829526960849762,
196
+ "trainable_params": 125521920
197
+ },
198
+ {
199
+ "method": "full_adam",
200
+ "local_batch_size": 1,
201
+ "global_batch_size_requested": 16,
202
+ "status": "ok",
203
+ "effective_global_batch_size": 16,
204
+ "gradient_accumulation_steps": 16,
205
+ "optimizer_step_time_sec": 1.8378386999993381,
206
+ "images_per_sec": 8.705878268863183,
207
+ "mean_loss": 9.189274996519089,
208
+ "trainable_params": 125521920
209
+ },
210
+ {
211
+ "method": "full_adam",
212
+ "local_batch_size": 2,
213
+ "global_batch_size_requested": 2,
214
+ "status": "ok",
215
+ "effective_global_batch_size": 2,
216
+ "gradient_accumulation_steps": 1,
217
+ "optimizer_step_time_sec": 0.23647629999868514,
218
+ "images_per_sec": 8.457507158269646,
219
+ "mean_loss": 9.128178596496582,
220
+ "trainable_params": 125521920
221
+ },
222
+ {
223
+ "method": "full_adam",
224
+ "local_batch_size": 2,
225
+ "global_batch_size_requested": 8,
226
+ "status": "ok",
227
+ "effective_global_batch_size": 8,
228
+ "gradient_accumulation_steps": 4,
229
+ "optimizer_step_time_sec": 0.8083188999989943,
230
+ "images_per_sec": 9.897083935572896,
231
+ "mean_loss": 8.64337944984436,
232
+ "trainable_params": 125521920
233
+ },
234
+ {
235
+ "method": "full_adam",
236
+ "local_batch_size": 2,
237
+ "global_batch_size_requested": 16,
238
+ "status": "ok",
239
+ "effective_global_batch_size": 16,
240
+ "gradient_accumulation_steps": 8,
241
+ "optimizer_step_time_sec": 1.8274533999974665,
242
+ "images_per_sec": 8.755353214490823,
243
+ "mean_loss": 8.331470370292664,
244
+ "trainable_params": 125521920
245
+ },
246
+ {
247
+ "method": "full_adam",
248
+ "local_batch_size": 4,
249
+ "global_batch_size_requested": 4,
250
+ "status": "ok",
251
+ "effective_global_batch_size": 4,
252
+ "gradient_accumulation_steps": 1,
253
+ "optimizer_step_time_sec": 0.511095199999545,
254
+ "images_per_sec": 7.826330593602838,
255
+ "mean_loss": 8.954268455505371,
256
+ "trainable_params": 125521920
257
+ },
258
+ {
259
+ "method": "full_adam",
260
+ "local_batch_size": 4,
261
+ "global_batch_size_requested": 8,
262
+ "status": "ok",
263
+ "effective_global_batch_size": 8,
264
+ "gradient_accumulation_steps": 2,
265
+ "optimizer_step_time_sec": 2.2738564999981463,
266
+ "images_per_sec": 3.518251921353226,
267
+ "mean_loss": 9.192809581756592,
268
+ "trainable_params": 125521920
269
+ },
270
+ {
271
+ "method": "full_adam",
272
+ "local_batch_size": 4,
273
+ "global_batch_size_requested": 16,
274
+ "status": "ok",
275
+ "effective_global_batch_size": 16,
276
+ "gradient_accumulation_steps": 4,
277
+ "optimizer_step_time_sec": 18.631701800000883,
278
+ "images_per_sec": 0.8587513997244869,
279
+ "mean_loss": 8.159156560897827,
280
+ "trainable_params": 125521920
281
+ },
282
+ {
283
+ "method": "full_adam8bit",
284
+ "local_batch_size": 1,
285
+ "global_batch_size_requested": 1,
286
+ "status": "ok",
287
+ "effective_global_batch_size": 1,
288
+ "gradient_accumulation_steps": 1,
289
+ "optimizer_step_time_sec": 0.13992360000156623,
290
+ "images_per_sec": 7.146757230294293,
291
+ "mean_loss": 9.259998321533203,
292
+ "trainable_params": 125521920
293
+ },
294
+ {
295
+ "method": "full_adam8bit",
296
+ "local_batch_size": 1,
297
+ "global_batch_size_requested": 8,
298
+ "status": "ok",
299
+ "effective_global_batch_size": 8,
300
+ "gradient_accumulation_steps": 8,
301
+ "optimizer_step_time_sec": 0.8451360999988538,
302
+ "images_per_sec": 9.465930990299492,
303
+ "mean_loss": 8.10985803604126,
304
+ "trainable_params": 125521920
305
+ },
306
+ {
307
+ "method": "full_adam8bit",
308
+ "local_batch_size": 1,
309
+ "global_batch_size_requested": 16,
310
+ "status": "ok",
311
+ "effective_global_batch_size": 16,
312
+ "gradient_accumulation_steps": 16,
313
+ "optimizer_step_time_sec": 1.8945816999930685,
314
+ "images_per_sec": 8.445135936897595,
315
+ "mean_loss": 8.591163873672485,
316
+ "trainable_params": 125521920
317
+ },
318
+ {
319
+ "method": "full_adam8bit",
320
+ "local_batch_size": 2,
321
+ "global_batch_size_requested": 2,
322
+ "status": "ok",
323
+ "effective_global_batch_size": 2,
324
+ "gradient_accumulation_steps": 1,
325
+ "optimizer_step_time_sec": 0.23971350000101666,
326
+ "images_per_sec": 8.343293139483249,
327
+ "mean_loss": 9.75894832611084,
328
+ "trainable_params": 125521920
329
+ },
330
+ {
331
+ "method": "full_adam8bit",
332
+ "local_batch_size": 2,
333
+ "global_batch_size_requested": 8,
334
+ "status": "ok",
335
+ "effective_global_batch_size": 8,
336
+ "gradient_accumulation_steps": 4,
337
+ "optimizer_step_time_sec": 0.9259438999997656,
338
+ "images_per_sec": 8.6398322835779,
339
+ "mean_loss": 8.462790489196777,
340
+ "trainable_params": 125521920
341
+ },
342
+ {
343
+ "method": "full_adam8bit",
344
+ "local_batch_size": 2,
345
+ "global_batch_size_requested": 16,
346
+ "status": "ok",
347
+ "effective_global_batch_size": 16,
348
+ "gradient_accumulation_steps": 8,
349
+ "optimizer_step_time_sec": 1.8237968999983423,
350
+ "images_per_sec": 8.772906676184471,
351
+ "mean_loss": 10.191668510437012,
352
+ "trainable_params": 125521920
353
+ },
354
+ {
355
+ "method": "full_adam8bit",
356
+ "local_batch_size": 4,
357
+ "global_batch_size_requested": 4,
358
+ "status": "ok",
359
+ "effective_global_batch_size": 4,
360
+ "gradient_accumulation_steps": 1,
361
+ "optimizer_step_time_sec": 0.5224713000006886,
362
+ "images_per_sec": 7.655922918626779,
363
+ "mean_loss": 8.14057445526123,
364
+ "trainable_params": 125521920
365
+ },
366
+ {
367
+ "method": "full_adam8bit",
368
+ "local_batch_size": 4,
369
+ "global_batch_size_requested": 8,
370
+ "status": "ok",
371
+ "effective_global_batch_size": 8,
372
+ "gradient_accumulation_steps": 2,
373
+ "optimizer_step_time_sec": 3.7809107000011863,
374
+ "images_per_sec": 2.1158923430795364,
375
+ "mean_loss": 8.521550178527832,
376
+ "trainable_params": 125521920
377
+ },
378
+ {
379
+ "method": "full_adam8bit",
380
+ "local_batch_size": 4,
381
+ "global_batch_size_requested": 16,
382
+ "status": "ok",
383
+ "effective_global_batch_size": 16,
384
+ "gradient_accumulation_steps": 4,
385
+ "optimizer_step_time_sec": 27.688971800002037,
386
+ "images_per_sec": 0.5778473868790903,
387
+ "mean_loss": 9.247632026672363,
388
+ "trainable_params": 125521920
389
+ }
390
+ ]
391
  }
evaluations/mimic_test_metrics.json CHANGED
@@ -3,21 +3,27 @@
3
  "dataset": "mimic-cxr",
4
  "view_filter": "frontal-only (PA/AP)",
5
  "num_examples": 3041,
6
- "chexpert_f1_micro": 0.16103319543941078,
7
- "chexpert_f1_macro": 0.1124493965203456,
8
  "chexpert_per_label_f1": {
9
- "Atelectasis": 0.0825,
10
- "Cardiomegaly": 0.00737100737100737,
11
- "Consolidation": 0.04561824729891957,
12
- "Edema": 0.2830349531116795,
13
- "Pleural Effusion": 0.33264462809917356,
14
- "Pneumonia": 0.08787346221441125,
15
- "Pneumothorax": 0.0605528740675735,
 
 
 
 
 
 
16
  "No Finding": 0.0
17
  },
18
- "radgraph_f1": 0.09555911275393814,
19
- "radgraph_f1_entity": 0.15815788437570172,
20
- "radgraph_f1_relation": 0.13806964215526496,
21
  "radgraph_available": true,
22
  "radgraph_error": null
23
  }
 
3
  "dataset": "mimic-cxr",
4
  "view_filter": "frontal-only (PA/AP)",
5
  "num_examples": 3041,
6
+ "chexpert_f1_micro": 0.1897964951950254,
7
+ "chexpert_f1_macro": 0.1005536332270045,
8
  "chexpert_per_label_f1": {
9
+ "Enlarged Cardiomediastinum": 0.0,
10
+ "Cardiomegaly": 0.0,
11
+ "Lung Opacity": 0.0,
12
+ "Lung Lesion": 0.0,
13
+ "Edema": 0.37237569060773473,
14
+ "Consolidation": 0.08431952662721894,
15
+ "Pneumonia": 0.12570145903479238,
16
+ "Atelectasis": 0.0456989247311828,
17
+ "Pneumothorax": 0.045261669024045256,
18
+ "Pleural Effusion": 0.3687114573190523,
19
+ "Pleural Other": 0.0,
20
+ "Fracture": 0.0,
21
+ "Support Devices": 0.36568213783403664,
22
  "No Finding": 0.0
23
  },
24
+ "radgraph_f1": 0.09636633496704333,
25
+ "radgraph_f1_entity": 0.16033252587414393,
26
+ "radgraph_f1_relation": 0.14117679892881935,
27
  "radgraph_available": true,
28
  "radgraph_error": null
29
  }
evaluations/mimic_test_predictions.csv CHANGED
The diff for this file is too large to render. See raw diff
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b30676d31ff967d3ff33af89e65dcee6d59ca0c8fe2f348abdb6f9d9546c88e1
3
  size 1152540320
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:85d0600d4706b6aae446467d50481d2bfde0cae44576e88a994b6f574c33bb63
3
  size 1152540320
run_summary.json CHANGED
@@ -1,18 +1,18 @@
1
  {
2
  "method": "lora_adamw",
3
  "run_name": "full_3_epoch_mask_run",
4
- "steps": 36576,
5
  "epochs_completed": 1,
6
  "epoch_index": 1,
7
  "target_epochs": 3,
8
- "progress_epochs": 1.072994734757931,
9
- "training_completion_percent": 35.766491158597695,
10
- "elapsed_seconds": 31281.024234599987,
11
- "images_seen": 292640,
12
- "train_loss_last": 1.442158818244934,
13
- "train_loss_mean": 2.1974657088950895,
14
- "val_loss": 1.482397198677063,
15
- "images_per_second": 9.355192394125972,
16
  "trainable_params": 1106688,
17
  "vision_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
18
  "text_model_name": "gpt2",
@@ -33,9 +33,40 @@
33
  "seed": 42,
34
  "resume_supported": true,
35
  "checkpoint_every_n_steps": 1000,
36
- "cumulative_loss_sum": 643066.365051059,
37
- "cumulative_loss_count": 292640,
38
  "completed": false,
39
  "target_duration_seconds": 3600,
40
- "target_duration_mode": "per_invocation"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  }
 
1
  {
2
  "method": "lora_adamw",
3
  "run_name": "full_3_epoch_mask_run",
4
+ "steps": 40864,
5
  "epochs_completed": 1,
6
  "epoch_index": 1,
7
  "target_epochs": 3,
8
+ "progress_epochs": 1.1987812211255005,
9
+ "training_completion_percent": 39.95937403751668,
10
+ "elapsed_seconds": 34881.30608979998,
11
+ "images_seen": 326946,
12
+ "train_loss_last": 2.278425455093384,
13
+ "train_loss_mean": 2.149527354583105,
14
+ "val_loss": 1.4887660503387452,
15
+ "images_per_second": 9.373100856897265,
16
  "trainable_params": 1106688,
17
  "vision_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
18
  "text_model_name": "gpt2",
 
33
  "seed": 42,
34
  "resume_supported": true,
35
  "checkpoint_every_n_steps": 1000,
36
+ "cumulative_loss_sum": 702779.3704715278,
37
+ "cumulative_loss_count": 326946,
38
  "completed": false,
39
  "target_duration_seconds": 3600,
40
+ "target_duration_mode": "per_invocation",
41
+ "train_datasets": "CheXpert, MIMIC-CXR",
42
+ "validation_datasets": "CheXpert, MIMIC-CXR",
43
+ "latest_evaluation": {
44
+ "split": "test",
45
+ "dataset": "mimic-cxr",
46
+ "view_filter": "frontal-only (PA/AP)",
47
+ "num_examples": 3041,
48
+ "chexpert_f1_micro": 0.1897964951950254,
49
+ "chexpert_f1_macro": 0.1005536332270045,
50
+ "chexpert_per_label_f1": {
51
+ "Enlarged Cardiomediastinum": 0.0,
52
+ "Cardiomegaly": 0.0,
53
+ "Lung Opacity": 0.0,
54
+ "Lung Lesion": 0.0,
55
+ "Edema": 0.37237569060773473,
56
+ "Consolidation": 0.08431952662721894,
57
+ "Pneumonia": 0.12570145903479238,
58
+ "Atelectasis": 0.0456989247311828,
59
+ "Pneumothorax": 0.045261669024045256,
60
+ "Pleural Effusion": 0.3687114573190523,
61
+ "Pleural Other": 0.0,
62
+ "Fracture": 0.0,
63
+ "Support Devices": 0.36568213783403664,
64
+ "No Finding": 0.0
65
+ },
66
+ "radgraph_f1": 0.09636633496704333,
67
+ "radgraph_f1_entity": 0.16033252587414393,
68
+ "radgraph_f1_relation": 0.14117679892881935,
69
+ "radgraph_available": true,
70
+ "radgraph_error": null
71
+ }
72
  }
tokenizer_config.json CHANGED
@@ -4,13 +4,9 @@
4
  "bos_token": "<|endoftext|>",
5
  "eos_token": "<|endoftext|>",
6
  "errors": "replace",
7
- "is_local": true,
8
- "max_length": 1022,
9
  "model_max_length": 1024,
10
  "pad_token": "<|endoftext|>",
11
- "stride": 0,
12
  "tokenizer_class": "GPT2Tokenizer",
13
- "truncation_side": "right",
14
- "truncation_strategy": "longest_first",
15
  "unk_token": "<|endoftext|>"
16
  }
 
4
  "bos_token": "<|endoftext|>",
5
  "eos_token": "<|endoftext|>",
6
  "errors": "replace",
7
+ "is_local": false,
 
8
  "model_max_length": 1024,
9
  "pad_token": "<|endoftext|>",
 
10
  "tokenizer_class": "GPT2Tokenizer",
 
 
11
  "unk_token": "<|endoftext|>"
12
  }