Image-to-Text
Transformers
Safetensors
lana_radgen
feature-extraction
medical-ai
radiology
chest-xray
report-generation
segmentation
anatomical-attention
custom_code
Instructions to use manu02/LAnA-v4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use manu02/LAnA-v4 with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="manu02/LAnA-v4", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("manu02/LAnA-v4", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Upload benchmarked LANA model
Browse files- README.md +35 -53
- model.safetensors +1 -1
- run_summary.json +15 -132
README.md
CHANGED
|
@@ -98,43 +98,34 @@ print(report)
|
|
| 98 |
|
| 99 |
Frontal-only evaluation using `PA/AP` studies only.
|
| 100 |
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
|
| 106 |
-
|
|
| 107 |
-
|
|
| 108 |
-
|
|
| 109 |
-
|
|
| 110 |
-
|
|
| 111 |
-
|
|
| 112 |
-
|
|
| 113 |
-
|
| 114 |
-
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
|
| 118 |
-
|
|
| 119 |
-
|
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
|
| 124 |
-
|
|
| 125 |
-
|
|
| 126 |
-
|
|
| 127 |
-
|
|
| 128 |
-
|
| 129 |
-
| BLEU-4 | `0.0405` | `0.0449` | `0.0484` | `0.0487` | `0.0573` | `0.0508` |
|
| 130 |
-
| METEOR | `0.2207` | `0.2347` | `0.2466` | `0.2240` | `0.2310` | `0.2138` |
|
| 131 |
-
| RadGraph F1 | `0.1010` | `0.1000` | `0.1119` | `0.1181` | `0.1046` | `0.0900` |
|
| 132 |
-
| RadGraph entity F1 | `0.1517` | `0.1577` | `0.1713` | `0.1739` | `0.1584` | `0.1567` |
|
| 133 |
-
| RadGraph relation F1 | `0.1347` | `0.1413` | `0.1549` | `0.1628` | `0.1405` | `0.1410` |
|
| 134 |
-
| CheXpert F1 14-micro | `0.1651` | `0.1442` | `0.1907` | `0.1365` | `0.2921` | `0.2229` |
|
| 135 |
-
| CheXpert F1 5-micro | `0.2152` | `0.1716` | `0.2415` | `0.2455` | `0.2394` | `0.0566` |
|
| 136 |
-
| CheXpert F1 14-macro | `0.1047` | `0.0700` | `0.1039` | `0.0381` | `0.1326` | `0.0724` |
|
| 137 |
-
| CheXpert F1 5-macro | `0.1611` | `0.1112` | `0.1578` | `0.0952` | `0.1636` | `0.0351` |
|
| 138 |
|
| 139 |
## Data
|
| 140 |
|
|
@@ -147,15 +138,6 @@ These comparison tables are refreshed across the full LAnA collection whenever a
|
|
| 147 |
|
| 148 |
- Medical report metrics implemented in the repository include RadGraph F1 and CheXpert F1 (`14-micro`, `5-micro`, `14-macro`, `5-macro`).
|
| 149 |
|
| 150 |
-
## Experiment Model Descriptions
|
| 151 |
-
|
| 152 |
-
- `LAnA-MIMIC-CHEXPERT`: This variant was trained on a combined dataset of `CheXpert` and `MIMIC-CXR` using LoRA fine-tuning with the `AdamW` optimizer.
|
| 153 |
-
- `LAnA-MIMIC`: This model was trained on the `MIMIC-CXR (findings-only)` dataset using LoRA fine-tuning with the `AdamW` optimizer.
|
| 154 |
-
- `LAnA`: This model was trained on the `MIMIC-CXR (findings-only)` dataset using full-model optimization with `AdamW` instead of LoRA.
|
| 155 |
-
- `LAnA-v2`: This version keeps the same training setup as `LAnA`, but increases the effective global batch size from `16` to `128`.
|
| 156 |
-
- `LAnA-v3`: This version keeps the same training setup as `LAnA`, including the effective global batch size of `16`, but changes how EOS is handled so training and generation follow the same behavior. The model no longer uses the EOS token during training, and generation remained greedy without stopping when an EOS token was produced. In the previous setup, decoding was also greedy, stopped at EOS, and used a maximum of `128` new tokens.
|
| 157 |
-
- `LAnA-v4`: This version keeps the same decoding behavior as `LAnA-v3`, but increases the effective global batch size from `16` to `128`.
|
| 158 |
-
|
| 159 |
## Training Snapshot
|
| 160 |
|
| 161 |
- Run: `LAnA-v4`
|
|
@@ -171,24 +153,24 @@ These comparison tables are refreshed across the full LAnA collection whenever a
|
|
| 171 |
- Scheduler: `cosine`
|
| 172 |
- Warmup steps: `165`
|
| 173 |
- Weight decay: `0.01`
|
| 174 |
-
- Steps completed: `
|
| 175 |
- Planned total steps: `3297`
|
| 176 |
-
- Images seen: `
|
| 177 |
-
- Total training time: `
|
| 178 |
- Hardware: `NVIDIA GeForce RTX 5070`
|
| 179 |
-
- Final train loss: `1.
|
| 180 |
-
- Validation loss: `1.
|
| 181 |
|
| 182 |
## Status
|
| 183 |
|
| 184 |
- Project status: `Training in progress`
|
| 185 |
- Release status: `Research preview checkpoint`
|
| 186 |
- Current checkpoint status: `Not final`
|
| 187 |
-
- Training completion toward planned run: `
|
| 188 |
- Current published metrics are intermediate and will change as training continues.
|
| 189 |
|
| 190 |
## Notes
|
| 191 |
|
| 192 |
- Set `HF_TOKEN` with permission to access the DINOv3 repositories required by this model before downloading or running inference.
|
| 193 |
- `segmenters/` contains the lung and heart segmentation checkpoints used to build anatomical attention masks.
|
| 194 |
-
- `evaluations/mimic_test_metrics.json` contains the latest saved MIMIC test metrics.
|
|
|
|
| 98 |
|
| 99 |
Frontal-only evaluation using `PA/AP` studies only.
|
| 100 |
|
| 101 |
+
### Current Checkpoint Results
|
| 102 |
+
|
| 103 |
+
| Metric | Value |
|
| 104 |
+
| --- | --- |
|
| 105 |
+
| Number of studies | TBD |
|
| 106 |
+
| RadGraph F1 | TBD |
|
| 107 |
+
| RadGraph entity F1 | TBD |
|
| 108 |
+
| RadGraph relation F1 | TBD |
|
| 109 |
+
| CheXpert F1 14-micro | TBD |
|
| 110 |
+
| CheXpert F1 5-micro | TBD |
|
| 111 |
+
| CheXpert F1 14-macro | TBD |
|
| 112 |
+
| CheXpert F1 5-macro | TBD |
|
| 113 |
+
|
| 114 |
+
### Final Completed Training Results
|
| 115 |
+
|
| 116 |
+
The final table will be populated when the planned training run is completed. Until then, final-report metrics remain `TBD`.
|
| 117 |
+
|
| 118 |
+
| Metric | Value |
|
| 119 |
+
| --- | --- |
|
| 120 |
+
| Number of studies | TBD |
|
| 121 |
+
| RadGraph F1 | TBD |
|
| 122 |
+
| RadGraph entity F1 | TBD |
|
| 123 |
+
| RadGraph relation F1 | TBD |
|
| 124 |
+
| CheXpert F1 14-micro | TBD |
|
| 125 |
+
| CheXpert F1 5-micro | TBD |
|
| 126 |
+
| CheXpert F1 14-macro | TBD |
|
| 127 |
+
| CheXpert F1 5-macro | TBD |
|
| 128 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
|
| 130 |
## Data
|
| 131 |
|
|
|
|
| 138 |
|
| 139 |
- Medical report metrics implemented in the repository include RadGraph F1 and CheXpert F1 (`14-micro`, `5-micro`, `14-macro`, `5-macro`).
|
| 140 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 141 |
## Training Snapshot
|
| 142 |
|
| 143 |
- Run: `LAnA-v4`
|
|
|
|
| 153 |
- Scheduler: `cosine`
|
| 154 |
- Warmup steps: `165`
|
| 155 |
- Weight decay: `0.01`
|
| 156 |
+
- Steps completed: `3289`
|
| 157 |
- Planned total steps: `3297`
|
| 158 |
+
- Images seen: `421707`
|
| 159 |
+
- Total training time: `8.0982` hours
|
| 160 |
- Hardware: `NVIDIA GeForce RTX 5070`
|
| 161 |
+
- Final train loss: `1.9641`
|
| 162 |
+
- Validation loss: `1.6446`
|
| 163 |
|
| 164 |
## Status
|
| 165 |
|
| 166 |
- Project status: `Training in progress`
|
| 167 |
- Release status: `Research preview checkpoint`
|
| 168 |
- Current checkpoint status: `Not final`
|
| 169 |
+
- Training completion toward planned run: `100.00%` (`3` / `3` epochs)
|
| 170 |
- Current published metrics are intermediate and will change as training continues.
|
| 171 |
|
| 172 |
## Notes
|
| 173 |
|
| 174 |
- Set `HF_TOKEN` with permission to access the DINOv3 repositories required by this model before downloading or running inference.
|
| 175 |
- `segmenters/` contains the lung and heart segmentation checkpoints used to build anatomical attention masks.
|
| 176 |
+
- `evaluations/mimic_test_metrics.json` contains the latest saved MIMIC test metrics.
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 1152546464
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cd099dd6604efe4ed12b2aea6b1a0e80cf53a4e0c139bc4a13d77e4a19d915ae
|
| 3 |
size 1152546464
|
run_summary.json
CHANGED
|
@@ -1,18 +1,18 @@
|
|
| 1 |
{
|
| 2 |
"method": "full_adamw",
|
| 3 |
"run_name": "LAnA-v4",
|
| 4 |
-
"steps":
|
| 5 |
-
"epochs_completed":
|
| 6 |
-
"epoch_index":
|
| 7 |
"target_epochs": 3,
|
| 8 |
-
"progress_epochs":
|
| 9 |
-
"training_completion_percent":
|
| 10 |
-
"elapsed_seconds":
|
| 11 |
-
"images_seen":
|
| 12 |
-
"train_loss_last": 1.
|
| 13 |
-
"train_loss_mean": 2.
|
| 14 |
-
"val_loss": 1.
|
| 15 |
-
"images_per_second": 14.
|
| 16 |
"trainable_params": 125522688,
|
| 17 |
"vision_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
|
| 18 |
"text_model_name": "gpt2",
|
|
@@ -37,129 +37,12 @@
|
|
| 37 |
"seed": 42,
|
| 38 |
"resume_supported": true,
|
| 39 |
"checkpoint_every_n_steps": 1000,
|
| 40 |
-
"cumulative_loss_sum":
|
| 41 |
-
"cumulative_loss_count":
|
| 42 |
-
"completed":
|
| 43 |
"target_duration_seconds": 3600,
|
| 44 |
"target_duration_mode": "per_invocation",
|
| 45 |
"repo_id": "manu02/LAnA-v4",
|
| 46 |
"train_datasets": "MIMIC-CXR (findings-only)",
|
| 47 |
-
"validation_datasets": "MIMIC-CXR (findings-only)"
|
| 48 |
-
"repo_url": "https://huggingface.co/manu02/LAnA-v4",
|
| 49 |
-
"latest_evaluation": {
|
| 50 |
-
"split": "test",
|
| 51 |
-
"subset": "all frontal studies",
|
| 52 |
-
"dataset": "mimic-cxr",
|
| 53 |
-
"view_filter": "frontal-only (PA/AP)",
|
| 54 |
-
"num_examples": 3041,
|
| 55 |
-
"bleu_1": 0.22466909304042493,
|
| 56 |
-
"bleu_4": 0.043919975602752334,
|
| 57 |
-
"meteor": 0.20049977335710334,
|
| 58 |
-
"rouge_l": 0.16756058992939854,
|
| 59 |
-
"chexpert_f1_14_micro": 0.22279554040357505,
|
| 60 |
-
"chexpert_f1_5_micro": 0.05494209534069486,
|
| 61 |
-
"chexpert_f1_14_macro": 0.07355641991775376,
|
| 62 |
-
"chexpert_f1_5_macro": 0.034170854271356785,
|
| 63 |
-
"chexpert_f1_micro": 0.22279554040357505,
|
| 64 |
-
"chexpert_f1_macro": 0.07355641991775376,
|
| 65 |
-
"chexpert_per_label_f1": {
|
| 66 |
-
"Enlarged Cardiomediastinum": 0.0,
|
| 67 |
-
"Cardiomegaly": 0.1708542713567839,
|
| 68 |
-
"Lung Opacity": 0.0,
|
| 69 |
-
"Lung Lesion": 0.0,
|
| 70 |
-
"Edema": 0.0,
|
| 71 |
-
"Consolidation": 0.0,
|
| 72 |
-
"Pneumonia": 0.0,
|
| 73 |
-
"Atelectasis": 0.0,
|
| 74 |
-
"Pneumothorax": 0.0,
|
| 75 |
-
"Pleural Effusion": 0.0,
|
| 76 |
-
"Pleural Other": 0.0,
|
| 77 |
-
"Fracture": 0.0,
|
| 78 |
-
"Support Devices": 0.5644329896907216,
|
| 79 |
-
"No Finding": 0.29450261780104714
|
| 80 |
-
},
|
| 81 |
-
"radgraph_f1": 0.0791523357254355,
|
| 82 |
-
"radgraph_f1_entity": 0.1443115199444943,
|
| 83 |
-
"radgraph_f1_relation": 0.12993022073120553,
|
| 84 |
-
"radgraph_available": true,
|
| 85 |
-
"radgraph_error": null
|
| 86 |
-
},
|
| 87 |
-
"latest_evaluations": {
|
| 88 |
-
"all_test": {
|
| 89 |
-
"split": "test",
|
| 90 |
-
"subset": "all frontal studies",
|
| 91 |
-
"dataset": "mimic-cxr",
|
| 92 |
-
"view_filter": "frontal-only (PA/AP)",
|
| 93 |
-
"num_examples": 3041,
|
| 94 |
-
"bleu_1": 0.22466909304042493,
|
| 95 |
-
"bleu_4": 0.043919975602752334,
|
| 96 |
-
"meteor": 0.20049977335710334,
|
| 97 |
-
"rouge_l": 0.16756058992939854,
|
| 98 |
-
"chexpert_f1_14_micro": 0.22279554040357505,
|
| 99 |
-
"chexpert_f1_5_micro": 0.05494209534069486,
|
| 100 |
-
"chexpert_f1_14_macro": 0.07355641991775376,
|
| 101 |
-
"chexpert_f1_5_macro": 0.034170854271356785,
|
| 102 |
-
"chexpert_f1_micro": 0.22279554040357505,
|
| 103 |
-
"chexpert_f1_macro": 0.07355641991775376,
|
| 104 |
-
"chexpert_per_label_f1": {
|
| 105 |
-
"Enlarged Cardiomediastinum": 0.0,
|
| 106 |
-
"Cardiomegaly": 0.1708542713567839,
|
| 107 |
-
"Lung Opacity": 0.0,
|
| 108 |
-
"Lung Lesion": 0.0,
|
| 109 |
-
"Edema": 0.0,
|
| 110 |
-
"Consolidation": 0.0,
|
| 111 |
-
"Pneumonia": 0.0,
|
| 112 |
-
"Atelectasis": 0.0,
|
| 113 |
-
"Pneumothorax": 0.0,
|
| 114 |
-
"Pleural Effusion": 0.0,
|
| 115 |
-
"Pleural Other": 0.0,
|
| 116 |
-
"Fracture": 0.0,
|
| 117 |
-
"Support Devices": 0.5644329896907216,
|
| 118 |
-
"No Finding": 0.29450261780104714
|
| 119 |
-
},
|
| 120 |
-
"radgraph_f1": 0.0791523357254355,
|
| 121 |
-
"radgraph_f1_entity": 0.1443115199444943,
|
| 122 |
-
"radgraph_f1_relation": 0.12993022073120553,
|
| 123 |
-
"radgraph_available": true,
|
| 124 |
-
"radgraph_error": null
|
| 125 |
-
},
|
| 126 |
-
"findings_only_test": {
|
| 127 |
-
"split": "test",
|
| 128 |
-
"subset": "findings-only frontal studies",
|
| 129 |
-
"dataset": "mimic-cxr",
|
| 130 |
-
"view_filter": "frontal-only (PA/AP), structured Findings section only",
|
| 131 |
-
"num_examples": 2210,
|
| 132 |
-
"bleu_1": 0.23428333207003713,
|
| 133 |
-
"bleu_4": 0.05076939437931996,
|
| 134 |
-
"meteor": 0.21379406362615114,
|
| 135 |
-
"rouge_l": 0.17515008816614538,
|
| 136 |
-
"chexpert_f1_14_micro": 0.22289738986327856,
|
| 137 |
-
"chexpert_f1_5_micro": 0.056563951034191644,
|
| 138 |
-
"chexpert_f1_14_macro": 0.07235490647135043,
|
| 139 |
-
"chexpert_f1_5_macro": 0.03507853403141361,
|
| 140 |
-
"chexpert_f1_micro": 0.22289738986327856,
|
| 141 |
-
"chexpert_f1_macro": 0.07235490647135043,
|
| 142 |
-
"chexpert_per_label_f1": {
|
| 143 |
-
"Enlarged Cardiomediastinum": 0.0,
|
| 144 |
-
"Cardiomegaly": 0.17539267015706805,
|
| 145 |
-
"Lung Opacity": 0.0,
|
| 146 |
-
"Lung Lesion": 0.0,
|
| 147 |
-
"Edema": 0.0,
|
| 148 |
-
"Consolidation": 0.0,
|
| 149 |
-
"Pneumonia": 0.0,
|
| 150 |
-
"Atelectasis": 0.0,
|
| 151 |
-
"Pneumothorax": 0.0,
|
| 152 |
-
"Pleural Effusion": 0.0,
|
| 153 |
-
"Pleural Other": 0.0,
|
| 154 |
-
"Fracture": 0.0,
|
| 155 |
-
"Support Devices": 0.48633093525179855,
|
| 156 |
-
"No Finding": 0.35124508519003933
|
| 157 |
-
},
|
| 158 |
-
"radgraph_f1": 0.09000123209087225,
|
| 159 |
-
"radgraph_f1_entity": 0.15665513076129836,
|
| 160 |
-
"radgraph_f1_relation": 0.14101742529549965,
|
| 161 |
-
"radgraph_available": true,
|
| 162 |
-
"radgraph_error": null
|
| 163 |
-
}
|
| 164 |
-
}
|
| 165 |
}
|
|
|
|
| 1 |
{
|
| 2 |
"method": "full_adamw",
|
| 3 |
"run_name": "LAnA-v4",
|
| 4 |
+
"steps": 3289,
|
| 5 |
+
"epochs_completed": 3,
|
| 6 |
+
"epoch_index": 3,
|
| 7 |
"target_epochs": 3,
|
| 8 |
+
"progress_epochs": 4.0,
|
| 9 |
+
"training_completion_percent": 100.0,
|
| 10 |
+
"elapsed_seconds": 29153.5889147,
|
| 11 |
+
"images_seen": 421707,
|
| 12 |
+
"train_loss_last": 1.9640858173370361,
|
| 13 |
+
"train_loss_mean": 2.007043061655561,
|
| 14 |
+
"val_loss": 1.6445714235305786,
|
| 15 |
+
"images_per_second": 14.465011537134089,
|
| 16 |
"trainable_params": 125522688,
|
| 17 |
"vision_model_name": "facebook/dinov3-vits16-pretrain-lvd1689m",
|
| 18 |
"text_model_name": "gpt2",
|
|
|
|
| 37 |
"seed": 42,
|
| 38 |
"resume_supported": true,
|
| 39 |
"checkpoint_every_n_steps": 1000,
|
| 40 |
+
"cumulative_loss_sum": 846384.1084015816,
|
| 41 |
+
"cumulative_loss_count": 421707,
|
| 42 |
+
"completed": true,
|
| 43 |
"target_duration_seconds": 3600,
|
| 44 |
"target_duration_mode": "per_invocation",
|
| 45 |
"repo_id": "manu02/LAnA-v4",
|
| 46 |
"train_datasets": "MIMIC-CXR (findings-only)",
|
| 47 |
+
"validation_datasets": "MIMIC-CXR (findings-only)"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
}
|