Upload README.md
Browse files
README.md
CHANGED
|
@@ -6,9 +6,9 @@
|
|
| 6 |
|
| 7 |
## Model Overview
|
| 8 |
|
| 9 |
-
**CMRCLIP** encodes CMR images and clinical reports into a shared embedding space for retrieval, similarity scoring, and downstream tasks. It uses:
|
| 10 |
|
| 11 |
-
* A pretrained text encoder (`
|
| 12 |
* A video encoder built on Vision Transformers (`SpaceTimeTransformer`)
|
| 13 |
* A lightweight projection head to map both modalities into a common vector space
|
| 14 |
|
|
@@ -64,26 +64,21 @@ model.eval()
|
|
| 64 |
|
| 65 |
```json
|
| 66 |
{
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
"projection": "minimal",
|
| 83 |
-
"projection_dim": 512,
|
| 84 |
-
"load_checkpoint": ""
|
| 85 |
-
}
|
| 86 |
-
}
|
| 87 |
}
|
| 88 |
|
| 89 |
```
|
|
|
|
| 6 |
|
| 7 |
## Model Overview
|
| 8 |
|
| 9 |
+
**CMRCLIP** encodes CMR(Cardiac Magnetic Resonance) images and clinical reports into a shared embedding space for retrieval, similarity scoring, and downstream tasks. It uses:
|
| 10 |
|
| 11 |
+
* A pretrained text encoder (`Bio_ClinicalBERT`)
|
| 12 |
* A video encoder built on Vision Transformers (`SpaceTimeTransformer`)
|
| 13 |
* A lightweight projection head to map both modalities into a common vector space
|
| 14 |
|
|
|
|
| 64 |
|
| 65 |
```json
|
| 66 |
{
|
| 67 |
+
"video_params": {
|
| 68 |
+
"model": "SpaceTimeTransformer",
|
| 69 |
+
"arch_config": "base_patch16_224",
|
| 70 |
+
"num_frames": 64,
|
| 71 |
+
"pretrained": true,
|
| 72 |
+
"time_init": "zeros"
|
| 73 |
+
},
|
| 74 |
+
"text_params": {
|
| 75 |
+
"model": "emilyalsentzer/Bio_ClinicalBERT",
|
| 76 |
+
"pretrained": true,
|
| 77 |
+
"input": "text"
|
| 78 |
+
},
|
| 79 |
+
"projection": "minimal",
|
| 80 |
+
"projection_dim": 512,
|
| 81 |
+
"load_checkpoint": ""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
}
|
| 83 |
|
| 84 |
```
|