makiyeah commited on
Commit
80c54bf
·
verified ·
1 Parent(s): c2641f1

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +111 -3
README.md CHANGED
@@ -1,3 +1,111 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CMRCLIP
2
+
3
+ > A CMR-report contrastive model combining Vision Transformers and pretrained text encoders.
4
+
5
+ ---
6
+
7
+ ## Model Overview
8
+
9
+ **CMRCLIP** encodes CMR images and clinical reports into a shared embedding space for retrieval, similarity scoring, and downstream tasks. It uses:
10
+
11
+ * A pretrained text encoder (`Bio+ClinicalBERT`)
12
+ * A video encoder built on Vision Transformers (`SpaceTimeTransformer`)
13
+ * A lightweight projection head to map both modalities into a common vector space
14
+
15
+ This repository contains only the trained weights and minimal configuration needed to load and run the model.
16
+
17
+ ---
18
+
19
+ ## Files
20
+
21
+ * `config.json` — Model hyperparameters & architecture settings
22
+ * `pytorch_model.bin` — Saved PyTorch `state_dict` of the trained model
23
+
24
+ ---
25
+
26
+ ## Usage Example
27
+
28
+ Below is a minimal example of how to download and load the model using the Hugging Face Hub:
29
+
30
+ ```python
31
+ import json
32
+ import torch
33
+ from huggingface_hub import hf_hub_download
34
+ from model.cmrclip import CMRCLIP
35
+
36
+ # 1. Download artifacts
37
+ def _download_file(filename):
38
+ return hf_hub_download(
39
+ repo_id="makiyeah/CMRCLIP",
40
+ filename=filename
41
+ )
42
+ config_file = _download_file("config.json")
43
+ weights_file = _download_file("pytorch_model.bin")
44
+
45
+ # 2. Load config & model
46
+ with open(config_file, "r") as f:
47
+ cfg = json.load(f)
48
+
49
+ model = CMRCLIP(
50
+ video_params=cfg["video_params"],
51
+ text_params=cfg["text_params"],
52
+ projection_dim=cfg.get("projection_dim", 512),
53
+ load_checkpoint=cfg.get("load_checkpoint"),
54
+ projection=cfg.get("projection", "minimal"),
55
+ )
56
+ state_dict = torch.load(weights_file)
57
+ model.load_state_dict(state_dict)
58
+ model.eval()
59
+ ```
60
+
61
+ ---
62
+
63
+ ## Configuration (`config.json`)
64
+
65
+ ```json
66
+ {
67
+ "arch": {
68
+ "type": "CMRCLIP",
69
+ "args": {
70
+ "video_params": {
71
+ "model": "SpaceTimeTransformer",
72
+ "arch_config": "base_patch16_224",
73
+ "num_frames": 64,
74
+ "pretrained": true,
75
+ "time_init": "zeros"
76
+ },
77
+ "text_params": {
78
+ "model": "emilyalsentzer/Bio_ClinicalBERT",
79
+ "pretrained": true,
80
+ "input": "text"
81
+ },
82
+ "projection": "minimal",
83
+ "projection_dim": 512,
84
+ "load_checkpoint": ""
85
+ }
86
+ }
87
+ }
88
+
89
+ ```
90
+
91
+ ---
92
+
93
+ ## License
94
+
95
+ This model is released under the **MIT** license. See [LICENSE](LICENSE) for details.
96
+
97
+
98
+ ---
99
+
100
+ ## Citation
101
+
102
+ If you use this model in your work, please cite:
103
+
104
+ ```bibtex
105
+ @misc{cmrclip2025,
106
+ title={CMR-CLIP: Contrastive Language Image Pretraining for a Cardiac Magnetic Resonance Image Embedding with Zero-shot Capabilities},
107
+ year={2025},
108
+ }
109
+ ```
110
+
111
+ ---