cwangrun commited on
Commit
9e4b45d
·
verified ·
1 Parent(s): e8e1834

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +100 -1
README.md CHANGED
@@ -1,3 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import torch
2
  from PIL import Image
3
  from transformers import AutoModel, AutoTokenizer, AutoImageProcessor
@@ -14,7 +47,13 @@ tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
14
  image_processor = AutoImageProcessor.from_pretrained(repo_id, trust_remote_code=True)
15
 
16
  model.eval()
 
 
 
17
 
 
 
 
18
  image = Image.open("./CXR/images/5AF3BB6C1BCC83C.png").convert("RGB")
19
  text = ["Pneumonia", "no Pneumonia"]
20
 
@@ -26,4 +65,64 @@ with torch.no_grad():
26
  pixel_values=image_inputs["pixel_values"],
27
  text_tokens=text_inputs,
28
  )
29
- print(outputs)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CheXficient
2
+
3
+ CheXficient is a vision-language foundation model for efficient and
4
+ robust chest X-ray understanding. It enables joint image-text
5
+ representation learning and supports prompt-based zero-shot
6
+ classification.
7
+
8
+ This repository provides a Hugging Face-compatible implementation for
9
+ seamless integration into research workflows.
10
+
11
+ ------------------------------------------------------------------------
12
+
13
+ ## Model Overview
14
+
15
+ - Architecture: Vision-Language dual encoder
16
+ - Input: Chest X-ray image + text prompts
17
+ - Output: Image-text similarity logits and embeddings
18
+ - Framework: PyTorch + Hugging Face Transformers
19
+ - Intended Use: Research in medical AI and multimodal learning
20
+
21
+ ------------------------------------------------------------------------
22
+
23
+ ## Installation
24
+
25
+ ``` bash
26
+ pip install torch torchvision transformers pillow
27
+ ```
28
+
29
+ ------------------------------------------------------------------------
30
+
31
+ ## Load the Model
32
+
33
+ ``` python
34
  import torch
35
  from PIL import Image
36
  from transformers import AutoModel, AutoTokenizer, AutoImageProcessor
 
47
  image_processor = AutoImageProcessor.from_pretrained(repo_id, trust_remote_code=True)
48
 
49
  model.eval()
50
+ ```
51
+
52
+ ------------------------------------------------------------------------
53
 
54
+ ## Zero-Shot Classification Example
55
+
56
+ ``` python
57
  image = Image.open("./CXR/images/5AF3BB6C1BCC83C.png").convert("RGB")
58
  text = ["Pneumonia", "no Pneumonia"]
59
 
 
65
  pixel_values=image_inputs["pixel_values"],
66
  text_tokens=text_inputs,
67
  )
68
+
69
+ print(outputs)
70
+ ```
71
+
72
+ Optional probability conversion:
73
+
74
+ ``` python
75
+ import torch.nn.functional as F
76
+
77
+ logits = outputs["logits"]
78
+ probs = F.softmax(logits, dim=-1)
79
+ print(probs)
80
+ ```
81
+
82
+ ------------------------------------------------------------------------
83
+
84
+ ## Model Interface
85
+
86
+ ``` python
87
+ model(
88
+ pixel_values=Tensor,
89
+ text_tokens=dict
90
+ )
91
+ ```
92
+
93
+ Returns:
94
+
95
+ - logits
96
+ - image_embeds
97
+ - text_embeds
98
+
99
+ ------------------------------------------------------------------------
100
+
101
+ ## Intended Use
102
+
103
+ - Zero-shot chest X-ray classification
104
+ - Vision-language representation learning
105
+ - Prompt-based disease detection
106
+ - Medical AI research
107
+
108
+ ------------------------------------------------------------------------
109
+
110
+ ## Limitations
111
+
112
+ - Research use only
113
+ - Not approved for clinical deployment
114
+ - Performance may vary across institutions and demographics
115
+ - trust_remote_code=True is required
116
+
117
+ ------------------------------------------------------------------------
118
+
119
+ ## Citation
120
+
121
+ ``` bibtex
122
+ @article{chexficient2024,
123
+ title={CheXficient: Efficient Vision-Language Learning for Chest X-ray Understanding},
124
+ author={...},
125
+ journal={...},
126
+ year={2024}
127
+ }
128
+ ```