File size: 2,135 Bytes
69f3d91
 
 
 
 
 
 
 
 
 
 
 
 
266e106
 
69f3d91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e268ed0
 
 
 
 
69f3d91
 
 
 
266e106
 
 
 
 
 
 
 
 
69f3d91
266e106
 
69f3d91
266e106
 
69f3d91
266e106
42a4e86
 
69f3d91
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
language: en
tags:
- vision
- vit
- xray
- chest-xray
- classification
license: mit
pipeline_tag: image-classification
author: Om Kumar (@itsomk)
---

# ViT X-ray Multi-label (vit-xray-v1)

## Model Description
This model is a fine-tuned **Vision Transformer** ([google/vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k)) for **multi-label classification of chest X-rays**.  
It predicts the presence of multiple findings such as:
- **Nodule**
- **Infiltration**
- **Effusion**
- **Atelectasis**

**Author:** Om Kumar (Hugging Face: [@itsomk](https://huggingface.co/itsomk))  

The model is designed for **research and educational purposes only** and should not be used as a substitute for clinical diagnosis.

---

## Intended Use
- **Research** in medical imaging and computer vision  
- **Educational purposes** for understanding X-ray image classification  
- **Baseline model** for further fine-tuning or domain adaptation  

⚠️ **Not intended for clinical use**. Predictions should not guide medical decisions.

---

## Training Data
- Dataset: Chest X-ray images (publicly available datasets, e.g., NIH ChestX-ray14, etc.)  
- Images were preprocessed (resized to 224x224, normalized).  
- Labels are **multi-label**, meaning an X-ray can contain more than one finding.

---

## Model Performance
- Optimized for detecting **common thoracic abnormalities**.  
- Evaluation metrics: AUC .
- Nodule AUC: 0.696
- Infiltration AUC: 0.684
- Effusion AUC: 0.843
- Atelectasis AUC: 0.762

---

## Quick Usage

```python
from transformers import AutoImageProcessor, AutoModelForImageClassification
import torch
from PIL import Image

MODEL = "itsomk/vit-xray-v1"
processor = AutoImageProcessor.from_pretrained(MODEL)
model = AutoModelForImageClassification.from_pretrained(MODEL)

img = Image.open("path/to/xray.jpg").convert("RGB")
inputs = processor(images=img, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits

probs = torch.sigmoid(logits).squeeze().tolist()
results = {model.config.id2label[i]: float(probs[i]) for i in range(len(probs))}
print(results)