evoikonomou's picture
Update README.md
80131ef verified
---
license: cc-by-nc-4.0
base_model:
- microsoft/beit-base-patch16-384
pipeline_tag: zero-shot-image-classification
tags:
- medical
---
# ECG‑CLIP‑BEiT‑Base‑384
A CLIP‑based model fine‑tuned on paired ECG images and echo reports to generate 512‑dim embeddings and perform zero‑shot classification of echocardiographic phenotypes.
Read more here: https://github.com/CarDS-Yale/target-ai-shared
## Description
This model uses:
- **Vision encoder**: `microsoft/beit-base-patch16-384`
- **Text encoder**: custom ByteLevel BPE tokenizer (vocab_size=16 030)
- **Projection dimension**: 512
- **Trained on**: paired ECG images and corresponding echo report text
- **Tasks**:
- Produce image embeddings for ECG images
- Zero‑shot classification of phenotypes (e.g., LV dysfunction, AS) via cosine similarity to precomputed centroids
## Setup & Usage
**Clone the repository and go to demo directory**:
```bash
git clone https://github.com/CarDS-Yale/target-ai-shared.git
cd target-ai-shared
```
**Make inference environment:**:
```bash
conda env create -f ./environment_files/ecg_image_vit_light.yml
```
**Run inference for a set of images:**:
You can run quick inference on a folder of ECG images by accessing the model's weights on HuggingFace.
In the command below, the following can be adjusted:
- hf_repo: this points to the HF repo of the published model and weights
- image_dir: point to a folder containing ECG images in the 4 standard layouts - as shown below
- centroid_csv: point to the csv containing reference embeddings/centroids for cases/controls - we provide examples from our training set and EchoNext (Elias, P., & Finer, J. (2025). EchoNext: A Dataset for Detecting Echocardiogram-Confirmed Structural Heart Disease from ECGs (version 1.1.0). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/3ykd-bf14)
----- e.g. ./demo/reference_embeddings/echonext_reference_centroids.csv or ./demo/reference_embeddings/ynhhs_reference_centroids.csv
- output_csv: path where the output will be stored
```bash
conda activate ecg_image_vit_light
python ./demo/zero_shot_from_reference_embedding.py \
--hf_repo "CarDSLab/ecg-clip-beit-base-384" \
--image_dir "./demo/online_ecg_images_credit_to_liftl" \
--centroid_csv "./demo/reference_embeddings/ynhhs_reference_centroids.csv" \
--output_csv "./demo/output/output.csv" \
--batch_size 32 \
--device cuda
```
**Output** (`output_csv`): one row per image×label with columns:
```
image_id,label,cosine_similarity_to_case,delta_cosine_similarity_case_minus_control,embedding
```
## Citation
```
@article{oikonomou2025targetecg,
title = {TARGET-AI: a foundational approach for the targeted deployment of artificial intelligence electrocardiography in the electronic health record},
author = {Oikonomou, Evangelos K. and Khera, Rohan and others},
journal = {medRxiv},
year = {2025},
doi = {10.1101/2025.08.25.25334266},
note = {Preprint}
}
```