halituyanik commited on
Commit
44c580c
·
verified ·
1 Parent(s): 6331f36

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -22,19 +22,19 @@ model_name: "RetinaGen-VLM"
22
  # 👁️ RetinaGen-VLM
23
  **Vision-Language Alignment for Automated Retinopathy Grading**
24
 
25
- ### 📝 Project Overview
26
  RetinaGen-VLM is a multimodal deep learning framework designed to bridge the gap between fundus imaging and clinical reporting. By leveraging a **VQ-VAE** based discrete latent space and an autoregressive **Transformer**, the model identifies diabetic retinopathy stages while generating descriptive medical narratives.
27
 
28
  ![RetinaGen-VLM Architecture](architecture.png)
29
 
30
- ### 🔬 Key Features
31
  - **Multimodal Reasoning:** Aligns visual features directly with medical terminology.
32
  - **Synthetic Data Augmentation:** Utilizes generative modeling to balance rare pathological cases such as PDR.
33
  - **Automated Grading:** Provides a standardized 5-point scale diagnostic output (Stages 0-4).
34
 
35
- ### 🛠️ Methodology
36
  The core architecture focuses on mapping high-resolution fundus images into a quantized codebook (Zq), followed by a Transformer-based decoder that predicts the likelihood of specific clinical biomarkers.
37
- #### 🧠 Clinical Reasoning Chain
38
  The model simulates clinical logic by identifying specific visual biomarkers before generating the final diagnostic output:
39
 
40
  **Process Flow:**
@@ -43,7 +43,7 @@ The model simulates clinical logic by identifying specific visual biomarkers bef
43
  **Example Output:**
44
  > "Optic disc shows increased cup-to-disc ratio consistent with glaucoma symptoms."
45
 
46
- ### 💻 Implementation Preview
47
  ```python
48
  import torch
49
  from retinagen_vlm import VQVAE, MedicalTransformer
 
22
  # 👁️ RetinaGen-VLM
23
  **Vision-Language Alignment for Automated Retinopathy Grading**
24
 
25
+ ### Project Overview
26
  RetinaGen-VLM is a multimodal deep learning framework designed to bridge the gap between fundus imaging and clinical reporting. By leveraging a **VQ-VAE** based discrete latent space and an autoregressive **Transformer**, the model identifies diabetic retinopathy stages while generating descriptive medical narratives.
27
 
28
  ![RetinaGen-VLM Architecture](architecture.png)
29
 
30
+ ### Key Features
31
  - **Multimodal Reasoning:** Aligns visual features directly with medical terminology.
32
  - **Synthetic Data Augmentation:** Utilizes generative modeling to balance rare pathological cases such as PDR.
33
  - **Automated Grading:** Provides a standardized 5-point scale diagnostic output (Stages 0-4).
34
 
35
+ ### Methodology
36
  The core architecture focuses on mapping high-resolution fundus images into a quantized codebook (Zq), followed by a Transformer-based decoder that predicts the likelihood of specific clinical biomarkers.
37
+ #### Clinical Reasoning Chain
38
  The model simulates clinical logic by identifying specific visual biomarkers before generating the final diagnostic output:
39
 
40
  **Process Flow:**
 
43
  **Example Output:**
44
  > "Optic disc shows increased cup-to-disc ratio consistent with glaucoma symptoms."
45
 
46
+ ### Implementation Preview
47
  ```python
48
  import torch
49
  from retinagen_vlm import VQVAE, MedicalTransformer