amrn commited on
Commit
056bd03
·
verified ·
1 Parent(s): 9ebdf4b

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +12 -37
README.md CHANGED
@@ -22,7 +22,8 @@ NV-Reason-CXR-3B is a specialized vision-language model designed for medical rea
22
 
23
  This model is for research and development only.
24
 
25
- Try a Demo [here](https://huggingface.co/spaces/nvidia/nv-reason-cxr)
 
26
 
27
  ## Quick start
28
 
@@ -111,18 +112,18 @@ This model is designed for research and educational purposes only and should not
111
  Huggingface: 10/27/2025 via https://huggingface.co/NVIDIA
112
 
113
  ## Model Architecture:
114
- **Architecture Type:** Transformer
115
- **Network Architecture:** Vision-Language Model based on Qwen2.5-VL architecture with medical reasoning capabilities
116
 
117
  This model was developed by fine-tuning Qwen2.5-VL-3B using Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO) for enhanced medical reasoning.
118
  **Number of model parameters:** 3B
119
 
120
 
121
  ## Input:
122
- **Input Type(s):** Image, Text
123
- **Input Format(s):** Medical images (JPEG, PNG), Text prompts (string)
124
- **Input Parameters:** Two-Dimensional (2D) images with accompanying text queries (1D)
125
- **Other Properties Related to Input:** Supports frontal chest X-ray images with flexible scaling. Accepts natural language prompts for medical queries, follow-up questions, and reasoning requests. Input images are automatically processed without specific size constraints.
126
 
127
  ### Input Specifications:
128
  - **Medical Images:** Chest X-ray images in standard medical imaging formats
@@ -130,10 +131,10 @@ This model was developed by fine-tuning Qwen2.5-VL-3B using Supervised Fine-Tuni
130
  - **Interactive Dialogue:** Support for follow-up questions and clarification requests
131
 
132
  ## Output:
133
- **Output Type(s):** Text
134
- **Output Format:** Structured reasoning with XML-like tags
135
- **Output Parameters:** One-Dimensional (1D) Natural language reasoning and analysis
136
- **Other Properties Related to Output:** Outputs contain structured thinking processes enclosed in `<thinking>` tags showing step-by-step medical reasoning, followed by concise answers in `<answer>` tags. This format enables transparency in the model's diagnostic reasoning process and supports educational use cases.
137
 
138
  Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (GPU cores) and software frameworks (CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
139
 
@@ -165,32 +166,6 @@ Large-scale chest X-ray datasets including MIMIC-CXR, ChestXRay14, and CheXpert.
165
  * Image
166
  * Text
167
 
168
- **Image Training Data Size:**
169
- * Less than a Million Images
170
-
171
- **Text Training Data Size:**
172
- * Less than a Billion Tokens
173
-
174
- **Data Collection Method by dataset:**
175
- * Hybrid: Human, Automatic/Sensors
176
-
177
- **Labeling Method by dataset:**
178
- * Hybrid: Human, Synthetic
179
-
180
- ## Testing Dataset:
181
- **Data Collection Method by dataset:**
182
- * Hybrid: Human, Automatic/Sensors
183
-
184
- **Labeling Method by dataset:**
185
- * Hybrid: Human, Synthetic
186
-
187
- ## Evaluation Dataset:
188
- **Data Collection Method by dataset:**
189
- * Hybrid: Human, Automatic/Sensors
190
-
191
- **Labeling Method by dataset:**
192
- * Hybrid: Human, Synthetic
193
-
194
  ## Inference:
195
  **Acceleration Engine:** PyTorch, Transformers
196
  **Test Hardware:**
 
22
 
23
  This model is for research and development only.
24
 
25
+ 💻 [\[Github code\]](https://github.com/NVIDIA-Medtech/NV-Reason-CXR)
26
+ 🩻 [\[Web Demo\]](https://huggingface.co/spaces/nvidia/nv-reason-cxr)
27
 
28
  ## Quick start
29
 
 
112
  Huggingface: 10/27/2025 via https://huggingface.co/NVIDIA
113
 
114
  ## Model Architecture:
115
+ - **Architecture Type:** Transformer
116
+ - **Network Architecture:** Vision-Language Model based on Qwen2.5-VL architecture with medical reasoning capabilities
117
 
118
  This model was developed by fine-tuning Qwen2.5-VL-3B using Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO) for enhanced medical reasoning.
119
  **Number of model parameters:** 3B
120
 
121
 
122
  ## Input:
123
+ - **Input Type(s):** Image, Text
124
+ - **Input Format(s):** Medical images (JPEG, PNG), Text prompts (string)
125
+ - **Input Parameters:** Two-Dimensional (2D) images with accompanying text queries (1D)
126
+ - **Other Properties Related to Input:** Supports frontal chest X-ray images with flexible scaling. Accepts natural language prompts for medical queries, follow-up questions, and reasoning requests. Input images are automatically processed without specific size constraints.
127
 
128
  ### Input Specifications:
129
  - **Medical Images:** Chest X-ray images in standard medical imaging formats
 
131
  - **Interactive Dialogue:** Support for follow-up questions and clarification requests
132
 
133
  ## Output:
134
+ - **Output Type(s):** Text
135
+ - **Output Format:** Structured reasoning with XML-like tags
136
+ - **Output Parameters:** One-Dimensional (1D) Natural language reasoning and analysis
137
+ - **Other Properties Related to Output:** Outputs contain structured thinking processes enclosed in `<thinking>` tags showing step-by-step medical reasoning, followed by concise answers in `<answer>` tags. This format enables transparency in the model's diagnostic reasoning process and supports educational use cases.
138
 
139
  Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (GPU cores) and software frameworks (CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
140
 
 
166
  * Image
167
  * Text
168
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
  ## Inference:
170
  **Acceleration Engine:** PyTorch, Transformers
171
  **Test Hardware:**