nvidia
/

NV-Reason-CXR-3B

@@ -22,7 +22,8 @@ NV-Reason-CXR-3B is a specialized vision-language model designed for medical rea
 This model is for research and development only.
-Try a Demo [here](https://huggingface.co/spaces/nvidia/nv-reason-cxr)
 ## Quick start
@@ -111,18 +112,18 @@ This model is designed for research and educational purposes only and should not
 Huggingface: 10/27/2025 via https://huggingface.co/NVIDIA
 ## Model Architecture:
-**Architecture Type:** Transformer
-**Network Architecture:** Vision-Language Model based on Qwen2.5-VL architecture with medical reasoning capabilities
 This model was developed by fine-tuning Qwen2.5-VL-3B using Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO) for enhanced medical reasoning.
 **Number of model parameters:** 3B
 ## Input:
-**Input Type(s):** Image, Text
-**Input Format(s):** Medical images (JPEG, PNG), Text prompts (string)
-**Input Parameters:** Two-Dimensional (2D) images with accompanying text queries (1D)
-**Other Properties Related to Input:** Supports frontal chest X-ray images with flexible scaling. Accepts natural language prompts for medical queries, follow-up questions, and reasoning requests. Input images are automatically processed without specific size constraints.
 ### Input Specifications:
 - **Medical Images:** Chest X-ray images in standard medical imaging formats
@@ -130,10 +131,10 @@ This model was developed by fine-tuning Qwen2.5-VL-3B using Supervised Fine-Tuni
 - **Interactive Dialogue:** Support for follow-up questions and clarification requests
 ## Output:
-**Output Type(s):** Text
-**Output Format:** Structured reasoning with XML-like tags
-**Output Parameters:** One-Dimensional (1D) Natural language reasoning and analysis
-**Other Properties Related to Output:** Outputs contain structured thinking processes enclosed in `<thinking>` tags showing step-by-step medical reasoning, followed by concise answers in `<answer>` tags. This format enables transparency in the model's diagnostic reasoning process and supports educational use cases.
 Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (GPU cores) and software frameworks (CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
@@ -165,32 +166,6 @@ Large-scale chest X-ray datasets including MIMIC-CXR, ChestXRay14, and CheXpert.
 * Image
 * Text
-**Image Training Data Size:**
-* Less than a Million Images
-**Text Training Data Size:**
-* Less than a Billion Tokens
-**Data Collection Method by dataset:**
-* Hybrid: Human, Automatic/Sensors
-**Labeling Method by dataset:**
-* Hybrid: Human, Synthetic
-## Testing Dataset:
-**Data Collection Method by dataset:**
-* Hybrid: Human, Automatic/Sensors
-**Labeling Method by dataset:**
-* Hybrid: Human, Synthetic
-## Evaluation Dataset:
-**Data Collection Method by dataset:**
-* Hybrid: Human, Automatic/Sensors
-**Labeling Method by dataset:**
-* Hybrid: Human, Synthetic
 ## Inference:
 **Acceleration Engine:** PyTorch, Transformers
 **Test Hardware:**

 This model is for research and development only.
+💻 [\[Github code\]](https://github.com/NVIDIA-Medtech/NV-Reason-CXR)
+🩻 [\[Web Demo\]](https://huggingface.co/spaces/nvidia/nv-reason-cxr)
 ## Quick start
 Huggingface: 10/27/2025 via https://huggingface.co/NVIDIA
 ## Model Architecture:
+- **Architecture Type:** Transformer
+- **Network Architecture:** Vision-Language Model based on Qwen2.5-VL architecture with medical reasoning capabilities
 This model was developed by fine-tuning Qwen2.5-VL-3B using Supervised Fine-Tuning (SFT) and Group Relative Policy Optimization (GRPO) for enhanced medical reasoning.
 **Number of model parameters:** 3B
 ## Input:
+- **Input Type(s):** Image, Text
+- **Input Format(s):** Medical images (JPEG, PNG), Text prompts (string)
+- **Input Parameters:** Two-Dimensional (2D) images with accompanying text queries (1D)
+- **Other Properties Related to Input:** Supports frontal chest X-ray images with flexible scaling. Accepts natural language prompts for medical queries, follow-up questions, and reasoning requests. Input images are automatically processed without specific size constraints.
 ### Input Specifications:
 - **Medical Images:** Chest X-ray images in standard medical imaging formats
 - **Interactive Dialogue:** Support for follow-up questions and clarification requests
 ## Output:
+- **Output Type(s):** Text
+- **Output Format:** Structured reasoning with XML-like tags
+- **Output Parameters:** One-Dimensional (1D) Natural language reasoning and analysis
+- **Other Properties Related to Output:** Outputs contain structured thinking processes enclosed in `<thinking>` tags showing step-by-step medical reasoning, followed by concise answers in `<answer>` tags. This format enables transparency in the model's diagnostic reasoning process and supports educational use cases.
 Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (GPU cores) and software frameworks (CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
 * Image
 * Text
 ## Inference:
 **Acceleration Engine:** PyTorch, Transformers
 **Test Hardware:**