Add model card and Apache-2.0 license

#4
Files changed (1) hide show
  1. README.md +1 -37
README.md CHANGED
@@ -32,6 +32,7 @@ injection under a common grounded reasoning policy.
32
 
33
  - **Base model:** [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)
34
  - **Training data:** [IQuestLab/UniReason-Med-Data](https://huggingface.co/datasets/IQuestLab/UniReason-Med-Data)
 
35
  - **Modalities:** image + text → text
36
  - **License:** Apache-2.0
37
 
@@ -63,36 +64,6 @@ This checkpoint is the merged Hugging Face model exported from the GRPO stage.
63
  Training code (LLaMA-Factory for SFT, verl for GRPO) and configs are released at:
64
  <https://github.com/IQuestLab/unireason-med>.
65
 
66
- ## Usage
67
-
68
- ```python
69
- from transformers import AutoModelForImageTextToText, AutoProcessor
70
- from PIL import Image
71
-
72
- model_id = "IQuestLab/UniReason-Med"
73
- processor = AutoProcessor.from_pretrained(model_id)
74
- model = AutoModelForImageTextToText.from_pretrained(model_id, device_map="auto")
75
-
76
- image = Image.open("medical_image.png")
77
- messages = [
78
- {
79
- "role": "user",
80
- "content": [
81
- {"type": "image"},
82
- {"type": "text", "text": "What is the most likely diagnosis? Reason step by step."},
83
- ],
84
- }
85
- ]
86
- prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
87
- inputs = processor(text=[prompt], images=[image], return_tensors="pt").to(model.device)
88
- output = model.generate(**inputs, max_new_tokens=1024)
89
- print(processor.batch_decode(output, skip_special_tokens=True)[0])
90
- ```
91
-
92
- The model produces interleaved reasoning with bounding boxes over the input image. Reproducing
93
- the full grounded crop-and-continue loop (crop the predicted region and feed it back as visual
94
- input) follows the agent/rollout logic in the released training code.
95
-
96
  ## Intended Use and Limitations
97
 
98
  - **Intended use:** research on medical multimodal reasoning, visual grounding, and 2D-to-3D
@@ -104,13 +75,6 @@ input) follows the agent/rollout logic in the released training code.
104
  boxes are reasoning aids, not validated localization. Always involve qualified medical
105
  professionals for any health-related decision.
106
 
107
- ## Data Notice
108
-
109
- The public training-data release keeps 3D examples text-only and does not redistribute 3D image
110
- data, because those samples are derived from M3D whose underlying image sources include
111
- Radiopaedia and may require separate authorization. See the
112
- [dataset card](https://huggingface.co/datasets/IQuestLab/UniReason-Med-Data) for details.
113
-
114
  ## License
115
 
116
  Released under the [Apache License 2.0](./LICENSE), consistent with the base model
 
32
 
33
  - **Base model:** [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)
34
  - **Training data:** [IQuestLab/UniReason-Med-Data](https://huggingface.co/datasets/IQuestLab/UniReason-Med-Data)
35
+ - **Code:** [github.com/IQuestLab/unireason-med](https://github.com/IQuestLab/unireason-med)
36
  - **Modalities:** image + text → text
37
  - **License:** Apache-2.0
38
 
 
64
  Training code (LLaMA-Factory for SFT, verl for GRPO) and configs are released at:
65
  <https://github.com/IQuestLab/unireason-med>.
66
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  ## Intended Use and Limitations
68
 
69
  - **Intended use:** research on medical multimodal reasoning, visual grounding, and 2D-to-3D
 
75
  boxes are reasoning aids, not validated localization. Always involve qualified medical
76
  professionals for any health-related decision.
77
 
 
 
 
 
 
 
 
78
  ## License
79
 
80
  Released under the [Apache License 2.0](./LICENSE), consistent with the base model