Instructions to use IQuestLab/UniReason-Med with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use IQuestLab/UniReason-Med with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="IQuestLab/UniReason-Med")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("IQuestLab/UniReason-Med")
model = AutoModelForMultimodalLM.from_pretrained("IQuestLab/UniReason-Med")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use IQuestLab/UniReason-Med with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "IQuestLab/UniReason-Med"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IQuestLab/UniReason-Med",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/IQuestLab/UniReason-Med

SGLang

How to use IQuestLab/UniReason-Med with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "IQuestLab/UniReason-Med" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IQuestLab/UniReason-Med",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "IQuestLab/UniReason-Med" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IQuestLab/UniReason-Med",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use IQuestLab/UniReason-Med with Docker Model Runner:
```
docker model run hf.co/IQuestLab/UniReason-Med
```

UniReason-Med / README.md

yunyin007

Add model card and Apache-2.0 license (#4)

68d7c25 8 days ago

preview code

Raw

History Blame Contribute Delete

3.98 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-VL-7B-Instruct
	pipeline_tag: image-text-to-text
	library_name: transformers
	tags:
	- medical
	- multimodal
	- vqa
	- visual-grounding
	- chain-of-thought
	- reinforcement-learning
	- grpo
	- qwen2_5_vl
	language:
	- en
	datasets:
	- IQuestLab/UniReason-Med-Data
	---

	# UniReason-Med

	UniReason-Med is a medical multimodal model that accompanies the paper
	"UniReason-Med: A Shared Grounded Reasoning Interface for 2D-to-3D Transfer in Medical VQA".

	It studies whether grounded reasoning supervision from abundant 2D medical images can improve
	3D medical VQA when both modalities share a common reasoning interface. A single checkpoint
	processes either a 2D image or a slice-serialized 3D volume, generating interleaved textual
	reasoning and localized visual evidence through shared bounding-box syntax and region-token
	injection under a common grounded reasoning policy.

	- Base model: [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)
	- Training data: [IQuestLab/UniReason-Med-Data](https://huggingface.co/datasets/IQuestLab/UniReason-Med-Data)
	- Code: [github.com/IQuestLab/unireason-med](https://github.com/IQuestLab/unireason-med)
	- Modalities: image + text → text
	- License: Apache-2.0

	## Model Description

	UniReason-Med is trained to interleave free-form reasoning with localized visual evidence.
	During reasoning, the model emits bounding boxes over the input image; the referenced region is
	cropped and re-injected as additional visual context for the next reasoning step (a
	grounded chain-of-thought, GCoT, interface). The same shared interface is applied to 2D images
	and to 3D volumes serialized as ordered slice sequences, which allows grounded supervision
	collected on plentiful 2D data to transfer to 3D reasoning.

	A central result of the paper is that joint 2D+3D grounded supervision improves 3D reasoning
	compared with 3D-only training under matched schedules, while the shared grounding interface
	also benefits 2D tasks.

	## Training

	The model is built with a two-stage recipe:

	1. Supervised fine-tuning (SFT) on the UniMed-CoT dataset — 220K grounded chain-of-thought
	samples (170K 2D + 50K 3D) with interleaved textual reasoning and grounded visual evidence.
	Vision tower and the multimodal projector are frozen; the language model is fully fine-tuned.
	2. Reinforcement learning (GRPO) with outcome-level rewards. RL uses answer-correctness and
	format rewards rather than ground-truth localization-overlap rewards such as IoU or Dice.

	This checkpoint is the merged Hugging Face model exported from the GRPO stage.

	Training code (LLaMA-Factory for SFT, verl for GRPO) and configs are released at:
	<https://github.com/IQuestLab/unireason-med>.

	## Intended Use and Limitations

	- Intended use: research on medical multimodal reasoning, visual grounding, and 2D-to-3D
	transfer. Suitable for academic benchmarking and method development.
	- Out of scope: UniReason-Med is a research artifact and is not a medical device. It must
	not be used for clinical diagnosis, treatment decisions, or any real patient care.
	- Limitations: outputs may be incorrect, incomplete, or biased; performance depends on
	imaging modality, anatomy, and distribution shift from the training data. Predicted bounding
	boxes are reasoning aids, not validated localization. Always involve qualified medical
	professionals for any health-related decision.

	## License

	Released under the [Apache License 2.0](./LICENSE), consistent with the base model
	Qwen2.5-VL-7B-Instruct. Note the research-only intended use and the medical-use limitations above.

	## Citation

	If you use this model, please cite the UniReason-Med paper:

	```bibtex
	@article{unireasonmed,
	title = {UniReason-Med: A Shared Grounded Reasoning Interface for 2D-to-3D Transfer in Medical VQA},
	author = {UniReason-Med Team},
	year = {2025}
	}
	```