OpenMed
/

Ministral-3B-MedVL

@@ -24,16 +24,75 @@ Ministral-3B fine-tuned on ~200K medical VQA records from the SynthVision pipeli
 | **Fine-tuned** | **0.4789** | **0.3669** | **0.5664** | **0.4708** |
 | Delta | +1.9% | +13.2% | +14.5% | +9.6% |
-## Quick Start
 ```python
-from transformers import AutoModelForCausalLM, AutoProcessor
-model = AutoModelForCausalLM.from_pretrained(
-    "OpenMed/Ministral-3B-MedVL",
-    torch_dtype="auto", device_map="auto"
 )
-processor = AutoProcessor.from_pretrained("OpenMed/Ministral-3B-MedVL")
 ```
 ## Training Details

 | **Fine-tuned** | **0.4789** | **0.3669** | **0.5664** | **0.4708** |
 | Delta | +1.9% | +13.2% | +14.5% | +9.6% |
+## Usage
+### Transformers
+```python
+from transformers import AutoProcessor, AutoModelForImageTextToText
+model_id = "OpenMed/Ministral-3B-MedVL"
+processor = AutoProcessor.from_pretrained(model_id)
+model = AutoModelForImageTextToText.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
+messages = [
+    {
+        "role": "user",
+        "content": [
+            {"type": "image", "url": "https://example.com/xray.jpg"},
+            {"type": "text", "text": "What are the key findings in this chest X-ray?"},
+        ],
+    }
+]
+inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device)
+output = model.generate(**inputs, max_new_tokens=512)
+print(processor.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
+```
+### vLLM
+```python
+from vllm import LLM, SamplingParams
+llm = LLM(
+    model="OpenMed/Ministral-3B-MedVL",
+    tokenizer_mode="mistral",
+    config_format="mistral",
+    load_format="mistral",
+    max_model_len=4096,
+    limit_mm_per_prompt={"image": 1},
+)
+messages = [{"role": "user", "content": [
+    {"type": "image_url", "image_url": {"url": "https://example.com/xray.jpg"}},
+    {"type": "text", "text": "What are the key findings in this chest X-ray?"},
+]}]
+output = llm.chat(messages, SamplingParams(temperature=0, max_tokens=512))
+print(output[0].outputs[0].text)
+```
+### SGLang
+```bash
+# Launch server
+python -m sglang.launch_server --model-path OpenMed/Ministral-3B-MedVL --port 8000
+```
 ```python
+from openai import OpenAI
+client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
+response = client.chat.completions.create(
+    model="OpenMed/Ministral-3B-MedVL",
+    messages=[{"role": "user", "content": [
+        {"type": "image_url", "image_url": {"url": "https://example.com/xray.jpg"}},
+        {"type": "text", "text": "What are the key findings in this chest X-ray?"},
+    ]}],
+    max_tokens=512,
 )
+print(response.choices[0].message.content)
 ```
 ## Training Details