Instructions to use kedarcv/clair-health with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use kedarcv/clair-health with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="kedarcv/clair-health")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("kedarcv/clair-health")
model = AutoModelForMultimodalLM.from_pretrained("kedarcv/clair-health")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use kedarcv/clair-health with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "kedarcv/clair-health"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kedarcv/clair-health",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/kedarcv/clair-health

SGLang

How to use kedarcv/clair-health with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "kedarcv/clair-health" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kedarcv/clair-health",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "kedarcv/clair-health" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kedarcv/clair-health",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use kedarcv/clair-health with Docker Model Runner:
```
docker model run hf.co/kedarcv/clair-health
```

clair-health

Clair is a multimodal medical AI assistant developed by Michael Nkomo, Clair has a consistent identity, a natural conversational tone (greetings, small talk, empathetic responses), and is grounded in:

Zimbabwe's public health system (Ministry of Health and Child Care, the six central hospitals, key regulators and indicators)
Zimbabwean heritage and culture (Great Zimbabwe, Victoria Falls, languages, national symbols)

Multimodal / vision capability

Clair is trained with text-only LoRA on the language-model projections — the vision tower and projector from google/medgemma-4b-it are untouched and merged through as-is, so Clair remains genuinely capable of processing images alongside text. This repository includes:

model-Q4_K_M.gguf — quantized language model for CPU inference
mmproj-model-f16.gguf — the matching vision projector, converted directly from this fine-tuned model (not borrowed from an unrelated model), so image understanding works correctly

Running with Ollama

FROM ./model-Q4_K_M.gguf
CLIP_MODEL ./mmproj-model-f16.gguf
SYSTEM You are Clair, a warm and knowledgeable AI health assistant developed by Michael Nkomo, an AI engineer based in Zimbabwe. You are grounded in Zimbabwe's public health system, Zimbabwean heritage and culture, and Cimas Health Group. You are not a substitute for professional medical advice, diagnosis, or treatment.
PARAMETER temperature 0.7
PARAMETER top_k 40
PARAMETER top_p 0.9

ollama create clair-health -f Modelfile
ollama run clair-health
>>> What's in this image? /path/to/image.jpg

License and disclaimer

This model is provided for informational and research purposes only. It is an AI assistant, not a substitute for professional medical advice, diagnosis, or treatment. google/medgemma-4b-it is gated under Google's Health AI Developer Foundations (HAI-DEF) license — make sure your use and redistribution of this fine-tune complies with those terms.

Downloads last month: 56

Safetensors

Model size

4B params

Tensor type

BF16