How to use from
Docker Model Runner
docker model run hf.co/Phora68/rapha:Q4_K_M
Quick Links

🩺 Rapha β€” Clinical AI Physician Assistant (GGUF)

Rapha is a clinical AI assistant fine-tuned on Mistral-Nemo-12B using Unsloth.
It performs forward-chaining medical reasoning β€” gathering symptoms conversationally,
reasoning step by step, and escalating structured findings to a physician.

⚠️ Rapha is a research prototype. It does not diagnose. All outputs must be reviewed by a qualified medical professional.


πŸš€ Quickstart

Ollama

ollama run hf.co/Phora68/rapha

llama.cpp

./llama-cli -m rapha-q4_k_m.gguf \
  --chat-template mistral \
  -p "I've been having chest pain and shortness of breath for two days." \
  -n 512

Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
    model_path = "rapha-q4_k_m.gguf",
    n_ctx      = 2048,
    n_gpu_layers = -1,   # use all GPU layers
)

response = llm.create_chat_completion(messages=[
    {"role": "user", "content": "I've had a persistent headache for three days and I'm really worried."}
])
print(response["choices"][0]["message"]["content"])

πŸ“Š Model Details

Property Value
Base model mistralai/Mistral-Nemo-Base-2407
Fine-tuning QLoRA (r=64, Ξ±=16) via Unsloth
Quantisation Q4_K_M
Context length 2048 tokens
Training format ShareGPT
Chat template Mistral [INST]
Domain Clinical / Medical triage
Dataset 200,000 samples (170k train / 20k val / 10k test)

⚠️ Limitations

  • Not a medical device
  • Does not provide diagnoses
  • Must be reviewed by a qualified clinician
  • Not validated for clinical deployment
Downloads last month
31
GGUF
Model size
12B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Phora68/rapha

Adapter
(9)
this model