Instructions to use prabhal/mistral-clinical-simplifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prabhal/mistral-clinical-simplifier with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="prabhal/mistral-clinical-simplifier")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("prabhal/mistral-clinical-simplifier")
model = AutoModelForCausalLM.from_pretrained("prabhal/mistral-clinical-simplifier")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

PEFT
How to use prabhal/mistral-clinical-simplifier with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use prabhal/mistral-clinical-simplifier with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prabhal/mistral-clinical-simplifier"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prabhal/mistral-clinical-simplifier",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/prabhal/mistral-clinical-simplifier

SGLang

How to use prabhal/mistral-clinical-simplifier with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prabhal/mistral-clinical-simplifier" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prabhal/mistral-clinical-simplifier",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prabhal/mistral-clinical-simplifier" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prabhal/mistral-clinical-simplifier",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use prabhal/mistral-clinical-simplifier with Docker Model Runner:
```
docker model run hf.co/prabhal/mistral-clinical-simplifier
```

library_name: transformers base_model: mistralai/Mistral-7B-Instruct-v0.2 language:

en license: apache-2.0 tags: peft lora qlora medical clinical-nlp text-simplification fine-tuned pipeline_tag: text-generation datasets: armanc/pubmed-rct20k metrics: rouge bertscore

mistral-clinical-simplifier A fine-tuned version of Mistral-7B-Instruct-v0.2 trained to convert complex clinical and biomedical text into plain language that a patient can understand. Demo: https://huggingface.co/spaces/prabhal/mistral-clinical-simplifier

Model Description This model was trained using Supervised Fine-Tuning (SFT) with QLoRA on a custom dataset derived from PubMed RCT abstracts. The task is clinical text simplification: given a sentence or paragraph written for clinicians, the model produces a rewritten version that a patient with no medical background can read and understand.

Base model: mistralai/Mistral-7B-Instruct-v0.2 Fine-tuning method: QLoRA (4-bit quantization with NF4, LoRA rank 16, alpha 32) Target modules: q_proj, k_proj, v_proj, o_proj Training hardware: T4 GPU (Google Colab)

Training Data Sourced from the PubMed RCT 20k dataset. Sentences longer than 80 characters were extracted and cleaned to remove annotation artifacts. Sentence-level and paragraph-level inputs were combined to give the model exposure to both short and multi-sentence clinical contexts. The final dataset contained approximately 400 training examples with a 90/10 train-validation split.

How to Use from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig from peft import PeftModel import torch

base_model_id = "mistralai/Mistral-7B-Instruct-v0.2" lora_model_id = "prabhal/mistral-clinical-simplifier"

bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4" )

tokenizer = AutoTokenizer.from_pretrained(base_model_id)

base_model = AutoModelForCausalLM.from_pretrained( base_model_id, quantization_config=bnb_config, device_map="auto" )

model = PeftModel.from_pretrained(base_model, lora_model_id) model.eval()

def simplify(text): prompt = f"""### Instruction: Simplify the following clinical text into patient-friendly explanation.

Input:

{text}

Response:

""" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): output = model.generate( **inputs, max_new_tokens=200, do_sample=True, temperature=0.3, top_p=0.9 ) return tokenizer.decode(output[0], skip_special_tokens=True)

Evaluation Results

Evaluated on 20 held-out samples comparing the fine-tuned model against the base Mistral-7B-Instruct-v0.2 without any fine-tuning.

Readability (Flesch-Kincaid Grade Level): The original clinical text averaged a grade level of 15.58, meaning it reads at a college sophomore level. The base model brought this down to 12.51. The fine-tuned model reduced it further to 7.47, which is roughly middle-school level and aligns with the broadly recommended target for patient-facing health communication.

ROUGE scores: ROUGE-1 improved from 0.3773 on the base model to 0.5274 on the fine-tuned model. ROUGE-L improved from 0.2520 to 0.3872. This indicates the fine-tuned model produces outputs that are significantly closer in word overlap and sequence structure to the reference simplifications.

BERTScore F1: The base model scored 0.8878 and the fine-tuned model scored 0.9034. The gap is smaller here because BERTScore measures meaning-level alignment rather than surface overlap, and the base model is already a strong language model. The improvement confirms the fine-tuned outputs are semantically closer to the references without introducing drift.

LLM-as-Judge (GPT, scale 1 to 10): The fine-tuned model scored 8.60 on simplicity, confirming it reliably produces patient-friendly language. Accuracy scored 6.55 and faithfulness scored 6.45, reflecting the inherent difficulty of preserving exact medical meaning while simplifying vocabulary. These mid-range scores on accuracy and faithfulness point to hallucination and meaning drift as the primary areas for future improvement.

Limitations This model is intended for educational and research purposes only. It should not be used to provide medical advice or replace clinical communication. Faithfulness scores indicate that meaning drift and hallucination remain real risks, and outputs should always be reviewed by a qualified professional before reaching patients.

Downloads last month: 8

Safetensors

Model size

7B params

Tensor type

F32

BF16

Model tree for prabhal/mistral-clinical-simplifier

Base model

mistralai/Mistral-7B-Instruct-v0.2

Adapter

(1195)

this model