Instructions to use faith-ogun/gluten-gemma4-marsh-merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use faith-ogun/gluten-gemma4-marsh-merged with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="faith-ogun/gluten-gemma4-marsh-merged") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("faith-ogun/gluten-gemma4-marsh-merged") model = AutoModelForImageTextToText.from_pretrained("faith-ogun/gluten-gemma4-marsh-merged") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use faith-ogun/gluten-gemma4-marsh-merged with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "faith-ogun/gluten-gemma4-marsh-merged" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "faith-ogun/gluten-gemma4-marsh-merged", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/faith-ogun/gluten-gemma4-marsh-merged
- SGLang
How to use faith-ogun/gluten-gemma4-marsh-merged with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "faith-ogun/gluten-gemma4-marsh-merged" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "faith-ogun/gluten-gemma4-marsh-merged", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "faith-ogun/gluten-gemma4-marsh-merged" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "faith-ogun/gluten-gemma4-marsh-merged", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Unsloth Studio new
How to use faith-ogun/gluten-gemma4-marsh-merged with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for faith-ogun/gluten-gemma4-marsh-merged to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for faith-ogun/gluten-gemma4-marsh-merged to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for faith-ogun/gluten-gemma4-marsh-merged to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="faith-ogun/gluten-gemma4-marsh-merged", max_seq_length=2048, ) - Docker Model Runner
How to use faith-ogun/gluten-gemma4-marsh-merged with Docker Model Runner:
docker model run hf.co/faith-ogun/gluten-gemma4-marsh-merged
Glüten · Gemma 4 E4B Marsh classifier (merged fp16)
Gemma 4 E4B fine-tuned via Unsloth QLoRA on the IBDColEpi colon-biopsy patch dataset (Pettersen et al. 2022, CC0). The fine-tuned LoRA adapter is merged into the base in fp16 and re-saved as a single standalone multimodal checkpoint, so deployment doesn't need PEFT or bitsandbytes at runtime.
This is the structural-layer classifier used by the Glüten coeliac disease digital twin: https://gluten--gluten-gemma4.europe-west4.hosted.app
Quick start
import torch
from transformers import AutoModelForImageTextToText, AutoProcessor
from PIL import Image
REPO = "faith-ogun/gluten-gemma4-marsh-merged"
processor = AutoProcessor.from_pretrained(REPO)
model = AutoModelForImageTextToText.from_pretrained(
REPO,
dtype=torch.bfloat16,
device_map="auto",
)
model.eval()
img = Image.open("patch.tif").convert("RGB").resize((224, 224))
messages = [{
"role": "user",
"content": [
{"type": "image", "image": img},
{"type": "text", "text": (
"You are a histopathology assistant. Classify the Marsh "
"grade of this HE-stained intestinal biopsy patch. Respond "
"with exactly one of: Marsh-0, Marsh-1, Marsh-3a, Marsh-3b."
)},
],
}]
inputs = processor.apply_chat_template(
messages, add_generation_prompt=True, tokenize=True,
return_dict=True, return_tensors="pt",
).to("cuda")
with torch.inference_mode():
out = model.generate(**inputs, max_new_tokens=8, do_sample=False)
print(processor.decode(out[0, inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
# -> "Marsh-3b"
Single-GPU inference fits on a 24 GB card (L4/A10) at fp16. T4 (14.5 GB)
will OOM unless you cpu-offload via device_map="auto".
What this model does
Given a 224×224 RGB patch from an HE-stained intestinal mucosa biopsy, the model returns one of four Marsh-equivalent labels:
| Label | Meaning |
|---|---|
Marsh-0 |
normal mucosa |
Marsh-1 |
infiltrative (raised intraepithelial lymphocytes, intact villi) |
Marsh-3a |
partial villous atrophy |
Marsh-3b |
subtotal villous atrophy |
Marsh-2 is omitted (under-represented in the literature and in the training data; the molecular reference dataset GSE164883 also skips it).
Provenance
This checkpoint is the deployment artefact of the v2 training run
(see notebooks/gluten-gemma4-marsh-qlora-v2.ipynb in the project repo):
- Started from
google/gemma-4-E4B-it, loaded in 4-bit via UnslothFastVisionModel.from_pretrained(load_in_4bit=True). - Vision tower frozen (
finetune_vision_layers=False); LoRA applied to the language head only (r=16,lora_alpha=32,dropout=0.05, target attention + MLP modules). - SFT for 250 steps on a 2,000-patch sample from IBDColEpi Trainset
with
per_device_train_batch_size=2,gradient_accumulation_steps=4,learning_rate=2e-4, cosine schedule, fp16 (T4 doesn't support bf16). model.save_pretrained_merged(save_method='merged_16bit')then pushed here.
Training time: ~30 min on a free Kaggle T4. The notebook is public
and reproducible end-to-end given a HF_TOKEN with the Gemma 4 license
accepted.
Pseudo-label methodology (read this first)
IBDColEpi ships pixel-level epithelium-segmentation masks, NOT Marsh grades. The labels in this fine-tune are derived from epithelium fraction per patch:
epi_frac = (mask > 0).mean()
# quantile-binned over the full dataset into four classes:
# Q1 (lowest coverage) → Marsh-3b
# Q2 → Marsh-3a
# Q3 → Marsh-1
# Q4 (highest coverage) → Marsh-0
The motivation is published in Pettersen et al. 2022 (DOI 10.18710/TLA01U): epithelium quantification is the same primitive a pathologist counts when staging villous atrophy, and the authors explicitly name coeliac disease as an applicable use case for the same segmentation pipeline.
These are weak-supervision proxy labels, not pathologist-validated Marsh scores. Anyone retraining with pathologist-validated grades should expect substantially higher accuracy on Marsh-1 / Marsh-3a specifically; the labels in this dataset are most accurate at the extremes (no epithelium / mostly epithelium) and noisiest in the middle.
Performance
Aggregate (v3 stratified audit, 400 held-out test patches)
| Metric | Value |
|---|---|
| Accuracy (training-time eval, in v2 kernel) | 0.70 |
| Accuracy (deployed merged-fp16, v3 audit) | 0.645 |
| Macro-F1 (deployed) | 0.624 |
Per-class (deployed model, v3 audit)
| Class | Precision | Recall | F1 | n |
|---|---|---|---|---|
| Marsh-0 | 0.68 | 0.73 | 0.70 | 81 |
| Marsh-1 | 0.49 | 0.57 | 0.53 | 96 |
| Marsh-3a | 0.58 | 0.35 | 0.44 | 99 |
| Marsh-3b | 0.78 | 0.88 | 0.83 | 124 |
Marsh-3b — the most clinically actionable severe-atrophy class — is the strongest. Errors confine to adjacent grades (3a↔3b, 0↔1), mirroring 73–80% inter-pathologist agreement reported in the literature.
Stratified bias audit
The full audit (figure + CSV + per-stratum summary) is in the project
repo under results/marsh_stratified*. Key finding:
- Per-WSI accuracy spans 0.00–1.00 (σ=0.21) across 35 source slides drawn from a single hospital and scanner.
In other words, on the same dataset, same scanner, same centre, accuracy varies by 100 percentage points slide-to-slide. Cross-site generalisation is the load-bearing unknown the field has no answer for, because the SOTA Cambridge benchmark (NEJM AI 2025, 3,383 duodenal WSIs) sits behind a UK NHS data-sharing agreement (IRAS 162057) and has not published per-demographic metrics. Performance for African, South Asian, East Asian, Hispanic, or Middle Eastern patients cannot be quantified for any publicly available CD biopsy model, including this one.
Limitations
- Pseudo-labels, not pathologist-validated. See "Pseudo-label methodology" above. Re-training on pathologist grades is a 30-minute job on a T4 if those labels become available.
- Trained on colon, not duodenum. Coeliac disease is duodenal; the IBDColEpi dataset is colon. Pettersen et al. argue the segmentation primitive transfers, but the canonical structural reference would be the Cambridge duodenal dataset which is access-restricted.
- Single-site, single-scanner training data. No demographic metadata accompanies IBDColEpi — no patient ancestry, sex, age, hospital, or scanner. Generalisation across centres and populations is not measurable from the available data.
- Marsh-2 absent. Quantile binning produces four classes, not five.
- Patch-level, not slide-level. The model classifies 224×224 patches individually. Whole-slide inference would require tiling + aggregation logic not provided here.
- Not a diagnostic tool. This is a research prototype. Marsh grading in practice depends on context the model cannot see (IEL counts, clinical history, serology). The output is one input among many to a pathologist's judgement, not a substitute for it.
Intended use
- Research and reproducibility of the Glüten coeliac disease digital twin project.
- Baseline for groups with pathologist-validated Marsh datasets who want a starting point for proper supervised fine-tuning.
- Educational demonstration of multimodal LoRA fine-tuning of Gemma 4 E4B on histopathology, including the dequantize-and-merge deployment pattern.
Out-of-scope
- Clinical decision-making
- Whole-slide analysis without external tiling and aggregation
- Validation for any non-European patient population (no demographic metadata in training data)
- Drug, treatment, or therapy decisions of any kind
Citation
If this model is useful in your work, please cite:
@misc{ogundimu2026gluten,
title = {Glüten: a coeliac disease digital twin with per-layer
ancestry-aware confidence},
author = {Ogundimu, Faith},
year = {2026},
note = {Gemma 4 Good Hackathon submission, Kaggle/Google DeepMind},
url = {https://github.com/faith-ogun/Gluten},
}
And the underlying dataset:
@dataset{pettersen2022ibdcolepi,
title = {IBDColEpi: Annotated WSIs of colon epithelium and pixel-level
U-Net segmentation models},
author = {Pettersen, Henrik S. and others},
year = {2022},
doi = {10.18710/TLA01U},
url = {https://dataverse.no/dataset.xhtml?persistentId=doi:10.18710/TLA01U},
}
And the base model:
@misc{gemma4,
title = {Gemma 4},
author = {Google DeepMind},
year = {2026},
url = {https://ai.google.dev/gemma/docs/core/model_card_4},
}
License
This fine-tuned model inherits Gemma's licence (https://ai.google.dev/gemma/terms). The LoRA delta and the merged weights themselves are released under the same terms. The IBDColEpi training data is CC0 1.0 (no attribution required, public domain).
Acknowledgements
- Google DeepMind for Gemma 4 and the Gemma 4 Good Hackathon
- Unsloth for the QLoRA training framework
- Pettersen et al. and NTNU / St. Olavs Hospital, Trondheim for IBDColEpi
- Modal for hosting the merged-fp16 inference at the live demo URL
- Downloads last month
- 48