FADA-SKD 0.8B (GGUF)

Fetal Anatomy Delineation and Analysis - Selective Knowledge Distillation

On-device vision-language model for fetal ultrasound image analysis, optimized for mobile deployment via llama.cpp.

Model Overview

Property Value
Base Model Qwen3.5-VL 0.8B (via Unsloth)
Fine-tuning LoRA (rank 16, alpha 32) via Unsloth SKD
Training Strategy Offline Selective Knowledge Distillation
Architecture Qwen3.5 (text) + Qwen3-VL Merger (vision)
Context Length 4096 tokens
Text Model Q4_K_M quantized (~517 MB)
Vision Encoder FP16 (~196 MB)
Format GGUF v3
Total Size ~713 MB

Teacher Models (SKD)

Teacher Weight Role
FetalCLIP 0.40 Fetal anatomy representation alignment
UltraSAM 0.25 Ultrasound segmentation knowledge
USF-MAE 0.20 Self-supervised ultrasound features
UltraFedFM 0.15 Federated foundation model knowledge

Training: 21,144 steps, 3 epochs, fusion MSE loss, 19,000+ images across 32+ anatomical classes.

Files

File Description Size
gguf/fada-skd-0.8b-Q4_K_M.gguf Text model (Q4_K_M quantized) 517 MB
gguf/fada-skd-0.8b-mmproj-f16.gguf Vision encoder (FP16 mmproj) 196 MB
tokenizer.json Tokenizer vocabulary -
tokenizer_config.json Tokenizer configuration -
chat_template.jinja Chat template for inference -

Usage with llama.cpp

Text + Vision Inference

# Using llama-mtmd-cli from llama.cpp
./llama-mtmd-cli \
  -m fada-skd-0.8b-Q4_K_M.gguf \
  --mmproj fada-skd-0.8b-mmproj-f16.gguf \
  -p "Analyze this fetal ultrasound image" \
  --image ultrasound.jpg

Mobile Deployment

This model is designed for on-device mobile deployment using the FADA Android app with llama.cpp's multimodal (mtmd) library for native inference without cloud connectivity.

Capabilities

FADA-SKD performs 5-phase fetal ultrasound analysis:

  1. Interpretation: Structured clinical report (imaging plane, anatomical structures, gestational age, image quality, normality assessment)
  2. Classification: Anatomical view identification (BPD plane, four-chamber, Doppler, etc.)
  3. Mapping: Intelligent routing to relevant detection/segmentation classes
  4. Detection: Bounding box localization of anatomical structures (normalized 0-1000 coordinates)
  5. Segmentation: Polygon mask delineation of anatomical regions

Supported Anatomy (32+ classes)

  • Brain: BPD, CSP, lateral ventricle, brain parenchyma
  • Cardiac: Heart chambers, thorax
  • First Trimester: CRL, NT, nasal bone/skin/tip
  • Doppler: Arteries, veins, liver, stomach
  • Pelvimetry: Fetal head, pubic symphysis
  • Body/Pose: Abdomen, limbs, head
  • Keypoints: CRL endpoints, NT caliper points, scale bar

Citation

If you use this model in your research, please cite:

@article{fada2026,
  title={FADA: Knowledge-Distilled Vision-Language Models for Accessible Fetal
         Ultrasound Interpretation in Low-Resource Obstetric Settings},
  author={Alzubaidi, Mahmood and Agus, Marco},
  journal={Arxiv},
  year={2026},
  note={Submitted to the "Digital Health in Low-Resource Settings" Collection}
} year={2026}
}

License

Apache 2.0

Disclaimer

This is a research prototype. Not for clinical diagnostic use. All outputs should be reviewed by qualified medical professionals.

Downloads last month
561
GGUF
Model size
0.8B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mshz88/FADA-Mobile-GGUF

Adapter
(21)
this model