Osaurus

Ornith-1.0-9B · MXFP4

Official OsaurusAI MXFP4 build of deepreinforce-ai/Ornith-1.0-9B (MIT) — a vision-language model on a Qwen3.5 hybrid backbone. Near-lossless 8-bit microscaled FP; runs on Apple Silicon via Osaurus / mlx.

  • ~5.3 GB (from ~18.8 GB bf16) bundle.
  • MXFP8: microscaled FP4 (group-size 32, 4-bit) on the language-model linear weights; the vision tower is preserved at fp16, short-conv kernels and norms kept fp16.
  • Vision-language (image + text → text).

Architecture

Family qwen3_5 (dense, hybrid)
Text layers 32 — 24 Gated-DeltaNet (linear-attention) + 8 full-attention
Hidden 4096 · untied lm_head
Vision ViT tower (model.visual) preserved fp16
Cache hybrid (GDN state + KV for attention layers)

Usage

# text
python -m mlx_lm generate --model OsaurusAI/Ornith-1.0-9B-MXFP4 --prompt "Explain a hash map in two sentences."

For image+text, load in Osaurus or an MLX-VLM runtime that supports qwen3_5 vision.

Provenance

Downloads last month
62
Safetensors
Model size
2B params
Tensor type
U32
·
U8
·
F16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OsaurusAI/Ornith-1.0-9B-MXFP4

Finetuned
(12)
this model