Gemma 4 E4B-it — 4-bit (MLX)
Properly converted with all vision and audio tower weights verified intact
Why this exists: The mlx-community 8-bit conversion of Gemma 4 E4B has broken/zeroed-out vision tower weights, producing a model that appears functional for text but silently fails on image and audio inputs. This is a clean conversion from the original
google/gemma-4-E4B-itwith every multimodal weight tensor verified non-zero.
Model Details
| Property | Value |
|---|---|
| Base Model | google/gemma-4-E4B-it |
| Parameters | 4.5B effective (8B total with Per-Layer Embeddings) |
| Quantization | 4-bit affine, mixed-precision (MLP layers kept at 8-bit) |
| Avg Bits/Weight | 6.900 |
| Model Size | 6.4 GB |
| Architecture | Gemma 4 (text + vision + audio) |
| Context Length | 128K tokens |
| Vocabulary | 262K tokens |
Multimodal Weight Verification
Every tensor in every multimodal component was loaded and checked for max(abs(tensor)) > 0. Zero broken weights found.
| Component | Tensor Count | Status |
|---|---|---|
| Vision Tower (SigLIP) | 658 | All non-zero |
| Audio Tower (Conformer) | 751 | All non-zero |
| Language Model | 1,485 | All non-zero |
| Total | 2,894 | All verified |
Mixed-Precision Quantization
mlx-vlm's default quantization predicate automatically keeps MLP gate/up/down projections at 8-bit across all 42 language model layers while quantizing attention and other weights to 4-bit. This improves quality over naive uniform 4-bit quantization.
Usage
# Requires Osaurus (https://osaurus.ai)
osaurus serve OsaurusAI/gemma-4-E4B-it-4bit
# Python API
from mlx_vlm import load, generate
model, processor = load("OsaurusAI/gemma-4-E4B-it-4bit")
# Text-only
output = generate(model, processor, "Explain quantum computing", max_tokens=500)
# With image
output = generate(model, processor, "Describe this image", ["path/to/image.jpg"], max_tokens=500)
Conversion Details
| Detail | Value |
|---|---|
| Tool | mlx-vlm v0.4.4 |
| Source dtype | bfloat16 |
| Quantization mode | affine |
| Group size | 64 |
| Source | google/gemma-4-E4B-it (original Google release) |
Converted by Osaurus AI
- Downloads last month
- 656
4-bit
Model tree for OsaurusAI/gemma-4-E4B-it-4bit
Base model
google/gemma-4-E4B-it