Gemma 4 E2B-it — 4-bit (MLX)
Properly converted with all vision and audio tower weights verified intact
Why this exists: Some mlx-community conversions of Gemma 4 have broken or zeroed-out vision/audio tower weights, producing models that appear functional for text but silently fail on image and audio inputs. This is a clean conversion from the original
google/gemma-4-E2B-itwith every multimodal weight tensor verified non-zero.
Model Details
| Property | Value |
|---|---|
| Base Model | google/gemma-4-E2B-it |
| Parameters | 2.3B effective (5.1B total with Per-Layer Embeddings) |
| Quantization | 4-bit affine, mixed-precision (MLP layers kept at 8-bit) |
| Avg Bits/Weight | 6.851 |
| Model Size | 4.1 GB |
| Architecture | Gemma 4 (text + vision + audio) |
| Context Length | 128K tokens |
| Vocabulary | 262K tokens |
Multimodal Weight Verification
Every tensor in every multimodal component was loaded and checked for max(abs(tensor)) > 0. Zero broken weights found.
| Component | Tensor Count | Status |
|---|---|---|
| Vision Tower (SigLIP) | 658 | All non-zero |
| Audio Tower (Conformer) | 751 | All non-zero |
| Language Model | 1,240 | All non-zero |
| Total | 2,649 | All verified |
Mixed-Precision Quantization
mlx-vlm's default quantization predicate automatically keeps MLP gate/up/down projections at 8-bit across all 35 language model layers while quantizing attention and other weights to 4-bit. This improves quality over naive uniform 4-bit quantization.
Usage
# Requires Osaurus (https://osaurus.ai)
osaurus serve OsaurusAI/gemma-4-E2B-it-4bit
# Python API
from mlx_vlm import load, generate
model, processor = load("OsaurusAI/gemma-4-E2B-it-4bit")
# Text-only
output = generate(model, processor, "Explain quantum computing", max_tokens=500)
# With image
output = generate(model, processor, "Describe this image", ["path/to/image.jpg"], max_tokens=500)
Conversion Details
| Detail | Value |
|---|---|
| Tool | mlx-vlm v0.4.4 |
| Source dtype | bfloat16 |
| Quantization mode | affine |
| Group size | 64 |
| Source | google/gemma-4-E2B-it (original Google release) |
Converted by Osaurus AI
- Downloads last month
- 441
4-bit
Model tree for OsaurusAI/gemma-4-E2B-it-4bit
Base model
google/gemma-4-E2B-it