gemma-4-E2B-it-8bit / README.md
Osaurus-AI's picture
Update usage to Osaurus branding
61110dd verified
metadata
language:
  - en
library_name: mlx
license: gemma
license_link: https://ai.google.dev/gemma/docs/gemma_4_license
pipeline_tag: any-to-any
base_model: google/gemma-4-E2B-it
tags:
  - quantized
  - apple-silicon
  - mlx
  - gemma4
  - vision
  - audio
  - multimodal
  - 8bit

Osaurus AI

Gemma 4 E2B-it — 8-bit (MLX)

Properly converted with all vision and audio tower weights verified intact

Website  OsaurusAI


Why this exists: Some mlx-community conversions of Gemma 4 have broken or zeroed-out vision/audio tower weights, producing models that appear functional for text but silently fail on image and audio inputs. This is a clean conversion from the original google/gemma-4-E2B-it with every multimodal weight tensor verified non-zero.


Model Details

Property Value
Base Model google/gemma-4-E2B-it
Parameters 2.3B effective (5.1B total with Per-Layer Embeddings)
Quantization 8-bit affine
Avg Bits/Weight 9.257
Model Size 5.5 GB
Architecture Gemma 4 (text + vision + audio)
Context Length 128K tokens
Vocabulary 262K tokens

Multimodal Weight Verification

Every tensor in every multimodal component was loaded and checked for max(abs(tensor)) > 0. Zero broken weights found.

Component Tensor Count Status
Vision Tower (SigLIP) 658 All non-zero
Audio Tower (Conformer) 751 All non-zero
Language Model 1,240 All non-zero
Total 2,649 All verified

Usage

# Requires Osaurus (https://osaurus.ai)
osaurus serve OsaurusAI/gemma-4-E2B-it-8bit
# Python API
from mlx_vlm import load, generate

model, processor = load("OsaurusAI/gemma-4-E2B-it-8bit")

# Text-only
output = generate(model, processor, "Explain quantum computing", max_tokens=500)

# With image
output = generate(model, processor, "Describe this image", ["path/to/image.jpg"], max_tokens=500)

Conversion Details

Detail Value
Tool mlx-vlm v0.4.4
Source dtype bfloat16
Quantization mode affine
Group size 64
Source google/gemma-4-E2B-it (original Google release)

Converted by Osaurus AI