Gemma 4 E2B IT Text-Only 4-bit MLX Repack

Text-only repack of mlx-community/gemma-4-e2b-it-4bit for Overshow local inference via MLX Swift.

This artefact keeps the Gemma 4 language model tensors and tokenizer files, strips the language_model. tensor prefix expected by the pinned MLX Swift text loader, and drops unused audio and vision tower tensors.

Source

  • Source repository: mlx-community/gemma-4-e2b-it-4bit
  • Source revision: 99d9a53ff828d365a8ecae538e45f80a08d612cd
  • Repack script: scripts/repack-gemma4-text-only.py in over-show/app

Bundle Shape

Required files:

  • config.json
  • model.safetensors
  • model.safetensors.index.json
  • tokenizer.json
  • tokenizer_config.json
  • generation_config.json

Validation

Local validation passed before publishing:

  • Tensor plan: 1,234 tensors kept, 1,415 dropped
  • Weights: 2,512.5 MiB, saving 902.7 MiB against the source MLX checkpoint
  • scripts/validate-mlx-helper.py: 5/5 commands passed
  • swift test --filter Gemma4LoadSmokeTest: 2/2 tests passed
Downloads last month
678
Safetensors
Model size
0.7B params
Tensor type
U32
·
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for over-show/gemma-4-e2b-it-text-only-4bit

Quantized
(2)
this model

Collection including over-show/gemma-4-e2b-it-text-only-4bit