Instructions to use google/gemma-4-E4B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-4-E4B-it with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/gemma-4-E4B-it") model = AutoModelForImageTextToText.from_pretrained("google/gemma-4-E4B-it") - Notebooks
- Google Colab
- Kaggle
GGUF available — Cerebellum v1 & v2 (ablation-guided mixed-precision)
#29
by deucebucket - opened
Ablation-guided mixed-precision GGUF quants for running this model in llama.cpp / ollama:
- Cerebellum v2 — ablation-informed with PLE protection, latest version
- Cerebellum v1 — initial ablation-informed quant
Instead of treating every tensor the same, we ran individual ablation experiments to measure which tensors are sensitive vs. tolerant and assigned precision accordingly. Details and benchmarks in the model cards.