How to use from the
Use from the
MLX library
# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Roxas13/gemma3")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

AEGIS Gemma 3 4B IT (text-only, 4-bit MLX)

Text-only, 4-bit MLX-quantized build of Gemma 3 4B IT, used on-device by the AEGIS app. This is a modified (vision tower removed, 4-bit quantized) derivative of Google's Gemma 3 4B IT.

License & notices

Gemma is provided under and subject to the Gemma Terms of Use: https://ai.google.dev/gemma/terms

Use of this model must comply with the Gemma Prohibited Use Policy: https://ai.google.dev/gemma/prohibited_use_policy

This repository redistributes a modified Gemma derivative. By Google's Gemma Terms, the above Terms of Use and Prohibited Use Policy apply to this model and are passed along to all recipients. Modifications relative to the base model: vision encoder removed (text-only) and weights quantized to 4-bit for MLX.

Downloads last month
27
Safetensors
Model size
0.7B params
Tensor type
F16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Roxas13/gemma3

Quantized
(5)
this model