functiongemma-270m-it-4bit-mlx / README.md

codewithdark

Upload model via QuantLLM

0a76aa1 verified 6 days ago

preview code

raw

history blame contribute delete

4.15 kB

metadata

license: apache-2.0
base_model: google/functiongemma-270m-it
library_name: mlx
language:
  - en
tags:
  - quantllm
  - mlx
  - mlx-lm
  - apple-silicon
  - transformers
  - q4_k_m

🍎 functiongemma-270m-it-4bit-mlx

google/functiongemma-270m-it converted to MLX format

⭐ Star QuantLLM on GitHub

📖 About This Model

This model is google/functiongemma-270m-it converted to MLX format optimized for Apple Silicon (M1/M2/M3/M4) Macs with native acceleration.

Property	Value
Base Model	google/functiongemma-270m-it
Format	MLX
Quantization	Q4_K_M
License	apache-2.0
Created With	QuantLLM

🚀 Quick Start

Generate Text with mlx-lm

from mlx_lm import load, generate

# Load the model
model, tokenizer = load("QuantLLM/functiongemma-270m-it-4bit-mlx")

# Simple generation
prompt = "Explain quantum computing in simple terms"
messages = [{"role": "user", "content": prompt}]
prompt_formatted = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True
)

# Generate response
text = generate(model, tokenizer, prompt=prompt_formatted, verbose=True)
print(text)

Streaming Generation

from mlx_lm import load, stream_generate

model, tokenizer = load("QuantLLM/functiongemma-270m-it-4bit-mlx")

prompt = "Write a haiku about coding"
messages = [{"role": "user", "content": prompt}]
prompt_formatted = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True
)

# Stream tokens as they're generated
for token in stream_generate(model, tokenizer, prompt=prompt_formatted, max_tokens=200):
    print(token, end="", flush=True)

Command Line Interface

# Install mlx-lm
pip install mlx-lm

# Generate text
python -m mlx_lm.generate --model QuantLLM/functiongemma-270m-it-4bit-mlx --prompt "Hello!"

# Interactive chat
python -m mlx_lm.chat --model QuantLLM/functiongemma-270m-it-4bit-mlx

System Requirements

Requirement	Minimum
Chip	Apple Silicon (M1/M2/M3/M4)
macOS	13.0 (Ventura) or later
Python	3.10+
RAM	8GB+ (16GB recommended)

# Install dependencies
pip install mlx-lm

📊 Model Details

Property	Value
Original Model	google/functiongemma-270m-it
Format	MLX
Quantization	Q4_K_M
License	`apache-2.0`
Export Date	2025-12-21
Exported By	QuantLLM v2.0

🚀 Created with QuantLLM

Convert any model to GGUF, ONNX, or MLX in one line!

from quantllm import turbo

# Load any HuggingFace model
model = turbo("google/functiongemma-270m-it")

# Export to any format
model.export("mlx", quantization="Q4_K_M")

# Push to HuggingFace
model.push("your-repo", format="mlx")

📚 Documentation · 🐛 Report Issue · 💡 Request Feature