codewithdark's picture
Upload model via QuantLLM
0a76aa1 verified
metadata
license: apache-2.0
base_model: google/functiongemma-270m-it
library_name: mlx
language:
  - en
tags:
  - quantllm
  - mlx
  - mlx-lm
  - apple-silicon
  - transformers
  - q4_k_m

🍎 functiongemma-270m-it-4bit-mlx

google/functiongemma-270m-it converted to MLX format

QuantLLM Format Quantization

⭐ Star QuantLLM on GitHub


πŸ“– About This Model

This model is google/functiongemma-270m-it converted to MLX format optimized for Apple Silicon (M1/M2/M3/M4) Macs with native acceleration.

Property Value
Base Model google/functiongemma-270m-it
Format MLX
Quantization Q4_K_M
License apache-2.0
Created With QuantLLM

πŸš€ Quick Start

Generate Text with mlx-lm

from mlx_lm import load, generate

# Load the model
model, tokenizer = load("QuantLLM/functiongemma-270m-it-4bit-mlx")

# Simple generation
prompt = "Explain quantum computing in simple terms"
messages = [{"role": "user", "content": prompt}]
prompt_formatted = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True
)

# Generate response
text = generate(model, tokenizer, prompt=prompt_formatted, verbose=True)
print(text)

Streaming Generation

from mlx_lm import load, stream_generate

model, tokenizer = load("QuantLLM/functiongemma-270m-it-4bit-mlx")

prompt = "Write a haiku about coding"
messages = [{"role": "user", "content": prompt}]
prompt_formatted = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True
)

# Stream tokens as they're generated
for token in stream_generate(model, tokenizer, prompt=prompt_formatted, max_tokens=200):
    print(token, end="", flush=True)

Command Line Interface

# Install mlx-lm
pip install mlx-lm

# Generate text
python -m mlx_lm.generate --model QuantLLM/functiongemma-270m-it-4bit-mlx --prompt "Hello!"

# Interactive chat
python -m mlx_lm.chat --model QuantLLM/functiongemma-270m-it-4bit-mlx

System Requirements

Requirement Minimum
Chip Apple Silicon (M1/M2/M3/M4)
macOS 13.0 (Ventura) or later
Python 3.10+
RAM 8GB+ (16GB recommended)
# Install dependencies
pip install mlx-lm

πŸ“Š Model Details

Property Value
Original Model google/functiongemma-270m-it
Format MLX
Quantization Q4_K_M
License apache-2.0
Export Date 2025-12-21
Exported By QuantLLM v2.0

πŸš€ Created with QuantLLM

QuantLLM

Convert any model to GGUF, ONNX, or MLX in one line!

from quantllm import turbo

# Load any HuggingFace model
model = turbo("google/functiongemma-270m-it")

# Export to any format
model.export("mlx", quantization="Q4_K_M")

# Push to HuggingFace
model.push("your-repo", format="mlx")
GitHub Stars

πŸ“š Documentation Β· πŸ› Report Issue Β· πŸ’‘ Request Feature