How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf vanpelt/summarizer:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf vanpelt/summarizer:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf vanpelt/summarizer:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf vanpelt/summarizer:Q4_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf vanpelt/summarizer:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf vanpelt/summarizer:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf vanpelt/summarizer:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf vanpelt/summarizer:Q4_K_M
Use Docker
docker model run hf.co/vanpelt/summarizer:Q4_K_M
Quick Links

summarizer

Fine-tuned Gemma-3-270M for task summarization and branch naming

Model Details

  • Base Model: google/gemma-3-270m-it
  • Format: GGUF (quantized for efficient inference)
  • Quantization: Q4_K_M
  • Use Case: Generating concise task titles and git branch names

Training

Usage

With Ollama

ollama pull hf.co/vanpelt/summarizer
ollama run hf.co/vanpelt/summarizer

With llama.cpp

# Download the GGUF file
huggingface-cli download vanpelt/summarizer gemma3-270m-summarizer-Q4_K_M.gguf

# Run with llama.cpp
./main -m gemma3-270m-summarizer-Q4_K_M.gguf -p 'Your prompt here'

Files

  • tokenizer.json (31.8 MB)
  • tokenizer_config.json (1.1 MB)
  • added_tokens.json (0.0 MB)
  • chat_template.jinja (0.0 MB)
  • Modelfile (0.0 MB)
  • template (0.0 MB)
  • system (0.0 MB)
  • model.safetensors (511.4 MB)
  • gemma3-270m-summarizer-Q4_K_M.gguf (241.4 MB)
  • special_tokens_map.json (0.0 MB)
  • config.json (0.0 MB)
  • params (0.0 MB)
  • tokenizer.model (4.5 MB)
Downloads last month
11
GGUF
Model size
0.3B params
Architecture
gemma3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for vanpelt/summarizer

Quantized
(86)
this model