---
language:
- en
license: mit
tags:
- text-generation
- compression
- coding
- ollama
- squeezr
base_model: Qwen/Qwen2.5-0.5B-Instruct
model_type: qwen2
pipeline_tag: text-generation
---
# Zest — Local AI Compression Model for Squeezr
Zest is a fine-tuned 0.8B model that compresses coding tool outputs (bash, git, test runners, file reads) to save context window tokens. Designed to run locally via Ollama as the AI backend for [Squeezr](https://github.com/sergioramosv/Squeezr).
## Quick install
```bash
# Install via Squeezr wizard (recommended)
squeezr zest
```
Or manually:
```bash
ollama pull ramosvs/zest  # coming soon
# Or use the GGUF directly:
ollama create zest -f Modelfile.zest
```
## What it does
- **Input**: raw coding tool output (git diff, npm install, test failure, file read...)
- **Output**: compressed version preserving errors, paths, function names, key values
- **Typical savings**: 52–72% on real Claude Code tool outputs (>5K chars)
- **Minimum input**: 1500 chars (smaller inputs may expand — handled by Squeezr's safety net)
## Performance
| Metric | Value |
| eval_loss | 0.4422 |
| eval_accuracy | 89.12% |
| Input size sweet spot | ≥5K chars |
| Compression on large inputs | 52–72% |
## Training
Fine-tuned from Qwen3.5-0.8B using LoRA (r=16, α=32) on a distillation dataset of 1,111 training pairs generated by Claude Opus 4.7. Dataset covers 50+ categories: git, test runners, build tools, docker, kubectl, npm, stack traces, MCP responses, etc.
## Usage with Ollama
```
FROM zest-Q4_K_M.gguf
SYSTEM \"\"\"You are compressing a coding tool output to save tokens. Extract ONLY what is essential: errors, file paths, function names, test failures, key values, warnings. Be extremely concise, target under 150 tokens. Output only the compressed content, nothing else.\"\"\"
PARAMETER temperature 0
PARAMETER top_p 1
PARAMETER top_k 1
PARAMETER num_predict 300
PARAMETER num_ctx 2048
```
## Integration with Squeezr
After `squeezr zest` configures everything, add to `~/.squeezr/squeezr.toml`:
```toml
[compression]
ai_compression = true
ai_min_chars = 1500
[local]
enabled = true
upstream_url = "http://localhost:11434"
compression_model = "zest"
```