--- language: - en license: mit tags: - text-generation - compression - coding - ollama - squeezr base_model: Qwen/Qwen2.5-0.5B-Instruct model_type: qwen2 pipeline_tag: text-generation --- # Zest — Local AI Compression Model for Squeezr Zest is a fine-tuned 0.8B model that compresses coding tool outputs (bash, git, test runners, file reads) to save context window tokens. Designed to run locally via Ollama as the AI backend for [Squeezr](https://github.com/sergioramosv/Squeezr). ## Quick install ```bash # Install via Squeezr wizard (recommended) squeezr zest ``` Or manually: ```bash ollama pull ramosvs/zest # coming soon # Or use the GGUF directly: ollama create zest -f Modelfile.zest ``` ## What it does - **Input**: raw coding tool output (git diff, npm install, test failure, file read...) - **Output**: compressed version preserving errors, paths, function names, key values - **Typical savings**: 52–72% on real Claude Code tool outputs (>5K chars) - **Minimum input**: 1500 chars (smaller inputs may expand — handled by Squeezr's safety net) ## Performance | Metric | Value | | eval_loss | 0.4422 | | eval_accuracy | 89.12% | | Input size sweet spot | ≥5K chars | | Compression on large inputs | 52–72% | ## Training Fine-tuned from Qwen3.5-0.8B using LoRA (r=16, α=32) on a distillation dataset of 1,111 training pairs generated by Claude Opus 4.7. Dataset covers 50+ categories: git, test runners, build tools, docker, kubectl, npm, stack traces, MCP responses, etc. ## Usage with Ollama ``` FROM zest-Q4_K_M.gguf SYSTEM \"\"\"You are compressing a coding tool output to save tokens. Extract ONLY what is essential: errors, file paths, function names, test failures, key values, warnings. Be extremely concise, target under 150 tokens. Output only the compressed content, nothing else.\"\"\" PARAMETER temperature 0 PARAMETER top_p 1 PARAMETER top_k 1 PARAMETER num_predict 300 PARAMETER num_ctx 2048 ``` ## Integration with Squeezr After `squeezr zest` configures everything, add to `~/.squeezr/squeezr.toml`: ```toml [compression] ai_compression = true ai_min_chars = 1500 [local] enabled = true upstream_url = "http://localhost:11434" compression_model = "zest" ```