| | --- |
| | license: mit |
| | base_model: LocoreMind/LocoOperator-4B |
| | tags: |
| | - code |
| | - agent |
| | - tool-calling |
| | - gguf |
| | - llama-cpp |
| | - qwen |
| | --- |
| | |
| | # LocoOperator-4B-GGUF |
| |
|
| | This repository contains the **official GGUF quantized versions** of [LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B). |
| |
|
| | **LocoOperator-4B** is a 4B-parameter code exploration agent distilled from **Qwen3-Coder-Next**. It is specifically optimized for local agent loops (like Claude Code style), providing high-speed codebase navigation with **100% JSON tool-calling validity**. |
| |
|
| | ## π Which file should I choose? |
| |
|
| | We provide several quantization levels to balance performance and memory usage: |
| |
|
| | | File Name | Size | Recommendation | |
| | |-----------|------|----------------| |
| | | **LocoOperator-4B.Q8_0.gguf** | 4.28 GB | **Best Accuracy.** Recommended for local agent loops to ensure perfect JSON output. | |
| | | **LocoOperator-4B.Q6_K.gguf** | 3.31 GB | **Great Balance.** Near-lossless logic with a smaller footprint. | |
| | | **LocoOperator-4B.Q4_K_M.gguf**| 2.50 GB | **Standard.** Compatible with almost all local LLM runners (LM Studio, Ollama, etc.). | |
| | | **LocoOperator-4B.IQ4_XS.gguf**| 2.29 GB | **Advanced.** Uses Importance Quantization for better performance at smaller sizes. | |
| | |
| | ## π Usage (llama.cpp) |
| | |
| | To run this model using `llama-cli` or `llama-server`, we recommend a **context size of at least 50K** to handle multi-turn codebase exploration: |
| | |
| | ### Simple CLI Chat: |
| | ```bash |
| | ./llama-cli \ |
| | -m LocoOperator-4B.Q8_0.gguf \ |
| | -c 51200 \ |
| | -p "You are a helpful codebase explorer. Use tools to help the user." |
| | ``` |
| | |
| | ### Serve as an OpenAI-compatible API: |
| | ```bash |
| | ./llama-server \ |
| | -m LocoOperator-4B.Q8_0.gguf \ |
| | --ctx-size 51200 \ |
| | --port 8080 |
| | ``` |
| | |
| | ## π Model Details |
| | - **Base Model:** Qwen3-4B-Instruct-2507 |
| | - **Teacher Model:** Qwen3-Coder-Next |
| | - **Training Method:** Full-parameter SFT (Knowledge Distillation) |
| | - **Primary Use Case:** Codebase exploration (Read, Grep, Glob, Bash, Task) |
| | |
| | ## π Links |
| | - **Main Repository:** [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B) |
| | - **GitHub:** [LocoreMind/LocoOperator](https://github.com/LocoreMind/LocoOperator) |
| | - **Blog:** [locoremind.com/blog/loco-operator](https://locoremind.com/blog/loco-operator) |
| | |
| | ## π Acknowledgments |
| | Special thanks to `mradermacher` for the initial quantization work and the `llama.cpp` community. |