File size: 2,394 Bytes
a84d7d0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
license: mit
base_model: LocoreMind/LocoOperator-4B
tags:
- code
- agent
- tool-calling
- gguf
- llama-cpp
- qwen
---

# LocoOperator-4B-GGUF

This repository contains the **official GGUF quantized versions** of [LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B).

**LocoOperator-4B** is a 4B-parameter code exploration agent distilled from **Qwen3-Coder-Next**. It is specifically optimized for local agent loops (like Claude Code style), providing high-speed codebase navigation with **100% JSON tool-calling validity**.

## πŸš€ Which file should I choose?

We provide several quantization levels to balance performance and memory usage:

| File Name | Size | Recommendation |
|-----------|------|----------------|
| **LocoOperator-4B.Q8_0.gguf** | 4.28 GB | **Best Accuracy.** Recommended for local agent loops to ensure perfect JSON output. |
| **LocoOperator-4B.Q6_K.gguf** | 3.31 GB | **Great Balance.** Near-lossless logic with a smaller footprint. |
| **LocoOperator-4B.Q4_K_M.gguf**| 2.50 GB | **Standard.** Compatible with almost all local LLM runners (LM Studio, Ollama, etc.). |
| **LocoOperator-4B.IQ4_XS.gguf**| 2.29 GB | **Advanced.** Uses Importance Quantization for better performance at smaller sizes. |

## πŸ›  Usage (llama.cpp)

To run this model using `llama-cli` or `llama-server`, we recommend a **context size of at least 50K** to handle multi-turn codebase exploration:

### Simple CLI Chat:
```bash
./llama-cli \
    -m LocoOperator-4B.Q8_0.gguf \
    -c 51200 \
    -p "You are a helpful codebase explorer. Use tools to help the user."
```

### Serve as an OpenAI-compatible API:
```bash
./llama-server \
    -m LocoOperator-4B.Q8_0.gguf \
    --ctx-size 51200 \
    --port 8080
```

## πŸ“‹ Model Details
- **Base Model:** Qwen3-4B-Instruct-2507
- **Teacher Model:** Qwen3-Coder-Next
- **Training Method:** Full-parameter SFT (Knowledge Distillation)
- **Primary Use Case:** Codebase exploration (Read, Grep, Glob, Bash, Task)

## πŸ”— Links
- **Main Repository:** [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B)
- **GitHub:** [LocoreMind/LocoOperator](https://github.com/LocoreMind/LocoOperator)
- **Blog:** [locoremind.com/blog/loco-operator](https://locoremind.com/blog/loco-operator)

## πŸ™ Acknowledgments
Special thanks to `mradermacher` for the initial quantization work and the `llama.cpp` community.