FutureMa commited on
Commit
a84d7d0
ยท
verified ยท
1 Parent(s): bb31597

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: LocoreMind/LocoOperator-4B
4
+ tags:
5
+ - code
6
+ - agent
7
+ - tool-calling
8
+ - gguf
9
+ - llama-cpp
10
+ - qwen
11
+ ---
12
+
13
+ # LocoOperator-4B-GGUF
14
+
15
+ This repository contains the **official GGUF quantized versions** of [LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B).
16
+
17
+ **LocoOperator-4B** is a 4B-parameter code exploration agent distilled from **Qwen3-Coder-Next**. It is specifically optimized for local agent loops (like Claude Code style), providing high-speed codebase navigation with **100% JSON tool-calling validity**.
18
+
19
+ ## ๐Ÿš€ Which file should I choose?
20
+
21
+ We provide several quantization levels to balance performance and memory usage:
22
+
23
+ | File Name | Size | Recommendation |
24
+ |-----------|------|----------------|
25
+ | **LocoOperator-4B.Q8_0.gguf** | 4.28 GB | **Best Accuracy.** Recommended for local agent loops to ensure perfect JSON output. |
26
+ | **LocoOperator-4B.Q6_K.gguf** | 3.31 GB | **Great Balance.** Near-lossless logic with a smaller footprint. |
27
+ | **LocoOperator-4B.Q4_K_M.gguf**| 2.50 GB | **Standard.** Compatible with almost all local LLM runners (LM Studio, Ollama, etc.). |
28
+ | **LocoOperator-4B.IQ4_XS.gguf**| 2.29 GB | **Advanced.** Uses Importance Quantization for better performance at smaller sizes. |
29
+
30
+ ## ๐Ÿ›  Usage (llama.cpp)
31
+
32
+ To run this model using `llama-cli` or `llama-server`, we recommend a **context size of at least 50K** to handle multi-turn codebase exploration:
33
+
34
+ ### Simple CLI Chat:
35
+ ```bash
36
+ ./llama-cli \
37
+ -m LocoOperator-4B.Q8_0.gguf \
38
+ -c 51200 \
39
+ -p "You are a helpful codebase explorer. Use tools to help the user."
40
+ ```
41
+
42
+ ### Serve as an OpenAI-compatible API:
43
+ ```bash
44
+ ./llama-server \
45
+ -m LocoOperator-4B.Q8_0.gguf \
46
+ --ctx-size 51200 \
47
+ --port 8080
48
+ ```
49
+
50
+ ## ๐Ÿ“‹ Model Details
51
+ - **Base Model:** Qwen3-4B-Instruct-2507
52
+ - **Teacher Model:** Qwen3-Coder-Next
53
+ - **Training Method:** Full-parameter SFT (Knowledge Distillation)
54
+ - **Primary Use Case:** Codebase exploration (Read, Grep, Glob, Bash, Task)
55
+
56
+ ## ๐Ÿ”— Links
57
+ - **Main Repository:** [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B)
58
+ - **GitHub:** [LocoreMind/LocoOperator](https://github.com/LocoreMind/LocoOperator)
59
+ - **Blog:** [locoremind.com/blog/loco-operator](https://locoremind.com/blog/loco-operator)
60
+
61
+ ## ๐Ÿ™ Acknowledgments
62
+ Special thanks to `mradermacher` for the initial quantization work and the `llama.cpp` community.