LocoreMind
/

LocoOperator-4B-GGUF

Model card Files Files and versions

LocoOperator-4B-GGUF / README.md

FutureMa's picture

Create README.md

a84d7d0 verified 4 days ago

|

history blame contribute delete

2.39 kB

	---
	license: mit
	base_model: LocoreMind/LocoOperator-4B
	tags:
	- code
	- agent
	- tool-calling
	- gguf
	- llama-cpp
	- qwen
	---

	# LocoOperator-4B-GGUF

	This repository contains the official GGUF quantized versions of [LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B).

	LocoOperator-4B is a 4B-parameter code exploration agent distilled from Qwen3-Coder-Next. It is specifically optimized for local agent loops (like Claude Code style), providing high-speed codebase navigation with 100% JSON tool-calling validity.

	## 🚀 Which file should I choose?

	We provide several quantization levels to balance performance and memory usage:

	\| File Name \| Size \| Recommendation \|
	\|-----------\|------\|----------------\|
	\| LocoOperator-4B.Q8_0.gguf \| 4.28 GB \| Best Accuracy. Recommended for local agent loops to ensure perfect JSON output. \|
	\| LocoOperator-4B.Q6_K.gguf \| 3.31 GB \| Great Balance. Near-lossless logic with a smaller footprint. \|
	\| LocoOperator-4B.Q4_K_M.gguf\| 2.50 GB \| Standard. Compatible with almost all local LLM runners (LM Studio, Ollama, etc.). \|
	\| LocoOperator-4B.IQ4_XS.gguf\| 2.29 GB \| Advanced. Uses Importance Quantization for better performance at smaller sizes. \|

	## 🛠 Usage (llama.cpp)

	To run this model using `llama-cli` or `llama-server`, we recommend a context size of at least 50K to handle multi-turn codebase exploration:

	### Simple CLI Chat:
	```bash
	./llama-cli \
	-m LocoOperator-4B.Q8_0.gguf \
	-c 51200 \
	-p "You are a helpful codebase explorer. Use tools to help the user."
	```

	### Serve as an OpenAI-compatible API:
	```bash
	./llama-server \
	-m LocoOperator-4B.Q8_0.gguf \
	--ctx-size 51200 \
	--port 8080
	```

	## 📋 Model Details
	- Base Model: Qwen3-4B-Instruct-2507
	- Teacher Model: Qwen3-Coder-Next
	- Training Method: Full-parameter SFT (Knowledge Distillation)
	- Primary Use Case: Codebase exploration (Read, Grep, Glob, Bash, Task)

	## 🔗 Links
	- Main Repository: [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B)
	- GitHub: [LocoreMind/LocoOperator](https://github.com/LocoreMind/LocoOperator)
	- Blog: [locoremind.com/blog/loco-operator](https://locoremind.com/blog/loco-operator)

	## 🙏 Acknowledgments
	Special thanks to `mradermacher` for the initial quantization work and the `llama.cpp` community.