DJLougen's picture
Upload README.md with huggingface_hub
45cd4bd verified
---
language:
- en
license: mit
tags:
- mlx
- qwen3
- agent
- tool-calling
- code
- 4-bit
- quantized
base_model: LocoreMind/LocoOperator-4B
pipeline_tag: text-generation
library_name: mlx
---
# LocoOperator-4B — MLX 4-bit Quantized
This is a **4-bit quantized MLX** version of [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B), converted for efficient inference on Apple Silicon using [MLX](https://github.com/ml-explore/mlx).
## Model Overview
| Attribute | Value |
|---|---|
| **Original Model** | [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B) |
| **Architecture** | Qwen3 (4B parameters) |
| **Quantization** | 4-bit (MLX) |
| **Base Model** | Qwen3-4B-Instruct-2507 |
| **Teacher Model** | Qwen3-Coder-Next |
| **Training Method** | Full-parameter SFT (distillation from 170K samples) |
| **Max Sequence Length** | 16,384 tokens |
| **License** | MIT |
## About LocoOperator-4B
LocoOperator-4B is a 4B-parameter tool-calling agent model trained via knowledge distillation from Qwen3-Coder-Next inference traces. It specializes in multi-turn codebase exploration — reading files, searching code, and navigating project structures within a Claude Code-style agent loop.
### Key Features
- **Tool-Calling Agent**: Generates structured `<tool_call>` JSON for Read, Grep, Glob, Bash, Write, Edit, and Task (subagent delegation)
- **100% JSON Validity**: Every tool call is valid JSON with all required arguments — outperforming the teacher model (87.6%)
- **Multi-Turn**: Handles conversation depths of 3–33 messages with consistent tool-calling behavior
### Performance
| Metric | Score |
|---|---|
| Tool Call Presence Alignment | **100%** (65/65) |
| First Tool Type Match | **65.6%** (40/61) |
| JSON Validity | **100%** (76/76) |
| Argument Syntax Correctness | **100%** (76/76) |
## Usage with MLX
```bash
pip install mlx-lm
```
```python
from mlx_lm import load, generate
model, tokenizer = load("DJLougen/LocoOperator-4B-MLX-4bit")
messages = [
{
"role": "system",
"content": "You are a read-only codebase search specialist."
},
{
"role": "user",
"content": "Analyze the project structure at /workspace/myproject and explain the architecture."
}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)
```
## Other Quantizations
| Variant | Link |
|---|---|
| MLX 4-bit | **This repo** |
| MLX 6-bit | [DJLougen/LocoOperator-4B-MLX-6bit](https://huggingface.co/DJLougen/LocoOperator-4B-MLX-6bit) |
| MLX 8-bit | [DJLougen/LocoOperator-4B-MLX-8bit](https://huggingface.co/DJLougen/LocoOperator-4B-MLX-8bit) |
| GGUF | [LocoreMind/LocoOperator-4B-GGUF](https://huggingface.co/LocoreMind/LocoOperator-4B-GGUF) |
| Full Weights | [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B) |
## Acknowledgments
- [LocoreMind](https://huggingface.co/LocoreMind) for the original LocoOperator-4B model
- [Qwen Team](https://huggingface.co/Qwen) for the Qwen3-4B-Instruct-2507 base model
- [Apple MLX Team](https://github.com/ml-explore/mlx) for the MLX framework