| --- |
| language: |
| - en |
| license: mit |
| tags: |
| - mlx |
| - qwen3 |
| - agent |
| - tool-calling |
| - code |
| - 8-bit |
| - quantized |
| base_model: LocoreMind/LocoOperator-4B |
| pipeline_tag: text-generation |
| library_name: mlx |
| --- |
| |
| # LocoOperator-4B — MLX 8-bit Quantized |
|
|
| This is an **8-bit quantized MLX** version of [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B), converted for efficient inference on Apple Silicon using [MLX](https://github.com/ml-explore/mlx). |
|
|
| ## Model Overview |
|
|
| | Attribute | Value | |
| |---|---| |
| | **Original Model** | [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B) | |
| | **Architecture** | Qwen3 (4B parameters) | |
| | **Quantization** | 8-bit (MLX) | |
| | **Base Model** | Qwen3-4B-Instruct-2507 | |
| | **Teacher Model** | Qwen3-Coder-Next | |
| | **Training Method** | Full-parameter SFT (distillation from 170K samples) | |
| | **Max Sequence Length** | 16,384 tokens | |
| | **License** | MIT | |
|
|
| ## About LocoOperator-4B |
|
|
| LocoOperator-4B is a 4B-parameter tool-calling agent model trained via knowledge distillation from Qwen3-Coder-Next inference traces. It specializes in multi-turn codebase exploration — reading files, searching code, and navigating project structures within a Claude Code-style agent loop. |
|
|
| ### Key Features |
|
|
| - **Tool-Calling Agent**: Generates structured `<tool_call>` JSON for Read, Grep, Glob, Bash, Write, Edit, and Task (subagent delegation) |
| - **100% JSON Validity**: Every tool call is valid JSON with all required arguments — outperforming the teacher model (87.6%) |
| - **Multi-Turn**: Handles conversation depths of 3–33 messages with consistent tool-calling behavior |
|
|
| ### Performance |
|
|
| | Metric | Score | |
| |---|---| |
| | Tool Call Presence Alignment | **100%** (65/65) | |
| | First Tool Type Match | **65.6%** (40/61) | |
| | JSON Validity | **100%** (76/76) | |
| | Argument Syntax Correctness | **100%** (76/76) | |
|
|
| ## Usage with MLX |
|
|
| ```bash |
| pip install mlx-lm |
| ``` |
|
|
| ```python |
| from mlx_lm import load, generate |
| |
| model, tokenizer = load("DJLougen/LocoOperator-4B-MLX-8bit") |
| |
| messages = [ |
| { |
| "role": "system", |
| "content": "You are a read-only codebase search specialist." |
| }, |
| { |
| "role": "user", |
| "content": "Analyze the project structure at /workspace/myproject and explain the architecture." |
| } |
| ] |
| |
| prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| response = generate(model, tokenizer, prompt=prompt, max_tokens=512) |
| print(response) |
| ``` |
|
|
| ## Other Quantizations |
|
|
| | Variant | Link | |
| |---|---| |
| | MLX 4-bit | [DJLougen/LocoOperator-4B-MLX-4bit](https://huggingface.co/DJLougen/LocoOperator-4B-MLX-4bit) | |
| | MLX 6-bit | [DJLougen/LocoOperator-4B-MLX-6bit](https://huggingface.co/DJLougen/LocoOperator-4B-MLX-6bit) | |
| | MLX 8-bit | **This repo** | |
| | GGUF | [LocoreMind/LocoOperator-4B-GGUF](https://huggingface.co/LocoreMind/LocoOperator-4B-GGUF) | |
| | Full Weights | [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B) | |
|
|
| ## Acknowledgments |
|
|
| - [LocoreMind](https://huggingface.co/LocoreMind) for the original LocoOperator-4B model |
| - [Qwen Team](https://huggingface.co/Qwen) for the Qwen3-4B-Instruct-2507 base model |
| - [Apple MLX Team](https://github.com/ml-explore/mlx) for the MLX framework |
|
|