--- language: - en license: mit tags: - mlx - qwen3 - agent - tool-calling - code - 4-bit - quantized base_model: LocoreMind/LocoOperator-4B pipeline_tag: text-generation library_name: mlx --- # LocoOperator-4B — MLX 4-bit Quantized This is a **4-bit quantized MLX** version of [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B), converted for efficient inference on Apple Silicon using [MLX](https://github.com/ml-explore/mlx). ## Model Overview | Attribute | Value | |---|---| | **Original Model** | [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B) | | **Architecture** | Qwen3 (4B parameters) | | **Quantization** | 4-bit (MLX) | | **Base Model** | Qwen3-4B-Instruct-2507 | | **Teacher Model** | Qwen3-Coder-Next | | **Training Method** | Full-parameter SFT (distillation from 170K samples) | | **Max Sequence Length** | 16,384 tokens | | **License** | MIT | ## About LocoOperator-4B LocoOperator-4B is a 4B-parameter tool-calling agent model trained via knowledge distillation from Qwen3-Coder-Next inference traces. It specializes in multi-turn codebase exploration — reading files, searching code, and navigating project structures within a Claude Code-style agent loop. ### Key Features - **Tool-Calling Agent**: Generates structured `` JSON for Read, Grep, Glob, Bash, Write, Edit, and Task (subagent delegation) - **100% JSON Validity**: Every tool call is valid JSON with all required arguments — outperforming the teacher model (87.6%) - **Multi-Turn**: Handles conversation depths of 3–33 messages with consistent tool-calling behavior ### Performance | Metric | Score | |---|---| | Tool Call Presence Alignment | **100%** (65/65) | | First Tool Type Match | **65.6%** (40/61) | | JSON Validity | **100%** (76/76) | | Argument Syntax Correctness | **100%** (76/76) | ## Usage with MLX ```bash pip install mlx-lm ``` ```python from mlx_lm import load, generate model, tokenizer = load("DJLougen/LocoOperator-4B-MLX-4bit") messages = [ { "role": "system", "content": "You are a read-only codebase search specialist." }, { "role": "user", "content": "Analyze the project structure at /workspace/myproject and explain the architecture." } ] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) response = generate(model, tokenizer, prompt=prompt, max_tokens=512) print(response) ``` ## Other Quantizations | Variant | Link | |---|---| | MLX 4-bit | **This repo** | | MLX 6-bit | [DJLougen/LocoOperator-4B-MLX-6bit](https://huggingface.co/DJLougen/LocoOperator-4B-MLX-6bit) | | MLX 8-bit | [DJLougen/LocoOperator-4B-MLX-8bit](https://huggingface.co/DJLougen/LocoOperator-4B-MLX-8bit) | | GGUF | [LocoreMind/LocoOperator-4B-GGUF](https://huggingface.co/LocoreMind/LocoOperator-4B-GGUF) | | Full Weights | [LocoreMind/LocoOperator-4B](https://huggingface.co/LocoreMind/LocoOperator-4B) | ## Acknowledgments - [LocoreMind](https://huggingface.co/LocoreMind) for the original LocoOperator-4B model - [Qwen Team](https://huggingface.co/Qwen) for the Qwen3-4B-Instruct-2507 base model - [Apple MLX Team](https://github.com/ml-explore/mlx) for the MLX framework