OpenClaw Question Classifier — Qwen3.5-2B (MLX, 4-bit)

A fine-tuned version of mlx-community/Qwen3.5-2B-4bit trained to classify incoming questions as simple, tricky, or complex for intelligent routing in the OpenClaw Telegram bot system.

What it does

Given any question, the model outputs a structured JSON classification:

{"class": "simple", "confidence": 0.99, "reason": "Direct factual question with a single, short answer."}

Class	Meaning	Routing action
`simple`	Direct factual lookup, single definitive answer	Fast/cheap model (GPT-4o-mini, Gemini Flash)
`tricky`	Nuanced, opinion-based, or multi-faceted	Mid-tier model (GPT-4o, Claude Sonnet)
`complex`	Requires deep expertise, multi-domain analysis	Best model (Claude Opus, o1)

Training

Base model: mlx-community/Qwen3.5-2B-4bit (4-bit quantized, Apple Silicon native)
Method: LoRA fine-tuning via MLX-LM on Apple M2 (16GB)
Dataset: ~9,500 labeled question classification examples (simple/tricky/complex)
Training: 2000 iterations, batch size 2, LR 1e-4, 8 LoRA layers
Final train loss: 0.23 | Best val loss: 0.34

Usage (MLX — Apple Silicon)

from mlx_lm import load, generate

model, tokenizer = load("NeelkanthSingh/openclaw-qwen3.5-2b-classifier")

SYSTEM_PROMPT = (
    "You are an expert question classifier for the OpenClaw system. "
    "Your job is to analyze incoming questions and classify them as "
    "'simple', 'tricky', or 'complex' based on the cognitive effort "
    "and expertise required to answer them. "
    "Always respond with valid JSON only."
)

question = "What is the boiling point of water?"
messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": f"Classify this question: {question}"}
]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
response = generate(model, tokenizer, prompt=prompt, max_tokens=100, verbose=False)
print(response)
# {"class": "simple", "confidence": 0.99, "reason": "Direct factual question with a specific numerical answer."}

Example outputs

Question	Output
"What is 2 + 2?"	`{"class": "simple", "confidence": 0.99, "reason": "Basic arithmetic."}`
"Should I invest in crypto or stocks?"	`{"class": "tricky", "confidence": 0.88, "reason": "Requires comparing investment vehicles and macroeconomic factors."}`
"Explain the geopolitical consequences of the 2008 financial crisis."	`{"class": "complex", "confidence": 0.95, "reason": "Requires deep economic analysis of complex global systems."}`

Hardware

Optimized for Apple Silicon (M1/M2/M3) via the MLX framework. Runs at ~1 second per classification on M2 Air.

Downloads last month: 335

Safetensors

Model size

0.3B params

Tensor type

BF16

U32

F32

MLX

Hardware compatibility

4-bit

Model tree for NeelkanthSingh/openclaw-qwen3.5-2b-classifier

Base model

Qwen/Qwen3.5-2B-Base

Quantized

mlx-community/Qwen3.5-2B-4bit

Adapter

(1)

this model