OpenClaw Question Classifier โ€” Qwen3.5-2B (MLX, 4-bit)

A fine-tuned version of mlx-community/Qwen3.5-2B-4bit trained to classify incoming questions as simple, tricky, or complex for intelligent routing in the OpenClaw Telegram bot system.

What it does

Given any question, the model outputs a structured JSON classification:

{"class": "simple", "confidence": 0.99, "reason": "Direct factual question with a single, short answer."}
Class Meaning Routing action
simple Direct factual lookup, single definitive answer Fast/cheap model (GPT-4o-mini, Gemini Flash)
tricky Nuanced, opinion-based, or multi-faceted Mid-tier model (GPT-4o, Claude Sonnet)
complex Requires deep expertise, multi-domain analysis Best model (Claude Opus, o1)

Training

  • Base model: mlx-community/Qwen3.5-2B-4bit (4-bit quantized, Apple Silicon native)
  • Method: LoRA fine-tuning via MLX-LM on Apple M2 (16GB)
  • Dataset: ~9,500 labeled question classification examples (simple/tricky/complex)
  • Training: 2000 iterations, batch size 2, LR 1e-4, 8 LoRA layers
  • Final train loss: 0.23 | Best val loss: 0.34

Usage (MLX โ€” Apple Silicon)

from mlx_lm import load, generate

model, tokenizer = load("NeelkanthSingh/openclaw-qwen3.5-2b-classifier")

SYSTEM_PROMPT = (
    "You are an expert question classifier for the OpenClaw system. "
    "Your job is to analyze incoming questions and classify them as "
    "'simple', 'tricky', or 'complex' based on the cognitive effort "
    "and expertise required to answer them. "
    "Always respond with valid JSON only."
)

question = "What is the boiling point of water?"
messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": f"Classify this question: {question}"}
]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
response = generate(model, tokenizer, prompt=prompt, max_tokens=100, verbose=False)
print(response)
# {"class": "simple", "confidence": 0.99, "reason": "Direct factual question with a specific numerical answer."}

Example outputs

Question Output
"What is 2 + 2?" {"class": "simple", "confidence": 0.99, "reason": "Basic arithmetic."}
"Should I invest in crypto or stocks?" {"class": "tricky", "confidence": 0.88, "reason": "Requires comparing investment vehicles and macroeconomic factors."}
"Explain the geopolitical consequences of the 2008 financial crisis." {"class": "complex", "confidence": 0.95, "reason": "Requires deep economic analysis of complex global systems."}

Hardware

Optimized for Apple Silicon (M1/M2/M3) via the MLX framework. Runs at ~1 second per classification on M2 Air.

Downloads last month
335
Safetensors
Model size
0.3B params
Tensor type
BF16
ยท
U32
ยท
F32
ยท
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for NeelkanthSingh/openclaw-qwen3.5-2b-classifier

Adapter
(1)
this model