OpenClaw Question Classifier โ Qwen3.5-2B (MLX, 4-bit)
A fine-tuned version of mlx-community/Qwen3.5-2B-4bit trained to classify incoming questions as simple, tricky, or complex for intelligent routing in the OpenClaw Telegram bot system.
What it does
Given any question, the model outputs a structured JSON classification:
{"class": "simple", "confidence": 0.99, "reason": "Direct factual question with a single, short answer."}
| Class | Meaning | Routing action |
|---|---|---|
simple |
Direct factual lookup, single definitive answer | Fast/cheap model (GPT-4o-mini, Gemini Flash) |
tricky |
Nuanced, opinion-based, or multi-faceted | Mid-tier model (GPT-4o, Claude Sonnet) |
complex |
Requires deep expertise, multi-domain analysis | Best model (Claude Opus, o1) |
Training
- Base model:
mlx-community/Qwen3.5-2B-4bit(4-bit quantized, Apple Silicon native) - Method: LoRA fine-tuning via MLX-LM on Apple M2 (16GB)
- Dataset: ~9,500 labeled question classification examples (simple/tricky/complex)
- Training: 2000 iterations, batch size 2, LR 1e-4, 8 LoRA layers
- Final train loss: 0.23 | Best val loss: 0.34
Usage (MLX โ Apple Silicon)
from mlx_lm import load, generate
model, tokenizer = load("NeelkanthSingh/openclaw-qwen3.5-2b-classifier")
SYSTEM_PROMPT = (
"You are an expert question classifier for the OpenClaw system. "
"Your job is to analyze incoming questions and classify them as "
"'simple', 'tricky', or 'complex' based on the cognitive effort "
"and expertise required to answer them. "
"Always respond with valid JSON only."
)
question = "What is the boiling point of water?"
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": f"Classify this question: {question}"}
]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
response = generate(model, tokenizer, prompt=prompt, max_tokens=100, verbose=False)
print(response)
# {"class": "simple", "confidence": 0.99, "reason": "Direct factual question with a specific numerical answer."}
Example outputs
| Question | Output |
|---|---|
| "What is 2 + 2?" | {"class": "simple", "confidence": 0.99, "reason": "Basic arithmetic."} |
| "Should I invest in crypto or stocks?" | {"class": "tricky", "confidence": 0.88, "reason": "Requires comparing investment vehicles and macroeconomic factors."} |
| "Explain the geopolitical consequences of the 2008 financial crisis." | {"class": "complex", "confidence": 0.95, "reason": "Requires deep economic analysis of complex global systems."} |
Hardware
Optimized for Apple Silicon (M1/M2/M3) via the MLX framework. Runs at ~1 second per classification on M2 Air.
- Downloads last month
- 335
Model size
0.3B params
Tensor type
BF16
ยท
U32 ยท
F32 ยท
Hardware compatibility
Log In to add your hardware
4-bit