Qwen3.5-9B-metro-v24

QLoRA fine-tune of Qwen3.5-9B for the MetroLLM-Bench transit-kiosk task: multi-step tool calling, structured terminal-state synthesis, and disruption/advisory handling across six metro systems.

v24 is the leakage-free retraining used in the MetroLLM-Bench paper. Teacher distillation traces are drawn only from the 717-case training partition of a stratified 75/25 split (seed=42); the 238-case held-out partition is strictly reserved for reporting. It supersedes continker/Qwen3.5-9B-metro-v23, which was trained on the full case set.

Held-out results (n=238, mean of 2 training seeds)

Metric 9B base 9B + v24 ฮ”
Tier-1 89.38 91.03 +1.65
Composite 88.05 88.85 +0.80

Contents

  • adapter/ โ€” LoRA adapter (rank 16, ฮฑ 32; QLoRA 4-bit NF4 training) with tokenizer + chat template
  • Qwen3.5-9B-metro-v24-Q4_K_M.gguf โ€” merged + quantised GGUF (5.3 GB), self-contained
  • training_summary.json โ€” training metadata

Usage

GGUF (recommended, self-contained):

llama-server --hf-repo continker/Qwen3.5-9B-metro-v24 --hf-file Qwen3.5-9B-metro-v24-Q4_K_M.gguf

The LoRA adapter is provided for merging. Its keys use the .language_model. module prefix; to apply it to a text-only Qwen3.5-9B, strip that prefix from the adapter keys first (see the MetroLLM-Bench demo _localise_adapter).

Training

QLoRA, rank 16, 3 epochs, single RTX 5090. Distilled from Qwen3.5-27B-dense and 35B-A3B teachers (tier-1 โ‰ฅ 90% filter, 600 examples). Recipe and full evaluation: MetroLLM-Bench. License Apache 2.0 (inherits Qwen3.5-9B).

Downloads last month
9
GGUF
Model size
9B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for continker/Qwen3.5-9B-metro-v24

Finetuned
Qwen/Qwen3.5-9B
Adapter
(226)
this model

Space using continker/Qwen3.5-9B-metro-v24 1