CAAL Qwen3.5 2B โ€” Fine-Tuned for Tool Calling

A fine-tuned Qwen3.5 2B model optimized for tool calling in voice assistant workflows. Built for CAAL (CoreWorxLab Ambient Assistant for Linux).

Performance

82/85 tests passed (96%) on the CAAL 85-test evaluation suite:

Category Score
Single tool calls 27/27
Conversational (no tool) 10/10
Multi-turn chains 28/29
Argument formatting 17/19

Model Details

  • Base model: Qwen3.5 2B
  • Training method: SFT with BF16 LoRA (last-turn-only โ€” previous turns as context, only final response trained)
  • LoRA config: r=32, alpha=32
  • Quantization: Q4_K_M (GGUF)
  • File size: ~1.2 GB
  • VRAM usage: ~2.6 GB at 16384 context

Usage with Ollama

# Download the GGUF and create a Modelfile:
# Modelfile contents:
# FROM caal-qwen3.5-2b-q4.gguf
# RENDERER qwen3.5
# PARSER qwen3.5
# PARAMETER temperature 0.1
# PARAMETER num_ctx 16384

ollama create caal-qwen35-2b -f Modelfile

Designed For

  • Edge deployment on consumer GPUs (fits on 5GB+ VRAM alongside TTS)
  • Local voice assistants with tool calling
  • Smart home control, email, calendar, and service management
  • Multi-step tool chains (e.g., search โ†’ lookup contact โ†’ send email)

License

See LICENSE for the CAAL Model License v1.0. This model is free for personal, non-commercial use with attribution to CoreWorxLab. Commercial use requires written permission.

The base model (Qwen3.5) is licensed under Apache 2.0. Users must comply with both licenses.

Downloads last month
373
GGUF
Model size
2B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for CoreWorxLab/caal-qwen3.5-2b

Finetuned
Qwen/Qwen3.5-2B
Quantized
(76)
this model