CAAL Qwen3.5 2B โ Fine-Tuned for Tool Calling
A fine-tuned Qwen3.5 2B model optimized for tool calling in voice assistant workflows. Built for CAAL (CoreWorxLab Ambient Assistant for Linux).
Performance
82/85 tests passed (96%) on the CAAL 85-test evaluation suite:
| Category | Score |
|---|---|
| Single tool calls | 27/27 |
| Conversational (no tool) | 10/10 |
| Multi-turn chains | 28/29 |
| Argument formatting | 17/19 |
Model Details
- Base model: Qwen3.5 2B
- Training method: SFT with BF16 LoRA (last-turn-only โ previous turns as context, only final response trained)
- LoRA config: r=32, alpha=32
- Quantization: Q4_K_M (GGUF)
- File size: ~1.2 GB
- VRAM usage: ~2.6 GB at 16384 context
Usage with Ollama
# Download the GGUF and create a Modelfile:
# Modelfile contents:
# FROM caal-qwen3.5-2b-q4.gguf
# RENDERER qwen3.5
# PARSER qwen3.5
# PARAMETER temperature 0.1
# PARAMETER num_ctx 16384
ollama create caal-qwen35-2b -f Modelfile
Designed For
- Edge deployment on consumer GPUs (fits on 5GB+ VRAM alongside TTS)
- Local voice assistants with tool calling
- Smart home control, email, calendar, and service management
- Multi-step tool chains (e.g., search โ lookup contact โ send email)
License
See LICENSE for the CAAL Model License v1.0. This model is free for personal, non-commercial use with attribution to CoreWorxLab. Commercial use requires written permission.
The base model (Qwen3.5) is licensed under Apache 2.0. Users must comply with both licenses.
- Downloads last month
- 373
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support