--- license: apache-2.0 base_model: Qwen/Qwen3-1.7B tags: - text-generation - qwen - qwen3 - lora - home-assistant - home-automation - smart-home - tool-use language: - en library_name: transformers pipeline_tag: text-generation --- # Selora AI Qwen3 1.7B fine-tuned for Home Assistant with four specialist LoRA adapters. The `answer` adapter additionally emits a `query_state` tool envelope for live device-state queries against the Home Assistant REST API. Used by the [Selora AI Home Assistant integration](https://gitlab.com/selorahomes/products/selora-ai/ha-integration); also runnable directly via Ollama, llama.cpp, or vLLM. ## Specialists | Adapter | Intent | Output shape | | --- | --- | --- | | `command` | "Turn off the kitchen lights" | `{intent:"command",response,calls:[…]}` | | `automation` | "Wake up lights at 6:30 AM" | `{intent:"automation",automation:{triggers,actions,…}}` | | `answer` | Q&A / small talk | `{intent:"answer",response}` | | `clarification` | Ask the user a follow-up | `{intent:"clarification",response}` | The HA integration's `selora_local` provider classifies each request to one of the four specialists before the call (cheap regex pre-classifier), then sends the request with `model: selora-v1-{specialist}`. Backends that support multi-LoRA (llama-server's `/lora-adapters`, vLLM `--enable-lora`) activate the matching adapter. ## Quick start ### Ollama ```bash ollama pull selora/commands ollama run selora/commands ``` Modelfiles for all four specialists live in [`ollama/`](ollama/) and are also published as separate Ollama models. ### llama.cpp ```bash llama-server \ --model qwen3_17b_base.Q4_K_M.gguf \ --lora-init-without-apply \ --lora qwen3_17b_command.lora.gguf \ --lora qwen3_17b_automation.lora.gguf \ --lora qwen3_17b_answer.lora.gguf \ --lora qwen3_17b_clarification.lora.gguf \ --ctx-size 8192 ``` POST to `/lora-adapters` to switch the active LoRA before each `/v1/chat/completions` call. ### vLLM (cloud) ```bash python -m vllm.entrypoints.openai.api_server \ --model ./qwen3_17b_hf \ --enable-lora --max-loras 4 --max-lora-rank 32 \ --lora-modules \ selora-v1-commands=/path/to/peft/command \ selora-v1-automations=/path/to/peft/automation \ selora-v1-answers=/path/to/peft/answer \ selora-v1-clarifications=/path/to/peft/clarification ``` vLLM activates the matching LoRA based on the request's `model` field; no extra routing layer needed. ## Generation parameters ```json { "temperature": 0.0, "repeat_penalty": 1.15, "repeat_last_n": 256, "max_tokens": 384, "stop": ["<|im_end|>", "<|endoftext|>"] } ``` Bump `max_tokens` to 1536 for automation requests (longer JSON output). ## Training Base: [Qwen3 1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) fine-tuned with [Apple mlx-lm](https://github.com/ml-explore/mlx-examples). Each specialist has its own LoRA (rank 8–28, scale 20) trained on a curated HA-domain corpus (forum threads, HA docs, synthetic command / automation pairs). System prompts trained per-specialist; see [`prompts/`](prompts/). The `answer` adapter went through a sequential continuation pass that added a `query_state` tool envelope on top of the original answer-only training distribution; that's preserved in the augmented `prompts/answers.txt` and the `Modelfile.answers` SYSTEM block. ## Evaluation 10/10 parity pass rate on the four-intent suite (command, automation, answer, clarification — plus screenshot regressions). Validator and scenarios live in [`parity/`](parity/). ## Files in this bundle | Artifact | Purpose | Distribution | | --- | --- | --- | | `qwen3_17b_base.IQ4_XS.gguf` | Quantized base for Ollama / llama.cpp | Hugging Face, ollama.com | | `qwen3_17b_{intent}.lora.gguf` (×4) | Specialist LoRA adapters | Hugging Face, ollama.com | | `Modelfile.{intent}` (×4) | Ollama recipes (base + LoRA + system prompt) | this repo, ollama.com | | `prompts/{intent}.txt` (×4) | Plain-text trained prompts (reference / testing) | this repo | The full-precision (f16) base and HF safetensors set used by vLLM / TGI / SageMaker live separately in the cloud bundle and are not yet mirrored to Hugging Face. ## Citation ```bibtex @misc{selora-ai-2026, title = {Selora AI: Qwen3 1.7B + LoRA Specialists for Home Assistant}, author = {{Selora Homes}}, year = {2026}, url = {https://huggingface.co/selora-homes/selora-ai} } ``` Base model citation: Qwen Team, *Qwen3 Technical Report* (2025). ## License Apache-2.0 (matches the Qwen3 base license).