Selora-AI / README.md
GChief117's picture
Selora AI v0.4.7
d994b26 unverified
---
license: apache-2.0
base_model: Qwen/Qwen3-1.7B
tags:
- text-generation
- qwen
- qwen3
- lora
- home-assistant
- home-automation
- smart-home
- tool-use
language:
- en
library_name: transformers
pipeline_tag: text-generation
---
# Selora AI
Qwen3 1.7B fine-tuned for Home Assistant with four specialist LoRA
adapters. The `answer` adapter additionally emits a `query_state` tool
envelope for live device-state queries against the Home Assistant REST
API. Used by the [Selora AI Home Assistant
integration](https://gitlab.com/selorahomes/products/selora-ai/ha-integration);
also runnable directly via Ollama, llama.cpp, or vLLM.
## Specialists
| Adapter | Intent | Output shape |
| --- | --- | --- |
| `command` | "Turn off the kitchen lights" | `{intent:"command",response,calls:[…]}` |
| `automation` | "Wake up lights at 6:30 AM" | `{intent:"automation",automation:{triggers,actions,…}}` |
| `answer` | Q&A / small talk | `{intent:"answer",response}` |
| `clarification` | Ask the user a follow-up | `{intent:"clarification",response}` |
The HA integration's `selora_local` provider classifies each request to
one of the four specialists before the call (cheap regex
pre-classifier), then sends the request with `model:
selora-v1-{specialist}`. Backends that support multi-LoRA
(llama-server's `/lora-adapters`, vLLM `--enable-lora`) activate the
matching adapter.
## Quick start
### Ollama
```bash
ollama pull selora/commands
ollama run selora/commands
```
Modelfiles for all four specialists live in [`ollama/`](ollama/) and
are also published as separate Ollama models.
### llama.cpp
```bash
llama-server \
--model qwen3_17b_base.Q4_K_M.gguf \
--lora-init-without-apply \
--lora qwen3_17b_command.lora.gguf \
--lora qwen3_17b_automation.lora.gguf \
--lora qwen3_17b_answer.lora.gguf \
--lora qwen3_17b_clarification.lora.gguf \
--ctx-size 8192
```
POST to `/lora-adapters` to switch the active LoRA before each
`/v1/chat/completions` call.
### vLLM (cloud)
```bash
python -m vllm.entrypoints.openai.api_server \
--model ./qwen3_17b_hf \
--enable-lora --max-loras 4 --max-lora-rank 32 \
--lora-modules \
selora-v1-commands=/path/to/peft/command \
selora-v1-automations=/path/to/peft/automation \
selora-v1-answers=/path/to/peft/answer \
selora-v1-clarifications=/path/to/peft/clarification
```
vLLM activates the matching LoRA based on the request's `model` field;
no extra routing layer needed.
## Generation parameters
```json
{
"temperature": 0.0,
"repeat_penalty": 1.15,
"repeat_last_n": 256,
"max_tokens": 384,
"stop": ["<|im_end|>", "<|endoftext|>"]
}
```
Bump `max_tokens` to 1536 for automation requests (longer JSON output).
## Training
Base: [Qwen3 1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) fine-tuned
with [Apple mlx-lm](https://github.com/ml-explore/mlx-examples). Each
specialist has its own LoRA (rank 8–28, scale 20) trained on a curated
HA-domain corpus (forum threads, HA docs, synthetic command /
automation pairs). System prompts trained per-specialist; see
[`prompts/`](prompts/). The `answer` adapter went through a sequential
continuation pass that added a `query_state` tool envelope on top of
the original answer-only training distribution; that's preserved in
the augmented `prompts/answers.txt` and the `Modelfile.answers` SYSTEM
block.
## Evaluation
10/10 parity pass rate on the four-intent suite (command, automation,
answer, clarification — plus screenshot regressions). Validator and
scenarios live in [`parity/`](parity/).
## Files in this bundle
| Artifact | Purpose | Distribution |
| --- | --- | --- |
| `qwen3_17b_base.IQ4_XS.gguf` | Quantized base for Ollama / llama.cpp | Hugging Face, ollama.com |
| `qwen3_17b_{intent}.lora.gguf` (×4) | Specialist LoRA adapters | Hugging Face, ollama.com |
| `Modelfile.{intent}` (×4) | Ollama recipes (base + LoRA + system prompt) | this repo, ollama.com |
| `prompts/{intent}.txt` (×4) | Plain-text trained prompts (reference / testing) | this repo |
The full-precision (f16) base and HF safetensors set used by vLLM /
TGI / SageMaker live separately in the cloud bundle and are not yet
mirrored to Hugging Face.
## Citation
```bibtex
@misc{selora-ai-2026,
title = {Selora AI: Qwen3 1.7B + LoRA Specialists for Home Assistant},
author = {{Selora Homes}},
year = {2026},
url = {https://huggingface.co/selora-homes/selora-ai}
}
```
Base model citation: Qwen Team, *Qwen3 Technical Report* (2025).
## License
Apache-2.0 (matches the Qwen3 base license).