---
license: apache-2.0
base_model: Qwen/Qwen3-1.7B
tags:
  - text-generation
  - qwen
  - qwen3
  - lora
  - home-assistant
  - home-automation
  - smart-home
  - tool-use
language:
  - en
library_name: transformers
pipeline_tag: text-generation
---

# Selora AI

Qwen3 1.7B fine-tuned for Home Assistant with four specialist LoRA
adapters. The `answer` adapter additionally emits a `query_state` tool
envelope for live device-state queries against the Home Assistant REST
API. Used by the [Selora AI Home Assistant
integration](https://gitlab.com/selorahomes/products/selora-ai/ha-integration);
also runnable directly via Ollama, llama.cpp, or vLLM.

## Specialists

| Adapter | Intent | Output shape |
| --- | --- | --- |
| `command` | "Turn off the kitchen lights" | `{intent:"command",response,calls:[…]}` |
| `automation` | "Wake up lights at 6:30 AM" | `{intent:"automation",automation:{triggers,actions,…}}` |
| `answer` | Q&A / small talk | `{intent:"answer",response}` |
| `clarification` | Ask the user a follow-up | `{intent:"clarification",response}` |

The HA integration's `selora_local` provider classifies each request to
one of the four specialists before the call (cheap regex
pre-classifier), then sends the request with `model:
selora-v1-{specialist}`. Backends that support multi-LoRA
(llama-server's `/lora-adapters`, vLLM `--enable-lora`) activate the
matching adapter.

## Quick start

### Ollama

```bash
ollama pull selora/commands
ollama run selora/commands
```

Modelfiles for all four specialists live in [`ollama/`](ollama/) and
are also published as separate Ollama models.

### llama.cpp

```bash
llama-server \
  --model qwen3_17b_base.Q4_K_M.gguf \
  --lora-init-without-apply \
  --lora qwen3_17b_command.lora.gguf \
  --lora qwen3_17b_automation.lora.gguf \
  --lora qwen3_17b_answer.lora.gguf \
  --lora qwen3_17b_clarification.lora.gguf \
  --ctx-size 8192
```

POST to `/lora-adapters` to switch the active LoRA before each
`/v1/chat/completions` call.

### vLLM (cloud)

```bash
python -m vllm.entrypoints.openai.api_server \
  --model ./qwen3_17b_hf \
  --enable-lora --max-loras 4 --max-lora-rank 32 \
  --lora-modules \
    selora-v1-commands=/path/to/peft/command \
    selora-v1-automations=/path/to/peft/automation \
    selora-v1-answers=/path/to/peft/answer \
    selora-v1-clarifications=/path/to/peft/clarification
```

vLLM activates the matching LoRA based on the request's `model` field;
no extra routing layer needed.

## Generation parameters

```json
{
  "temperature": 0.0,
  "repeat_penalty": 1.15,
  "repeat_last_n": 256,
  "max_tokens": 384,
  "stop": ["<|im_end|>", "<|endoftext|>"]
}
```

Bump `max_tokens` to 1536 for automation requests (longer JSON output).

## Training

Base: [Qwen3 1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) fine-tuned
with [Apple mlx-lm](https://github.com/ml-explore/mlx-examples). Each
specialist has its own LoRA (rank 8–28, scale 20) trained on a curated
HA-domain corpus (forum threads, HA docs, synthetic command /
automation pairs). System prompts trained per-specialist; see
[`prompts/`](prompts/). The `answer` adapter went through a sequential
continuation pass that added a `query_state` tool envelope on top of
the original answer-only training distribution; that's preserved in
the augmented `prompts/answers.txt` and the `Modelfile.answers` SYSTEM
block.

## Evaluation

10/10 parity pass rate on the four-intent suite (command, automation,
answer, clarification — plus screenshot regressions). Validator and
scenarios live in [`parity/`](parity/).

## Files in this bundle

| Artifact | Purpose | Distribution |
| --- | --- | --- |
| `qwen3_17b_base.IQ4_XS.gguf` | Quantized base for Ollama / llama.cpp | Hugging Face, ollama.com |
| `qwen3_17b_{intent}.lora.gguf` (×4) | Specialist LoRA adapters | Hugging Face, ollama.com |
| `Modelfile.{intent}` (×4) | Ollama recipes (base + LoRA + system prompt) | this repo, ollama.com |
| `prompts/{intent}.txt` (×4) | Plain-text trained prompts (reference / testing) | this repo |

The full-precision (f16) base and HF safetensors set used by vLLM /
TGI / SageMaker live separately in the cloud bundle and are not yet
mirrored to Hugging Face.

## Citation

```bibtex
@misc{selora-ai-2026,
  title  = {Selora AI: Qwen3 1.7B + LoRA Specialists for Home Assistant},
  author = {{Selora Homes}},
  year   = {2026},
  url    = {https://huggingface.co/selora-homes/selora-ai}
}
```

Base model citation: Qwen Team, *Qwen3 Technical Report* (2025).

## License

Apache-2.0 (matches the Qwen3 base license).