Instructions to use selorahomes/Selora-AI with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use selorahomes/Selora-AI with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="selorahomes/Selora-AI")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("selorahomes/Selora-AI", dtype="auto")

llama-cpp-python

How to use selorahomes/Selora-AI with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="selorahomes/Selora-AI",
	filename="qwen3_17b_base.Q6_K.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use selorahomes/Selora-AI with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf selorahomes/Selora-AI:Q6_K
# Run inference directly in the terminal:
llama-cli -hf selorahomes/Selora-AI:Q6_K

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf selorahomes/Selora-AI:Q6_K
# Run inference directly in the terminal:
llama-cli -hf selorahomes/Selora-AI:Q6_K

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf selorahomes/Selora-AI:Q6_K
# Run inference directly in the terminal:
./llama-cli -hf selorahomes/Selora-AI:Q6_K

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf selorahomes/Selora-AI:Q6_K
# Run inference directly in the terminal:
./build/bin/llama-cli -hf selorahomes/Selora-AI:Q6_K

Use Docker

docker model run hf.co/selorahomes/Selora-AI:Q6_K

LM Studio
Jan

vLLM

How to use selorahomes/Selora-AI with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "selorahomes/Selora-AI"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "selorahomes/Selora-AI",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/selorahomes/Selora-AI:Q6_K

SGLang

How to use selorahomes/Selora-AI with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "selorahomes/Selora-AI" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "selorahomes/Selora-AI",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "selorahomes/Selora-AI" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "selorahomes/Selora-AI",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use selorahomes/Selora-AI with Ollama:
```
ollama run hf.co/selorahomes/Selora-AI:Q6_K
```

Unsloth Studio

How to use selorahomes/Selora-AI with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for selorahomes/Selora-AI to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for selorahomes/Selora-AI to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for selorahomes/Selora-AI to start chatting

How to use selorahomes/Selora-AI with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf selorahomes/Selora-AI:Q6_K

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "selorahomes/Selora-AI:Q6_K"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use selorahomes/Selora-AI with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf selorahomes/Selora-AI:Q6_K

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default selorahomes/Selora-AI:Q6_K

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use selorahomes/Selora-AI with Docker Model Runner:
```
docker model run hf.co/selorahomes/Selora-AI:Q6_K
```

Lemonade

How to use selorahomes/Selora-AI with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull selorahomes/Selora-AI:Q6_K

Run and chat with the model

lemonade run user.Selora-AI-Q6_K

List all available models

lemonade list

Gunnar Beck Nelson commited on 7 days ago

Commit

9019349

unverified ·

1 Parent(s): b39b328

Selora AI v0.4.7

Browse files

Files changed (7) hide show

manifest.json +36 -15
prompts/command_system_prompt.txt +2 -2
qwen3_17b_base.f16.gguf +0 -3
selora-v047-answer.f16.gguf +1 -1
selora-v047-automation.f16.gguf +2 -2
selora-v047-clarification.f16.gguf +1 -1
selora-v047-command.f16.gguf +1 -1

manifest.json CHANGED Viewed

@@ -1,50 +1,54 @@
 {
   "name": "selora-ai-local",
   "version": "0.4.7",
-  "description": "Selora AI v0.4.7 \u2014 built on top of [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) (Apache-2.0). F16 base + 4 LoRA specialists with slim action-then-confirm output schemas (command, automation, answer, clarification). Trained with [mlx-lm](https://github.com/ml-explore/mlx-examples) on Apple Silicon. Inference: cache_prompt enabled to amortize system-prompt KV cache across requests.",
   "base_model": {
     "id": "Qwen/Qwen3-1.7B",
     "format": "gguf",
-    "dtype": "F16",
-    "filename": "qwen3_17b_base.f16.gguf",
-    "size_bytes": 4069678880,
-    "sha256": "3e4009f0d96955a45f29aa77bded839d376d7832823c6909f76c84ace81dc445"
   },
   "loras": [
     {
       "slot": 0,
       "name": "command",
       "filename": "selora-v047-command.f16.gguf",
       "size_bytes": 19938528,
-      "sha256": "b10b5131157698400ee9fafac08ab4101b81230a79ad067eb8f89bd4e29a6273"
     },
     {
       "slot": 1,
       "name": "automation",
       "filename": "selora-v047-automation.f16.gguf",
-      "size_bytes": 37374880,
-      "sha256": "1bdd8c54cb36332889498a67046f01b55de7e5ac019e350419ba98a9c104f78f"
     },
     {
       "slot": 2,
       "name": "answer",
       "filename": "selora-v047-answer.f16.gguf",
       "size_bytes": 14957792,
-      "sha256": "9ec14105e7119675a75c1e166c132298886f4a25e57628c4ef720fce2080171e"
     },
     {
       "slot": 3,
       "name": "clarification",
       "filename": "selora-v047-clarification.f16.gguf",
       "size_bytes": 9977056,
-      "sha256": "071558279b1bc8c8609fc63a1524604aa7fcf721c5bd6d3ecdf1f55ad16f5e1a"
     }
   ],
   "system_prompts": {
     "command": {
       "filename": "command_system_prompt.txt",
-      "size_bytes": 1071,
-      "sha256": "9921c6fef09c6ebad4a2ed4fad1dbe7e76efe0bfe4e532bf7c7fe096864de6a4"
     },
     "automation": {
       "filename": "automation_system_prompt.txt",
@@ -64,12 +68,29 @@
   },
   "runtime": {
     "cache_prompt": true,
-    "ctx_size": 4096
   },
   "training": {
     "framework": "mlx-lm",
     "base_model_repo": "Qwen/Qwen3-1.7B",
-    "english_only": true,
-    "workspace_version": "0.4.7"
   }
 }

 {
   "name": "selora-ai-local",
   "version": "0.4.7",
+  "description": "Selora AI v0.4.7 \u2014 Qwen3-1.7B Q6_K base + 4 LoRA specialists. Hub-optimal base quant (fits Vega 8 VRAM, no GTT spill). Specialists retrained on a compacted-JSON corpus (automation rank 32); command specialist prompt updated, other three unchanged from v0.4.6.",
   "base_model": {
     "id": "Qwen/Qwen3-1.7B",
     "format": "gguf",
+    "dtype": "Q6_K",
+    "filename": "qwen3_17b_base.Q6_K.gguf",
+    "size_bytes": 1673006880,
+    "sha256": "a00bbdb411872149d73e1a0683b9b8a9f13cf74f98ba70ff8e8e430d9a093179"
   },
   "loras": [
     {
       "slot": 0,
       "name": "command",
+      "rank": 16,
       "filename": "selora-v047-command.f16.gguf",
       "size_bytes": 19938528,
+      "sha256": "49ba46bc054259409f5cd52eb3d9971101ed858fba27c6678c89e145815af515"
     },
     {
       "slot": 1,
       "name": "automation",
+      "rank": 32,
       "filename": "selora-v047-automation.f16.gguf",
+      "size_bytes": 59791968,
+      "sha256": "32e5633f0e554fec5e336d993647065f85a4fffd348366cdc3417ca34824b527"
     },
     {
       "slot": 2,
       "name": "answer",
+      "rank": 12,
       "filename": "selora-v047-answer.f16.gguf",
       "size_bytes": 14957792,
+      "sha256": "7584a04e13dd3e4973bfeb89ed01b7b29b3288d768d7bee15bcea3de89ff50c7"
     },
     {
       "slot": 3,
       "name": "clarification",
+      "rank": 8,
       "filename": "selora-v047-clarification.f16.gguf",
       "size_bytes": 9977056,
+      "sha256": "75c06c0efbd9ca1e7108ec6ec5811f68e2dc5639d01ef583148a151b11398847"
     }
   ],
   "system_prompts": {
     "command": {
       "filename": "command_system_prompt.txt",
+      "size_bytes": 1374,
+      "sha256": "0fa2b1669dedca18ddba2cebd7f72cd7cff0f7431b87ca4166a4bf60c7aed697"
     },
     "automation": {
       "filename": "automation_system_prompt.txt",
   },
   "runtime": {
     "cache_prompt": true,
+    "ctx_size": 8192
   },
   "training": {
     "framework": "mlx-lm",
     "base_model_repo": "Qwen/Qwen3-1.7B",
+    "scale": 20.0,
+    "rank_per_specialist": {
+      "command": 16,
+      "automation": 32,
+      "answer": 12,
+      "clarification": 8
+    },
+    "iterations_per_specialist": {
+      "command": 800,
+      "automation": 1800,
+      "answer": 600,
+      "clarification": 450
+    },
+    "examples_per_specialist": {
+      "command": 11000,
+      "automation": 10000,
+      "answer": 5500,
+      "clarification": 3000
+    }
   }
 }

prompts/command_system_prompt.txt CHANGED Viewed

@@ -5,11 +5,11 @@ Given a user command and the AVAILABLE ENTITIES list, respond with ONE JSON obje
 Rules:
 - c: ordered array of one or more service calls. Calls execute in array order.
-- s: HA service in "domain.action" form (e.g. "light.turn_on", "lock.lock", "media_player.play_media", "scene.turn_on").
 - e: canonical entity_id from AVAILABLE ENTITIES. Never use the human alias — always the entity_id.
 - d: service parameters object. Omit the d key entirely when there are no params (do not include "d":{}).
 - r: ≤ 1 sentence past-tense confirmation describing what got done (e.g. "Kitchen light on.").
-- The service domain (before the dot) must match the entity_id's domain. light.turn_on goes with light.* entities, lock.lock goes with lock.* entities, etc.
 - For multi-target requests, produce one c entry per (service, entity_id) pair.
 Output JSON only — no narration, no markdown fences, no chain-of-thought.

 Rules:
 - c: ordered array of one or more service calls. Calls execute in array order.
+- s: HA service in "domain.action" form. Only these domains are accepted for immediate execution: light, switch, fan, media_player, climate, input_boolean, scene, cover. Examples: "light.turn_on", "switch.turn_off", "media_player.media_play", "cover.open_cover", "scene.turn_on". Do not emit lock.*, alarm_*, script.*, notify.*, or any other domain — those are blocked by the safety policy and the call will fail.
 - e: canonical entity_id from AVAILABLE ENTITIES. Never use the human alias — always the entity_id.
 - d: service parameters object. Omit the d key entirely when there are no params (do not include "d":{}).
 - r: ≤ 1 sentence past-tense confirmation describing what got done (e.g. "Kitchen light on.").
+- The service domain (before the dot) must match the entity_id's domain. light.turn_on goes with light.* entities, cover.open_cover goes with cover.* entities, etc.
 - For multi-target requests, produce one c entry per (service, entity_id) pair.
 Output JSON only — no narration, no markdown fences, no chain-of-thought.

qwen3_17b_base.f16.gguf DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:3e4009f0d96955a45f29aa77bded839d376d7832823c6909f76c84ace81dc445
-size 4069678880

selora-v047-answer.f16.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9ec14105e7119675a75c1e166c132298886f4a25e57628c4ef720fce2080171e
 size 14957792

 version https://git-lfs.github.com/spec/v1
+oid sha256:7584a04e13dd3e4973bfeb89ed01b7b29b3288d768d7bee15bcea3de89ff50c7
 size 14957792

selora-v047-automation.f16.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1bdd8c54cb36332889498a67046f01b55de7e5ac019e350419ba98a9c104f78f
-size 37374880

 version https://git-lfs.github.com/spec/v1
+oid sha256:32e5633f0e554fec5e336d993647065f85a4fffd348366cdc3417ca34824b527
+size 59791968

selora-v047-clarification.f16.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:071558279b1bc8c8609fc63a1524604aa7fcf721c5bd6d3ecdf1f55ad16f5e1a
 size 9977056

 version https://git-lfs.github.com/spec/v1
+oid sha256:75c06c0efbd9ca1e7108ec6ec5811f68e2dc5639d01ef583148a151b11398847
 size 9977056

selora-v047-command.f16.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b10b5131157698400ee9fafac08ab4101b81230a79ad067eb8f89bd4e29a6273
 size 19938528

 version https://git-lfs.github.com/spec/v1
+oid sha256:49ba46bc054259409f5cd52eb3d9971101ed858fba27c6678c89e145815af515
 size 19938528