Getting started
1. Add the package and let models provision
Add com.sky.ondeviceagent to your project (Package Manager → Add package from git URL, or reference
a local clone under Packages/). The six com.sky.sentis.* model packages are declared as hard
dependencies and are pulled in with it.
On-device models are not vendored in this repo — they are fetched on demand:
- Sentis models (wake-word, VAD, Whisper STT, E5 text embeddings, Supertonic TTS, YOLOX vision)
download from Hugging Face (
Sky-Kim/com.sky.sentis.*) into each model package'sModels~/folder on first Editor load (and again before a player build). An Editor step then copies them intoStreamingAssets/Model/so the player ships them. No manual download step. - On-device LLM (Android) streams from Hugging Face on first launch; see android-llm.md.
See ../THIRD_PARTY_NOTICES.md for each model's source and license.
YOLO detector model (YOLOX, Apache-2.0)
The vision detector uses YOLOX (Apache-2.0) — do
not use Ultralytics weights (yolo26n, AGPL-3.0). The weights ship in the com.sky.sentis.yolox
package (yolox_fp16.sentis, provisioned from Hugging Face like the other Sentis models);
CocoYoloDetector loads it at runtime.
The decoder expects a single output [1, N, 5+C] (or transposed) of [cx, cy, w, h, obj, classes...]
in input-pixel coordinates, and RGB NCHW input in [0,1]. NMS runs on the C# side.
Sample knowledge index (optional, for RAG)
The Voice Assistant sample ships a pre-built LightRAG index in its StreamingAssets/VoiceAgent/DB/, so
retrieval works out of the box. To rebuild it from the synthetic corpus in
StreamingAssets/Knowledge/, use the KnowledgeIngest Editor tool (menu added by
Runtime/AgentCore/Editor/KnowledgeIngestMenu.cs). Rebuilding requires a running Ollama for the
ingest model. Without an index the agent still runs; only knowledge retrieval is unavailable.
2. Install the desktop LLM (Ollama)
The Editor/desktop path talks to an Ollama server.
# install Ollama (see ollama.com), then:
ollama pull gemma4:e2b # match the model configured on the agent
ollama serve # default endpoint http://localhost:11434
Note:
gemma4:e2bis the tag configured in code (AgentBuilder.Model/ the agent'sm_Model), not a public Ollama registry tag — the real Gemma tags aregemma/gemma2/gemma3/gemma3n. Runningollama pull gemma4:e2bas-is will fail with "model not found" unless you have a matching local model tagged that way. Either tag a local model asgemma4:e2b, or changem_Modelon the agent component in the Inspector to a tag you have pulled.
The default endpoint and model are configurable on the agent component in the Inspector.
3. Run the sample
- Use Unity 6000.4.8f1 (Unity 6.x). Other versions are untested.
- Open the ondeviceagent-sample project, which references this package and wires the full pipeline into a scene.
- Open the sample scene, press Play, say the wake word, then ask a question (e.g. a URP/rendering question to exercise the bundled knowledge base, or any general question to exercise web search).
See the ondeviceagent-sample repository for sample details.
4. Android on-device LLM (optional)
To run the LLM fully on-device on Android instead of via Ollama, see android-llm.md.
In short: side-load (or download) a .litertlm model; the llm-release.aar bridge ships in this
package.
Troubleshooting
- No response from the agent (desktop): confirm
ollama serveis running and the configured model is pulled. Check the endpoint in the Inspector. - Models missing at runtime: confirm the Sentis model provisioning step ran (check the Editor
console on load / before build) so
StreamingAssets/Model/is populated; provisioning needs network access to Hugging Face. - Wake word not triggering: the voice pipeline needs microphone permission; on desktop, confirm the OS granted Unity microphone access.