OnDeviceAgent Framework
A Unity package for building on-device voice and vision agents: wake-word detection, speech-to-text, a tool-calling LLM, retrieval-augmented generation, on-device vision, and text-to-speech wired into a single local agent runtime.
Install
This package depends on six com.sky.sentis.* model packages. They are not published to a UPM
registry, so Unity does not resolve them transitively from a git URL. Add all seven git URLs to your
project's Packages/manifest.json (this package plus the six Sentis packages):
"dependencies": {
"com.sky.ondeviceagent": "https://huggingface.co/Sky-Kim/com.sky.ondeviceagent.git",
"com.sky.sentis.e5": "https://huggingface.co/Sky-Kim/com.sky.sentis.e5.git",
"com.sky.sentis.whisper": "https://huggingface.co/Sky-Kim/com.sky.sentis.whisper.git",
"com.sky.sentis.openwakeword": "https://huggingface.co/Sky-Kim/com.sky.sentis.openwakeword.git",
"com.sky.sentis.silero-vad": "https://huggingface.co/Sky-Kim/com.sky.sentis.silero-vad.git",
"com.sky.sentis.supertonic": "https://huggingface.co/Sky-Kim/com.sky.sentis.supertonic.git",
"com.sky.sentis.yolox": "https://huggingface.co/Sky-Kim/com.sky.sentis.yolox.git"
}
The package binaries and model weights are stored in Git LFS, so git-lfs must be installed on
your machine for the Package Manager to fetch them.
This package also depends on the External Dependency Manager, served from the OpenUPM scoped registry.
Add it to Packages/manifest.json if it is not already present:
"scopedRegistries": [
{ "name": "package.openupm.com", "url": "https://package.openupm.com",
"scopes": ["com.google.external-dependency-manager"] }
]
Models
This framework uses two kinds of on-device models:
- Sentis models (wake-word, VAD, speech-to-text, text-to-speech, vision, and the RAG text
embedder) ship as separate embedded UPM packages โ
com.sky.sentis.*(e5,whisper,openwakeword,silero-vad,supertonic,yolox) โ with the FP16 weights under each package'sModels~/folder. The framework loads them straight from those packages in the Editor; before a player build an Editor step copies them intoStreamingAssets/Model/so the player ships them. All six are declared as hard dependencies of this package but must be added to your project yourself (see Install) โ they do not resolve automatically from a git URL. - On-device LLM (Android): the tool-calling LLM weights (LiteRT-LM
.litertlm) are streamed from Hugging Face on first launch โ the model is gated, so provide a Hugging Face access token โ and cached in the app's persistent data path.
Retrieval-augmented generation (LightRAG.NET)
The RAG pipeline is powered by LightRAG.NET, a C# port of
LightRAG (graph + vector retrieval). It ships here as a prebuilt managed plugin
(LightRAG.NET.dll, netstandard2.1) under Runtime/Plugins/LightRAG/, built from the
v0.2.0 release. The framework drives it
with the on-device E5 text embedder (com.sky.sentis.e5) and the tool-calling LLM, and talks to
Ollama over raw HTTP (the LightRAG.Providers.Ollama provider is intentionally not bundled). To
update it, drop a newer LightRAG.NET.dll from the release page into that folder.
Android on-device LLM
On Android the tool-calling LLM runs on-device through a thin Kotlin/JNI bridge over the LiteRT-LM
runtime (AndroidLlmTransport, called via JNI at runtime). The bridge ships entirely within this
package:
- AAR:
Runtime/Plugins/Android/llm-release.aar(classcom.ondeviceagent.llm.LlmBridge) - EDM4U deps:
Runtime/Plugins/Android/Editor/LiteRtDependencies.xml(LiteRT-LM + Qualcomm QNN for NPU) - Kotlin source + rebuild tooling:
AndroidBridge~/(hidden from the importer by~; rebuild with JDK 17 via./gradlew assembleReleaseand drop the output atRuntime/Plugins/Android/)
Run Assets โธ External Dependency Manager โธ Android Resolver โธ Resolve to fetch the Maven deps.
โ ๏ธ Redistribution note. The AAR embeds
libLiteRtDispatch_Qualcomm.so, and the EDM4U manifest pulls Qualcomm QNN Maven artifacts โ both proprietary. They are included here for convenience so the NPU backend works out of the box, but redistribution rights are unconfirmed: if you redistribute this package (or a player build that includes it), confirm Qualcomm's redistribution terms for those binaries first, or strip the NPU.soand QNN deps and ship GPU/CPU only.
Sample
For a runnable project that wires the full pipeline into a scene, see the ondeviceagent-sample repository.
License
Apache-2.0. Bundled third-party libraries and downloaded model weights carry their own licenses; see
THIRD_PARTY_NOTICES.md in the repository root.