Spaces:
Running on Zero
Running on Zero
docs: claim best-agent with an explicit perception-action architecture section
Browse files
README.md
CHANGED
|
@@ -15,6 +15,7 @@ tags:
|
|
| 15 |
- off-brand
|
| 16 |
- off-the-grid
|
| 17 |
- best-demo
|
|
|
|
| 18 |
- sharing-is-caring
|
| 19 |
- community-choice
|
| 20 |
---
|
|
@@ -64,6 +65,13 @@ Accessibility shaped the whole interface, because the person it was made for ask
|
|
| 64 |
voice-first frontend is custom, built on **`gr.Server`** (Off-Brand). Inference runs
|
| 65 |
in the Space on **ZeroGPU**, with no third-party model APIs.
|
| 66 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
## Safety
|
| 68 |
Iris describes surroundings and reads text. Don't use it to get around or avoid
|
| 69 |
obstacles. It can't judge distance reliably and isn't safe to walk by.
|
|
|
|
| 15 |
- off-brand
|
| 16 |
- off-the-grid
|
| 17 |
- best-demo
|
| 18 |
+
- best-agent
|
| 19 |
- sharing-is-caring
|
| 20 |
- community-choice
|
| 21 |
---
|
|
|
|
| 65 |
voice-first frontend is custom, built on **`gr.Server`** (Off-Brand). Inference runs
|
| 66 |
in the Space on **ZeroGPU**, with no third-party model APIs.
|
| 67 |
|
| 68 |
+
## Architecture: a small perception-action agent
|
| 69 |
+
Iris is more than one model call. It orchestrates four tools and runs a control loop:
|
| 70 |
+
- **Role prompts** define what each model does: read money and bills, describe a scene for a blind person, report only what is new.
|
| 71 |
+
- **Intent routing** turns a spoken phrase into an action: describe, answer a question, or toggle live mode (forgiving of transcription errors).
|
| 72 |
+
- **Tools it drives:** Whisper to hear, Qwen3-VL to see and read, Piper to speak, and an on-device detector (COCO-SSD) to watch for change.
|
| 73 |
+
- **A live loop** that perceives (camera + detector), decides whether something new is worth saying, acts (calls the vision model and speaks), and remembers what it already said so it doesn't repeat.
|
| 74 |
+
|
| 75 |
## Safety
|
| 76 |
Iris describes surroundings and reads text. Don't use it to get around or avoid
|
| 77 |
obstacles. It can't judge distance reliably and isn't safe to walk by.
|