Spaces:
Running
Running
| # Plan: talk_with_perplexity | |
| ## What I understand you want | |
| A Hugging Face Space (static JS app) where a user can have a spoken conversation with their Reachy Mini robot, powered by the Perplexity AI API. The robot will respond expressively β moving its head and antennas β while listening, thinking, and speaking. | |
| --- | |
| ## App Flow | |
| ``` | |
| 1. Open Space URL | |
| 2. Sign in with HuggingFace (OAuth) | |
| 3. On first use: enter your Perplexity API key β saved to localStorage | |
| 4. Connect to signaling server β select robot β start session | |
| 5. Press TALK button | |
| ββ Listening β Web SpeechRecognition transcribes voice β robot listens (antennas up) | |
| ββ Thinking β text sent to Perplexity API β robot "thinks" (antennas wiggle) | |
| ββ Speaking β Perplexity response read via SpeechSynthesis β robot nods head | |
| 6. Conversation history panel shows exchanges | |
| 7. Repeat | |
| ``` | |
| --- | |
| ## Technical Approach | |
| | Layer | Implementation | | |
| |---------------|----------------| | |
| | Hosting | Static HF Space (no backend) | | |
| | Auth | HF OAuth (built into the Space) | | |
| | Robot control | `reachy-mini.js` v1.7.1 via jsDelivr | | |
| | STT | Web Speech API (`SpeechRecognition`) β browser-native, free | | |
| | LLM | Perplexity `/chat/completions` β `llama-3.1-sonar-small-128k-online` model | | |
| | TTS | Web Speech API (`speechSynthesis`) β browser-native, free | | |
| | API key | User-entered in Settings drawer, stored in `localStorage` | | |
| | Robot motion | Head/antenna animations during listen/think/speak states | | |
| **Files:** `README.md` (Space YAML), `index.html`, `style.css` | |
| --- | |
| ## Clarifying Questions | |
| Answer these before I code: | |
| **Q1: Talk button style?** | |
| - [ ] Push-to-hold (PTT) β press and hold to speak, release to send | |
| - [ ] Click-to-toggle β click once to start listening, click again to stop | |
| - [ ] Auto voice-activity detection (browser auto-stops when you stop speaking) | |
| **Q2: Robot camera visible?** | |
| - [ ] Yes β show the robot's camera feed (16:9 panel at top) | |
| - [ ] No β conversation-only UI (simpler, faster) | |
| **Q3: Robot persona / system prompt?** | |
| - [ ] Generic assistant (no special persona) | |
| - [ ] Specific persona: _______ | |
| **Q4: Perplexity search depth?** | |
| - [ ] `llama-3.1-sonar-small-128k-online` β fast, search-enhanced (default) | |
| - [ ] `llama-3.1-sonar-large-128k-online` β slower, more thorough | |
| - [ ] `llama-3.1-sonar-huge-128k-online` β most powerful | |
| **Q5: Robot expressive motion?** | |
| - [ ] Yes β head/antenna animations for listening / thinking / speaking states | |
| - [ ] No β keep robot still (pure conversation, no motion) | |
| --- | |
| ## Default Answers (I'll use these if you don't reply) | |
| - Q1: Auto voice-activity detection | |
| - Q2: Yes, show camera | |
| - Q3: Generic assistant | |
| - Q4: sonar-small (fast) | |
| - Q5: Yes, expressive motion | |
| --- | |
| ## Deployment Steps (for later) | |
| 1. `hf auth login` (one-time) | |
| 2. `hf repos create talk-with-perplexity --repo-type space --space-sdk static` | |
| 3. `git clone https://huggingface.co/spaces/<username>/talk-with-perplexity` | |
| 4. Copy files β commit β push | |
| 5. App live at `https://<username>-talk-with-perplexity.static.hf.space/` | |