Spaces:
Running
Running
Plan: talk_with_perplexity
What I understand you want
A Hugging Face Space (static JS app) where a user can have a spoken conversation with their Reachy Mini robot, powered by the Perplexity AI API. The robot will respond expressively β moving its head and antennas β while listening, thinking, and speaking.
App Flow
1. Open Space URL
2. Sign in with HuggingFace (OAuth)
3. On first use: enter your Perplexity API key β saved to localStorage
4. Connect to signaling server β select robot β start session
5. Press TALK button
ββ Listening β Web SpeechRecognition transcribes voice β robot listens (antennas up)
ββ Thinking β text sent to Perplexity API β robot "thinks" (antennas wiggle)
ββ Speaking β Perplexity response read via SpeechSynthesis β robot nods head
6. Conversation history panel shows exchanges
7. Repeat
Technical Approach
| Layer | Implementation |
|---|---|
| Hosting | Static HF Space (no backend) |
| Auth | HF OAuth (built into the Space) |
| Robot control | reachy-mini.js v1.7.1 via jsDelivr |
| STT | Web Speech API (SpeechRecognition) β browser-native, free |
| LLM | Perplexity /chat/completions β llama-3.1-sonar-small-128k-online model |
| TTS | Web Speech API (speechSynthesis) β browser-native, free |
| API key | User-entered in Settings drawer, stored in localStorage |
| Robot motion | Head/antenna animations during listen/think/speak states |
Files: README.md (Space YAML), index.html, style.css
Clarifying Questions
Answer these before I code:
Q1: Talk button style?
- Push-to-hold (PTT) β press and hold to speak, release to send
- Click-to-toggle β click once to start listening, click again to stop
- Auto voice-activity detection (browser auto-stops when you stop speaking)
Q2: Robot camera visible?
- Yes β show the robot's camera feed (16:9 panel at top)
- No β conversation-only UI (simpler, faster)
Q3: Robot persona / system prompt?
- Generic assistant (no special persona)
- Specific persona: _______
Q4: Perplexity search depth?
-
llama-3.1-sonar-small-128k-onlineβ fast, search-enhanced (default) -
llama-3.1-sonar-large-128k-onlineβ slower, more thorough -
llama-3.1-sonar-huge-128k-onlineβ most powerful
Q5: Robot expressive motion?
- Yes β head/antenna animations for listening / thinking / speaking states
- No β keep robot still (pure conversation, no motion)
Default Answers (I'll use these if you don't reply)
- Q1: Auto voice-activity detection
- Q2: Yes, show camera
- Q3: Generic assistant
- Q4: sonar-small (fast)
- Q5: Yes, expressive motion
Deployment Steps (for later)
hf auth login(one-time)hf repos create talk-with-perplexity --repo-type space --space-sdk staticgit clone https://huggingface.co/spaces/<username>/talk-with-perplexity- Copy files β commit β push
- App live at
https://<username>-talk-with-perplexity.static.hf.space/