Spaces:

onitsche
/

talk-with-perplexity

Running

App Files Files Community

talk-with-perplexity / plan.md

Oliver Nitsche

Initial commit: talk_with_perplexity Reachy Mini JS app

aae9801 27 days ago

preview code

raw

history blame contribute delete

3.15 kB

Plan: talk_with_perplexity

What I understand you want

A Hugging Face Space (static JS app) where a user can have a spoken conversation with their Reachy Mini robot, powered by the Perplexity AI API. The robot will respond expressively — moving its head and antennas — while listening, thinking, and speaking.

App Flow

1. Open Space URL
2. Sign in with HuggingFace (OAuth)
3. On first use: enter your Perplexity API key → saved to localStorage
4. Connect to signaling server → select robot → start session
5. Press TALK button
   └─ Listening  → Web SpeechRecognition transcribes voice → robot listens (antennas up)
   └─ Thinking   → text sent to Perplexity API → robot "thinks" (antennas wiggle)
   └─ Speaking   → Perplexity response read via SpeechSynthesis → robot nods head
6. Conversation history panel shows exchanges
7. Repeat

Technical Approach

Layer	Implementation
Hosting	Static HF Space (no backend)
Auth	HF OAuth (built into the Space)
Robot control	`reachy-mini.js` v1.7.1 via jsDelivr
STT	Web Speech API (`SpeechRecognition`) — browser-native, free
LLM	Perplexity `/chat/completions` — `llama-3.1-sonar-small-128k-online` model
TTS	Web Speech API (`speechSynthesis`) — browser-native, free
API key	User-entered in Settings drawer, stored in `localStorage`
Robot motion	Head/antenna animations during listen/think/speak states

Files: README.md (Space YAML), index.html, style.css

Clarifying Questions

Answer these before I code:

Q1: Talk button style?

Push-to-hold (PTT) — press and hold to speak, release to send
Click-to-toggle — click once to start listening, click again to stop
Auto voice-activity detection (browser auto-stops when you stop speaking)

Q2: Robot camera visible?

Yes — show the robot's camera feed (16:9 panel at top)
No — conversation-only UI (simpler, faster)

Q3: Robot persona / system prompt?

Generic assistant (no special persona)
Specific persona: _______

Q4: Perplexity search depth?

llama-3.1-sonar-small-128k-online — fast, search-enhanced (default)
llama-3.1-sonar-large-128k-online — slower, more thorough
llama-3.1-sonar-huge-128k-online — most powerful

Q5: Robot expressive motion?

Yes — head/antenna animations for listening / thinking / speaking states
No — keep robot still (pure conversation, no motion)

Default Answers (I'll use these if you don't reply)

Q1: Auto voice-activity detection
Q2: Yes, show camera
Q3: Generic assistant
Q4: sonar-small (fast)
Q5: Yes, expressive motion

Deployment Steps (for later)

hf auth login (one-time)
hf repos create talk-with-perplexity --repo-type space --space-sdk static
git clone https://huggingface.co/spaces/<username>/talk-with-perplexity
Copy files → commit → push
App live at https://<username>-talk-with-perplexity.static.hf.space/