Spaces:

onitsche
/

talk-with-perplexity

Running

App Files Files Community

talk-with-perplexity / plan.md

Oliver Nitsche

Initial commit: talk_with_perplexity Reachy Mini JS app

aae9801 27 days ago

preview code

raw

history blame contribute delete

3.15 kB

	# Plan: talk_with_perplexity

	## What I understand you want

	A Hugging Face Space (static JS app) where a user can have a spoken conversation with their Reachy Mini robot, powered by the Perplexity AI API. The robot will respond expressively — moving its head and antennas — while listening, thinking, and speaking.

	---

	## App Flow

	```
	1. Open Space URL
	2. Sign in with HuggingFace (OAuth)
	3. On first use: enter your Perplexity API key → saved to localStorage
	4. Connect to signaling server → select robot → start session
	5. Press TALK button
	└─ Listening → Web SpeechRecognition transcribes voice → robot listens (antennas up)
	└─ Thinking → text sent to Perplexity API → robot "thinks" (antennas wiggle)
	└─ Speaking → Perplexity response read via SpeechSynthesis → robot nods head
	6. Conversation history panel shows exchanges
	7. Repeat
	```

	---

	## Technical Approach

	\| Layer \| Implementation \|
	\|---------------\|----------------\|
	\| Hosting \| Static HF Space (no backend) \|
	\| Auth \| HF OAuth (built into the Space) \|
	\| Robot control \| `reachy-mini.js` v1.7.1 via jsDelivr \|
	\| STT \| Web Speech API (`SpeechRecognition`) — browser-native, free \|
	\| LLM \| Perplexity `/chat/completions` — `llama-3.1-sonar-small-128k-online` model \|
	\| TTS \| Web Speech API (`speechSynthesis`) — browser-native, free \|
	\| API key \| User-entered in Settings drawer, stored in `localStorage` \|
	\| Robot motion \| Head/antenna animations during listen/think/speak states \|

	Files: `README.md` (Space YAML), `index.html`, `style.css`

	---

	## Clarifying Questions

	Answer these before I code:

	Q1: Talk button style?
	- [ ] Push-to-hold (PTT) — press and hold to speak, release to send
	- [ ] Click-to-toggle — click once to start listening, click again to stop
	- [ ] Auto voice-activity detection (browser auto-stops when you stop speaking)

	Q2: Robot camera visible?
	- [ ] Yes — show the robot's camera feed (16:9 panel at top)
	- [ ] No — conversation-only UI (simpler, faster)

	Q3: Robot persona / system prompt?
	- [ ] Generic assistant (no special persona)
	- [ ] Specific persona: _______

	Q4: Perplexity search depth?
	- [ ] `llama-3.1-sonar-small-128k-online` — fast, search-enhanced (default)
	- [ ] `llama-3.1-sonar-large-128k-online` — slower, more thorough
	- [ ] `llama-3.1-sonar-huge-128k-online` — most powerful

	Q5: Robot expressive motion?
	- [ ] Yes — head/antenna animations for listening / thinking / speaking states
	- [ ] No — keep robot still (pure conversation, no motion)

	---

	## Default Answers (I'll use these if you don't reply)

	- Q1: Auto voice-activity detection
	- Q2: Yes, show camera
	- Q3: Generic assistant
	- Q4: sonar-small (fast)
	- Q5: Yes, expressive motion

	---

	## Deployment Steps (for later)

	1. `hf auth login` (one-time)
	2. `hf repos create talk-with-perplexity --repo-type space --space-sdk static`
	3. `git clone https://huggingface.co/spaces/<username>/talk-with-perplexity`
	4. Copy files → commit → push
	5. App live at `https://<username>-talk-with-perplexity.static.hf.space/`