Spaces:
Sleeping
Sleeping
MSG commited on
Commit ·
9939b9d
1
Parent(s): 196a48f
Feat/sunday sprint 1 (#14)
Browse files* multilingual lessons
* language page wip
* language page wip lesson
* test teacher
* test teacher lessons language model
- .cursor/plans/multilingual_coach_cohere_eed97371.plan.md +274 -0
- .env.example +9 -3
- README.md +3 -3
- USAGE.md +37 -6
- apps/gradio-space/README.md +18 -9
- apps/gradio-space/src/gradio_space/api/studio.py +223 -10
- apps/gradio-space/static/studio/index.html +67 -88
- apps/gradio-space/static/studio/studio.css +116 -47
- apps/gradio-space/static/studio/studio.js +299 -218
- libs/echocoach/src/echocoach/config.py +22 -0
- libs/echocoach/src/echocoach/pipeline.py +7 -2
- libs/echocoach/src/echocoach/prompts.py +63 -5
- libs/echocoach/src/echocoach/teacher_voice.py +31 -12
- libs/echocoach/tests/test_teacher_voice.py +37 -2
- models.yaml +24 -0
- voice_models.yaml +5 -3
.cursor/plans/multilingual_coach_cohere_eed97371.plan.md
ADDED
|
@@ -0,0 +1,274 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
name: Multilingual Coach Cohere
|
| 3 |
+
overview: Add a dedicated Studio tab — Language lessons — that unifies multilingual text chat, audio upload, and realtime-style voice in/out (Cohere Transcribe + Tiny Aya + streaming TTS) on one page, replacing the split Voice / pitch-analysis UX for the hackathon demo.
|
| 4 |
+
todos:
|
| 5 |
+
- id: aya-presets
|
| 6 |
+
content: Add tiny-aya-global/water/fire/earth to models.yaml; set voice_models.yaml coach_model default; verify TransformersBackend.chat()
|
| 7 |
+
status: completed
|
| 8 |
+
- id: locale-prompts
|
| 9 |
+
content: Add language-lesson system prompt + language_instruction() for lesson/explain modes; wire language into build_teacher_messages() and RAG path
|
| 10 |
+
status: completed
|
| 11 |
+
- id: language-lessons-page
|
| 12 |
+
content: "New Studio nav tab Language lessons: language selector, unified composer (text + mic + upload), chat with inline audio, auto VoiceOut via realtime TTS"
|
| 13 |
+
status: completed
|
| 14 |
+
- id: language-lessons-api
|
| 15 |
+
content: Extend teacher_voice_* API with auto_voiceout flag; reuse existing turn pipeline; optional speak-on-reply default for Language lessons view
|
| 16 |
+
status: completed
|
| 17 |
+
- id: cohere-space-defaults
|
| 18 |
+
content: "Document and set Space secrets: ECHOCOACH_ASR_PRESET=cohere-transcribe, ECHOCOACH_COACH_MODEL=tiny-aya-global, ECHOCOACH_REALTIME_TTS_PRESET=vibevoice-realtime-0.5b"
|
| 19 |
+
status: completed
|
| 20 |
+
- id: echocoach-i18n-polish
|
| 21 |
+
content: Move Deep pitch analysis to collapsed Advanced or Classic-only; gate English-only filler metrics; fix el Piper voice mapping
|
| 22 |
+
status: completed
|
| 23 |
+
- id: demo-docs
|
| 24 |
+
content: "Update README judge script: single Language lessons tab demo (14-lang voice + 70-lang text); Cohere Labs partner narrative"
|
| 25 |
+
status: completed
|
| 26 |
+
isProject: false
|
| 27 |
+
---
|
| 28 |
+
|
| 29 |
+
# Language lessons — one tab, text + audio + realtime voice (Cohere stack)
|
| 30 |
+
|
| 31 |
+
## Goal
|
| 32 |
+
|
| 33 |
+
Replace the current split **Voice** experience (TeacherVoice chat + buried EchoCoach pitch panel) with **one primary Studio page: Language lessons** — a multilingual learning coach where the user can interact the same way throughout:
|
| 34 |
+
|
| 35 |
+
| Input | Output |
|
| 36 |
+
|-------|--------|
|
| 37 |
+
| **Text** — type a question or lesson prompt | **Text** — chat bubbles in target language |
|
| 38 |
+
| **Mic** — hold / push-to-talk recording | **Audio** — auto-play teacher reply (realtime TTS when available) |
|
| 39 |
+
| **Upload** — `.wav` / `.mp3` clip | **Optional** — replay last reply, toggle auto-speak |
|
| 40 |
+
|
| 41 |
+
Backend stays **turn-based** (speak → wait → hear reply), but the page should *feel* realtime: mic stops → transcript appears → first audio chunk plays quickly via VibeVoice Realtime, with Piper fallback.
|
| 42 |
+
|
| 43 |
+
Partner stack ([Cohere Labs guide](https://build-small-hackathon-field-guide.hf.space/partners/cohere)): **Cohere Transcribe** (speech in) + **Tiny Aya** (coach brain, 70 langs) + **Piper / VibeVoice** (speech out).
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
## What you already have (reuse, don’t rewrite)
|
| 48 |
+
|
| 49 |
+
| Building block | Location | Reuse for Language lessons |
|
| 50 |
+
|---|---|---|
|
| 51 |
+
| Multi-turn coach pipeline | [`libs/echocoach/src/echocoach/teacher_voice.py`](libs/echocoach/src/echocoach/teacher_voice.py) | Same `run_teacher_voice_turn` / `run_teacher_voice_text_turn` |
|
| 52 |
+
| Lesson + explain prompts | [`libs/echocoach/src/echocoach/prompts.py`](libs/echocoach/src/echocoach/prompts.py) | `lesson` + `explain` modes (drop pitch from this page) |
|
| 53 |
+
| 14-language ASR/TTS config | [`voice_models.yaml`](voice_models.yaml) | Language dropdown + Cohere ASR + Piper voices |
|
| 54 |
+
| Cohere Transcribe backend | [`libs/echocoach/src/echocoach/asr/cohere.py`](libs/echocoach/src/echocoach/asr/cohere.py) | Default ASR on Space |
|
| 55 |
+
| Streaming TTS | [`libs/echocoach/src/echocoach/tts/vibevoice.py`](libs/echocoach/src/echocoach/tts/vibevoice.py) + `voiceout.py` | `chunk_first=True` already used for TeacherVoice |
|
| 56 |
+
| Studio API | [`apps/gradio-space/src/gradio_space/api/studio.py`](apps/gradio-space/src/gradio_space/api/studio.py) | `teacher_voice_turn`, `teacher_voice_audio_turn`, `voice_presets` |
|
| 57 |
+
| RAG grounding | ResearchMind via `teacher_voice.py` | Optional “Answer from my sources” toggle |
|
| 58 |
+
| Recording helpers | [`studio.js`](apps/gradio-space/static/studio/studio.js) `recordingTarget`, mic start/stop | Extend for hold-to-talk on Language lessons page |
|
| 59 |
+
|
| 60 |
+
**Not on this page:** EchoCoach one-shot pitch JSON report → move to **Classic** `/classic` EchoCoach tab only, or a collapsed “Pitch analysis (advanced)” link so Language lessons stays focused on learning.
|
| 61 |
+
|
| 62 |
+
---
|
| 63 |
+
|
| 64 |
+
## Page design — Language lessons tab
|
| 65 |
+
|
| 66 |
+
### Navigation
|
| 67 |
+
|
| 68 |
+
In [`apps/gradio-space/static/studio/index.html`](apps/gradio-space/static/studio/index.html):
|
| 69 |
+
|
| 70 |
+
- Add/rename sidebar item: **`Language lessons`** (`data-view="language-lessons"`) with icon `translate` or `school`.
|
| 71 |
+
- Demote current **Voice** nav (pitch + mixed modes) → remove from primary nav, or keep **Voice** as alias redirecting to Language lessons for one release cycle.
|
| 72 |
+
- Classic `/classic` keeps full TeacherVoice + EchoCoach tabs unchanged.
|
| 73 |
+
|
| 74 |
+
### Layout (single page)
|
| 75 |
+
|
| 76 |
+
```text
|
| 77 |
+
┌─��───────────────────────────────────────────────────────────┐
|
| 78 |
+
│ Language lessons │
|
| 79 |
+
│ Learn in your language — text, voice, or upload audio │
|
| 80 |
+
├──────────────┬──────────────────────────────────────────────┤
|
| 81 |
+
│ LEFT RAIL │ MAIN — conversation │
|
| 82 |
+
│ │ │
|
| 83 |
+
│ Target lang ▼│ [User bubble — text or transcript] │
|
| 84 |
+
│ Coach model │ [Teacher bubble — text + inline ▶ audio] │
|
| 85 |
+
│ (Aya Global)│ ... │
|
| 86 |
+
│ │ │
|
| 87 |
+
│ Lesson topic │ ── UNIFIED COMPOSER ────────────────────── │
|
| 88 |
+
│ │ [ Text area — always visible ] │
|
| 89 |
+
│ ☑ Use sources│ [ 🎤 Hold to speak ] [ 📎 Upload audio ] │
|
| 90 |
+
│ │ [ Send ] ☑ Auto-speak replies │
|
| 91 |
+
│ Add sources │ Status: Listening… / Transcribing… / … │
|
| 92 |
+
│ (details) │ │
|
| 93 |
+
└──────────────┴──────────────────────────────────────────────┘
|
| 94 |
+
```
|
| 95 |
+
|
| 96 |
+
**Left rail controls**
|
| 97 |
+
|
| 98 |
+
- **Target language** — required; populated from `voice_presets.languages` (14 voice langs).
|
| 99 |
+
- **Coach variant** (optional Advanced): Auto regional → Tiny Aya Global / Water / Fire / Earth.
|
| 100 |
+
- **Lesson topic** — defaults to workspace topic; grounds lesson mode.
|
| 101 |
+
- **Use indexed sources** — same as current `#use-rag`; applies to explain + lesson.
|
| 102 |
+
- **Add sources** — reuse voice-rail ingest (discover, URLs, PDF) or link to Research view.
|
| 103 |
+
|
| 104 |
+
**Main conversation**
|
| 105 |
+
|
| 106 |
+
- Messages format: user shows typed text or “🎤 transcript”; assistant shows reply text + embedded `<audio controls autoplay>` when VoiceOut path returned.
|
| 107 |
+
- Empty state copy: “Choose a language, then type, speak, or upload audio to start your lesson.”
|
| 108 |
+
|
| 109 |
+
**Unified composer (one place for all input modes)**
|
| 110 |
+
|
| 111 |
+
1. **Text** — textarea + **Send** → `teacher_voice_turn` with `mode=lesson` (default) or toggle **Explain** vs **Lesson coach** (two small pills, not three modes).
|
| 112 |
+
2. **Mic** — **Hold to speak** (mousedown/touchstart → record, release → stop → auto `teacher_voice_audio_turn`). Reuse existing `recordingTarget` pattern; set `state.recordingTarget = "language-lessons"`.
|
| 113 |
+
3. **Upload** — file input → preview waveform/name → **Send audio** or auto-send on select.
|
| 114 |
+
4. **Auto-speak replies** — checkbox default **on**; passes through to API so server always synthesizes TTS (already default in pipeline when `synthesize_voice_reply` runs).
|
| 115 |
+
|
| 116 |
+
**Realtime voice output behavior**
|
| 117 |
+
|
| 118 |
+
- Use `ECHOCOACH_REALTIME_TTS_PRESET=vibevoice-realtime-0.5b` for Language lessons page (14 langs experimental on VibeVoice; fallback to Piper per lang).
|
| 119 |
+
- Frontend: on response, `autoplay` first audio element; show “Speaking…” while playing.
|
| 120 |
+
- Honest scope: **not** full-duplex WebSocket; latency target is “release mic → hear teacher within ~1–3s on GPU” via chunked TTS already in `voiceout.py`.
|
| 121 |
+
|
| 122 |
+
**70-language text demo (no voice required)**
|
| 123 |
+
|
| 124 |
+
- Language dropdown includes **“Other (text only)”** free-text ISO/code field OR a second “LLM language” field for codes outside Piper set (e.g. `hi`, `sw`).
|
| 125 |
+
- Helper: “Voice in/out: 14 languages · Coach understands 70+ with Tiny Aya.”
|
| 126 |
+
- When language has no Piper voice, show text reply only + banner “VoiceOut not available for this language.”
|
| 127 |
+
|
| 128 |
+
---
|
| 129 |
+
|
| 130 |
+
## Target architecture
|
| 131 |
+
|
| 132 |
+
```mermaid
|
| 133 |
+
flowchart TB
|
| 134 |
+
subgraph page [Language lessons page]
|
| 135 |
+
TextIn[Text composer]
|
| 136 |
+
MicIn[Hold-to-talk mic]
|
| 137 |
+
FileIn[Audio upload]
|
| 138 |
+
end
|
| 139 |
+
|
| 140 |
+
TextIn --> Turn[teacher_voice turn]
|
| 141 |
+
MicIn --> ASR[Cohere Transcribe 2B]
|
| 142 |
+
FileIn --> ASR
|
| 143 |
+
ASR --> Turn
|
| 144 |
+
Turn --> Aya[Tiny Aya Global or regional]
|
| 145 |
+
RAG[ResearchMind RAG] --> Aya
|
| 146 |
+
Aya --> Reply[Lesson reply text]
|
| 147 |
+
Reply --> TTS[VibeVoice Realtime or Piper]
|
| 148 |
+
TTS --> AutoPlay[Inline autoplay audio]
|
| 149 |
+
Reply --> Chat[Chat bubbles]
|
| 150 |
+
```
|
| 151 |
+
|
| 152 |
+
---
|
| 153 |
+
|
| 154 |
+
## Gaps to close (updated)
|
| 155 |
+
|
| 156 |
+
1. **No dedicated Language lessons view** — today everything lives under generic **Voice** with pitch mode + EchoCoach panel ([`index.html` L303–419](apps/gradio-space/static/studio/index.html)).
|
| 157 |
+
2. **Language not wired in Studio JS** — hardcoded `default_language` in [`studio.js`](apps/gradio-space/static/studio/studio.js) (~L1187).
|
| 158 |
+
3. **Split send paths** — “Send text” vs “Send voice turn” should become one flow with auto-routing by input type.
|
| 159 |
+
4. **Manual replay buttons** — “Speak full reply” should be default-on for Language lessons; keep replay as secondary.
|
| 160 |
+
5. **Coach LLM** — still MiniCPM5 1B; need Tiny Aya presets for multilingual quality.
|
| 161 |
+
6. **Default ASR** — Whisper tiny, not Cohere Transcribe.
|
| 162 |
+
7. **Pitch/EchoCoach clutter** — remove from primary Language lessons UX.
|
| 163 |
+
|
| 164 |
+
---
|
| 165 |
+
|
| 166 |
+
## Implementation plan
|
| 167 |
+
|
| 168 |
+
### 1. Backend — Tiny Aya + locale prompts (unchanged core)
|
| 169 |
+
|
| 170 |
+
Add to [`models.yaml`](models.yaml):
|
| 171 |
+
|
| 172 |
+
| Preset | HF model_id |
|
| 173 |
+
|--------|-------------|
|
| 174 |
+
| `tiny-aya-global` | `CohereLabs/tiny-aya-global` |
|
| 175 |
+
| `tiny-aya-water` | `CohereLabs/tiny-aya-water` |
|
| 176 |
+
| `tiny-aya-fire` | `CohereLabs/tiny-aya-fire` |
|
| 177 |
+
| `tiny-aya-earth` | `CohereLabs/tiny-aya-earth` |
|
| 178 |
+
|
| 179 |
+
Set `voice_models.yaml` → `defaults.coach_model: tiny-aya-global`.
|
| 180 |
+
|
| 181 |
+
In [`prompts.py`](libs/echocoach/src/echocoach/prompts.py):
|
| 182 |
+
|
| 183 |
+
- Add `LANGUAGE_LESSON_SYSTEM` (or extend `LESSON_SYSTEM` / `EXPLAIN_SYSTEM`) with explicit target-language instruction.
|
| 184 |
+
- Add `language_instruction(language: str) -> str` injected in `build_teacher_messages()`.
|
| 185 |
+
|
| 186 |
+
Optional `resolve_aya_preset(language)` for Water/Fire/Earth when user picks “Auto regional”.
|
| 187 |
+
|
| 188 |
+
### 2. Backend — Language lessons API surface
|
| 189 |
+
|
| 190 |
+
In [`studio.py`](apps/gradio-space/src/gradio_space/api/studio.py):
|
| 191 |
+
|
| 192 |
+
- Add thin wrapper `api_language_lesson_turn(...)` OR alias existing endpoints with fixed `mode` default `lesson`.
|
| 193 |
+
- Parameters: `message`, `audio_path`, `language`, `topic`, `use_rag`, `history`, `mode` (`lesson`|`explain`), `auto_voiceout=True`, `coach_model` optional override.
|
| 194 |
+
- Ensure `language` is always passed through to ASR + TTS + prompts (no default-only path from frontend).
|
| 195 |
+
|
| 196 |
+
Register in Studio HTML boot (`initLanguageLessons()` parallel to `initVoicePresets()`).
|
| 197 |
+
|
| 198 |
+
### 3. Frontend — new Language lessons page
|
| 199 |
+
|
| 200 |
+
Files: [`studio_html.py`](apps/gradio-space/src/gradio_space/ui/studio_html.py) (fragment), [`index.html`](apps/gradio-space/static/studio/index.html), [`studio.js`](apps/gradio-space/static/studio/studio.js), [`studio.css`](apps/gradio-space/static/studio/studio.css).
|
| 201 |
+
|
| 202 |
+
- New `<section class="col col-studio" data-view-panel="language-lessons">` with layout above.
|
| 203 |
+
- JS module: `state.languageLesson = { language, mode, autoSpeak, history }`.
|
| 204 |
+
- Wire nav `data-view="language-lessons"` in existing view switcher.
|
| 205 |
+
- **Hold-to-talk**: pointerdown on `#btn-lesson-hold-mic` → start recording; pointerup → stop → `sendLanguageLessonAudioTurn(path)`.
|
| 206 |
+
- **Unified send**: if textarea non-empty → text turn; else if pending audio → audio turn.
|
| 207 |
+
- **Render**: extend chat renderer to show inline audio on assistant messages (reuse `renderVoiceReply` patterns).
|
| 208 |
+
- Remove pitch mode cards and `#voice-pitch-analysis` from this view (Classic EchoCoach tab remains).
|
| 209 |
+
|
| 210 |
+
### 4. Space defaults (Cohere partner demo)
|
| 211 |
+
|
| 212 |
+
```bash
|
| 213 |
+
ECHOCOACH_ASR_PRESET=cohere-transcribe
|
| 214 |
+
ECHOCOACH_COACH_MODEL=tiny-aya-global
|
| 215 |
+
ECHOCOACH_TTS_PRESET=piper-multilingual
|
| 216 |
+
ECHOCOACH_REALTIME_TTS_PRESET=vibevoice-realtime-0.5b
|
| 217 |
+
```
|
| 218 |
+
|
| 219 |
+
Document in [`USAGE.md`](USAGE.md). GPU Space recommended.
|
| 220 |
+
|
| 221 |
+
### 5. Polish & demote pitch analysis
|
| 222 |
+
|
| 223 |
+
- Gate English-only filler metrics in EchoCoach when `language != "en"`.
|
| 224 |
+
- Fix Greek Piper mapping (`el`) in `voice_models.yaml`.
|
| 225 |
+
- EchoCoach deep analysis: Classic tab only, or footer link “Practice a monologue (pitch metrics)” opening Classic.
|
| 226 |
+
|
| 227 |
+
### 6. Demo script (single tab)
|
| 228 |
+
|
| 229 |
+
Update [`README.md`](README.md) / [`apps/gradio-space/README.md`](apps/gradio-space/README.md):
|
| 230 |
+
|
| 231 |
+
1. Open **Language lessons**.
|
| 232 |
+
2. Select **French** → hold mic → ask “Explique le fine-tuning en termes simples.” → hear Piper/VibeVoice reply.
|
| 233 |
+
3. Switch to **Spanish**, type a follow-up question (text in, text + audio out).
|
| 234 |
+
4. Select **Hindi** (text-only) → show Tiny Aya Fire-quality written lesson snippet.
|
| 235 |
+
5. Toggle **Use sources** after ingesting one PDF in Research.
|
| 236 |
+
|
| 237 |
+
Badge line: **Cohere Labs** — Transcribe + Tiny Aya on one local Language lessons page.
|
| 238 |
+
|
| 239 |
+
### 7. Tests
|
| 240 |
+
|
| 241 |
+
[`libs/echocoach/tests/test_teacher_voice.py`](libs/echocoach/tests/test_teacher_voice.py):
|
| 242 |
+
|
| 243 |
+
- `build_teacher_messages(..., language="fr")` contains French instruction.
|
| 244 |
+
- Optional: API contract test that `language` propagates to mock ASR call.
|
| 245 |
+
|
| 246 |
+
---
|
| 247 |
+
|
| 248 |
+
## What you do **not** need for hackathon MVP
|
| 249 |
+
|
| 250 |
+
- Full duplex / interruptible WebSocket conversation
|
| 251 |
+
- TTS for all 70 Tiny Aya languages
|
| 252 |
+
- Replacing ResearchMind embeddings with multilingual models
|
| 253 |
+
- Keeping pitch practice on the same page as Language lessons
|
| 254 |
+
|
| 255 |
+
---
|
| 256 |
+
|
| 257 |
+
## Risk notes
|
| 258 |
+
|
| 259 |
+
| Risk | Mitigation |
|
| 260 |
+
|------|------------|
|
| 261 |
+
| GPU RAM (Transcribe 2B + Aya 3.3B) | Sequential load on ZeroGPU; dev fallback whisper + Aya |
|
| 262 |
+
| VibeVoice lang coverage gaps | Piper fallback per `voice_models.yaml`; text-only banner |
|
| 263 |
+
| Hold-to-talk on mobile browsers | Push-to-talk fallback buttons (start/stop) |
|
| 264 |
+
| Scope creep from 3-mode Voice tab | Language lessons = **lesson + explain only** |
|
| 265 |
+
|
| 266 |
+
---
|
| 267 |
+
|
| 268 |
+
## Suggested execution order
|
| 269 |
+
|
| 270 |
+
1. Tiny Aya presets + locale prompts (quality foundation)
|
| 271 |
+
2. **Language lessons page** HTML/JS/CSS + unified composer
|
| 272 |
+
3. Wire language + auto_voiceout through API
|
| 273 |
+
4. Space env defaults (Cohere ASR + realtime TTS)
|
| 274 |
+
5. Demote EchoCoach pitch from Studio; docs + demo script
|
.env.example
CHANGED
|
@@ -52,11 +52,17 @@ ALLOW_MODEL_SWITCH=false
|
|
| 52 |
# After training, point Gradio at the adapter preset:
|
| 53 |
# ACTIVE_MODEL=minicpm5-1b-lesson-lora
|
| 54 |
|
| 55 |
-
# --- EchoCoach (voice
|
| 56 |
# VOICE_PRESETS_PATH=./voice_models.yaml
|
| 57 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
# ECHOCOACH_TTS_PRESET=piper-multilingual
|
| 59 |
-
# ECHOCOACH_REALTIME_TTS_PRESET=vibevoice-realtime-0.5b
|
|
|
|
|
|
|
| 60 |
# ECHOCOACH_COACH_MODEL=minicpm5-1b
|
| 61 |
# ECHOCOACH_MAX_SECONDS=30
|
| 62 |
# ECHOCOACH_CAPTURE_DEVICE= # optional ALSA/PipeWire device (e.g. pipewire, alsa_input.pci-...)
|
|
|
|
| 52 |
# After training, point Gradio at the adapter preset:
|
| 53 |
# ACTIVE_MODEL=minicpm5-1b-lesson-lora
|
| 54 |
|
| 55 |
+
# --- EchoCoach / Language lessons (voice stack) ---
|
| 56 |
# VOICE_PRESETS_PATH=./voice_models.yaml
|
| 57 |
+
# Recommended for Cohere Labs partner demo (GPU Space):
|
| 58 |
+
# ECHOCOACH_ASR_PRESET=cohere-transcribe
|
| 59 |
+
# ECHOCOACH_COACH_MODEL=tiny-aya-global
|
| 60 |
+
# Comma-separated preset keys from models.yaml if primary coach fails to load:
|
| 61 |
+
# ECHOCOACH_COACH_FALLBACK=minicpm5-1b
|
| 62 |
# ECHOCOACH_TTS_PRESET=piper-multilingual
|
| 63 |
+
# ECHOCOACH_REALTIME_TTS_PRESET=vibevoice-realtime-0.5b
|
| 64 |
+
# Dev fallback (CPU):
|
| 65 |
+
# ECHOCOACH_ASR_PRESET=whisper-cpp-tiny
|
| 66 |
# ECHOCOACH_COACH_MODEL=minicpm5-1b
|
| 67 |
# ECHOCOACH_MAX_SECONDS=30
|
| 68 |
# ECHOCOACH_CAPTURE_DEVICE= # optional ALSA/PipeWire device (e.g. pipewire, alsa_input.pci-...)
|
README.md
CHANGED
|
@@ -38,10 +38,10 @@ Open [http://localhost:7860](http://localhost:7860).
|
|
| 38 |
|
| 39 |
### Studio UI (Off Brand track)
|
| 40 |
|
| 41 |
-
The default landing page is a **custom AI Studio workspace** at `/` — not default Gradio chrome. It uses **Gradio 6 Server mode** (`gradio.Server`): Material 3 layout, sidebar +
|
| 42 |
|
| 43 |
-
- **`/`** — Studio UI (ingest sources, generate slides,
|
| 44 |
-
- **`/classic`** — full Gradio Blocks app (
|
| 45 |
|
| 46 |
See [apps/gradio-space/README.md](apps/gradio-space/README.md) for API names and a 2-minute judge demo script.
|
| 47 |
|
|
|
|
| 38 |
|
| 39 |
### Studio UI (Off Brand track)
|
| 40 |
|
| 41 |
+
The default landing page is a **custom AI Studio workspace** at `/` — not default Gradio chrome. It uses **Gradio 6 Server mode** (`gradio.Server`): Material 3 layout, sidebar + workspace (Research → Slides → Language lessons), and `@server.api` endpoints wired to the same Python backends as Classic.
|
| 42 |
|
| 43 |
+
- **`/`** — Studio UI (ingest sources, generate slides, **Language lessons** multilingual coach)
|
| 44 |
+
- **`/classic`** — full Gradio Blocks app (TeacherVoice, EchoCoach pitch analysis, settings, Chat debug)
|
| 45 |
|
| 46 |
See [apps/gradio-space/README.md](apps/gradio-space/README.md) for API names and a 2-minute judge demo script.
|
| 47 |
|
USAGE.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
| 2 |
|
| 3 |
How to run the **Lesson Agent** Gradio app locally, deploy to a Hugging Face Space (Gradio SDK + ZeroGPU), and optionally test with Docker later for the [Build Small Hackathon](https://huggingface.co/build-small-hackathon).
|
| 4 |
|
| 5 |
-
The primary UI is the **Lesson slides** tab (topic → local model outline → downloadable `.pptx`). Use **ResearchMind** for corpus Q&A, **
|
| 6 |
|
| 7 |
## Prerequisites
|
| 8 |
|
|
@@ -115,10 +115,11 @@ Configure presets in [`voice_models.yaml`](voice_models.yaml) or via `.env`:
|
|
| 115 |
|
| 116 |
| Variable | Default | Description |
|
| 117 |
| -------- | ------- | ----------- |
|
| 118 |
-
| `ECHOCOACH_ASR_PRESET` | `
|
| 119 |
| `ECHOCOACH_TTS_PRESET` | `piper-multilingual` | TTS preset key (EchoCoach, default VoiceOut) |
|
| 120 |
-
| `ECHOCOACH_REALTIME_TTS_PRESET` | `vibevoice-realtime-0.5b` |
|
| 121 |
-
| `ECHOCOACH_COACH_MODEL` | `
|
|
|
|
| 122 |
| `ECHOCOACH_MAX_SECONDS` | `30` | Max recording length |
|
| 123 |
|
| 124 |
**Cohere Transcribe** (`cohere-transcribe`) is gated on Hugging Face — run `huggingface-cli login`, accept the model terms, then set `ECHOCOACH_ASR_PRESET=cohere-transcribe`. GPU recommended for ASR + coach together.
|
|
@@ -129,9 +130,39 @@ Smoke tests (analysis only, no GPU):
|
|
| 129 |
bash scripts/echo_coach_smoke.sh
|
| 130 |
```
|
| 131 |
|
| 132 |
-
###
|
| 133 |
|
| 134 |
-
The **
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 135 |
|
| 136 |
| Mode | Purpose |
|
| 137 |
| ---- | ------- |
|
|
|
|
| 2 |
|
| 3 |
How to run the **Lesson Agent** Gradio app locally, deploy to a Hugging Face Space (Gradio SDK + ZeroGPU), and optionally test with Docker later for the [Build Small Hackathon](https://huggingface.co/build-small-hackathon).
|
| 4 |
|
| 5 |
+
The primary UI is the **Lesson slides** tab (topic → local model outline → downloadable `.pptx`). Use **ResearchMind** for corpus Q&A, **Language lessons** for multilingual text + voice tutoring (Cohere Transcribe + Tiny Aya), **EchoCoach** for one-shot pitch analysis in Classic UI, or ground lessons directly from the Lesson tab. The **Chat (debug)** tab tests the underlying model.
|
| 6 |
|
| 7 |
## Prerequisites
|
| 8 |
|
|
|
|
| 115 |
|
| 116 |
| Variable | Default | Description |
|
| 117 |
| -------- | ------- | ----------- |
|
| 118 |
+
| `ECHOCOACH_ASR_PRESET` | `cohere-transcribe` | ASR preset key (Space demo); use `whisper-cpp-tiny` on CPU dev |
|
| 119 |
| `ECHOCOACH_TTS_PRESET` | `piper-multilingual` | TTS preset key (EchoCoach, default VoiceOut) |
|
| 120 |
+
| `ECHOCOACH_REALTIME_TTS_PRESET` | `vibevoice-realtime-0.5b` | Language lessons streaming TTS (see below) |
|
| 121 |
+
| `ECHOCOACH_COACH_MODEL` | `tiny-aya-global` | Text coach preset (Tiny Aya; from `models.yaml`) |
|
| 122 |
+
| `ECHOCOACH_COACH_FALLBACK` | `minicpm5-1b` | Comma-separated fallback presets if primary coach fails to load |
|
| 123 |
| `ECHOCOACH_MAX_SECONDS` | `30` | Max recording length |
|
| 124 |
|
| 125 |
**Cohere Transcribe** (`cohere-transcribe`) is gated on Hugging Face — run `huggingface-cli login`, accept the model terms, then set `ECHOCOACH_ASR_PRESET=cohere-transcribe`. GPU recommended for ASR + coach together.
|
|
|
|
| 130 |
bash scripts/echo_coach_smoke.sh
|
| 131 |
```
|
| 132 |
|
| 133 |
+
### Language lessons — multilingual coach (Studio tab)
|
| 134 |
|
| 135 |
+
The **Language lessons** tab is the primary voice learning experience: one page for **text**, **hold-to-talk mic**, and **audio upload**, with optional auto VoiceOut on every reply.
|
| 136 |
+
|
| 137 |
+
| Input | Output |
|
| 138 |
+
| ----- | ------ |
|
| 139 |
+
| Type a question | Chat bubble in target language |
|
| 140 |
+
| Hold mic / upload audio | Transcript + teacher reply; auto-play TTS when enabled |
|
| 141 |
+
| **Other (text only)** language code | Tiny Aya written lesson (no Piper voice for unsupported codes) |
|
| 142 |
+
|
| 143 |
+
**Stack (Cohere Labs partner demo):** [Cohere Transcribe](https://huggingface.co/CohereLabs/c4ai-transcribe-v2) (14 voice langs) → [Tiny Aya Global / regional](https://huggingface.co/CohereLabs/tiny-aya-global) (70+ text langs) → Piper or VibeVoice Realtime for speech out.
|
| 144 |
+
|
| 145 |
+
Set Space secrets (GPU recommended):
|
| 146 |
+
|
| 147 |
+
```bash
|
| 148 |
+
ECHOCOACH_ASR_PRESET=cohere-transcribe
|
| 149 |
+
ECHOCOACH_COACH_MODEL=tiny-aya-global
|
| 150 |
+
ECHOCOACH_TTS_PRESET=piper-multilingual
|
| 151 |
+
ECHOCOACH_REALTIME_TTS_PRESET=vibevoice-realtime-0.5b
|
| 152 |
+
```
|
| 153 |
+
|
| 154 |
+
| Mode | Purpose |
|
| 155 |
+
| ---- | ------- |
|
| 156 |
+
| **Explain** | Tutor any topic in plain language |
|
| 157 |
+
| **Lesson coach** | Discuss and outline lesson content |
|
| 158 |
+
|
| 159 |
+
Turn-based (not full duplex): speak → wait → hear reply. **Auto-speak replies** synthesizes TTS each turn when the language has a Piper voice.
|
| 160 |
+
|
| 161 |
+
Pitch metrics and monologue analysis live in **Classic UI → EchoCoach** (`/classic`).
|
| 162 |
+
|
| 163 |
+
### TeacherVoice — Classic UI (turn-based)
|
| 164 |
+
|
| 165 |
+
The **TeacherVoice** tab in `/classic` is the legacy multi-turn voice teacher — same pipeline as Language lessons, plus **Pitch practice** mode.
|
| 166 |
|
| 167 |
| Mode | Purpose |
|
| 168 |
| ---- | ------- |
|
apps/gradio-space/README.md
CHANGED
|
@@ -33,8 +33,9 @@ This package uses **Gradio 6 Server mode** (`gradio.Server`):
|
|
| 33 |
|
| 34 |
**Voice & coach**
|
| 35 |
|
|
|
|
| 36 |
- `teacher_voice_turn`, `teacher_voice_audio_turn`, `teacher_voice_clear`, `teacher_voice_speak`
|
| 37 |
-
- `load_sample_pitch`, `analyze_pitch` (language, ASR preset, `speak_rewrite`)
|
| 38 |
- `recording_status`, `recording_start`, `recording_stop`
|
| 39 |
- `voice_presets`
|
| 40 |
|
|
@@ -44,15 +45,23 @@ This package uses **Gradio 6 Server mode** (`gradio.Server`):
|
|
| 44 |
- `debug_chat`
|
| 45 |
- `save_upload`
|
| 46 |
|
| 47 |
-
## Demo script (judges)
|
|
|
|
|
|
|
| 48 |
|
| 49 |
1. Open `/` — **Small Model Finetuning** project workspace
|
| 50 |
-
2.
|
| 51 |
-
3.
|
| 52 |
-
4.
|
| 53 |
-
5.
|
| 54 |
-
6.
|
| 55 |
-
|
| 56 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
Space card metadata lives in the [repository root README.md](../../README.md).
|
|
|
|
| 33 |
|
| 34 |
**Voice & coach**
|
| 35 |
|
| 36 |
+
- `language_lesson_turn` — unified text/audio turn for Language lessons (mode, language, `auto_voiceout`, coach variant)
|
| 37 |
- `teacher_voice_turn`, `teacher_voice_audio_turn`, `teacher_voice_clear`, `teacher_voice_speak`
|
| 38 |
+
- `load_sample_pitch`, `analyze_pitch` (Classic EchoCoach; language, ASR preset, `speak_rewrite`)
|
| 39 |
- `recording_status`, `recording_start`, `recording_stop`
|
| 40 |
- `voice_presets`
|
| 41 |
|
|
|
|
| 45 |
- `debug_chat`
|
| 46 |
- `save_upload`
|
| 47 |
|
| 48 |
+
## Demo script (judges) — Language lessons + Cohere stack
|
| 49 |
+
|
| 50 |
+
**Badge line:** Cohere Labs — Transcribe + Tiny Aya on one local Language lessons page.
|
| 51 |
|
| 52 |
1. Open `/` — **Small Model Finetuning** project workspace
|
| 53 |
+
2. **Language lessons** tab → select **French** → hold mic → ask *« Explique le fine-tuning en termes simples. »* → hear Piper/VibeVoice reply
|
| 54 |
+
3. Switch to **Spanish**, type a follow-up (text in, text + audio out with **Auto-speak replies** on)
|
| 55 |
+
4. Select **Other (text only)** → enter `hi` → show Tiny Aya Fire-quality written lesson (text only banner)
|
| 56 |
+
5. Toggle **Use indexed sources** after ingesting one PDF in **Research**
|
| 57 |
+
6. Optional: **Generate Slides** from the Slides tab; **Classic UI** (`/classic`) for EchoCoach pitch metrics
|
| 58 |
+
|
| 59 |
+
Space secrets for GPU demo:
|
| 60 |
+
|
| 61 |
+
```bash
|
| 62 |
+
ECHOCOACH_ASR_PRESET=cohere-transcribe
|
| 63 |
+
ECHOCOACH_COACH_MODEL=tiny-aya-global
|
| 64 |
+
ECHOCOACH_REALTIME_TTS_PRESET=vibevoice-realtime-0.5b
|
| 65 |
+
```
|
| 66 |
|
| 67 |
Space card metadata lives in the [repository root README.md](../../README.md).
|
apps/gradio-space/src/gradio_space/api/studio.py
CHANGED
|
@@ -10,7 +10,7 @@ import gradio as gr
|
|
| 10 |
|
| 11 |
from echocoach.config import get_echo_coach_config
|
| 12 |
from echocoach.pipeline import run_echo_coach
|
| 13 |
-
from echocoach.prompts import TeacherVoiceMode
|
| 14 |
from echocoach.recording import (
|
| 15 |
ServerRecordingError,
|
| 16 |
recording_backend_status,
|
|
@@ -51,7 +51,7 @@ from gradio_space.ui.studio_html import (
|
|
| 51 |
render_trace_details,
|
| 52 |
)
|
| 53 |
from gradio_space.voice_helpers import speak_last_assistant_reply
|
| 54 |
-
from inference.config import get_app_config
|
| 55 |
from inference.factory import get_backend
|
| 56 |
from researchmind.config import get_config as get_research_config
|
| 57 |
from researchmind.ingest import IngestPipeline
|
|
@@ -167,11 +167,93 @@ def _voice_stack_summary() -> str:
|
|
| 167 |
f"ASR: {asr.label} ({_echo_config.asr_preset})",
|
| 168 |
f"TTS: {tts.label} ({_echo_config.tts_preset})",
|
| 169 |
f"Coach model: {_echo_config.coach_model}",
|
|
|
|
| 170 |
f"Max recording: {_echo_config.max_seconds}s",
|
| 171 |
]
|
| 172 |
return "\n".join(lines)
|
| 173 |
|
| 174 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 175 |
def _paths_summary() -> str:
|
| 176 |
rm = get_research_config()
|
| 177 |
lines = []
|
|
@@ -549,9 +631,15 @@ def api_teacher_voice_turn(
|
|
| 549 |
doc_ids: list[str] | None = None,
|
| 550 |
language: str = "en",
|
| 551 |
asr_preset: str | None = None,
|
|
|
|
|
|
|
|
|
|
| 552 |
) -> dict[str, Any]:
|
| 553 |
-
model_key =
|
| 554 |
-
|
|
|
|
|
|
|
|
|
|
| 555 |
if load_error:
|
| 556 |
return err(load_error)
|
| 557 |
|
|
@@ -567,9 +655,11 @@ def api_teacher_voice_turn(
|
|
| 567 |
language=language,
|
| 568 |
topic=topic.strip() or None,
|
| 569 |
backend=get_backend(model_key),
|
|
|
|
| 570 |
use_rag=use_rag and mode in RAG_MODES,
|
| 571 |
session_id=session_id or None,
|
| 572 |
doc_ids=doc_ids or None,
|
|
|
|
| 573 |
)
|
| 574 |
except Exception as exc: # noqa: BLE001
|
| 575 |
return err(str(exc))
|
|
@@ -577,9 +667,12 @@ def api_teacher_voice_turn(
|
|
| 577 |
return ok(
|
| 578 |
history=result.history,
|
| 579 |
assistant=result.assistant_text,
|
| 580 |
-
status=result.rag_status
|
| 581 |
voiceout_path=result.voiceout_path,
|
|
|
|
| 582 |
rag_references=result.rag_references,
|
|
|
|
|
|
|
| 583 |
)
|
| 584 |
|
| 585 |
def api_teacher_voice_audio_turn(
|
|
@@ -592,9 +685,15 @@ def api_teacher_voice_audio_turn(
|
|
| 592 |
doc_ids: list[str] | None = None,
|
| 593 |
language: str = "en",
|
| 594 |
asr_preset: str | None = None,
|
|
|
|
|
|
|
|
|
|
| 595 |
) -> dict[str, Any]:
|
| 596 |
-
model_key =
|
| 597 |
-
|
|
|
|
|
|
|
|
|
|
| 598 |
if load_error:
|
| 599 |
return err(load_error)
|
| 600 |
|
|
@@ -613,10 +712,12 @@ def api_teacher_voice_audio_turn(
|
|
| 613 |
asr_preset=preset,
|
| 614 |
topic=topic.strip() or None,
|
| 615 |
backend=get_backend(model_key),
|
|
|
|
| 616 |
use_rag=use_rag and mode in RAG_MODES,
|
| 617 |
session_id=session_id or None,
|
| 618 |
doc_ids=doc_ids or None,
|
| 619 |
max_turn_seconds=max_turn,
|
|
|
|
| 620 |
)
|
| 621 |
except Exception as exc: # noqa: BLE001
|
| 622 |
return err(str(exc))
|
|
@@ -624,10 +725,60 @@ def api_teacher_voice_audio_turn(
|
|
| 624 |
return ok(
|
| 625 |
history=result.history,
|
| 626 |
assistant=result.assistant_text,
|
| 627 |
-
status=result.rag_status
|
| 628 |
voiceout_path=result.voiceout_path,
|
|
|
|
| 629 |
user_text=result.user_text,
|
| 630 |
rag_references=result.rag_references,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 631 |
)
|
| 632 |
|
| 633 |
|
|
@@ -672,8 +823,7 @@ def api_analyze_pitch(
|
|
| 672 |
asr_preset: str | None = None,
|
| 673 |
speak_rewrite: bool = False,
|
| 674 |
) -> dict[str, Any]:
|
| 675 |
-
model_key =
|
| 676 |
-
load_error = ensure_model_loaded(model_key)
|
| 677 |
if load_error:
|
| 678 |
return err(load_error)
|
| 679 |
|
|
@@ -686,6 +836,7 @@ def api_analyze_pitch(
|
|
| 686 |
audio_path,
|
| 687 |
language=language,
|
| 688 |
asr_preset=preset,
|
|
|
|
| 689 |
backend=get_backend(model_key),
|
| 690 |
speak_rewrite=speak_rewrite,
|
| 691 |
)
|
|
@@ -786,12 +937,30 @@ def api_recording_stop() -> dict[str, Any]:
|
|
| 786 |
|
| 787 |
|
| 788 |
def api_voice_presets() -> dict[str, Any]:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 789 |
return ok(
|
| 790 |
languages=[{"label": label, "value": value} for label, value in _echo_config.language_choices()],
|
| 791 |
asr_presets=[{"label": label, "value": value} for label, value in _echo_config.asr_choices()],
|
|
|
|
|
|
|
|
|
|
| 792 |
default_language=_echo_config.language_choices()[0][1] if _echo_config.language_choices() else "en",
|
| 793 |
default_asr=_echo_config.asr_preset,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 794 |
max_seconds=_echo_config.max_seconds,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 795 |
)
|
| 796 |
|
| 797 |
|
|
@@ -917,6 +1086,38 @@ def register_studio_apis(server: gr.Server) -> None:
|
|
| 917 |
file_paths,
|
| 918 |
)
|
| 919 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 920 |
@server.api(name="teacher_voice_turn")
|
| 921 |
def _teacher_voice_turn(
|
| 922 |
message: str,
|
|
@@ -928,6 +1129,9 @@ def register_studio_apis(server: gr.Server) -> None:
|
|
| 928 |
doc_ids: list[str] | None = None,
|
| 929 |
language: str = "en",
|
| 930 |
asr_preset: str | None = None,
|
|
|
|
|
|
|
|
|
|
| 931 |
) -> dict[str, Any]:
|
| 932 |
return api_teacher_voice_turn(
|
| 933 |
message,
|
|
@@ -939,6 +1143,9 @@ def register_studio_apis(server: gr.Server) -> None:
|
|
| 939 |
doc_ids,
|
| 940 |
language,
|
| 941 |
asr_preset,
|
|
|
|
|
|
|
|
|
|
| 942 |
)
|
| 943 |
|
| 944 |
@server.api(name="teacher_voice_audio_turn")
|
|
@@ -952,6 +1159,9 @@ def register_studio_apis(server: gr.Server) -> None:
|
|
| 952 |
doc_ids: list[str] | None = None,
|
| 953 |
language: str = "en",
|
| 954 |
asr_preset: str | None = None,
|
|
|
|
|
|
|
|
|
|
| 955 |
) -> dict[str, Any]:
|
| 956 |
return api_teacher_voice_audio_turn(
|
| 957 |
audio_path,
|
|
@@ -963,6 +1173,9 @@ def register_studio_apis(server: gr.Server) -> None:
|
|
| 963 |
doc_ids,
|
| 964 |
language,
|
| 965 |
asr_preset,
|
|
|
|
|
|
|
|
|
|
| 966 |
)
|
| 967 |
|
| 968 |
@server.api(name="teacher_voice_clear")
|
|
|
|
| 10 |
|
| 11 |
from echocoach.config import get_echo_coach_config
|
| 12 |
from echocoach.pipeline import run_echo_coach
|
| 13 |
+
from echocoach.prompts import TeacherVoiceMode, resolve_aya_preset
|
| 14 |
from echocoach.recording import (
|
| 15 |
ServerRecordingError,
|
| 16 |
recording_backend_status,
|
|
|
|
| 51 |
render_trace_details,
|
| 52 |
)
|
| 53 |
from gradio_space.voice_helpers import speak_last_assistant_reply
|
| 54 |
+
from inference.config import get_app_config, get_model_config
|
| 55 |
from inference.factory import get_backend
|
| 56 |
from researchmind.config import get_config as get_research_config
|
| 57 |
from researchmind.ingest import IngestPipeline
|
|
|
|
| 167 |
f"ASR: {asr.label} ({_echo_config.asr_preset})",
|
| 168 |
f"TTS: {tts.label} ({_echo_config.tts_preset})",
|
| 169 |
f"Coach model: {_echo_config.coach_model}",
|
| 170 |
+
f"Coach fallbacks: {', '.join(_echo_config.coach_fallbacks) or 'none'}",
|
| 171 |
f"Max recording: {_echo_config.max_seconds}s",
|
| 172 |
]
|
| 173 |
return "\n".join(lines)
|
| 174 |
|
| 175 |
|
| 176 |
+
def _coach_model_key(
|
| 177 |
+
coach_model: str | None = None,
|
| 178 |
+
*,
|
| 179 |
+
language: str = "en",
|
| 180 |
+
coach_variant: str = "auto",
|
| 181 |
+
) -> str:
|
| 182 |
+
if coach_model and coach_model.strip():
|
| 183 |
+
key = coach_model.strip()
|
| 184 |
+
elif coach_variant and coach_variant not in ("auto", ""):
|
| 185 |
+
key = coach_variant.strip()
|
| 186 |
+
else:
|
| 187 |
+
key = resolve_aya_preset(language, coach_variant)
|
| 188 |
+
if key in ("tiny-aya-water", "tiny-aya-fire", "tiny-aya-earth", "auto"):
|
| 189 |
+
key = "tiny-aya-global"
|
| 190 |
+
return key
|
| 191 |
+
|
| 192 |
+
|
| 193 |
+
def _coach_model_label(model_key: str) -> str:
|
| 194 |
+
try:
|
| 195 |
+
return get_model_config(model_key).label
|
| 196 |
+
except Exception:
|
| 197 |
+
return model_key
|
| 198 |
+
|
| 199 |
+
|
| 200 |
+
def _coach_model_candidates(
|
| 201 |
+
coach_model: str | None = None,
|
| 202 |
+
*,
|
| 203 |
+
language: str = "en",
|
| 204 |
+
coach_variant: str = "auto",
|
| 205 |
+
) -> list[str]:
|
| 206 |
+
if coach_model and coach_model.strip():
|
| 207 |
+
return [coach_model.strip()]
|
| 208 |
+
primary = _coach_model_key(None, language=language, coach_variant=coach_variant)
|
| 209 |
+
chain: list[str] = []
|
| 210 |
+
seen: set[str] = set()
|
| 211 |
+
for key in (primary, *_echo_config.coach_fallbacks):
|
| 212 |
+
if key and key not in seen:
|
| 213 |
+
seen.add(key)
|
| 214 |
+
chain.append(key)
|
| 215 |
+
return chain or [primary]
|
| 216 |
+
|
| 217 |
+
|
| 218 |
+
def _ensure_coach_loaded(
|
| 219 |
+
coach_model: str | None = None,
|
| 220 |
+
*,
|
| 221 |
+
language: str = "en",
|
| 222 |
+
coach_variant: str = "auto",
|
| 223 |
+
) -> tuple[str, str | None, str | None]:
|
| 224 |
+
"""Load the first coach preset that succeeds. Returns (key, error, fallback_note)."""
|
| 225 |
+
candidates = _coach_model_candidates(
|
| 226 |
+
coach_model,
|
| 227 |
+
language=language,
|
| 228 |
+
coach_variant=coach_variant,
|
| 229 |
+
)
|
| 230 |
+
errors: list[str] = []
|
| 231 |
+
for index, key in enumerate(candidates):
|
| 232 |
+
load_error = ensure_model_loaded(key)
|
| 233 |
+
if not load_error:
|
| 234 |
+
if index == 0:
|
| 235 |
+
return key, None, None
|
| 236 |
+
label = _coach_model_label(key)
|
| 237 |
+
note = (
|
| 238 |
+
f"Primary coach unavailable — using fallback **{label}** (`{key}`). "
|
| 239 |
+
"Replies still follow your target language via prompts."
|
| 240 |
+
)
|
| 241 |
+
return key, None, note
|
| 242 |
+
errors.append(load_error)
|
| 243 |
+
return candidates[-1], errors[-1], None
|
| 244 |
+
|
| 245 |
+
|
| 246 |
+
def _coach_turn_status(base: str | None, fallback_note: str | None) -> str:
|
| 247 |
+
status = (base or "Turn complete.").strip()
|
| 248 |
+
if fallback_note:
|
| 249 |
+
return f"{fallback_note} {status}".strip()
|
| 250 |
+
return status
|
| 251 |
+
|
| 252 |
+
|
| 253 |
+
def _voice_language_codes() -> list[str]:
|
| 254 |
+
return [code for _, code in _echo_config.language_choices()]
|
| 255 |
+
|
| 256 |
+
|
| 257 |
def _paths_summary() -> str:
|
| 258 |
rm = get_research_config()
|
| 259 |
lines = []
|
|
|
|
| 631 |
doc_ids: list[str] | None = None,
|
| 632 |
language: str = "en",
|
| 633 |
asr_preset: str | None = None,
|
| 634 |
+
auto_voiceout: bool = True,
|
| 635 |
+
coach_model: str = "",
|
| 636 |
+
coach_variant: str = "auto",
|
| 637 |
) -> dict[str, Any]:
|
| 638 |
+
model_key, load_error, fallback_note = _ensure_coach_loaded(
|
| 639 |
+
coach_model or None,
|
| 640 |
+
language=language,
|
| 641 |
+
coach_variant=coach_variant,
|
| 642 |
+
)
|
| 643 |
if load_error:
|
| 644 |
return err(load_error)
|
| 645 |
|
|
|
|
| 655 |
language=language,
|
| 656 |
topic=topic.strip() or None,
|
| 657 |
backend=get_backend(model_key),
|
| 658 |
+
coach_model=model_key,
|
| 659 |
use_rag=use_rag and mode in RAG_MODES,
|
| 660 |
session_id=session_id or None,
|
| 661 |
doc_ids=doc_ids or None,
|
| 662 |
+
auto_voiceout=auto_voiceout,
|
| 663 |
)
|
| 664 |
except Exception as exc: # noqa: BLE001
|
| 665 |
return err(str(exc))
|
|
|
|
| 667 |
return ok(
|
| 668 |
history=result.history,
|
| 669 |
assistant=result.assistant_text,
|
| 670 |
+
status=_coach_turn_status(result.rag_status, fallback_note),
|
| 671 |
voiceout_path=result.voiceout_path,
|
| 672 |
+
voiceout_warning=result.voiceout_warning,
|
| 673 |
rag_references=result.rag_references,
|
| 674 |
+
coach_model=model_key,
|
| 675 |
+
coach_fallback=bool(fallback_note),
|
| 676 |
)
|
| 677 |
|
| 678 |
def api_teacher_voice_audio_turn(
|
|
|
|
| 685 |
doc_ids: list[str] | None = None,
|
| 686 |
language: str = "en",
|
| 687 |
asr_preset: str | None = None,
|
| 688 |
+
auto_voiceout: bool = True,
|
| 689 |
+
coach_model: str = "",
|
| 690 |
+
coach_variant: str = "auto",
|
| 691 |
) -> dict[str, Any]:
|
| 692 |
+
model_key, load_error, fallback_note = _ensure_coach_loaded(
|
| 693 |
+
coach_model or None,
|
| 694 |
+
language=language,
|
| 695 |
+
coach_variant=coach_variant,
|
| 696 |
+
)
|
| 697 |
if load_error:
|
| 698 |
return err(load_error)
|
| 699 |
|
|
|
|
| 712 |
asr_preset=preset,
|
| 713 |
topic=topic.strip() or None,
|
| 714 |
backend=get_backend(model_key),
|
| 715 |
+
coach_model=model_key,
|
| 716 |
use_rag=use_rag and mode in RAG_MODES,
|
| 717 |
session_id=session_id or None,
|
| 718 |
doc_ids=doc_ids or None,
|
| 719 |
max_turn_seconds=max_turn,
|
| 720 |
+
auto_voiceout=auto_voiceout,
|
| 721 |
)
|
| 722 |
except Exception as exc: # noqa: BLE001
|
| 723 |
return err(str(exc))
|
|
|
|
| 725 |
return ok(
|
| 726 |
history=result.history,
|
| 727 |
assistant=result.assistant_text,
|
| 728 |
+
status=_coach_turn_status(result.rag_status, fallback_note),
|
| 729 |
voiceout_path=result.voiceout_path,
|
| 730 |
+
voiceout_warning=result.voiceout_warning,
|
| 731 |
user_text=result.user_text,
|
| 732 |
rag_references=result.rag_references,
|
| 733 |
+
coach_model=model_key,
|
| 734 |
+
coach_fallback=bool(fallback_note),
|
| 735 |
+
)
|
| 736 |
+
|
| 737 |
+
|
| 738 |
+
def api_language_lesson_turn(
|
| 739 |
+
message: str = "",
|
| 740 |
+
audio_path: str = "",
|
| 741 |
+
mode: TeacherVoiceMode = "lesson",
|
| 742 |
+
topic: str = "",
|
| 743 |
+
session_id: str = "",
|
| 744 |
+
use_rag: bool = True,
|
| 745 |
+
history: list | None = None,
|
| 746 |
+
doc_ids: list[str] | None = None,
|
| 747 |
+
language: str = "en",
|
| 748 |
+
asr_preset: str | None = None,
|
| 749 |
+
auto_voiceout: bool = True,
|
| 750 |
+
coach_model: str = "",
|
| 751 |
+
coach_variant: str = "auto",
|
| 752 |
+
) -> dict[str, Any]:
|
| 753 |
+
"""Unified Language lessons turn — routes to text or audio pipeline."""
|
| 754 |
+
if audio_path and audio_path.strip():
|
| 755 |
+
return api_teacher_voice_audio_turn(
|
| 756 |
+
audio_path.strip(),
|
| 757 |
+
mode=mode,
|
| 758 |
+
topic=topic,
|
| 759 |
+
session_id=session_id,
|
| 760 |
+
use_rag=use_rag,
|
| 761 |
+
history=history,
|
| 762 |
+
doc_ids=doc_ids,
|
| 763 |
+
language=language,
|
| 764 |
+
asr_preset=asr_preset,
|
| 765 |
+
auto_voiceout=auto_voiceout,
|
| 766 |
+
coach_model=coach_model,
|
| 767 |
+
coach_variant=coach_variant,
|
| 768 |
+
)
|
| 769 |
+
return api_teacher_voice_turn(
|
| 770 |
+
message,
|
| 771 |
+
mode=mode,
|
| 772 |
+
topic=topic,
|
| 773 |
+
session_id=session_id,
|
| 774 |
+
use_rag=use_rag,
|
| 775 |
+
history=history,
|
| 776 |
+
doc_ids=doc_ids,
|
| 777 |
+
language=language,
|
| 778 |
+
asr_preset=asr_preset,
|
| 779 |
+
auto_voiceout=auto_voiceout,
|
| 780 |
+
coach_model=coach_model,
|
| 781 |
+
coach_variant=coach_variant,
|
| 782 |
)
|
| 783 |
|
| 784 |
|
|
|
|
| 823 |
asr_preset: str | None = None,
|
| 824 |
speak_rewrite: bool = False,
|
| 825 |
) -> dict[str, Any]:
|
| 826 |
+
model_key, load_error, _fallback_note = _ensure_coach_loaded(None, language=language)
|
|
|
|
| 827 |
if load_error:
|
| 828 |
return err(load_error)
|
| 829 |
|
|
|
|
| 836 |
audio_path,
|
| 837 |
language=language,
|
| 838 |
asr_preset=preset,
|
| 839 |
+
coach_model=model_key,
|
| 840 |
backend=get_backend(model_key),
|
| 841 |
speak_rewrite=speak_rewrite,
|
| 842 |
)
|
|
|
|
| 937 |
|
| 938 |
|
| 939 |
def api_voice_presets() -> dict[str, Any]:
|
| 940 |
+
tts = _echo_config.get_tts()
|
| 941 |
+
voice_langs = _voice_language_codes()
|
| 942 |
+
coach_chain = _echo_config.coach_model_chain()
|
| 943 |
+
coach_chain_labels = [_coach_model_label(key) for key in coach_chain]
|
| 944 |
+
fallback_label = coach_chain_labels[1] if len(coach_chain_labels) > 1 else None
|
| 945 |
return ok(
|
| 946 |
languages=[{"label": label, "value": value} for label, value in _echo_config.language_choices()],
|
| 947 |
asr_presets=[{"label": label, "value": value} for label, value in _echo_config.asr_choices()],
|
| 948 |
+
coach_variants=[
|
| 949 |
+
{"label": "Tiny Aya Global (70+ languages)", "value": "tiny-aya-global"},
|
| 950 |
+
],
|
| 951 |
default_language=_echo_config.language_choices()[0][1] if _echo_config.language_choices() else "en",
|
| 952 |
default_asr=_echo_config.asr_preset,
|
| 953 |
+
default_coach=_echo_config.coach_model,
|
| 954 |
+
coach_fallbacks=list(_echo_config.coach_fallbacks),
|
| 955 |
+
coach_chain=coach_chain,
|
| 956 |
+
coach_chain_labels=coach_chain_labels,
|
| 957 |
+
voice_languages=voice_langs,
|
| 958 |
max_seconds=_echo_config.max_seconds,
|
| 959 |
+
voiceout_note=(
|
| 960 |
+
f"Voice in/out: {len(voice_langs)} languages via Piper · "
|
| 961 |
+
f"Coach: {coach_chain_labels[0]}"
|
| 962 |
+
+ (f" (fallback: {fallback_label})" if fallback_label else "")
|
| 963 |
+
),
|
| 964 |
)
|
| 965 |
|
| 966 |
|
|
|
|
| 1086 |
file_paths,
|
| 1087 |
)
|
| 1088 |
|
| 1089 |
+
@server.api(name="language_lesson_turn")
|
| 1090 |
+
def _language_lesson_turn(
|
| 1091 |
+
message: str = "",
|
| 1092 |
+
audio_path: str = "",
|
| 1093 |
+
mode: Literal["explain", "lesson"] = "lesson",
|
| 1094 |
+
topic: str = "",
|
| 1095 |
+
session_id: str = "",
|
| 1096 |
+
use_rag: bool = True,
|
| 1097 |
+
history: list | None = None,
|
| 1098 |
+
doc_ids: list[str] | None = None,
|
| 1099 |
+
language: str = "en",
|
| 1100 |
+
asr_preset: str | None = None,
|
| 1101 |
+
auto_voiceout: bool = True,
|
| 1102 |
+
coach_model: str = "",
|
| 1103 |
+
coach_variant: str = "auto",
|
| 1104 |
+
) -> dict[str, Any]:
|
| 1105 |
+
return api_language_lesson_turn(
|
| 1106 |
+
message,
|
| 1107 |
+
audio_path,
|
| 1108 |
+
mode,
|
| 1109 |
+
topic,
|
| 1110 |
+
session_id,
|
| 1111 |
+
use_rag,
|
| 1112 |
+
history,
|
| 1113 |
+
doc_ids,
|
| 1114 |
+
language,
|
| 1115 |
+
asr_preset,
|
| 1116 |
+
auto_voiceout,
|
| 1117 |
+
coach_model,
|
| 1118 |
+
coach_variant,
|
| 1119 |
+
)
|
| 1120 |
+
|
| 1121 |
@server.api(name="teacher_voice_turn")
|
| 1122 |
def _teacher_voice_turn(
|
| 1123 |
message: str,
|
|
|
|
| 1129 |
doc_ids: list[str] | None = None,
|
| 1130 |
language: str = "en",
|
| 1131 |
asr_preset: str | None = None,
|
| 1132 |
+
auto_voiceout: bool = True,
|
| 1133 |
+
coach_model: str = "",
|
| 1134 |
+
coach_variant: str = "auto",
|
| 1135 |
) -> dict[str, Any]:
|
| 1136 |
return api_teacher_voice_turn(
|
| 1137 |
message,
|
|
|
|
| 1143 |
doc_ids,
|
| 1144 |
language,
|
| 1145 |
asr_preset,
|
| 1146 |
+
auto_voiceout,
|
| 1147 |
+
coach_model,
|
| 1148 |
+
coach_variant,
|
| 1149 |
)
|
| 1150 |
|
| 1151 |
@server.api(name="teacher_voice_audio_turn")
|
|
|
|
| 1159 |
doc_ids: list[str] | None = None,
|
| 1160 |
language: str = "en",
|
| 1161 |
asr_preset: str | None = None,
|
| 1162 |
+
auto_voiceout: bool = True,
|
| 1163 |
+
coach_model: str = "",
|
| 1164 |
+
coach_variant: str = "auto",
|
| 1165 |
) -> dict[str, Any]:
|
| 1166 |
return api_teacher_voice_audio_turn(
|
| 1167 |
audio_path,
|
|
|
|
| 1173 |
doc_ids,
|
| 1174 |
language,
|
| 1175 |
asr_preset,
|
| 1176 |
+
auto_voiceout,
|
| 1177 |
+
coach_model,
|
| 1178 |
+
coach_variant,
|
| 1179 |
)
|
| 1180 |
|
| 1181 |
@server.api(name="teacher_voice_clear")
|
apps/gradio-space/static/studio/index.html
CHANGED
|
@@ -34,7 +34,7 @@
|
|
| 34 |
<nav class="sidebar-nav">
|
| 35 |
<button type="button" class="nav-item" data-view="research"><span class="material-symbols-outlined">search</span>Research</button>
|
| 36 |
<button type="button" class="nav-item active" data-view="slides"><span class="material-symbols-outlined">present_to_all</span>Slides</button>
|
| 37 |
-
<button type="button" class="nav-item" data-view="
|
| 38 |
<button type="button" class="nav-item" data-view="debug"><span class="material-symbols-outlined">bug_report</span>Debug</button>
|
| 39 |
<button type="button" id="btn-open-settings" class="nav-item"><span class="material-symbols-outlined">settings</span>Settings</button>
|
| 40 |
<a href="/classic" class="nav-item nav-link"><span class="material-symbols-outlined">open_in_new</span>Classic UI</a>
|
|
@@ -300,125 +300,104 @@
|
|
| 300 |
</section>
|
| 301 |
|
| 302 |
<section class="col col-studio">
|
| 303 |
-
<div class="
|
| 304 |
-
<aside class="
|
| 305 |
-
<div class="card
|
| 306 |
-
<p class="card-title">
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 307 |
<label class="toggle-row">
|
| 308 |
<span>Answer from my indexed sources</span>
|
| 309 |
-
<input id="use-rag" type="checkbox" checked />
|
| 310 |
</label>
|
| 311 |
-
<p class="status-text">Ground
|
| 312 |
</div>
|
| 313 |
-
<div class="card
|
| 314 |
<p class="card-title">Mode</p>
|
| 315 |
-
<div class="mode-cards
|
| 316 |
<button type="button" class="mode-card" data-mode="explain">Explain</button>
|
| 317 |
-
<button type="button" class="mode-card active" data-mode="lesson">Lesson</button>
|
| 318 |
-
<button type="button" class="mode-card" data-mode="pitch">Practice</button>
|
| 319 |
</div>
|
| 320 |
-
<label class="field
|
| 321 |
-
<span>
|
| 322 |
-
<input id="
|
| 323 |
</label>
|
| 324 |
-
<details class="
|
| 325 |
<summary>Add sources (optional)</summary>
|
| 326 |
<p class="status-text">Discover or ingest sources to ground answers in your library.</p>
|
| 327 |
<div class="ingest-action-row">
|
| 328 |
-
<button type="button" id="btn-
|
| 329 |
-
<button type="button" id="btn-
|
| 330 |
</div>
|
| 331 |
-
<div id="
|
| 332 |
-
<div id="
|
| 333 |
</div>
|
| 334 |
<label class="field">
|
| 335 |
<span>Paste URLs (one per line)</span>
|
| 336 |
-
<textarea id="
|
| 337 |
</label>
|
| 338 |
<label class="upload-zone upload-zone-compact">
|
| 339 |
-
<input id="
|
| 340 |
<span class="material-symbols-outlined">upload_file</span>
|
| 341 |
<span>Upload PDF or Doc</span>
|
| 342 |
</label>
|
| 343 |
-
<button type="button" id="btn-
|
| 344 |
-
<p id="
|
| 345 |
</details>
|
| 346 |
</div>
|
| 347 |
</aside>
|
| 348 |
-
<div class="
|
| 349 |
-
<div class="card
|
| 350 |
-
<div class="
|
| 351 |
-
<h2 class="section-label">
|
| 352 |
-
<p class="
|
| 353 |
</div>
|
| 354 |
-
<div id="
|
| 355 |
-
<p class="research-chat-empty">
|
| 356 |
</div>
|
| 357 |
-
<div class="
|
| 358 |
<label class="field">
|
| 359 |
-
<span>
|
| 360 |
-
<textarea id="
|
| 361 |
</label>
|
| 362 |
-
<div class="
|
| 363 |
-
<
|
| 364 |
-
|
| 365 |
-
|
| 366 |
-
|
| 367 |
-
|
| 368 |
-
|
| 369 |
-
|
| 370 |
-
<div class="voice-send-row">
|
| 371 |
-
<button type="button" id="btn-voice-send" class="btn btn-secondary">Send text</button>
|
| 372 |
-
<button type="button" id="btn-voice-audio-send" class="btn btn-primary">Send voice turn</button>
|
| 373 |
-
</div>
|
| 374 |
-
<p id="voice-turn-status" class="status-text"></p>
|
| 375 |
-
<div class="voice-replay-row">
|
| 376 |
-
<button type="button" id="btn-voice-speak-full" class="btn btn-secondary">Speak full reply</button>
|
| 377 |
-
<button type="button" id="btn-voice-speak-quick" class="btn btn-secondary">Speak first sentence</button>
|
| 378 |
-
<button type="button" id="btn-voice-clear" class="btn btn-ghost">Clear conversation</button>
|
| 379 |
-
</div>
|
| 380 |
-
<div id="voice-audio-out" class="voice-audio-out"></div>
|
| 381 |
-
</div>
|
| 382 |
-
</div>
|
| 383 |
-
<details class="card voice-pitch-analysis hidden" id="voice-pitch-analysis" open>
|
| 384 |
-
<summary class="voice-pitch-summary">
|
| 385 |
-
<span class="section-label">Deep pitch analysis</span>
|
| 386 |
-
<span class="voice-pitch-summary-hint">Pace, fillers, charts, and spoken rewrite</span>
|
| 387 |
-
</summary>
|
| 388 |
-
<div class="coach-panel-wrap">
|
| 389 |
-
<p class="coach-card-desc">Record or upload a short monologue (up to 30s), then analyze for metrics and feedback.</p>
|
| 390 |
-
<div class="coach-capture-row">
|
| 391 |
-
<div class="coach-capture-controls">
|
| 392 |
-
<div class="recording-row coach-recording-row">
|
| 393 |
-
<button type="button" id="btn-coach-record-start" class="btn btn-secondary">Start mic</button>
|
| 394 |
-
<button type="button" id="btn-coach-record-stop" class="btn btn-secondary" disabled>Stop mic</button>
|
| 395 |
-
<button type="button" id="btn-coach-sample" class="btn btn-ghost">Load sample</button>
|
| 396 |
-
</div>
|
| 397 |
-
<p id="coach-record-status" class="status-text coach-record-status"></p>
|
| 398 |
-
</div>
|
| 399 |
-
<label class="field coach-upload-field">
|
| 400 |
-
<span>Upload pitch (WAV)</span>
|
| 401 |
-
<input id="coach-audio" type="file" accept="audio/*" />
|
| 402 |
</label>
|
| 403 |
</div>
|
| 404 |
-
<
|
| 405 |
-
|
| 406 |
-
|
| 407 |
-
|
| 408 |
-
|
| 409 |
-
|
| 410 |
-
<span>ASR preset</span>
|
| 411 |
-
<select id="coach-asr" class="input"></select>
|
| 412 |
</label>
|
|
|
|
| 413 |
</div>
|
| 414 |
-
<
|
| 415 |
-
<span>Speak full rewrite (VoiceOut)</span>
|
| 416 |
-
<input id="coach-speak-rewrite" type="checkbox" />
|
| 417 |
-
</label>
|
| 418 |
-
<button type="button" id="btn-analyze" class="btn btn-primary btn-block coach-analyze-btn">Analyze pitch</button>
|
| 419 |
-
<div id="coach-panel" class="coach-results-panel"></div>
|
| 420 |
</div>
|
| 421 |
-
</
|
|
|
|
|
|
|
|
|
|
|
|
|
| 422 |
</div>
|
| 423 |
</div>
|
| 424 |
</section>
|
|
|
|
| 34 |
<nav class="sidebar-nav">
|
| 35 |
<button type="button" class="nav-item" data-view="research"><span class="material-symbols-outlined">search</span>Research</button>
|
| 36 |
<button type="button" class="nav-item active" data-view="slides"><span class="material-symbols-outlined">present_to_all</span>Slides</button>
|
| 37 |
+
<button type="button" class="nav-item" data-view="language-lessons"><span class="material-symbols-outlined">translate</span>Language lessons</button>
|
| 38 |
<button type="button" class="nav-item" data-view="debug"><span class="material-symbols-outlined">bug_report</span>Debug</button>
|
| 39 |
<button type="button" id="btn-open-settings" class="nav-item"><span class="material-symbols-outlined">settings</span>Settings</button>
|
| 40 |
<a href="/classic" class="nav-item nav-link"><span class="material-symbols-outlined">open_in_new</span>Classic UI</a>
|
|
|
|
| 300 |
</section>
|
| 301 |
|
| 302 |
<section class="col col-studio">
|
| 303 |
+
<div class="lessons-layout view-lessons-only">
|
| 304 |
+
<aside class="lessons-rail">
|
| 305 |
+
<div class="card lessons-rail-card">
|
| 306 |
+
<p class="card-title">Target language</p>
|
| 307 |
+
<label class="field">
|
| 308 |
+
<span>Lesson language</span>
|
| 309 |
+
<select id="lessons-language" class="input"></select>
|
| 310 |
+
</label>
|
| 311 |
+
<label class="field lessons-other-lang hidden" id="lessons-other-lang-wrap">
|
| 312 |
+
<span>Text-only language code</span>
|
| 313 |
+
<input id="lessons-other-lang" type="text" class="input" placeholder="e.g. hi, sw" maxlength="8" />
|
| 314 |
+
</label>
|
| 315 |
+
<p id="lessons-voiceout-note" class="status-text"></p>
|
| 316 |
+
<p class="status-text lessons-coach-model">Coach: Tiny Aya Global (70+ languages)</p>
|
| 317 |
+
<input type="hidden" id="lessons-coach-variant" value="tiny-aya-global" />
|
| 318 |
+
</div>
|
| 319 |
+
<div class="card lessons-rag-card">
|
| 320 |
+
<p class="card-title">RAG scope</p>
|
| 321 |
<label class="toggle-row">
|
| 322 |
<span>Answer from my indexed sources</span>
|
| 323 |
+
<input id="lessons-use-rag" type="checkbox" checked />
|
| 324 |
</label>
|
| 325 |
+
<p class="status-text">Ground lesson replies in your workspace documents when enabled.</p>
|
| 326 |
</div>
|
| 327 |
+
<div class="card lessons-rail-controls">
|
| 328 |
<p class="card-title">Mode</p>
|
| 329 |
+
<div class="mode-cards lessons-mode-cards" id="lessons-modes">
|
| 330 |
<button type="button" class="mode-card" data-mode="explain">Explain</button>
|
| 331 |
+
<button type="button" class="mode-card active" data-mode="lesson">Lesson coach</button>
|
|
|
|
| 332 |
</div>
|
| 333 |
+
<label class="field lessons-topic-wrap">
|
| 334 |
+
<span>Lesson topic</span>
|
| 335 |
+
<input id="lessons-topic" type="text" class="input" placeholder="Uses workspace topic when empty" />
|
| 336 |
</label>
|
| 337 |
+
<details class="lessons-rag-sources" id="lessons-rag-sources">
|
| 338 |
<summary>Add sources (optional)</summary>
|
| 339 |
<p class="status-text">Discover or ingest sources to ground answers in your library.</p>
|
| 340 |
<div class="ingest-action-row">
|
| 341 |
+
<button type="button" id="btn-lessons-discover" class="btn btn-secondary">Discover on web</button>
|
| 342 |
+
<button type="button" id="btn-lessons-auto-ingest" class="btn btn-secondary">Auto-ingest</button>
|
| 343 |
</div>
|
| 344 |
+
<div id="lessons-url-choices-panel" class="url-choices-panel hidden">
|
| 345 |
+
<div id="lessons-url-choices-list" class="url-choices-list"></div>
|
| 346 |
</div>
|
| 347 |
<label class="field">
|
| 348 |
<span>Paste URLs (one per line)</span>
|
| 349 |
+
<textarea id="lessons-urls-text" class="input" rows="2" placeholder="https://…"></textarea>
|
| 350 |
</label>
|
| 351 |
<label class="upload-zone upload-zone-compact">
|
| 352 |
+
<input id="lessons-ingest-file" type="file" accept=".pdf,.docx" multiple hidden />
|
| 353 |
<span class="material-symbols-outlined">upload_file</span>
|
| 354 |
<span>Upload PDF or Doc</span>
|
| 355 |
</label>
|
| 356 |
+
<button type="button" id="btn-lessons-ingest" class="btn btn-secondary btn-block">Ingest sources</button>
|
| 357 |
+
<p id="lessons-ingest-status" class="status-text"></p>
|
| 358 |
</details>
|
| 359 |
</div>
|
| 360 |
</aside>
|
| 361 |
+
<div class="lessons-main">
|
| 362 |
+
<div class="card lessons-main-card">
|
| 363 |
+
<div class="lessons-card-head">
|
| 364 |
+
<h2 class="section-label">Language lessons</h2>
|
| 365 |
+
<p class="lessons-card-desc">Learn in your language — type, hold the mic, or upload audio. Replies can speak back automatically.</p>
|
| 366 |
</div>
|
| 367 |
+
<div id="lessons-chat-messages" class="research-chat-messages lessons-chat-messages">
|
| 368 |
+
<p class="research-chat-empty">Choose a language, then type, speak, or upload audio to start your lesson.</p>
|
| 369 |
</div>
|
| 370 |
+
<div class="lessons-compose" id="lessons-panel">
|
| 371 |
<label class="field">
|
| 372 |
+
<span>Your message</span>
|
| 373 |
+
<textarea id="lessons-message" class="input" rows="2" placeholder="What is the difference between pretraining and finetuning a small model?"></textarea>
|
| 374 |
</label>
|
| 375 |
+
<div class="lessons-input-toolbar">
|
| 376 |
+
<button type="button" id="btn-lessons-hold-mic" class="btn btn-secondary lessons-hold-mic">Hold to speak</button>
|
| 377 |
+
<button type="button" id="btn-lessons-record-start" class="btn btn-secondary btn-compact">Start mic</button>
|
| 378 |
+
<button type="button" id="btn-lessons-record-stop" class="btn btn-secondary btn-compact" disabled>Stop mic</button>
|
| 379 |
+
<label class="lessons-upload-btn btn btn-secondary">
|
| 380 |
+
<span class="material-symbols-outlined">upload_file</span>
|
| 381 |
+
Upload audio
|
| 382 |
+
<input id="lessons-audio-upload" type="file" accept="audio/*" hidden />
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 383 |
</label>
|
| 384 |
</div>
|
| 385 |
+
<p id="lessons-record-status" class="status-text lessons-record-status"></p>
|
| 386 |
+
<div class="lessons-send-row">
|
| 387 |
+
<button type="button" id="btn-lessons-send" class="btn btn-primary">Send</button>
|
| 388 |
+
<label class="toggle-row lessons-auto-speak">
|
| 389 |
+
<span>Auto-speak replies</span>
|
| 390 |
+
<input id="lessons-auto-speak" type="checkbox" checked />
|
|
|
|
|
|
|
| 391 |
</label>
|
| 392 |
+
<button type="button" id="btn-lessons-clear" class="btn btn-ghost">Clear</button>
|
| 393 |
</div>
|
| 394 |
+
<p id="lessons-turn-status" class="status-text"></p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 395 |
</div>
|
| 396 |
+
</div>
|
| 397 |
+
<p class="lessons-classic-link status-text">
|
| 398 |
+
Pitch metrics and monologue analysis live in
|
| 399 |
+
<a href="/classic">Classic UI → EchoCoach</a>.
|
| 400 |
+
</p>
|
| 401 |
</div>
|
| 402 |
</div>
|
| 403 |
</section>
|
apps/gradio-space/static/studio/studio.css
CHANGED
|
@@ -387,11 +387,11 @@ body {
|
|
| 387 |
.region-loading-host,
|
| 388 |
.card-ingest,
|
| 389 |
.card-chat,
|
| 390 |
-
.
|
| 391 |
.coach-panel-wrap,
|
| 392 |
.coach-debug-card,
|
| 393 |
.controls-panel,
|
| 394 |
-
.
|
| 395 |
position: relative;
|
| 396 |
}
|
| 397 |
|
|
@@ -1017,26 +1017,26 @@ body {
|
|
| 1017 |
.research-layout { grid-template-columns: 1fr; }
|
| 1018 |
}
|
| 1019 |
|
| 1020 |
-
.workspace[data-view="
|
| 1021 |
-
.workspace[data-view="
|
| 1022 |
|
| 1023 |
-
.workspace[data-view="
|
| 1024 |
|
| 1025 |
-
.view-
|
| 1026 |
|
| 1027 |
-
.workspace[data-view="
|
| 1028 |
grid-template-columns: minmax(0, 1fr);
|
| 1029 |
max-width: 1280px;
|
| 1030 |
gap: 1.25rem;
|
| 1031 |
}
|
| 1032 |
|
| 1033 |
-
.workspace[data-view="
|
| 1034 |
grid-column: 1 / -1;
|
| 1035 |
width: 100%;
|
| 1036 |
min-width: 0;
|
| 1037 |
}
|
| 1038 |
|
| 1039 |
-
.workspace[data-view="
|
| 1040 |
display: grid;
|
| 1041 |
grid-template-columns: minmax(260px, 0.78fr) minmax(0, 1.22fr);
|
| 1042 |
gap: 1.25rem;
|
|
@@ -1044,29 +1044,29 @@ body {
|
|
| 1044 |
width: 100%;
|
| 1045 |
}
|
| 1046 |
|
| 1047 |
-
.workspace[data-view="
|
| 1048 |
display: flex;
|
| 1049 |
flex-direction: column;
|
| 1050 |
gap: 1rem;
|
| 1051 |
min-width: 0;
|
| 1052 |
}
|
| 1053 |
|
| 1054 |
-
.workspace[data-view="
|
| 1055 |
min-width: 0;
|
| 1056 |
display: flex;
|
| 1057 |
flex-direction: column;
|
| 1058 |
gap: 1rem;
|
| 1059 |
}
|
| 1060 |
|
| 1061 |
-
.workspace[data-view="
|
| 1062 |
margin: 0;
|
| 1063 |
}
|
| 1064 |
|
| 1065 |
-
.workspace[data-view="
|
| 1066 |
margin-bottom: 0.75rem;
|
| 1067 |
}
|
| 1068 |
|
| 1069 |
-
.workspace[data-view="
|
| 1070 |
cursor: pointer;
|
| 1071 |
list-style: none;
|
| 1072 |
display: flex;
|
|
@@ -1074,63 +1074,63 @@ body {
|
|
| 1074 |
gap: 0.2rem;
|
| 1075 |
}
|
| 1076 |
|
| 1077 |
-
.workspace[data-view="
|
| 1078 |
display: none;
|
| 1079 |
}
|
| 1080 |
|
| 1081 |
-
.workspace[data-view="
|
| 1082 |
font-size: 0.84rem;
|
| 1083 |
color: var(--secondary);
|
| 1084 |
font-weight: 400;
|
| 1085 |
}
|
| 1086 |
|
| 1087 |
-
.workspace[data-view="
|
| 1088 |
padding-top: 0.25rem;
|
| 1089 |
}
|
| 1090 |
|
| 1091 |
-
.workspace[data-view="
|
| 1092 |
margin-top: 0.75rem;
|
| 1093 |
}
|
| 1094 |
|
| 1095 |
-
.workspace[data-view="
|
| 1096 |
min-height: 80px;
|
| 1097 |
margin-top: 0.75rem;
|
| 1098 |
overflow-y: auto;
|
| 1099 |
}
|
| 1100 |
|
| 1101 |
-
.workspace[data-view="
|
| 1102 |
border-top: 1px solid var(--outline-variant);
|
| 1103 |
padding-top: 0.75rem;
|
| 1104 |
}
|
| 1105 |
|
| 1106 |
-
.workspace[data-view="
|
| 1107 |
display: flex;
|
| 1108 |
flex-direction: column;
|
| 1109 |
}
|
| 1110 |
|
| 1111 |
-
.workspace[data-view="
|
| 1112 |
display: flex;
|
| 1113 |
flex-direction: column;
|
| 1114 |
gap: 0.5rem;
|
| 1115 |
}
|
| 1116 |
|
| 1117 |
-
.workspace[data-view="
|
| 1118 |
margin: 0;
|
| 1119 |
}
|
| 1120 |
|
| 1121 |
-
.workspace[data-view="
|
| 1122 |
min-height: 3.25rem;
|
| 1123 |
resize: vertical;
|
| 1124 |
}
|
| 1125 |
|
| 1126 |
-
.workspace[data-view="
|
| 1127 |
flex-direction: row;
|
| 1128 |
flex-wrap: wrap;
|
| 1129 |
gap: 0.35rem;
|
| 1130 |
margin-bottom: 0.75rem;
|
| 1131 |
}
|
| 1132 |
|
| 1133 |
-
.workspace[data-view="
|
| 1134 |
flex: 1 1 calc(33.333% - 0.35rem);
|
| 1135 |
text-align: center;
|
| 1136 |
justify-content: center;
|
|
@@ -1139,27 +1139,27 @@ body {
|
|
| 1139 |
padding-right: 0.5rem;
|
| 1140 |
}
|
| 1141 |
|
| 1142 |
-
.workspace[data-view="
|
| 1143 |
margin: 0 0 0.75rem;
|
| 1144 |
}
|
| 1145 |
|
| 1146 |
-
.workspace[data-view="
|
| 1147 |
margin: 0;
|
| 1148 |
}
|
| 1149 |
|
| 1150 |
-
.workspace[data-view="
|
| 1151 |
cursor: pointer;
|
| 1152 |
font-weight: 600;
|
| 1153 |
font-size: 0.82rem;
|
| 1154 |
}
|
| 1155 |
|
| 1156 |
-
.workspace[data-view="
|
| 1157 |
min-height: 160px;
|
| 1158 |
max-height: min(260px, 32vh);
|
| 1159 |
margin: 0 0 0.75rem;
|
| 1160 |
}
|
| 1161 |
|
| 1162 |
-
.workspace[data-view="
|
| 1163 |
padding: 0.65rem 0.75rem;
|
| 1164 |
border: 1px solid var(--outline-variant);
|
| 1165 |
border-radius: var(--radius-lg);
|
|
@@ -1167,31 +1167,31 @@ body {
|
|
| 1167 |
margin-bottom: 0.65rem;
|
| 1168 |
}
|
| 1169 |
|
| 1170 |
-
.workspace[data-view="
|
| 1171 |
margin: 0;
|
| 1172 |
}
|
| 1173 |
|
| 1174 |
-
.workspace[data-view="
|
| 1175 |
margin: 0.35rem 0 0;
|
| 1176 |
min-height: 1.1rem;
|
| 1177 |
}
|
| 1178 |
|
| 1179 |
-
.workspace[data-view="
|
| 1180 |
display: grid;
|
| 1181 |
grid-template-columns: 1fr 1fr;
|
| 1182 |
gap: 0.5rem;
|
| 1183 |
margin-bottom: 0.35rem;
|
| 1184 |
}
|
| 1185 |
|
| 1186 |
-
.workspace[data-view="
|
| 1187 |
margin-bottom: 0.85rem;
|
| 1188 |
}
|
| 1189 |
|
| 1190 |
-
.workspace[data-view="
|
| 1191 |
margin-bottom: 0.35rem;
|
| 1192 |
}
|
| 1193 |
|
| 1194 |
-
.
|
| 1195 |
margin: 0;
|
| 1196 |
font-size: 0.84rem;
|
| 1197 |
line-height: 1.45;
|
|
@@ -1199,24 +1199,24 @@ body {
|
|
| 1199 |
}
|
| 1200 |
|
| 1201 |
@media (max-width: 960px) {
|
| 1202 |
-
.workspace[data-view="
|
| 1203 |
grid-template-columns: 1fr;
|
| 1204 |
max-width: 640px;
|
| 1205 |
margin-left: auto;
|
| 1206 |
margin-right: auto;
|
| 1207 |
}
|
| 1208 |
|
| 1209 |
-
.workspace[data-view="
|
| 1210 |
flex-direction: column;
|
| 1211 |
}
|
| 1212 |
|
| 1213 |
-
.workspace[data-view="
|
| 1214 |
flex: 1 1 auto;
|
| 1215 |
text-align: left;
|
| 1216 |
justify-content: space-between;
|
| 1217 |
}
|
| 1218 |
|
| 1219 |
-
.workspace[data-view="
|
| 1220 |
grid-template-columns: 1fr;
|
| 1221 |
}
|
| 1222 |
}
|
|
@@ -1421,22 +1421,22 @@ body {
|
|
| 1421 |
max-width: 160px;
|
| 1422 |
}
|
| 1423 |
|
| 1424 |
-
.
|
| 1425 |
.studio-coach-voiceout audio {
|
| 1426 |
width: 100%;
|
| 1427 |
margin-top: 0.5rem;
|
| 1428 |
}
|
| 1429 |
|
| 1430 |
-
.
|
| 1431 |
max-height: 220px;
|
| 1432 |
margin: 0.75rem 0;
|
| 1433 |
}
|
| 1434 |
|
| 1435 |
-
.
|
| 1436 |
margin: 0.75rem 0;
|
| 1437 |
}
|
| 1438 |
|
| 1439 |
-
.
|
| 1440 |
cursor: pointer;
|
| 1441 |
font-weight: 600;
|
| 1442 |
font-size: 0.875rem;
|
|
@@ -1450,14 +1450,14 @@ body {
|
|
| 1450 |
color: var(--on-surface-variant);
|
| 1451 |
}
|
| 1452 |
|
| 1453 |
-
.
|
| 1454 |
display: flex;
|
| 1455 |
flex-wrap: wrap;
|
| 1456 |
gap: 0.5rem;
|
| 1457 |
margin-top: 0.5rem;
|
| 1458 |
}
|
| 1459 |
|
| 1460 |
-
.
|
| 1461 |
margin-left: auto;
|
| 1462 |
}
|
| 1463 |
|
|
@@ -1578,3 +1578,72 @@ body {
|
|
| 1578 |
max-height: 320px;
|
| 1579 |
}
|
| 1580 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 387 |
.region-loading-host,
|
| 388 |
.card-ingest,
|
| 389 |
.card-chat,
|
| 390 |
+
.lessons-main-card,
|
| 391 |
.coach-panel-wrap,
|
| 392 |
.coach-debug-card,
|
| 393 |
.controls-panel,
|
| 394 |
+
.lessons-rail-controls {
|
| 395 |
position: relative;
|
| 396 |
}
|
| 397 |
|
|
|
|
| 1017 |
.research-layout { grid-template-columns: 1fr; }
|
| 1018 |
}
|
| 1019 |
|
| 1020 |
+
.workspace[data-view="language-lessons"] .col-research,
|
| 1021 |
+
.workspace[data-view="language-lessons"] .col-slides { display: none; }
|
| 1022 |
|
| 1023 |
+
.workspace[data-view="language-lessons"] .col-debug { display: none; }
|
| 1024 |
|
| 1025 |
+
.view-lessons-only { display: none; }
|
| 1026 |
|
| 1027 |
+
.workspace[data-view="language-lessons"] {
|
| 1028 |
grid-template-columns: minmax(0, 1fr);
|
| 1029 |
max-width: 1280px;
|
| 1030 |
gap: 1.25rem;
|
| 1031 |
}
|
| 1032 |
|
| 1033 |
+
.workspace[data-view="language-lessons"] .col-studio {
|
| 1034 |
grid-column: 1 / -1;
|
| 1035 |
width: 100%;
|
| 1036 |
min-width: 0;
|
| 1037 |
}
|
| 1038 |
|
| 1039 |
+
.workspace[data-view="language-lessons"] .lessons-layout {
|
| 1040 |
display: grid;
|
| 1041 |
grid-template-columns: minmax(260px, 0.78fr) minmax(0, 1.22fr);
|
| 1042 |
gap: 1.25rem;
|
|
|
|
| 1044 |
width: 100%;
|
| 1045 |
}
|
| 1046 |
|
| 1047 |
+
.workspace[data-view="language-lessons"] .lessons-rail {
|
| 1048 |
display: flex;
|
| 1049 |
flex-direction: column;
|
| 1050 |
gap: 1rem;
|
| 1051 |
min-width: 0;
|
| 1052 |
}
|
| 1053 |
|
| 1054 |
+
.workspace[data-view="language-lessons"] .lessons-main {
|
| 1055 |
min-width: 0;
|
| 1056 |
display: flex;
|
| 1057 |
flex-direction: column;
|
| 1058 |
gap: 1rem;
|
| 1059 |
}
|
| 1060 |
|
| 1061 |
+
.workspace[data-view="language-lessons"] .lessons-pitch-analysis {
|
| 1062 |
margin: 0;
|
| 1063 |
}
|
| 1064 |
|
| 1065 |
+
.workspace[data-view="language-lessons"] .lessons-pitch-analysis[open] .lessons-pitch-summary {
|
| 1066 |
margin-bottom: 0.75rem;
|
| 1067 |
}
|
| 1068 |
|
| 1069 |
+
.workspace[data-view="language-lessons"] .lessons-pitch-summary {
|
| 1070 |
cursor: pointer;
|
| 1071 |
list-style: none;
|
| 1072 |
display: flex;
|
|
|
|
| 1074 |
gap: 0.2rem;
|
| 1075 |
}
|
| 1076 |
|
| 1077 |
+
.workspace[data-view="language-lessons"] .lessons-pitch-summary::-webkit-details-marker {
|
| 1078 |
display: none;
|
| 1079 |
}
|
| 1080 |
|
| 1081 |
+
.workspace[data-view="language-lessons"] .lessons-pitch-summary-hint {
|
| 1082 |
font-size: 0.84rem;
|
| 1083 |
color: var(--secondary);
|
| 1084 |
font-weight: 400;
|
| 1085 |
}
|
| 1086 |
|
| 1087 |
+
.workspace[data-view="language-lessons"] .lessons-pitch-analysis .coach-panel-wrap {
|
| 1088 |
padding-top: 0.25rem;
|
| 1089 |
}
|
| 1090 |
|
| 1091 |
+
.workspace[data-view="language-lessons"] .lessons-discuss-btn {
|
| 1092 |
margin-top: 0.75rem;
|
| 1093 |
}
|
| 1094 |
|
| 1095 |
+
.workspace[data-view="language-lessons"] .coach-results-panel {
|
| 1096 |
min-height: 80px;
|
| 1097 |
margin-top: 0.75rem;
|
| 1098 |
overflow-y: auto;
|
| 1099 |
}
|
| 1100 |
|
| 1101 |
+
.workspace[data-view="language-lessons"] .coach-results-panel:not(:empty) {
|
| 1102 |
border-top: 1px solid var(--outline-variant);
|
| 1103 |
padding-top: 0.75rem;
|
| 1104 |
}
|
| 1105 |
|
| 1106 |
+
.workspace[data-view="language-lessons"] .lessons-main-card {
|
| 1107 |
display: flex;
|
| 1108 |
flex-direction: column;
|
| 1109 |
}
|
| 1110 |
|
| 1111 |
+
.workspace[data-view="language-lessons"] .lessons-compose {
|
| 1112 |
display: flex;
|
| 1113 |
flex-direction: column;
|
| 1114 |
gap: 0.5rem;
|
| 1115 |
}
|
| 1116 |
|
| 1117 |
+
.workspace[data-view="language-lessons"] .lessons-compose .field {
|
| 1118 |
margin: 0;
|
| 1119 |
}
|
| 1120 |
|
| 1121 |
+
.workspace[data-view="language-lessons"] .lessons-compose textarea {
|
| 1122 |
min-height: 3.25rem;
|
| 1123 |
resize: vertical;
|
| 1124 |
}
|
| 1125 |
|
| 1126 |
+
.workspace[data-view="language-lessons"] .lessons-rail .lessons-mode-cards {
|
| 1127 |
flex-direction: row;
|
| 1128 |
flex-wrap: wrap;
|
| 1129 |
gap: 0.35rem;
|
| 1130 |
margin-bottom: 0.75rem;
|
| 1131 |
}
|
| 1132 |
|
| 1133 |
+
.workspace[data-view="language-lessons"] .lessons-rail .lessons-mode-cards .mode-card {
|
| 1134 |
flex: 1 1 calc(33.333% - 0.35rem);
|
| 1135 |
text-align: center;
|
| 1136 |
justify-content: center;
|
|
|
|
| 1139 |
padding-right: 0.5rem;
|
| 1140 |
}
|
| 1141 |
|
| 1142 |
+
.workspace[data-view="language-lessons"] .lessons-rail-controls .lessons-topic-wrap {
|
| 1143 |
margin: 0 0 0.75rem;
|
| 1144 |
}
|
| 1145 |
|
| 1146 |
+
.workspace[data-view="language-lessons"] .lessons-rag-sources {
|
| 1147 |
margin: 0;
|
| 1148 |
}
|
| 1149 |
|
| 1150 |
+
.workspace[data-view="language-lessons"] .lessons-rag-sources summary {
|
| 1151 |
cursor: pointer;
|
| 1152 |
font-weight: 600;
|
| 1153 |
font-size: 0.82rem;
|
| 1154 |
}
|
| 1155 |
|
| 1156 |
+
.workspace[data-view="language-lessons"] .lessons-chat-messages {
|
| 1157 |
min-height: 160px;
|
| 1158 |
max-height: min(260px, 32vh);
|
| 1159 |
margin: 0 0 0.75rem;
|
| 1160 |
}
|
| 1161 |
|
| 1162 |
+
.workspace[data-view="language-lessons"] .lessons-input-toolbar {
|
| 1163 |
padding: 0.65rem 0.75rem;
|
| 1164 |
border: 1px solid var(--outline-variant);
|
| 1165 |
border-radius: var(--radius-lg);
|
|
|
|
| 1167 |
margin-bottom: 0.65rem;
|
| 1168 |
}
|
| 1169 |
|
| 1170 |
+
.workspace[data-view="language-lessons"] .lessons-recording-row {
|
| 1171 |
margin: 0;
|
| 1172 |
}
|
| 1173 |
|
| 1174 |
+
.workspace[data-view="language-lessons"] .lessons-record-status {
|
| 1175 |
margin: 0.35rem 0 0;
|
| 1176 |
min-height: 1.1rem;
|
| 1177 |
}
|
| 1178 |
|
| 1179 |
+
.workspace[data-view="language-lessons"] .lessons-send-row {
|
| 1180 |
display: grid;
|
| 1181 |
grid-template-columns: 1fr 1fr;
|
| 1182 |
gap: 0.5rem;
|
| 1183 |
margin-bottom: 0.35rem;
|
| 1184 |
}
|
| 1185 |
|
| 1186 |
+
.workspace[data-view="language-lessons"] .lessons-card-head {
|
| 1187 |
margin-bottom: 0.85rem;
|
| 1188 |
}
|
| 1189 |
|
| 1190 |
+
.workspace[data-view="language-lessons"] .lessons-card-head .section-label {
|
| 1191 |
margin-bottom: 0.35rem;
|
| 1192 |
}
|
| 1193 |
|
| 1194 |
+
.lessons-card-desc {
|
| 1195 |
margin: 0;
|
| 1196 |
font-size: 0.84rem;
|
| 1197 |
line-height: 1.45;
|
|
|
|
| 1199 |
}
|
| 1200 |
|
| 1201 |
@media (max-width: 960px) {
|
| 1202 |
+
.workspace[data-view="language-lessons"] .lessons-layout {
|
| 1203 |
grid-template-columns: 1fr;
|
| 1204 |
max-width: 640px;
|
| 1205 |
margin-left: auto;
|
| 1206 |
margin-right: auto;
|
| 1207 |
}
|
| 1208 |
|
| 1209 |
+
.workspace[data-view="language-lessons"] .lessons-rail .lessons-mode-cards {
|
| 1210 |
flex-direction: column;
|
| 1211 |
}
|
| 1212 |
|
| 1213 |
+
.workspace[data-view="language-lessons"] .lessons-rail .lessons-mode-cards .mode-card {
|
| 1214 |
flex: 1 1 auto;
|
| 1215 |
text-align: left;
|
| 1216 |
justify-content: space-between;
|
| 1217 |
}
|
| 1218 |
|
| 1219 |
+
.workspace[data-view="language-lessons"] .lessons-send-row {
|
| 1220 |
grid-template-columns: 1fr;
|
| 1221 |
}
|
| 1222 |
}
|
|
|
|
| 1421 |
max-width: 160px;
|
| 1422 |
}
|
| 1423 |
|
| 1424 |
+
.lessons-audio-out audio,
|
| 1425 |
.studio-coach-voiceout audio {
|
| 1426 |
width: 100%;
|
| 1427 |
margin-top: 0.5rem;
|
| 1428 |
}
|
| 1429 |
|
| 1430 |
+
.lessons-chat-messages {
|
| 1431 |
max-height: 220px;
|
| 1432 |
margin: 0.75rem 0;
|
| 1433 |
}
|
| 1434 |
|
| 1435 |
+
.lessons-rag-sources {
|
| 1436 |
margin: 0.75rem 0;
|
| 1437 |
}
|
| 1438 |
|
| 1439 |
+
.lessons-rag-sources summary {
|
| 1440 |
cursor: pointer;
|
| 1441 |
font-weight: 600;
|
| 1442 |
font-size: 0.875rem;
|
|
|
|
| 1450 |
color: var(--on-surface-variant);
|
| 1451 |
}
|
| 1452 |
|
| 1453 |
+
.lessons-replay-row {
|
| 1454 |
display: flex;
|
| 1455 |
flex-wrap: wrap;
|
| 1456 |
gap: 0.5rem;
|
| 1457 |
margin-top: 0.5rem;
|
| 1458 |
}
|
| 1459 |
|
| 1460 |
+
.lessons-replay-row .btn-ghost {
|
| 1461 |
margin-left: auto;
|
| 1462 |
}
|
| 1463 |
|
|
|
|
| 1578 |
max-height: 320px;
|
| 1579 |
}
|
| 1580 |
|
| 1581 |
+
.lessons-rail-card .field + .field {
|
| 1582 |
+
margin-top: 0.65rem;
|
| 1583 |
+
}
|
| 1584 |
+
|
| 1585 |
+
.lessons-input-toolbar {
|
| 1586 |
+
display: flex;
|
| 1587 |
+
flex-wrap: wrap;
|
| 1588 |
+
gap: 0.5rem;
|
| 1589 |
+
align-items: center;
|
| 1590 |
+
margin-top: 0.5rem;
|
| 1591 |
+
}
|
| 1592 |
+
|
| 1593 |
+
.lessons-hold-mic.is-recording {
|
| 1594 |
+
background: var(--primary-container);
|
| 1595 |
+
color: var(--on-primary-container);
|
| 1596 |
+
}
|
| 1597 |
+
|
| 1598 |
+
.lessons-upload-btn {
|
| 1599 |
+
cursor: pointer;
|
| 1600 |
+
display: inline-flex;
|
| 1601 |
+
align-items: center;
|
| 1602 |
+
gap: 0.35rem;
|
| 1603 |
+
}
|
| 1604 |
+
|
| 1605 |
+
.lessons-auto-speak {
|
| 1606 |
+
margin: 0;
|
| 1607 |
+
flex: 1 1 auto;
|
| 1608 |
+
justify-content: flex-end;
|
| 1609 |
+
}
|
| 1610 |
+
|
| 1611 |
+
.lessons-send-row {
|
| 1612 |
+
display: flex;
|
| 1613 |
+
flex-wrap: wrap;
|
| 1614 |
+
gap: 0.65rem;
|
| 1615 |
+
align-items: center;
|
| 1616 |
+
margin-top: 0.65rem;
|
| 1617 |
+
}
|
| 1618 |
+
|
| 1619 |
+
.lessons-chat-messages .chat-audio-inline {
|
| 1620 |
+
margin-top: 0.5rem;
|
| 1621 |
+
width: 100%;
|
| 1622 |
+
}
|
| 1623 |
+
|
| 1624 |
+
.lessons-classic-link {
|
| 1625 |
+
margin-top: 0.75rem;
|
| 1626 |
+
text-align: center;
|
| 1627 |
+
}
|
| 1628 |
+
|
| 1629 |
+
.lessons-classic-link a {
|
| 1630 |
+
color: var(--primary);
|
| 1631 |
+
}
|
| 1632 |
+
|
| 1633 |
+
.lessons-message-user::before {
|
| 1634 |
+
content: "You · ";
|
| 1635 |
+
font-weight: 600;
|
| 1636 |
+
opacity: 0.75;
|
| 1637 |
+
}
|
| 1638 |
+
|
| 1639 |
+
.lessons-message-assistant::before {
|
| 1640 |
+
content: "Teacher · ";
|
| 1641 |
+
font-weight: 600;
|
| 1642 |
+
opacity: 0.75;
|
| 1643 |
+
}
|
| 1644 |
+
|
| 1645 |
+
.btn-compact {
|
| 1646 |
+
padding-inline: 0.65rem;
|
| 1647 |
+
font-size: 0.82rem;
|
| 1648 |
+
}
|
| 1649 |
+
|
apps/gradio-space/static/studio/studio.js
CHANGED
|
@@ -40,11 +40,11 @@ const state = {
|
|
| 40 |
selectedUrls: [],
|
| 41 |
slideDiscoveredUrls: [],
|
| 42 |
slideSelectedUrls: [],
|
| 43 |
-
|
| 44 |
-
|
| 45 |
researchChatHistory: [],
|
| 46 |
debugChatHistory: [],
|
| 47 |
-
|
| 48 |
history: [],
|
| 49 |
downloads: null,
|
| 50 |
client: null,
|
|
@@ -55,9 +55,8 @@ const state = {
|
|
| 55 |
recordingTarget: null,
|
| 56 |
browserRecorder: null,
|
| 57 |
browserRecordChunks: [],
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
lastPitchAnalysis: null,
|
| 61 |
useBrowserMic: true,
|
| 62 |
};
|
| 63 |
|
|
@@ -223,16 +222,37 @@ function renderResearchUrlChoices(urls, selected) {
|
|
| 223 |
if (getIngestWorkflow() === "select") panel?.classList.remove("hidden");
|
| 224 |
}
|
| 225 |
|
| 226 |
-
function
|
| 227 |
-
|
| 228 |
-
return effectiveTopic($("#voice-topic")?.value || "");
|
| 229 |
}
|
| 230 |
|
| 231 |
-
function
|
| 232 |
-
return $("#use-rag").checked
|
| 233 |
}
|
| 234 |
|
| 235 |
-
function
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 236 |
if (content == null) return "";
|
| 237 |
if (typeof content === "string") return content;
|
| 238 |
if (Array.isArray(content)) {
|
|
@@ -253,8 +273,14 @@ function ingestSucceeded(status) {
|
|
| 253 |
);
|
| 254 |
}
|
| 255 |
|
| 256 |
-
function
|
| 257 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 258 |
state.workspaceSessionId = data.session_id || state.workspaceSessionId;
|
| 259 |
$("#workspace-session").value = state.workspaceSessionId;
|
| 260 |
if (data.documents_html) {
|
|
@@ -264,20 +290,21 @@ function applyVoiceIngestResult(data) {
|
|
| 264 |
updateResearchRagBadge();
|
| 265 |
updateResearchDocCount((data.documents || []).length);
|
| 266 |
if (ingestSucceeded(data.status)) {
|
| 267 |
-
$("#use-rag")
|
|
|
|
| 268 |
}
|
| 269 |
}
|
| 270 |
|
| 271 |
-
async function
|
| 272 |
-
const topic =
|
| 273 |
if (!topic) {
|
| 274 |
-
showError("Set a
|
| 275 |
return;
|
| 276 |
}
|
| 277 |
-
await withRegionLoading($(".
|
| 278 |
const data = await callApi("discover_sources", [topic, state.workspaceSessionId]);
|
| 279 |
-
$("#
|
| 280 |
-
|
| 281 |
if (data.session_id) {
|
| 282 |
state.workspaceSessionId = data.session_id;
|
| 283 |
$("#workspace-session").value = data.session_id;
|
|
@@ -286,32 +313,32 @@ async function discoverVoiceSources() {
|
|
| 286 |
});
|
| 287 |
}
|
| 288 |
|
| 289 |
-
async function
|
| 290 |
-
const topic =
|
| 291 |
if (!topic) {
|
| 292 |
-
showError("Set a
|
| 293 |
return;
|
| 294 |
}
|
| 295 |
-
await withRegionLoading($(".
|
| 296 |
const data = await callApi("auto_search_ingest", [topic, state.workspaceSessionId]);
|
| 297 |
-
|
| 298 |
-
state.
|
| 299 |
-
state.
|
| 300 |
-
|
| 301 |
await refreshWorkspaceSessions(state.workspaceSessionId);
|
| 302 |
});
|
| 303 |
}
|
| 304 |
|
| 305 |
-
async function
|
| 306 |
-
const topic =
|
| 307 |
-
const pasted = $("#
|
| 308 |
-
const selected = getSelectedDiscoveredUrls("#
|
| 309 |
-
const files = $("#
|
| 310 |
if (!pasted && !selected.length && !files?.length) {
|
| 311 |
showError("Add URLs, select suggested sources, or upload a file — then ingest.");
|
| 312 |
return;
|
| 313 |
}
|
| 314 |
-
await withRegionLoading($(".
|
| 315 |
const paths = [];
|
| 316 |
if (files?.length) {
|
| 317 |
for (const file of files) {
|
|
@@ -325,35 +352,40 @@ async function ingestVoiceSources() {
|
|
| 325 |
selected,
|
| 326 |
paths,
|
| 327 |
]);
|
| 328 |
-
|
| 329 |
-
if (pasted) $("#
|
| 330 |
-
if (files?.length) $("#
|
| 331 |
await refreshWorkspaceSessions(state.workspaceSessionId);
|
| 332 |
});
|
| 333 |
}
|
| 334 |
|
| 335 |
-
function
|
| 336 |
-
const ragMode = state.voiceMode === "explain" || state.voiceMode === "lesson";
|
| 337 |
-
const practiceMode = state.voiceMode === "pitch";
|
| 338 |
-
$("#voice-topic-wrap")?.classList.toggle("hidden", !ragMode);
|
| 339 |
-
$("#voice-rag-sources")?.classList.toggle("hidden", !ragMode);
|
| 340 |
-
$(".voice-rag-card")?.classList.toggle("hidden", practiceMode);
|
| 341 |
-
$("#voice-pitch-analysis")?.classList.toggle("hidden", !practiceMode);
|
| 342 |
const placeholders = {
|
| 343 |
explain: "e.g. How does finetuning differ from pretraining?",
|
| 344 |
lesson: "What is the difference between pretraining and finetuning a small model?",
|
| 345 |
-
pitch: "e.g. Here is my opening line — how can I improve it?",
|
| 346 |
};
|
| 347 |
-
const messageEl = $("#
|
| 348 |
-
if (messageEl) messageEl.placeholder = placeholders[state.
|
| 349 |
}
|
| 350 |
|
| 351 |
-
function
|
| 352 |
-
const
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 353 |
if (!container) return;
|
| 354 |
if (!state.history.length) {
|
| 355 |
container.innerHTML =
|
| 356 |
-
'<p class="research-chat-empty">
|
| 357 |
return;
|
| 358 |
}
|
| 359 |
const parts = [];
|
|
@@ -361,9 +393,13 @@ function renderVoiceChat() {
|
|
| 361 |
if (item && typeof item === "object" && item.role) {
|
| 362 |
const role = item.role === "user" ? "user" : "assistant";
|
| 363 |
const label = role === "user" ? "You" : "Teacher";
|
| 364 |
-
let body = renderMarkdownLite(
|
|
|
|
|
|
|
|
|
|
|
|
|
| 365 |
if (role === "assistant" && item.rag_references) {
|
| 366 |
-
body += `<div class="
|
| 367 |
}
|
| 368 |
parts.push(
|
| 369 |
`<div class="research-chat-bubble research-chat-${role}"><div class="research-chat-role">${label}</div><div class="research-chat-body">${body}</div></div>`
|
|
@@ -380,18 +416,50 @@ function renderVoiceChat() {
|
|
| 380 |
container.scrollTop = container.scrollHeight;
|
| 381 |
}
|
| 382 |
|
| 383 |
-
function
|
| 384 |
-
state.
|
| 385 |
-
state.
|
| 386 |
renderUrlChoices(
|
| 387 |
urls,
|
| 388 |
selected,
|
| 389 |
-
"#
|
| 390 |
-
"#
|
| 391 |
-
{ discovered: state.
|
| 392 |
);
|
| 393 |
}
|
| 394 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 395 |
function renderSlideUrlChoices(urls, selected) {
|
| 396 |
state.slideDiscoveredUrls = urls || [];
|
| 397 |
state.slideSelectedUrls = selected?.length ? selected : [...state.slideDiscoveredUrls];
|
|
@@ -961,23 +1029,30 @@ async function refreshDocuments() {
|
|
| 961 |
}
|
| 962 |
}
|
| 963 |
|
| 964 |
-
async function
|
| 965 |
const data = await callApi("voice_presets", []);
|
| 966 |
state.voicePresets = data;
|
| 967 |
-
const langSelect = $("#
|
| 968 |
-
const asrSelect = $("#coach-asr");
|
| 969 |
if (langSelect) {
|
| 970 |
-
|
| 971 |
.map((o) => `<option value="${o.value}">${o.label}</option>`)
|
| 972 |
.join("");
|
|
|
|
| 973 |
langSelect.value = data.default_language || "en";
|
| 974 |
}
|
| 975 |
-
|
| 976 |
-
|
| 977 |
-
|
| 978 |
-
|
| 979 |
-
|
|
|
|
|
|
|
| 980 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 981 |
}
|
| 982 |
|
| 983 |
async function initSettings() {
|
|
@@ -1033,10 +1108,10 @@ async function initWorkspace() {
|
|
| 1033 |
updateResearchRagBadge();
|
| 1034 |
await refreshWorkspaceSessions();
|
| 1035 |
await refreshDocuments();
|
| 1036 |
-
await
|
| 1037 |
await initSettings();
|
| 1038 |
-
|
| 1039 |
-
|
| 1040 |
await refreshDebugDocuments();
|
| 1041 |
const recStatus = await callApi("recording_status", []);
|
| 1042 |
state.useBrowserMic = !recStatus.backend || /unavailable|no capture/i.test(recStatus.message || "");
|
|
@@ -1055,7 +1130,7 @@ async function generateSlides() {
|
|
| 1055 |
const topic = effectiveTopic($("#lesson-topic").value);
|
| 1056 |
const grade = $("#lesson-grade").value;
|
| 1057 |
const slideCount = Number($("#slide-count").value);
|
| 1058 |
-
const useRag = $("#use-rag").checked;
|
| 1059 |
const docIds = effectiveDocIds([]);
|
| 1060 |
const sourceMode = $("#slide-source-mode")?.value || "";
|
| 1061 |
const searchWorkflow = $("#slide-search-workflow")?.value || "two_step";
|
|
@@ -1155,63 +1230,41 @@ async function generateSlides() {
|
|
| 1155 |
);
|
| 1156 |
}
|
| 1157 |
|
| 1158 |
-
function
|
| 1159 |
state.history = data.history ?? state.history;
|
| 1160 |
-
if (
|
| 1161 |
const last = state.history[state.history.length - 1];
|
| 1162 |
if (last && typeof last === "object" && last.role === "assistant") {
|
| 1163 |
-
last.rag_references = data.rag_references;
|
|
|
|
| 1164 |
}
|
| 1165 |
}
|
| 1166 |
-
|
| 1167 |
if (data.status) {
|
| 1168 |
-
$("#
|
| 1169 |
-
|
| 1170 |
-
const out = $("#voice-audio-out");
|
| 1171 |
-
if (data.voiceout_path) {
|
| 1172 |
-
out.innerHTML = `<audio controls src="${fileUrl(data.voiceout_path)}"></audio>`;
|
| 1173 |
-
} else if (!keepAudio) {
|
| 1174 |
-
out.innerHTML = "";
|
| 1175 |
}
|
| 1176 |
}
|
| 1177 |
|
| 1178 |
-
|
| 1179 |
-
|
| 1180 |
-
if (!message) {
|
| 1181 |
-
showError("Enter a message first.");
|
| 1182 |
-
return;
|
| 1183 |
-
}
|
| 1184 |
-
const topic = voiceEffectiveTopic();
|
| 1185 |
-
const useRag = voiceUseRag();
|
| 1186 |
-
const docIds = effectiveDocIds([]);
|
| 1187 |
-
const language = state.voicePresets?.default_language || "en";
|
| 1188 |
-
await withRegionLoading($(".voice-main-card"), "Teacher is thinking…", async () => {
|
| 1189 |
-
const data = await callApi("teacher_voice_turn", [
|
| 1190 |
-
message,
|
| 1191 |
-
state.voiceMode,
|
| 1192 |
-
topic,
|
| 1193 |
-
state.workspaceSessionId,
|
| 1194 |
-
useRag,
|
| 1195 |
-
state.history,
|
| 1196 |
-
docIds,
|
| 1197 |
-
language,
|
| 1198 |
-
null,
|
| 1199 |
-
]);
|
| 1200 |
-
$("#voice-message").value = "";
|
| 1201 |
-
renderVoiceReply(data);
|
| 1202 |
-
});
|
| 1203 |
}
|
| 1204 |
|
| 1205 |
-
async function
|
| 1206 |
-
const topic =
|
| 1207 |
-
const useRag =
|
| 1208 |
const docIds = effectiveDocIds([]);
|
| 1209 |
-
const language =
|
| 1210 |
const asr = state.voicePresets?.default_asr || null;
|
| 1211 |
-
|
| 1212 |
-
|
| 1213 |
-
|
| 1214 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1215 |
topic,
|
| 1216 |
state.workspaceSessionId,
|
| 1217 |
useRag,
|
|
@@ -1219,81 +1272,100 @@ async function sendVoiceAudioTurn(audioPath) {
|
|
| 1219 |
docIds,
|
| 1220 |
language,
|
| 1221 |
asr,
|
|
|
|
|
|
|
|
|
|
| 1222 |
]);
|
| 1223 |
-
if (data.user_text)
|
| 1224 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1225 |
});
|
| 1226 |
}
|
| 1227 |
|
| 1228 |
-
async function
|
| 1229 |
-
const
|
| 1230 |
-
|
| 1231 |
-
$("#
|
| 1232 |
-
if (
|
| 1233 |
-
|
|
|
|
|
|
|
|
|
|
| 1234 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1235 |
}
|
| 1236 |
|
| 1237 |
-
async function
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1238 |
const data = await callApi("teacher_voice_clear", []);
|
| 1239 |
state.history = [];
|
| 1240 |
-
|
| 1241 |
-
$("#
|
| 1242 |
-
$("#
|
| 1243 |
-
|
| 1244 |
-
}
|
| 1245 |
-
|
| 1246 |
-
async function loadSamplePitch() {
|
| 1247 |
-
const data = await callApi("load_sample_pitch", []);
|
| 1248 |
-
state.pendingCoachAudioPath = data.audio_path;
|
| 1249 |
-
$("#coach-record-status").textContent = stripMd(data.status || "Sample clip loaded.");
|
| 1250 |
-
}
|
| 1251 |
-
|
| 1252 |
-
async function analyzePitchWithPath(audioPath) {
|
| 1253 |
-
const language = $("#coach-language")?.value || "en";
|
| 1254 |
-
const asr = $("#coach-asr")?.value || null;
|
| 1255 |
-
const speakRewrite = $("#coach-speak-rewrite")?.checked || false;
|
| 1256 |
-
await withRegionLoading($("#voice-pitch-analysis"), "Analyzing pitch…", async () => {
|
| 1257 |
-
const data = await callApi("analyze_pitch", [audioPath, language, asr, speakRewrite]);
|
| 1258 |
-
state.lastPitchAnalysis = data;
|
| 1259 |
-
const panel = $("#coach-panel");
|
| 1260 |
-
panel.innerHTML = data.coach_panel_html || "";
|
| 1261 |
-
const discussBtn = document.createElement("button");
|
| 1262 |
-
discussBtn.type = "button";
|
| 1263 |
-
discussBtn.className = "btn btn-secondary voice-discuss-btn";
|
| 1264 |
-
discussBtn.textContent = "Discuss in chat";
|
| 1265 |
-
discussBtn.addEventListener("click", () => discussPitchInChat().catch(() => {}));
|
| 1266 |
-
if (data.transcript_html || data.report_md || data.tip) {
|
| 1267 |
-
panel.appendChild(discussBtn);
|
| 1268 |
-
}
|
| 1269 |
-
});
|
| 1270 |
}
|
| 1271 |
|
| 1272 |
-
function
|
| 1273 |
-
|
| 1274 |
-
|
| 1275 |
-
|
| 1276 |
-
|
| 1277 |
-
if (
|
| 1278 |
-
|
| 1279 |
-
|
| 1280 |
-
|
| 1281 |
-
|
| 1282 |
-
|
| 1283 |
-
|
| 1284 |
-
|
| 1285 |
-
|
| 1286 |
-
|
| 1287 |
-
|
| 1288 |
-
|
| 1289 |
-
|
| 1290 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1291 |
if (file) path = await uploadFile(file);
|
| 1292 |
if (!path) {
|
| 1293 |
-
showError("Record or upload audio
|
| 1294 |
return;
|
| 1295 |
}
|
| 1296 |
-
await
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1297 |
}
|
| 1298 |
|
| 1299 |
async function startBrowserRecording(statusEl) {
|
|
@@ -1367,23 +1439,11 @@ async function stopRecording(statusEl, startBtn, stopBtn) {
|
|
| 1367 |
path = data.path;
|
| 1368 |
if (statusEl) statusEl.textContent = stripMd(data.status || "Recording saved.");
|
| 1369 |
}
|
| 1370 |
-
if (state.recordingTarget === "
|
| 1371 |
-
if (state.recordingTarget === "coach") state.pendingCoachAudioPath = path;
|
| 1372 |
state.recordingTarget = null;
|
| 1373 |
return path;
|
| 1374 |
}
|
| 1375 |
|
| 1376 |
-
async function sendVoiceFromRecording() {
|
| 1377 |
-
let path = state.pendingVoiceAudioPath;
|
| 1378 |
-
const file = $("#voice-audio-upload").files?.[0];
|
| 1379 |
-
if (file) path = await uploadFile(file);
|
| 1380 |
-
if (!path) {
|
| 1381 |
-
showError("Record or upload audio first.");
|
| 1382 |
-
return;
|
| 1383 |
-
}
|
| 1384 |
-
await sendVoiceAudioTurn(path);
|
| 1385 |
-
}
|
| 1386 |
-
|
| 1387 |
function bindUi() {
|
| 1388 |
$("#slide-count").addEventListener("input", (e) => {
|
| 1389 |
$("#slide-count-val").textContent = e.target.value;
|
|
@@ -1450,18 +1510,52 @@ function bindUi() {
|
|
| 1450 |
});
|
| 1451 |
|
| 1452 |
$("#btn-generate").addEventListener("click", () => generateSlides().catch(() => {}));
|
| 1453 |
-
|
| 1454 |
-
$("#btn-
|
| 1455 |
-
$("#
|
| 1456 |
-
|
| 1457 |
-
|
| 1458 |
-
|
| 1459 |
-
|
| 1460 |
-
|
| 1461 |
-
$("#btn-
|
| 1462 |
-
$("#btn-
|
| 1463 |
-
$("#btn-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1464 |
$("#btn-debug-send").addEventListener("click", () => sendDebugMessage().catch(() => {}));
|
|
|
|
| 1465 |
$("#debug-session")?.addEventListener("change", () => refreshDebugDocuments().catch(() => {}));
|
| 1466 |
$("#debug-refresh-sessions")?.addEventListener("click", () => {
|
| 1467 |
refreshDebugSessions().catch(() => {});
|
|
@@ -1475,19 +1569,6 @@ function bindUi() {
|
|
| 1475 |
}
|
| 1476 |
});
|
| 1477 |
|
| 1478 |
-
$("#btn-voice-record-start")?.addEventListener("click", () =>
|
| 1479 |
-
startRecording("voice", $("#voice-record-status"), $("#btn-voice-record-start"), $("#btn-voice-record-stop")).catch(() => {})
|
| 1480 |
-
);
|
| 1481 |
-
$("#btn-voice-record-stop")?.addEventListener("click", () =>
|
| 1482 |
-
stopRecording($("#voice-record-status"), $("#btn-voice-record-start"), $("#btn-voice-record-stop")).catch(() => {})
|
| 1483 |
-
);
|
| 1484 |
-
$("#btn-coach-record-start")?.addEventListener("click", () =>
|
| 1485 |
-
startRecording("coach", $("#coach-record-status"), $("#btn-coach-record-start"), $("#btn-coach-record-stop")).catch(() => {})
|
| 1486 |
-
);
|
| 1487 |
-
$("#btn-coach-record-stop")?.addEventListener("click", () =>
|
| 1488 |
-
stopRecording($("#coach-record-status"), $("#btn-coach-record-start"), $("#btn-coach-record-stop")).catch(() => {})
|
| 1489 |
-
);
|
| 1490 |
-
|
| 1491 |
$("#btn-export").addEventListener("click", () => {
|
| 1492 |
const p = state.downloads?.pptx;
|
| 1493 |
if (p) window.open(fileUrl(p), "_blank");
|
|
@@ -1506,16 +1587,16 @@ function bindUi() {
|
|
| 1506 |
refreshDocuments().catch(() => {});
|
| 1507 |
});
|
| 1508 |
|
| 1509 |
-
document.querySelectorAll(".mode-card").forEach((btn) => {
|
| 1510 |
btn.addEventListener("click", () => {
|
| 1511 |
-
document.querySelectorAll(".mode-card").forEach((b) => b.classList.remove("active"));
|
| 1512 |
btn.classList.add("active");
|
| 1513 |
-
state.
|
| 1514 |
-
|
| 1515 |
});
|
| 1516 |
});
|
| 1517 |
|
| 1518 |
-
|
| 1519 |
}
|
| 1520 |
|
| 1521 |
bindUi();
|
|
|
|
| 40 |
selectedUrls: [],
|
| 41 |
slideDiscoveredUrls: [],
|
| 42 |
slideSelectedUrls: [],
|
| 43 |
+
lessonsDiscoveredUrls: [],
|
| 44 |
+
lessonsSelectedUrls: [],
|
| 45 |
researchChatHistory: [],
|
| 46 |
debugChatHistory: [],
|
| 47 |
+
lessonsMode: "lesson",
|
| 48 |
history: [],
|
| 49 |
downloads: null,
|
| 50 |
client: null,
|
|
|
|
| 55 |
recordingTarget: null,
|
| 56 |
browserRecorder: null,
|
| 57 |
browserRecordChunks: [],
|
| 58 |
+
pendingLessonsAudioPath: null,
|
| 59 |
+
holdMicActive: false,
|
|
|
|
| 60 |
useBrowserMic: true,
|
| 61 |
};
|
| 62 |
|
|
|
|
| 222 |
if (getIngestWorkflow() === "select") panel?.classList.remove("hidden");
|
| 223 |
}
|
| 224 |
|
| 225 |
+
function lessonsEffectiveTopic() {
|
| 226 |
+
return effectiveTopic($("#lessons-topic")?.value || "");
|
|
|
|
| 227 |
}
|
| 228 |
|
| 229 |
+
function lessonsUseRag() {
|
| 230 |
+
return Boolean($("#lessons-use-rag")?.checked);
|
| 231 |
}
|
| 232 |
|
| 233 |
+
function lessonsLanguage() {
|
| 234 |
+
const select = $("#lessons-language");
|
| 235 |
+
if (!select) return "en";
|
| 236 |
+
if (select.value === "other") {
|
| 237 |
+
return ($("#lessons-other-lang")?.value.trim() || "en").toLowerCase();
|
| 238 |
+
}
|
| 239 |
+
return select.value || "en";
|
| 240 |
+
}
|
| 241 |
+
|
| 242 |
+
function lessonsCoachVariant() {
|
| 243 |
+
return $("#lessons-coach-variant")?.value || "tiny-aya-global";
|
| 244 |
+
}
|
| 245 |
+
|
| 246 |
+
function lessonsAutoSpeak() {
|
| 247 |
+
return Boolean($("#lessons-auto-speak")?.checked);
|
| 248 |
+
}
|
| 249 |
+
|
| 250 |
+
function lessonsHasVoiceOut(language) {
|
| 251 |
+
const code = (language || "en").split("-")[0];
|
| 252 |
+
return (state.voicePresets?.voice_languages || []).includes(code);
|
| 253 |
+
}
|
| 254 |
+
|
| 255 |
+
function chatMessageText(content) {
|
| 256 |
if (content == null) return "";
|
| 257 |
if (typeof content === "string") return content;
|
| 258 |
if (Array.isArray(content)) {
|
|
|
|
| 273 |
);
|
| 274 |
}
|
| 275 |
|
| 276 |
+
function chatMessageAudio(content) {
|
| 277 |
+
if (!Array.isArray(content)) return null;
|
| 278 |
+
const filePart = content.find((part) => part && typeof part === "object" && part.path);
|
| 279 |
+
return filePart?.path || null;
|
| 280 |
+
}
|
| 281 |
+
|
| 282 |
+
function applyLessonsIngestResult(data) {
|
| 283 |
+
$("#lessons-ingest-status").textContent = stripMd(data.status || "Ingest complete.");
|
| 284 |
state.workspaceSessionId = data.session_id || state.workspaceSessionId;
|
| 285 |
$("#workspace-session").value = state.workspaceSessionId;
|
| 286 |
if (data.documents_html) {
|
|
|
|
| 290 |
updateResearchRagBadge();
|
| 291 |
updateResearchDocCount((data.documents || []).length);
|
| 292 |
if (ingestSucceeded(data.status)) {
|
| 293 |
+
const rag = $("#lessons-use-rag");
|
| 294 |
+
if (rag) rag.checked = true;
|
| 295 |
}
|
| 296 |
}
|
| 297 |
|
| 298 |
+
async function discoverLessonsSources() {
|
| 299 |
+
const topic = lessonsEffectiveTopic();
|
| 300 |
if (!topic) {
|
| 301 |
+
showError("Set a lesson or workspace topic before discovering sources.");
|
| 302 |
return;
|
| 303 |
}
|
| 304 |
+
await withRegionLoading($(".lessons-rail-controls"), "Discovering sources…", async () => {
|
| 305 |
const data = await callApi("discover_sources", [topic, state.workspaceSessionId]);
|
| 306 |
+
$("#lessons-ingest-status").textContent = stripMd(data.status || "Discovery complete.");
|
| 307 |
+
renderLessonsUrlChoices(data.urls || [], data.selected_urls || data.urls || []);
|
| 308 |
if (data.session_id) {
|
| 309 |
state.workspaceSessionId = data.session_id;
|
| 310 |
$("#workspace-session").value = data.session_id;
|
|
|
|
| 313 |
});
|
| 314 |
}
|
| 315 |
|
| 316 |
+
async function autoLessonsIngest() {
|
| 317 |
+
const topic = lessonsEffectiveTopic();
|
| 318 |
if (!topic) {
|
| 319 |
+
showError("Set a lesson or workspace topic before auto-ingest.");
|
| 320 |
return;
|
| 321 |
}
|
| 322 |
+
await withRegionLoading($(".lessons-rail-controls"), "Auto-ingesting sources…", async () => {
|
| 323 |
const data = await callApi("auto_search_ingest", [topic, state.workspaceSessionId]);
|
| 324 |
+
applyLessonsIngestResult(data);
|
| 325 |
+
state.lessonsDiscoveredUrls = [];
|
| 326 |
+
state.lessonsSelectedUrls = [];
|
| 327 |
+
renderLessonsUrlChoices([], []);
|
| 328 |
await refreshWorkspaceSessions(state.workspaceSessionId);
|
| 329 |
});
|
| 330 |
}
|
| 331 |
|
| 332 |
+
async function ingestLessonsSources() {
|
| 333 |
+
const topic = lessonsEffectiveTopic();
|
| 334 |
+
const pasted = $("#lessons-urls-text")?.value.trim() || "";
|
| 335 |
+
const selected = getSelectedDiscoveredUrls("#lessons-url-choices-list");
|
| 336 |
+
const files = $("#lessons-ingest-file")?.files;
|
| 337 |
if (!pasted && !selected.length && !files?.length) {
|
| 338 |
showError("Add URLs, select suggested sources, or upload a file — then ingest.");
|
| 339 |
return;
|
| 340 |
}
|
| 341 |
+
await withRegionLoading($(".lessons-rail-controls"), "Ingesting sources…", async () => {
|
| 342 |
const paths = [];
|
| 343 |
if (files?.length) {
|
| 344 |
for (const file of files) {
|
|
|
|
| 352 |
selected,
|
| 353 |
paths,
|
| 354 |
]);
|
| 355 |
+
applyLessonsIngestResult(data);
|
| 356 |
+
if (pasted) $("#lessons-urls-text").value = "";
|
| 357 |
+
if (files?.length) $("#lessons-ingest-file").value = "";
|
| 358 |
await refreshWorkspaceSessions(state.workspaceSessionId);
|
| 359 |
});
|
| 360 |
}
|
| 361 |
|
| 362 |
+
function syncLessonsModeUi() {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 363 |
const placeholders = {
|
| 364 |
explain: "e.g. How does finetuning differ from pretraining?",
|
| 365 |
lesson: "What is the difference between pretraining and finetuning a small model?",
|
|
|
|
| 366 |
};
|
| 367 |
+
const messageEl = $("#lessons-message");
|
| 368 |
+
if (messageEl) messageEl.placeholder = placeholders[state.lessonsMode] || placeholders.lesson;
|
| 369 |
}
|
| 370 |
|
| 371 |
+
function syncLessonsLanguageUi() {
|
| 372 |
+
const isOther = $("#lessons-language")?.value === "other";
|
| 373 |
+
$("#lessons-other-lang-wrap")?.classList.toggle("hidden", !isOther);
|
| 374 |
+
const lang = lessonsLanguage();
|
| 375 |
+
const note = state.voicePresets?.voiceout_note || "";
|
| 376 |
+
const voiceHint = lessonsHasVoiceOut(lang)
|
| 377 |
+
? note
|
| 378 |
+
: "VoiceOut not available for this language — text replies only.";
|
| 379 |
+
const noteEl = $("#lessons-voiceout-note");
|
| 380 |
+
if (noteEl) noteEl.textContent = voiceHint;
|
| 381 |
+
}
|
| 382 |
+
|
| 383 |
+
function renderLessonsChat() {
|
| 384 |
+
const container = $("#lessons-chat-messages");
|
| 385 |
if (!container) return;
|
| 386 |
if (!state.history.length) {
|
| 387 |
container.innerHTML =
|
| 388 |
+
'<p class="research-chat-empty">Choose a language, then type, speak, or upload audio to start your lesson.</p>';
|
| 389 |
return;
|
| 390 |
}
|
| 391 |
const parts = [];
|
|
|
|
| 393 |
if (item && typeof item === "object" && item.role) {
|
| 394 |
const role = item.role === "user" ? "user" : "assistant";
|
| 395 |
const label = role === "user" ? "You" : "Teacher";
|
| 396 |
+
let body = renderMarkdownLite(chatMessageText(item.content));
|
| 397 |
+
const audioPath = chatMessageAudio(item.content) || item.voiceout_path || null;
|
| 398 |
+
if (audioPath) {
|
| 399 |
+
body += `<audio class="chat-audio-inline" controls autoplay src="${fileUrl(audioPath)}"></audio>`;
|
| 400 |
+
}
|
| 401 |
if (role === "assistant" && item.rag_references) {
|
| 402 |
+
body += `<div class="lessons-rag-refs">${renderMarkdownLite(item.rag_references)}</div>`;
|
| 403 |
}
|
| 404 |
parts.push(
|
| 405 |
`<div class="research-chat-bubble research-chat-${role}"><div class="research-chat-role">${label}</div><div class="research-chat-body">${body}</div></div>`
|
|
|
|
| 416 |
container.scrollTop = container.scrollHeight;
|
| 417 |
}
|
| 418 |
|
| 419 |
+
function renderLessonsUrlChoices(urls, selected) {
|
| 420 |
+
state.lessonsDiscoveredUrls = urls || [];
|
| 421 |
+
state.lessonsSelectedUrls = selected?.length ? selected : [...state.lessonsDiscoveredUrls];
|
| 422 |
renderUrlChoices(
|
| 423 |
urls,
|
| 424 |
selected,
|
| 425 |
+
"#lessons-url-choices-list",
|
| 426 |
+
"#lessons-url-choices-panel",
|
| 427 |
+
{ discovered: state.lessonsDiscoveredUrls, selected: state.lessonsSelectedUrls }
|
| 428 |
);
|
| 429 |
}
|
| 430 |
|
| 431 |
+
function applyVoiceIngestResult(data) {
|
| 432 |
+
applyLessonsIngestResult(data);
|
| 433 |
+
}
|
| 434 |
+
|
| 435 |
+
async function discoverVoiceSources() {
|
| 436 |
+
return discoverLessonsSources();
|
| 437 |
+
}
|
| 438 |
+
|
| 439 |
+
async function autoVoiceIngest() {
|
| 440 |
+
return autoLessonsIngest();
|
| 441 |
+
}
|
| 442 |
+
|
| 443 |
+
async function ingestVoiceSources() {
|
| 444 |
+
return ingestLessonsSources();
|
| 445 |
+
}
|
| 446 |
+
|
| 447 |
+
function syncVoiceModeUi() {
|
| 448 |
+
syncLessonsModeUi();
|
| 449 |
+
}
|
| 450 |
+
|
| 451 |
+
function renderVoiceChat() {
|
| 452 |
+
renderLessonsChat();
|
| 453 |
+
}
|
| 454 |
+
|
| 455 |
+
function renderVoiceUrlChoices(urls, selected) {
|
| 456 |
+
renderLessonsUrlChoices(urls, selected);
|
| 457 |
+
}
|
| 458 |
+
|
| 459 |
+
function voiceMessageText(content) {
|
| 460 |
+
return chatMessageText(content);
|
| 461 |
+
}
|
| 462 |
+
|
| 463 |
function renderSlideUrlChoices(urls, selected) {
|
| 464 |
state.slideDiscoveredUrls = urls || [];
|
| 465 |
state.slideSelectedUrls = selected?.length ? selected : [...state.slideDiscoveredUrls];
|
|
|
|
| 1029 |
}
|
| 1030 |
}
|
| 1031 |
|
| 1032 |
+
async function initLanguageLessons() {
|
| 1033 |
const data = await callApi("voice_presets", []);
|
| 1034 |
state.voicePresets = data;
|
| 1035 |
+
const langSelect = $("#lessons-language");
|
|
|
|
| 1036 |
if (langSelect) {
|
| 1037 |
+
const opts = (data.languages || [])
|
| 1038 |
.map((o) => `<option value="${o.value}">${o.label}</option>`)
|
| 1039 |
.join("");
|
| 1040 |
+
langSelect.innerHTML = `${opts}<option value="other">Other (text only)</option>`;
|
| 1041 |
langSelect.value = data.default_language || "en";
|
| 1042 |
}
|
| 1043 |
+
const coachEl = document.querySelector(".lessons-coach-model");
|
| 1044 |
+
if (coachEl && data.coach_chain_labels?.length) {
|
| 1045 |
+
const primary = data.coach_chain_labels[0];
|
| 1046 |
+
const fallback = data.coach_chain_labels[1];
|
| 1047 |
+
coachEl.textContent = fallback
|
| 1048 |
+
? `Coach: ${primary} (auto-fallback: ${fallback})`
|
| 1049 |
+
: `Coach: ${primary}`;
|
| 1050 |
}
|
| 1051 |
+
syncLessonsLanguageUi();
|
| 1052 |
+
}
|
| 1053 |
+
|
| 1054 |
+
async function initVoicePresets() {
|
| 1055 |
+
return initLanguageLessons();
|
| 1056 |
}
|
| 1057 |
|
| 1058 |
async function initSettings() {
|
|
|
|
| 1108 |
updateResearchRagBadge();
|
| 1109 |
await refreshWorkspaceSessions();
|
| 1110 |
await refreshDocuments();
|
| 1111 |
+
await initLanguageLessons();
|
| 1112 |
await initSettings();
|
| 1113 |
+
syncLessonsModeUi();
|
| 1114 |
+
renderLessonsChat();
|
| 1115 |
await refreshDebugDocuments();
|
| 1116 |
const recStatus = await callApi("recording_status", []);
|
| 1117 |
state.useBrowserMic = !recStatus.backend || /unavailable|no capture/i.test(recStatus.message || "");
|
|
|
|
| 1130 |
const topic = effectiveTopic($("#lesson-topic").value);
|
| 1131 |
const grade = $("#lesson-grade").value;
|
| 1132 |
const slideCount = Number($("#slide-count").value);
|
| 1133 |
+
const useRag = Boolean($("#lessons-use-rag")?.checked);
|
| 1134 |
const docIds = effectiveDocIds([]);
|
| 1135 |
const sourceMode = $("#slide-source-mode")?.value || "";
|
| 1136 |
const searchWorkflow = $("#slide-search-workflow")?.value || "two_step";
|
|
|
|
| 1230 |
);
|
| 1231 |
}
|
| 1232 |
|
| 1233 |
+
function renderLessonsReply(data) {
|
| 1234 |
state.history = data.history ?? state.history;
|
| 1235 |
+
if (state.history.length) {
|
| 1236 |
const last = state.history[state.history.length - 1];
|
| 1237 |
if (last && typeof last === "object" && last.role === "assistant") {
|
| 1238 |
+
if (data.rag_references) last.rag_references = data.rag_references;
|
| 1239 |
+
if (data.voiceout_path && lessonsAutoSpeak()) last.voiceout_path = data.voiceout_path;
|
| 1240 |
}
|
| 1241 |
}
|
| 1242 |
+
renderLessonsChat();
|
| 1243 |
if (data.status) {
|
| 1244 |
+
const statusEl = $("#lessons-turn-status");
|
| 1245 |
+
if (statusEl) statusEl.textContent = stripMd(data.status);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1246 |
}
|
| 1247 |
}
|
| 1248 |
|
| 1249 |
+
function renderVoiceReply(data, options) {
|
| 1250 |
+
renderLessonsReply(data, options);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1251 |
}
|
| 1252 |
|
| 1253 |
+
async function sendLanguageLessonTurn({ message = "", audioPath = "" } = {}) {
|
| 1254 |
+
const topic = lessonsEffectiveTopic();
|
| 1255 |
+
const useRag = lessonsUseRag();
|
| 1256 |
const docIds = effectiveDocIds([]);
|
| 1257 |
+
const language = lessonsLanguage();
|
| 1258 |
const asr = state.voicePresets?.default_asr || null;
|
| 1259 |
+
const autoVoiceout = lessonsAutoSpeak() && lessonsHasVoiceOut(language);
|
| 1260 |
+
const coachVariant = lessonsCoachVariant();
|
| 1261 |
+
const loadingLabel = message || audioPath ? (message ? "Teacher is thinking…" : "Processing audio…") : "Sending…";
|
| 1262 |
+
|
| 1263 |
+
await withRegionLoading($(".lessons-main-card"), loadingLabel, async () => {
|
| 1264 |
+
const data = await callApi("language_lesson_turn", [
|
| 1265 |
+
message,
|
| 1266 |
+
audioPath || "",
|
| 1267 |
+
state.lessonsMode,
|
| 1268 |
topic,
|
| 1269 |
state.workspaceSessionId,
|
| 1270 |
useRag,
|
|
|
|
| 1272 |
docIds,
|
| 1273 |
language,
|
| 1274 |
asr,
|
| 1275 |
+
autoVoiceout,
|
| 1276 |
+
"",
|
| 1277 |
+
coachVariant,
|
| 1278 |
]);
|
| 1279 |
+
if (data.user_text) {
|
| 1280 |
+
$("#lessons-message").value = data.user_text;
|
| 1281 |
+
} else if (message) {
|
| 1282 |
+
$("#lessons-message").value = "";
|
| 1283 |
+
}
|
| 1284 |
+
renderLessonsReply(data);
|
| 1285 |
});
|
| 1286 |
}
|
| 1287 |
|
| 1288 |
+
async function sendLessonsTurn() {
|
| 1289 |
+
const message = $("#lessons-message")?.value.trim() || "";
|
| 1290 |
+
let audioPath = state.pendingLessonsAudioPath;
|
| 1291 |
+
const file = $("#lessons-audio-upload")?.files?.[0];
|
| 1292 |
+
if (file) audioPath = await uploadFile(file);
|
| 1293 |
+
if (message) {
|
| 1294 |
+
await sendLanguageLessonTurn({ message });
|
| 1295 |
+
state.pendingLessonsAudioPath = null;
|
| 1296 |
+
return;
|
| 1297 |
}
|
| 1298 |
+
if (audioPath) {
|
| 1299 |
+
await sendLanguageLessonTurn({ audioPath });
|
| 1300 |
+
state.pendingLessonsAudioPath = null;
|
| 1301 |
+
if ($("#lessons-audio-upload")) $("#lessons-audio-upload").value = "";
|
| 1302 |
+
return;
|
| 1303 |
+
}
|
| 1304 |
+
showError("Type a message, hold the mic, or upload audio.");
|
| 1305 |
}
|
| 1306 |
|
| 1307 |
+
async function sendVoiceTurn() {
|
| 1308 |
+
return sendLessonsTurn();
|
| 1309 |
+
}
|
| 1310 |
+
|
| 1311 |
+
async function sendVoiceAudioTurn(audioPath) {
|
| 1312 |
+
return sendLanguageLessonTurn({ audioPath });
|
| 1313 |
+
}
|
| 1314 |
+
|
| 1315 |
+
async function clearLessonsConversation() {
|
| 1316 |
const data = await callApi("teacher_voice_clear", []);
|
| 1317 |
state.history = [];
|
| 1318 |
+
renderLessonsChat();
|
| 1319 |
+
if ($("#lessons-message")) $("#lessons-message").value = "";
|
| 1320 |
+
const statusEl = $("#lessons-turn-status");
|
| 1321 |
+
if (statusEl) statusEl.textContent = stripMd(data.status || "Conversation cleared.");
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1322 |
}
|
| 1323 |
|
| 1324 |
+
async function clearVoiceConversation() {
|
| 1325 |
+
return clearLessonsConversation();
|
| 1326 |
+
}
|
| 1327 |
+
|
| 1328 |
+
async function startLessonsHoldMic(e) {
|
| 1329 |
+
if (state.holdMicActive) return;
|
| 1330 |
+
state.holdMicActive = true;
|
| 1331 |
+
e?.preventDefault();
|
| 1332 |
+
const holdBtn = $("#btn-lessons-hold-mic");
|
| 1333 |
+
holdBtn?.classList.add("recording");
|
| 1334 |
+
await startRecording(
|
| 1335 |
+
"lessons",
|
| 1336 |
+
$("#lessons-record-status"),
|
| 1337 |
+
$("#btn-lessons-record-start"),
|
| 1338 |
+
$("#btn-lessons-record-stop")
|
| 1339 |
+
);
|
| 1340 |
+
}
|
| 1341 |
+
|
| 1342 |
+
async function stopLessonsHoldMic(e) {
|
| 1343 |
+
if (!state.holdMicActive) return;
|
| 1344 |
+
state.holdMicActive = false;
|
| 1345 |
+
e?.preventDefault();
|
| 1346 |
+
$("#btn-lessons-hold-mic")?.classList.remove("recording");
|
| 1347 |
+
const path = await stopRecording(
|
| 1348 |
+
$("#lessons-record-status"),
|
| 1349 |
+
$("#btn-lessons-record-start"),
|
| 1350 |
+
$("#btn-lessons-record-stop")
|
| 1351 |
+
);
|
| 1352 |
+
if (path) await sendLanguageLessonTurn({ audioPath: path });
|
| 1353 |
+
}
|
| 1354 |
+
|
| 1355 |
+
async function sendLessonsFromRecording() {
|
| 1356 |
+
let path = state.pendingLessonsAudioPath;
|
| 1357 |
+
const file = $("#lessons-audio-upload")?.files?.[0];
|
| 1358 |
if (file) path = await uploadFile(file);
|
| 1359 |
if (!path) {
|
| 1360 |
+
showError("Record or upload audio first.");
|
| 1361 |
return;
|
| 1362 |
}
|
| 1363 |
+
await sendLanguageLessonTurn({ audioPath: path });
|
| 1364 |
+
state.pendingLessonsAudioPath = null;
|
| 1365 |
+
}
|
| 1366 |
+
|
| 1367 |
+
async function sendVoiceFromRecording() {
|
| 1368 |
+
return sendLessonsFromRecording();
|
| 1369 |
}
|
| 1370 |
|
| 1371 |
async function startBrowserRecording(statusEl) {
|
|
|
|
| 1439 |
path = data.path;
|
| 1440 |
if (statusEl) statusEl.textContent = stripMd(data.status || "Recording saved.");
|
| 1441 |
}
|
| 1442 |
+
if (state.recordingTarget === "lessons") state.pendingLessonsAudioPath = path;
|
|
|
|
| 1443 |
state.recordingTarget = null;
|
| 1444 |
return path;
|
| 1445 |
}
|
| 1446 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1447 |
function bindUi() {
|
| 1448 |
$("#slide-count").addEventListener("input", (e) => {
|
| 1449 |
$("#slide-count-val").textContent = e.target.value;
|
|
|
|
| 1510 |
});
|
| 1511 |
|
| 1512 |
$("#btn-generate").addEventListener("click", () => generateSlides().catch(() => {}));
|
| 1513 |
+
|
| 1514 |
+
$("#btn-lessons-send")?.addEventListener("click", () => sendLessonsTurn().catch(() => {}));
|
| 1515 |
+
$("#lessons-message")?.addEventListener("keydown", (e) => {
|
| 1516 |
+
if (e.key === "Enter" && !e.shiftKey) {
|
| 1517 |
+
e.preventDefault();
|
| 1518 |
+
sendLessonsTurn().catch(() => {});
|
| 1519 |
+
}
|
| 1520 |
+
});
|
| 1521 |
+
$("#btn-lessons-discover")?.addEventListener("click", () => discoverLessonsSources().catch(() => {}));
|
| 1522 |
+
$("#btn-lessons-auto-ingest")?.addEventListener("click", () => autoLessonsIngest().catch(() => {}));
|
| 1523 |
+
$("#btn-lessons-ingest")?.addEventListener("click", () => ingestLessonsSources().catch(() => {}));
|
| 1524 |
+
$("#lessons-ingest-file")?.addEventListener("change", () => ingestLessonsSources().catch(() => {}));
|
| 1525 |
+
$("#btn-lessons-clear")?.addEventListener("click", () => clearLessonsConversation().catch(() => {}));
|
| 1526 |
+
$("#lessons-language")?.addEventListener("change", syncLessonsLanguageUi);
|
| 1527 |
+
$("#lessons-other-lang")?.addEventListener("input", syncLessonsLanguageUi);
|
| 1528 |
+
$("#lessons-audio-upload")?.addEventListener("change", () => sendLessonsTurn().catch(() => {}));
|
| 1529 |
+
|
| 1530 |
+
const holdMic = $("#btn-lessons-hold-mic");
|
| 1531 |
+
if (holdMic) {
|
| 1532 |
+
holdMic.addEventListener("mousedown", (e) => startLessonsHoldMic(e).catch(() => {}));
|
| 1533 |
+
holdMic.addEventListener("mouseup", (e) => stopLessonsHoldMic(e).catch(() => {}));
|
| 1534 |
+
holdMic.addEventListener("mouseleave", (e) => {
|
| 1535 |
+
if (state.holdMicActive) stopLessonsHoldMic(e).catch(() => {});
|
| 1536 |
+
});
|
| 1537 |
+
holdMic.addEventListener("touchstart", (e) => startLessonsHoldMic(e).catch(() => {}), { passive: false });
|
| 1538 |
+
holdMic.addEventListener("touchend", (e) => stopLessonsHoldMic(e).catch(() => {}));
|
| 1539 |
+
}
|
| 1540 |
+
|
| 1541 |
+
$("#btn-lessons-record-start")?.addEventListener("click", () =>
|
| 1542 |
+
startRecording(
|
| 1543 |
+
"lessons",
|
| 1544 |
+
$("#lessons-record-status"),
|
| 1545 |
+
$("#btn-lessons-record-start"),
|
| 1546 |
+
$("#btn-lessons-record-stop")
|
| 1547 |
+
).catch(() => {})
|
| 1548 |
+
);
|
| 1549 |
+
$("#btn-lessons-record-stop")?.addEventListener("click", () =>
|
| 1550 |
+
stopRecording(
|
| 1551 |
+
$("#lessons-record-status"),
|
| 1552 |
+
$("#btn-lessons-record-start"),
|
| 1553 |
+
$("#btn-lessons-record-stop")
|
| 1554 |
+
).catch(() => {})
|
| 1555 |
+
);
|
| 1556 |
+
|
| 1557 |
$("#btn-debug-send").addEventListener("click", () => sendDebugMessage().catch(() => {}));
|
| 1558 |
+
|
| 1559 |
$("#debug-session")?.addEventListener("change", () => refreshDebugDocuments().catch(() => {}));
|
| 1560 |
$("#debug-refresh-sessions")?.addEventListener("click", () => {
|
| 1561 |
refreshDebugSessions().catch(() => {});
|
|
|
|
| 1569 |
}
|
| 1570 |
});
|
| 1571 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1572 |
$("#btn-export").addEventListener("click", () => {
|
| 1573 |
const p = state.downloads?.pptx;
|
| 1574 |
if (p) window.open(fileUrl(p), "_blank");
|
|
|
|
| 1587 |
refreshDocuments().catch(() => {});
|
| 1588 |
});
|
| 1589 |
|
| 1590 |
+
document.querySelectorAll("#lessons-modes .mode-card").forEach((btn) => {
|
| 1591 |
btn.addEventListener("click", () => {
|
| 1592 |
+
document.querySelectorAll("#lessons-modes .mode-card").forEach((b) => b.classList.remove("active"));
|
| 1593 |
btn.classList.add("active");
|
| 1594 |
+
state.lessonsMode = btn.dataset.mode;
|
| 1595 |
+
syncLessonsModeUi();
|
| 1596 |
});
|
| 1597 |
});
|
| 1598 |
|
| 1599 |
+
syncLessonsModeUi();
|
| 1600 |
}
|
| 1601 |
|
| 1602 |
bindUi();
|
libs/echocoach/src/echocoach/config.py
CHANGED
|
@@ -45,12 +45,23 @@ class EchoCoachConfig:
|
|
| 45 |
tts_preset: str
|
| 46 |
realtime_tts_preset: str | None
|
| 47 |
coach_model: str
|
|
|
|
| 48 |
max_seconds: int
|
| 49 |
languages: list[LanguageOption]
|
| 50 |
asr_presets: dict[str, AsrPreset]
|
| 51 |
tts_presets: dict[str, TtsPreset]
|
| 52 |
presets_path: Path | None = None
|
| 53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
def get_asr(self, key: str | None = None) -> AsrPreset:
|
| 55 |
preset_key = key or self.asr_preset
|
| 56 |
if preset_key not in self.asr_presets:
|
|
@@ -114,6 +125,7 @@ def _builtin_config() -> EchoCoachConfig:
|
|
| 114 |
tts_preset="piper-multilingual",
|
| 115 |
realtime_tts_preset=None,
|
| 116 |
coach_model="minicpm5-1b",
|
|
|
|
| 117 |
max_seconds=30,
|
| 118 |
languages=langs,
|
| 119 |
asr_presets=asr,
|
|
@@ -201,11 +213,15 @@ def load_echo_coach_config() -> EchoCoachConfig:
|
|
| 201 |
if tts_default not in tts_presets:
|
| 202 |
tts_default = next(iter(tts_presets))
|
| 203 |
|
|
|
|
|
|
|
|
|
|
| 204 |
config = EchoCoachConfig(
|
| 205 |
asr_preset=asr_default,
|
| 206 |
tts_preset=tts_default,
|
| 207 |
realtime_tts_preset=defaults.get("realtime_tts_preset"),
|
| 208 |
coach_model=str(defaults.get("coach_model", "minicpm5-1b")),
|
|
|
|
| 209 |
max_seconds=int(defaults.get("max_seconds", 30)),
|
| 210 |
languages=languages,
|
| 211 |
asr_presets=asr_presets,
|
|
@@ -222,6 +238,12 @@ def load_echo_coach_config() -> EchoCoachConfig:
|
|
| 222 |
updates["realtime_tts_preset"] = os.environ["ECHOCOACH_REALTIME_TTS_PRESET"]
|
| 223 |
if os.environ.get("ECHOCOACH_COACH_MODEL"):
|
| 224 |
updates["coach_model"] = os.environ["ECHOCOACH_COACH_MODEL"]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 225 |
if os.environ.get("ECHOCOACH_MAX_SECONDS"):
|
| 226 |
updates["max_seconds"] = int(os.environ["ECHOCOACH_MAX_SECONDS"])
|
| 227 |
|
|
|
|
| 45 |
tts_preset: str
|
| 46 |
realtime_tts_preset: str | None
|
| 47 |
coach_model: str
|
| 48 |
+
coach_fallbacks: tuple[str, ...]
|
| 49 |
max_seconds: int
|
| 50 |
languages: list[LanguageOption]
|
| 51 |
asr_presets: dict[str, AsrPreset]
|
| 52 |
tts_presets: dict[str, TtsPreset]
|
| 53 |
presets_path: Path | None = None
|
| 54 |
|
| 55 |
+
def coach_model_chain(self) -> list[str]:
|
| 56 |
+
"""Primary coach preset followed by fallbacks (deduped, order preserved)."""
|
| 57 |
+
chain: list[str] = []
|
| 58 |
+
seen: set[str] = set()
|
| 59 |
+
for key in (self.coach_model, *self.coach_fallbacks):
|
| 60 |
+
if key and key not in seen:
|
| 61 |
+
seen.add(key)
|
| 62 |
+
chain.append(key)
|
| 63 |
+
return chain
|
| 64 |
+
|
| 65 |
def get_asr(self, key: str | None = None) -> AsrPreset:
|
| 66 |
preset_key = key or self.asr_preset
|
| 67 |
if preset_key not in self.asr_presets:
|
|
|
|
| 125 |
tts_preset="piper-multilingual",
|
| 126 |
realtime_tts_preset=None,
|
| 127 |
coach_model="minicpm5-1b",
|
| 128 |
+
coach_fallbacks=(),
|
| 129 |
max_seconds=30,
|
| 130 |
languages=langs,
|
| 131 |
asr_presets=asr,
|
|
|
|
| 213 |
if tts_default not in tts_presets:
|
| 214 |
tts_default = next(iter(tts_presets))
|
| 215 |
|
| 216 |
+
raw_fallbacks = defaults.get("coach_fallbacks") or []
|
| 217 |
+
coach_fallbacks = tuple(str(item) for item in raw_fallbacks)
|
| 218 |
+
|
| 219 |
config = EchoCoachConfig(
|
| 220 |
asr_preset=asr_default,
|
| 221 |
tts_preset=tts_default,
|
| 222 |
realtime_tts_preset=defaults.get("realtime_tts_preset"),
|
| 223 |
coach_model=str(defaults.get("coach_model", "minicpm5-1b")),
|
| 224 |
+
coach_fallbacks=coach_fallbacks,
|
| 225 |
max_seconds=int(defaults.get("max_seconds", 30)),
|
| 226 |
languages=languages,
|
| 227 |
asr_presets=asr_presets,
|
|
|
|
| 238 |
updates["realtime_tts_preset"] = os.environ["ECHOCOACH_REALTIME_TTS_PRESET"]
|
| 239 |
if os.environ.get("ECHOCOACH_COACH_MODEL"):
|
| 240 |
updates["coach_model"] = os.environ["ECHOCOACH_COACH_MODEL"]
|
| 241 |
+
if os.environ.get("ECHOCOACH_COACH_FALLBACK"):
|
| 242 |
+
updates["coach_fallbacks"] = tuple(
|
| 243 |
+
part.strip()
|
| 244 |
+
for part in os.environ["ECHOCOACH_COACH_FALLBACK"].split(",")
|
| 245 |
+
if part.strip()
|
| 246 |
+
)
|
| 247 |
if os.environ.get("ECHOCOACH_MAX_SECONDS"):
|
| 248 |
updates["max_seconds"] = int(os.environ["ECHOCOACH_MAX_SECONDS"])
|
| 249 |
|
libs/echocoach/src/echocoach/pipeline.py
CHANGED
|
@@ -64,9 +64,14 @@ def run_echo_coach(
|
|
| 64 |
transcript = asr.transcribe(str(clipped_path), language=language)
|
| 65 |
trace.log_note("asr_complete", preset=asr_key, chars=len(transcript))
|
| 66 |
|
| 67 |
-
fillers = analyze_fillers(transcript)
|
| 68 |
pace = analyze_pace(transcript, duration)
|
| 69 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
filler_chart, pace_chart = build_charts(
|
| 72 |
transcript,
|
|
|
|
| 64 |
transcript = asr.transcribe(str(clipped_path), language=language)
|
| 65 |
trace.log_note("asr_complete", preset=asr_key, chars=len(transcript))
|
| 66 |
|
| 67 |
+
fillers = analyze_fillers(transcript) if language == "en" else FillerAnalysis(counts={}, spans=[], total=0)
|
| 68 |
pace = analyze_pace(transcript, duration)
|
| 69 |
+
if language == "en":
|
| 70 |
+
transcript_html = highlight_fillers_html(transcript, fillers)
|
| 71 |
+
else:
|
| 72 |
+
import html
|
| 73 |
+
|
| 74 |
+
transcript_html = html.escape(transcript).replace("\n", "<br>")
|
| 75 |
|
| 76 |
filler_chart, pace_chart = build_charts(
|
| 77 |
transcript,
|
libs/echocoach/src/echocoach/prompts.py
CHANGED
|
@@ -12,22 +12,49 @@ MODE_LABELS: dict[TeacherVoiceMode, str] = {
|
|
| 12 |
"pitch": "Pitch practice",
|
| 13 |
}
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
EXPLAIN_SYSTEM = """You are TeacherVoice, a friendly tutor who explains ideas in plain language.
|
| 16 |
Reply with ONLY the spoken answer (2-5 short sentences). Do not include planning, drafting,
|
| 17 |
numbered outlines, or phrases like "let me think" or "first I need to".
|
| 18 |
-
Use simple examples when helpful.
|
| 19 |
When source excerpts are provided, ground your answer in them and cite with [1], [2], etc."""
|
| 20 |
|
| 21 |
LESSON_SYSTEM = """You are TeacherVoice, a lesson-planning coach for teachers and students.
|
| 22 |
Reply with ONLY the spoken answer (2-5 short sentences). Do not include planning, drafting,
|
| 23 |
or meta commentary about how you will answer.
|
| 24 |
Help outline and explain lesson content verbally: learning goals, key points, and a simple flow.
|
| 25 |
-
If a lesson topic is set, stay focused on it.
|
|
|
|
| 26 |
|
| 27 |
PITCH_SYSTEM = """You are TeacherVoice, a supportive public-speaking coach in a live conversation.
|
| 28 |
Give brief, actionable feedback on what the student just said (opening, clarity, energy, structure).
|
| 29 |
Do not produce JSON or long reports — speak naturally in 2-4 sentences.
|
| 30 |
-
Suggest one concrete improvement for their next attempt. For charts and pace analysis,
|
| 31 |
|
| 32 |
_MODE_SYSTEM: dict[TeacherVoiceMode, str] = {
|
| 33 |
"explain": EXPLAIN_SYSTEM,
|
|
@@ -36,8 +63,39 @@ _MODE_SYSTEM: dict[TeacherVoiceMode, str] = {
|
|
| 36 |
}
|
| 37 |
|
| 38 |
|
| 39 |
-
def
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
|
| 42 |
|
| 43 |
def topic_context_block(topic: str | None, mode: TeacherVoiceMode) -> str | None:
|
|
|
|
| 12 |
"pitch": "Pitch practice",
|
| 13 |
}
|
| 14 |
|
| 15 |
+
LANGUAGE_LESSON_MODES: frozenset[TeacherVoiceMode] = frozenset({"explain", "lesson"})
|
| 16 |
+
|
| 17 |
+
# ISO 639-1 codes mapped to Tiny Aya regional presets (see Cohere Labs field guide).
|
| 18 |
+
_AYA_FIRE_LANGS = frozenset({"hi", "bn", "ta", "te", "mr", "gu", "kn", "ml", "pa", "ur", "ne", "si"})
|
| 19 |
+
_AYA_EARTH_LANGS = frozenset({"ar", "sw", "am", "ha", "fa", "he", "so", "yo", "ig", "zu", "af"})
|
| 20 |
+
_AYA_WATER_LANGS = frozenset(
|
| 21 |
+
{"fr", "de", "es", "it", "pt", "nl", "pl", "el", "ja", "zh", "ko", "vi", "ru", "uk", "cs", "sv", "da", "fi", "no"}
|
| 22 |
+
)
|
| 23 |
+
|
| 24 |
+
_LANGUAGE_LABELS: dict[str, str] = {
|
| 25 |
+
"en": "English",
|
| 26 |
+
"fr": "French",
|
| 27 |
+
"de": "German",
|
| 28 |
+
"es": "Spanish",
|
| 29 |
+
"it": "Italian",
|
| 30 |
+
"pt": "Portuguese",
|
| 31 |
+
"nl": "Dutch",
|
| 32 |
+
"pl": "Polish",
|
| 33 |
+
"el": "Greek",
|
| 34 |
+
"ar": "Arabic",
|
| 35 |
+
"ja": "Japanese",
|
| 36 |
+
"zh": "Chinese",
|
| 37 |
+
"vi": "Vietnamese",
|
| 38 |
+
"ko": "Korean",
|
| 39 |
+
}
|
| 40 |
+
|
| 41 |
EXPLAIN_SYSTEM = """You are TeacherVoice, a friendly tutor who explains ideas in plain language.
|
| 42 |
Reply with ONLY the spoken answer (2-5 short sentences). Do not include planning, drafting,
|
| 43 |
numbered outlines, or phrases like "let me think" or "first I need to".
|
| 44 |
+
Use simple examples when helpful.
|
| 45 |
When source excerpts are provided, ground your answer in them and cite with [1], [2], etc."""
|
| 46 |
|
| 47 |
LESSON_SYSTEM = """You are TeacherVoice, a lesson-planning coach for teachers and students.
|
| 48 |
Reply with ONLY the spoken answer (2-5 short sentences). Do not include planning, drafting,
|
| 49 |
or meta commentary about how you will answer.
|
| 50 |
Help outline and explain lesson content verbally: learning goals, key points, and a simple flow.
|
| 51 |
+
If a lesson topic is set, stay focused on it.
|
| 52 |
+
When source excerpts are provided, use them and cite [1], [2], etc."""
|
| 53 |
|
| 54 |
PITCH_SYSTEM = """You are TeacherVoice, a supportive public-speaking coach in a live conversation.
|
| 55 |
Give brief, actionable feedback on what the student just said (opening, clarity, energy, structure).
|
| 56 |
Do not produce JSON or long reports — speak naturally in 2-4 sentences.
|
| 57 |
+
Suggest one concrete improvement for their next attempt. For charts and pace analysis, use Classic EchoCoach."""
|
| 58 |
|
| 59 |
_MODE_SYSTEM: dict[TeacherVoiceMode, str] = {
|
| 60 |
"explain": EXPLAIN_SYSTEM,
|
|
|
|
| 63 |
}
|
| 64 |
|
| 65 |
|
| 66 |
+
def language_label(language: str) -> str:
|
| 67 |
+
code = (language or "en").strip().lower().split("-")[0]
|
| 68 |
+
return _LANGUAGE_LABELS.get(code, code or "English")
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
def language_instruction(language: str) -> str:
|
| 72 |
+
label = language_label(language)
|
| 73 |
+
return (
|
| 74 |
+
f"Target language: {label} ({language}). "
|
| 75 |
+
f"Reply ONLY in {label}. "
|
| 76 |
+
"If the student writes or speaks in another language, match their language instead."
|
| 77 |
+
)
|
| 78 |
+
|
| 79 |
+
|
| 80 |
+
def resolve_aya_preset(language: str, variant: str = "auto") -> str:
|
| 81 |
+
"""Return a models.yaml preset key for the Tiny Aya coach.
|
| 82 |
+
|
| 83 |
+
Regional Water/Fire/Earth presets remain in models.yaml for future use but
|
| 84 |
+
default to Global so Spaces only load one gated model.
|
| 85 |
+
"""
|
| 86 |
+
_ = language # language kept for API compatibility; Global handles 70+ langs
|
| 87 |
+
if variant and variant not in ("auto", ""):
|
| 88 |
+
if variant in ("tiny-aya-water", "tiny-aya-fire", "tiny-aya-earth"):
|
| 89 |
+
return "tiny-aya-global"
|
| 90 |
+
return variant
|
| 91 |
+
return "tiny-aya-global"
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
def system_prompt_for_mode(mode: TeacherVoiceMode, *, language: str | None = None) -> str:
|
| 95 |
+
base = _MODE_SYSTEM[mode]
|
| 96 |
+
if language:
|
| 97 |
+
return f"{base}\n\n{language_instruction(language)}"
|
| 98 |
+
return base
|
| 99 |
|
| 100 |
|
| 101 |
def topic_context_block(topic: str | None, mode: TeacherVoiceMode) -> str | None:
|
libs/echocoach/src/echocoach/teacher_voice.py
CHANGED
|
@@ -168,6 +168,7 @@ def _rag_turn_via_agent(
|
|
| 168 |
model_key: str,
|
| 169 |
backend: InferenceBackend,
|
| 170 |
trace: TraceRecorder,
|
|
|
|
| 171 |
) -> tuple[str, str | None, str | None, str]:
|
| 172 |
"""Grounded answer via ResearchMind harness. Returns text, refs, status, display."""
|
| 173 |
query = retrieval_query(user_text, topic=topic)
|
|
@@ -205,6 +206,7 @@ def _rag_turn_via_agent(
|
|
| 205 |
mode=mode,
|
| 206 |
backend=backend,
|
| 207 |
trace=trace,
|
|
|
|
| 208 |
)
|
| 209 |
rag_refs = result.references_markdown or None
|
| 210 |
return assistant_text, rag_refs, rag_status, display_reply
|
|
@@ -237,13 +239,14 @@ def _compact_teacher_reply(
|
|
| 237 |
mode: TeacherVoiceMode,
|
| 238 |
backend: InferenceBackend,
|
| 239 |
trace: TraceRecorder,
|
|
|
|
| 240 |
) -> str:
|
| 241 |
seed = strip_reasoning_output(raw_reply).strip() or raw_reply.strip()[:1200]
|
| 242 |
messages = [
|
| 243 |
{
|
| 244 |
"role": "system",
|
| 245 |
"content": (
|
| 246 |
-
f"{system_prompt_for_mode(mode)}\n\n"
|
| 247 |
"Rewrite the draft below into ONLY 2-4 spoken sentences for voice playback. "
|
| 248 |
"Keep any [n] citations. No planning or labels."
|
| 249 |
),
|
|
@@ -263,6 +266,7 @@ def _finalize_voice_reply(
|
|
| 263 |
mode: TeacherVoiceMode,
|
| 264 |
backend: InferenceBackend,
|
| 265 |
trace: TraceRecorder,
|
|
|
|
| 266 |
) -> tuple[str, str]:
|
| 267 |
"""Normalize model output into a complete spoken reply and chat display text."""
|
| 268 |
assistant_text = strip_reasoning_output(raw_reply).strip()
|
|
@@ -278,6 +282,7 @@ def _finalize_voice_reply(
|
|
| 278 |
mode=mode,
|
| 279 |
backend=backend,
|
| 280 |
trace=trace,
|
|
|
|
| 281 |
)
|
| 282 |
if not reply_ends_complete_sentence(assistant_text):
|
| 283 |
assistant_text = _compact_teacher_reply(
|
|
@@ -285,6 +290,7 @@ def _finalize_voice_reply(
|
|
| 285 |
mode=mode,
|
| 286 |
backend=backend,
|
| 287 |
trace=trace,
|
|
|
|
| 288 |
)
|
| 289 |
return assistant_text, assistant_text
|
| 290 |
|
|
@@ -296,8 +302,9 @@ def build_teacher_messages(
|
|
| 296 |
user_text: str,
|
| 297 |
topic: str | None = None,
|
| 298 |
rag: RagContext | None = None,
|
|
|
|
| 299 |
) -> list[dict[str, str]]:
|
| 300 |
-
system = system_prompt_for_mode(mode)
|
| 301 |
topic_line = topic_context_block(topic, mode)
|
| 302 |
if topic_line:
|
| 303 |
system = f"{system}\n\n{topic_line}"
|
|
@@ -330,6 +337,7 @@ def _generate_teacher_reply(
|
|
| 330 |
session_id: str,
|
| 331 |
doc_ids: list[str] | None,
|
| 332 |
tts_key: str,
|
|
|
|
| 333 |
) -> TeacherVoiceTurnResult:
|
| 334 |
rag_refs: str | None = None
|
| 335 |
rag_status: str | None = None
|
|
@@ -344,6 +352,7 @@ def _generate_teacher_reply(
|
|
| 344 |
model_key=model_key,
|
| 345 |
backend=backend,
|
| 346 |
trace=trace,
|
|
|
|
| 347 |
)
|
| 348 |
else:
|
| 349 |
messages = build_teacher_messages(
|
|
@@ -351,6 +360,7 @@ def _generate_teacher_reply(
|
|
| 351 |
history=history,
|
| 352 |
user_text=user_text,
|
| 353 |
topic=topic,
|
|
|
|
| 354 |
)
|
| 355 |
raw_reply = backend.chat(messages, max_tokens=512, temperature=0.2)
|
| 356 |
assistant_text, display_reply = _finalize_voice_reply(
|
|
@@ -358,20 +368,25 @@ def _generate_teacher_reply(
|
|
| 358 |
mode=mode,
|
| 359 |
backend=backend,
|
| 360 |
trace=trace,
|
|
|
|
| 361 |
)
|
| 362 |
trace.log_llm(messages[-1]["content"], raw_reply)
|
| 363 |
if mode in RAG_MODES:
|
| 364 |
rag_status = _rag_off_status(session_id, doc_ids)
|
| 365 |
|
| 366 |
-
voiceout_path
|
| 367 |
-
|
| 368 |
-
|
| 369 |
-
|
| 370 |
-
|
| 371 |
-
|
| 372 |
-
|
| 373 |
-
|
| 374 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 375 |
|
| 376 |
new_history = append_chat_turn(
|
| 377 |
history,
|
|
@@ -409,6 +424,7 @@ def run_teacher_voice_text_turn(
|
|
| 409 |
use_rag: bool = False,
|
| 410 |
session_id: str = "",
|
| 411 |
doc_ids: list[str] | None = None,
|
|
|
|
| 412 |
) -> TeacherVoiceTurnResult:
|
| 413 |
"""Process a typed user message (skips ASR)."""
|
| 414 |
user_text = user_text.strip()
|
|
@@ -451,6 +467,7 @@ def run_teacher_voice_text_turn(
|
|
| 451 |
session_id=session_id,
|
| 452 |
doc_ids=doc_ids,
|
| 453 |
tts_key=tts_key,
|
|
|
|
| 454 |
)
|
| 455 |
|
| 456 |
|
|
@@ -469,6 +486,7 @@ def run_teacher_voice_turn(
|
|
| 469 |
session_id: str = "",
|
| 470 |
doc_ids: list[str] | None = None,
|
| 471 |
max_turn_seconds: int | None = None,
|
|
|
|
| 472 |
) -> TeacherVoiceTurnResult:
|
| 473 |
if not audio_path:
|
| 474 |
raise ValueError("No audio recording provided.")
|
|
@@ -512,7 +530,7 @@ def run_teacher_voice_turn(
|
|
| 512 |
from echocoach.omni import is_omni_profile, try_omni_turn
|
| 513 |
|
| 514 |
if is_omni_profile():
|
| 515 |
-
system = system_prompt_for_mode(mode)
|
| 516 |
topic_line = topic_context_block(topic, mode)
|
| 517 |
if topic_line:
|
| 518 |
system = f"{system}\n\n{topic_line}"
|
|
@@ -559,4 +577,5 @@ def run_teacher_voice_turn(
|
|
| 559 |
session_id=session_id,
|
| 560 |
doc_ids=doc_ids,
|
| 561 |
tts_key=tts_key,
|
|
|
|
| 562 |
)
|
|
|
|
| 168 |
model_key: str,
|
| 169 |
backend: InferenceBackend,
|
| 170 |
trace: TraceRecorder,
|
| 171 |
+
language: str = "en",
|
| 172 |
) -> tuple[str, str | None, str | None, str]:
|
| 173 |
"""Grounded answer via ResearchMind harness. Returns text, refs, status, display."""
|
| 174 |
query = retrieval_query(user_text, topic=topic)
|
|
|
|
| 206 |
mode=mode,
|
| 207 |
backend=backend,
|
| 208 |
trace=trace,
|
| 209 |
+
language=language,
|
| 210 |
)
|
| 211 |
rag_refs = result.references_markdown or None
|
| 212 |
return assistant_text, rag_refs, rag_status, display_reply
|
|
|
|
| 239 |
mode: TeacherVoiceMode,
|
| 240 |
backend: InferenceBackend,
|
| 241 |
trace: TraceRecorder,
|
| 242 |
+
language: str = "en",
|
| 243 |
) -> str:
|
| 244 |
seed = strip_reasoning_output(raw_reply).strip() or raw_reply.strip()[:1200]
|
| 245 |
messages = [
|
| 246 |
{
|
| 247 |
"role": "system",
|
| 248 |
"content": (
|
| 249 |
+
f"{system_prompt_for_mode(mode, language=language)}\n\n"
|
| 250 |
"Rewrite the draft below into ONLY 2-4 spoken sentences for voice playback. "
|
| 251 |
"Keep any [n] citations. No planning or labels."
|
| 252 |
),
|
|
|
|
| 266 |
mode: TeacherVoiceMode,
|
| 267 |
backend: InferenceBackend,
|
| 268 |
trace: TraceRecorder,
|
| 269 |
+
language: str = "en",
|
| 270 |
) -> tuple[str, str]:
|
| 271 |
"""Normalize model output into a complete spoken reply and chat display text."""
|
| 272 |
assistant_text = strip_reasoning_output(raw_reply).strip()
|
|
|
|
| 282 |
mode=mode,
|
| 283 |
backend=backend,
|
| 284 |
trace=trace,
|
| 285 |
+
language=language,
|
| 286 |
)
|
| 287 |
if not reply_ends_complete_sentence(assistant_text):
|
| 288 |
assistant_text = _compact_teacher_reply(
|
|
|
|
| 290 |
mode=mode,
|
| 291 |
backend=backend,
|
| 292 |
trace=trace,
|
| 293 |
+
language=language,
|
| 294 |
)
|
| 295 |
return assistant_text, assistant_text
|
| 296 |
|
|
|
|
| 302 |
user_text: str,
|
| 303 |
topic: str | None = None,
|
| 304 |
rag: RagContext | None = None,
|
| 305 |
+
language: str = "en",
|
| 306 |
) -> list[dict[str, str]]:
|
| 307 |
+
system = system_prompt_for_mode(mode, language=language)
|
| 308 |
topic_line = topic_context_block(topic, mode)
|
| 309 |
if topic_line:
|
| 310 |
system = f"{system}\n\n{topic_line}"
|
|
|
|
| 337 |
session_id: str,
|
| 338 |
doc_ids: list[str] | None,
|
| 339 |
tts_key: str,
|
| 340 |
+
auto_voiceout: bool = True,
|
| 341 |
) -> TeacherVoiceTurnResult:
|
| 342 |
rag_refs: str | None = None
|
| 343 |
rag_status: str | None = None
|
|
|
|
| 352 |
model_key=model_key,
|
| 353 |
backend=backend,
|
| 354 |
trace=trace,
|
| 355 |
+
language=language,
|
| 356 |
)
|
| 357 |
else:
|
| 358 |
messages = build_teacher_messages(
|
|
|
|
| 360 |
history=history,
|
| 361 |
user_text=user_text,
|
| 362 |
topic=topic,
|
| 363 |
+
language=language,
|
| 364 |
)
|
| 365 |
raw_reply = backend.chat(messages, max_tokens=512, temperature=0.2)
|
| 366 |
assistant_text, display_reply = _finalize_voice_reply(
|
|
|
|
| 368 |
mode=mode,
|
| 369 |
backend=backend,
|
| 370 |
trace=trace,
|
| 371 |
+
language=language,
|
| 372 |
)
|
| 373 |
trace.log_llm(messages[-1]["content"], raw_reply)
|
| 374 |
if mode in RAG_MODES:
|
| 375 |
rag_status = _rag_off_status(session_id, doc_ids)
|
| 376 |
|
| 377 |
+
voiceout_path: str | None = None
|
| 378 |
+
voiceout_first: str | None = None
|
| 379 |
+
voiceout_warning: str | None = None
|
| 380 |
+
if auto_voiceout:
|
| 381 |
+
voiceout_path, voiceout_first, voiceout_warning = synthesize_voice_reply(
|
| 382 |
+
strip_references_for_tts(assistant_text),
|
| 383 |
+
language=language,
|
| 384 |
+
tts_preset=tts_key,
|
| 385 |
+
chunk_first=True,
|
| 386 |
+
out_subdir="teacher_voice",
|
| 387 |
+
)
|
| 388 |
+
if voiceout_path:
|
| 389 |
+
trace.set_artifact(voiceout_path)
|
| 390 |
|
| 391 |
new_history = append_chat_turn(
|
| 392 |
history,
|
|
|
|
| 424 |
use_rag: bool = False,
|
| 425 |
session_id: str = "",
|
| 426 |
doc_ids: list[str] | None = None,
|
| 427 |
+
auto_voiceout: bool = True,
|
| 428 |
) -> TeacherVoiceTurnResult:
|
| 429 |
"""Process a typed user message (skips ASR)."""
|
| 430 |
user_text = user_text.strip()
|
|
|
|
| 467 |
session_id=session_id,
|
| 468 |
doc_ids=doc_ids,
|
| 469 |
tts_key=tts_key,
|
| 470 |
+
auto_voiceout=auto_voiceout,
|
| 471 |
)
|
| 472 |
|
| 473 |
|
|
|
|
| 486 |
session_id: str = "",
|
| 487 |
doc_ids: list[str] | None = None,
|
| 488 |
max_turn_seconds: int | None = None,
|
| 489 |
+
auto_voiceout: bool = True,
|
| 490 |
) -> TeacherVoiceTurnResult:
|
| 491 |
if not audio_path:
|
| 492 |
raise ValueError("No audio recording provided.")
|
|
|
|
| 530 |
from echocoach.omni import is_omni_profile, try_omni_turn
|
| 531 |
|
| 532 |
if is_omni_profile():
|
| 533 |
+
system = system_prompt_for_mode(mode, language=language)
|
| 534 |
topic_line = topic_context_block(topic, mode)
|
| 535 |
if topic_line:
|
| 536 |
system = f"{system}\n\n{topic_line}"
|
|
|
|
| 577 |
session_id=session_id,
|
| 578 |
doc_ids=doc_ids,
|
| 579 |
tts_key=tts_key,
|
| 580 |
+
auto_voiceout=auto_voiceout,
|
| 581 |
)
|
libs/echocoach/tests/test_teacher_voice.py
CHANGED
|
@@ -7,7 +7,7 @@ import pytest
|
|
| 7 |
import soundfile as sf
|
| 8 |
|
| 9 |
from inference.response_clean import reply_ends_complete_sentence
|
| 10 |
-
from echocoach.prompts import PITCH_SYSTEM, system_prompt_for_mode
|
| 11 |
from echocoach.teacher_voice import (
|
| 12 |
RagContext,
|
| 13 |
append_chat_turn,
|
|
@@ -131,8 +131,43 @@ def test_build_teacher_messages_includes_topic_and_rag():
|
|
| 131 |
assert "Reply now in 2-4 complete spoken sentences only" in messages[-1]["content"]
|
| 132 |
|
| 133 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
def test_pitch_mode_system_prompt():
|
| 135 |
-
assert "
|
| 136 |
assert PITCH_SYSTEM == system_prompt_for_mode("pitch")
|
| 137 |
|
| 138 |
|
|
|
|
| 7 |
import soundfile as sf
|
| 8 |
|
| 9 |
from inference.response_clean import reply_ends_complete_sentence
|
| 10 |
+
from echocoach.prompts import PITCH_SYSTEM, resolve_aya_preset, system_prompt_for_mode
|
| 11 |
from echocoach.teacher_voice import (
|
| 12 |
RagContext,
|
| 13 |
append_chat_turn,
|
|
|
|
| 131 |
assert "Reply now in 2-4 complete spoken sentences only" in messages[-1]["content"]
|
| 132 |
|
| 133 |
|
| 134 |
+
def test_coach_model_chain_dedupes():
|
| 135 |
+
from echocoach.config import EchoCoachConfig, LanguageOption
|
| 136 |
+
|
| 137 |
+
cfg = EchoCoachConfig(
|
| 138 |
+
asr_preset="whisper-cpp-tiny",
|
| 139 |
+
tts_preset="piper-multilingual",
|
| 140 |
+
realtime_tts_preset=None,
|
| 141 |
+
coach_model="tiny-aya-global",
|
| 142 |
+
coach_fallbacks=("minicpm5-1b", "tiny-aya-global"),
|
| 143 |
+
max_seconds=30,
|
| 144 |
+
languages=[LanguageOption("en", "English")],
|
| 145 |
+
asr_presets={},
|
| 146 |
+
tts_presets={},
|
| 147 |
+
)
|
| 148 |
+
assert cfg.coach_model_chain() == ["tiny-aya-global", "minicpm5-1b"]
|
| 149 |
+
|
| 150 |
+
|
| 151 |
+
def test_resolve_aya_preset_uses_global_only():
|
| 152 |
+
assert resolve_aya_preset("fr", "auto") == "tiny-aya-global"
|
| 153 |
+
assert resolve_aya_preset("hi", "auto") == "tiny-aya-global"
|
| 154 |
+
assert resolve_aya_preset("en", "tiny-aya-water") == "tiny-aya-global"
|
| 155 |
+
|
| 156 |
+
|
| 157 |
+
def test_build_teacher_messages_includes_language_instruction():
|
| 158 |
+
messages = build_teacher_messages(
|
| 159 |
+
mode="lesson",
|
| 160 |
+
history=[],
|
| 161 |
+
user_text="Explique le fine-tuning.",
|
| 162 |
+
topic="ML",
|
| 163 |
+
language="fr",
|
| 164 |
+
)
|
| 165 |
+
assert "Target language: French" in messages[0]["content"]
|
| 166 |
+
assert "Reply ONLY in French" in messages[0]["content"]
|
| 167 |
+
|
| 168 |
+
|
| 169 |
def test_pitch_mode_system_prompt():
|
| 170 |
+
assert "public-speaking coach" in system_prompt_for_mode("pitch")
|
| 171 |
assert PITCH_SYSTEM == system_prompt_for_mode("pitch")
|
| 172 |
|
| 173 |
|
models.yaml
CHANGED
|
@@ -67,3 +67,27 @@ models:
|
|
| 67 |
backend: transformers
|
| 68 |
model_id: ./models/finetuned/minicpm5-1b-lora-merged
|
| 69 |
trust_remote_code: true
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
backend: transformers
|
| 68 |
model_id: ./models/finetuned/minicpm5-1b-lora-merged
|
| 69 |
trust_remote_code: true
|
| 70 |
+
|
| 71 |
+
tiny-aya-global:
|
| 72 |
+
label: Tiny Aya Global 3.3B (multilingual coach)
|
| 73 |
+
backend: transformers
|
| 74 |
+
model_id: CohereLabs/tiny-aya-global
|
| 75 |
+
trust_remote_code: true
|
| 76 |
+
|
| 77 |
+
tiny-aya-water:
|
| 78 |
+
label: Tiny Aya Water 3.3B (European / Asia-Pacific)
|
| 79 |
+
backend: transformers
|
| 80 |
+
model_id: CohereLabs/tiny-aya-water
|
| 81 |
+
trust_remote_code: true
|
| 82 |
+
|
| 83 |
+
tiny-aya-fire:
|
| 84 |
+
label: Tiny Aya Fire 3.3B (South Asian)
|
| 85 |
+
backend: transformers
|
| 86 |
+
model_id: CohereLabs/tiny-aya-fire
|
| 87 |
+
trust_remote_code: true
|
| 88 |
+
|
| 89 |
+
tiny-aya-earth:
|
| 90 |
+
label: Tiny Aya Earth 3.3B (West Asian / African)
|
| 91 |
+
backend: transformers
|
| 92 |
+
model_id: CohereLabs/tiny-aya-earth
|
| 93 |
+
trust_remote_code: true
|
voice_models.yaml
CHANGED
|
@@ -2,11 +2,13 @@
|
|
| 2 |
# Override defaults via ECHOCOACH_ASR_PRESET / ECHOCOACH_TTS_PRESET in .env
|
| 3 |
|
| 4 |
defaults:
|
| 5 |
-
asr_preset:
|
| 6 |
tts_preset: piper-multilingual
|
| 7 |
# Realtime streaming TTS for TeacherVoice VoiceOut (set ECHOCOACH_TTS_PRESET to match)
|
| 8 |
realtime_tts_preset: vibevoice-realtime-0.5b
|
| 9 |
-
coach_model:
|
|
|
|
|
|
|
| 10 |
max_seconds: 30
|
| 11 |
|
| 12 |
languages:
|
|
@@ -75,7 +77,7 @@ tts:
|
|
| 75 |
pt: pt_BR-faber-medium
|
| 76 |
nl: nl_NL-mls-medium
|
| 77 |
pl: pl_PL-darkman-medium
|
| 78 |
-
el:
|
| 79 |
ar: ar_JO-kareem-medium
|
| 80 |
ja: ja_JP-natsuki-medium
|
| 81 |
zh: zh_CN-huayan-medium
|
|
|
|
| 2 |
# Override defaults via ECHOCOACH_ASR_PRESET / ECHOCOACH_TTS_PRESET in .env
|
| 3 |
|
| 4 |
defaults:
|
| 5 |
+
asr_preset: cohere-transcribe
|
| 6 |
tts_preset: piper-multilingual
|
| 7 |
# Realtime streaming TTS for TeacherVoice VoiceOut (set ECHOCOACH_TTS_PRESET to match)
|
| 8 |
realtime_tts_preset: vibevoice-realtime-0.5b
|
| 9 |
+
coach_model: tiny-aya-global
|
| 10 |
+
coach_fallbacks:
|
| 11 |
+
- minicpm5-1b
|
| 12 |
max_seconds: 30
|
| 13 |
|
| 14 |
languages:
|
|
|
|
| 77 |
pt: pt_BR-faber-medium
|
| 78 |
nl: nl_NL-mls-medium
|
| 79 |
pl: pl_PL-darkman-medium
|
| 80 |
+
el: el_GR-rapunzelina-low
|
| 81 |
ar: ar_JO-kareem-medium
|
| 82 |
ja: ja_JP-natsuki-medium
|
| 83 |
zh: zh_CN-huayan-medium
|