Spaces:
Running
Running
Update: Professional React landing page
Browse files- .gitignore +1 -0
- README.md +11 -11
- assets/index-BGP_uT2P.css +1 -0
- assets/index-BdlkdtJU.js +0 -0
- index.html +2 -2
- scripts/install.sh +3 -3
- scripts/start_robot_mode.sh +3 -3
- scripts/uninstall.sh +3 -3
- src/bot.py +606 -0
- src/pipecat_service.py +274 -0
- src/tars_bot.py +457 -0
- ui/README.md +4 -4
- ui/app.py +2 -2
.gitignore
CHANGED
|
@@ -18,6 +18,7 @@ __pycache__/
|
|
| 18 |
|
| 19 |
# production
|
| 20 |
/build
|
|
|
|
| 21 |
|
| 22 |
# misc
|
| 23 |
.DS_Store
|
|
|
|
| 18 |
|
| 19 |
# production
|
| 20 |
/build
|
| 21 |
+
/dist
|
| 22 |
|
| 23 |
# misc
|
| 24 |
.DS_Store
|
README.md
CHANGED
|
@@ -15,8 +15,8 @@ Real-time voice AI with transcription, vision, and intelligent conversation usin
|
|
| 15 |
## Features
|
| 16 |
|
| 17 |
- **Dual Operation Modes**
|
| 18 |
-
- **WebRTC Mode** (`bot.py`) - Browser-based voice AI with real-time metrics dashboard
|
| 19 |
-
- **Robot Mode** (`tars_bot.py`) - Connect to Raspberry Pi TARS robot via WebRTC and gRPC
|
| 20 |
- **Real-time Transcription** - Speechmatics or Deepgram with smart turn detection
|
| 21 |
- **Dual TTS Options** - Qwen3-TTS (local, free, voice cloning) or ElevenLabs (cloud)
|
| 22 |
- **LLM Integration** - Any model via DeepInfra
|
|
@@ -32,9 +32,9 @@ Real-time voice AI with transcription, vision, and intelligent conversation usin
|
|
| 32 |
|
| 33 |
```
|
| 34 |
tars-conversation-app/
|
| 35 |
-
βββ bot.py # WebRTC mode - Browser voice AI
|
| 36 |
-
βββ tars_bot.py # Robot mode - Raspberry Pi hardware
|
| 37 |
-
βββ pipecat_service.py # FastAPI backend (WebRTC signaling)
|
| 38 |
βββ config.py # Configuration management
|
| 39 |
βββ config.ini # User configuration file
|
| 40 |
βββ requirements.txt # Python dependencies
|
|
@@ -62,14 +62,14 @@ tars-conversation-app/
|
|
| 62 |
|
| 63 |
## Operation Modes
|
| 64 |
|
| 65 |
-
### WebRTC Mode (`bot.py`)
|
| 66 |
- **Use case**: Browser-based voice AI conversations
|
| 67 |
- **Transport**: SmallWebRTC (browser β Pipecat)
|
| 68 |
- **Features**: Full pipeline with STT, LLM, TTS, Memory
|
| 69 |
- **UI**: Gradio dashboard for metrics and transcription
|
| 70 |
- **Best for**: Development, testing, remote conversations
|
| 71 |
|
| 72 |
-
### Robot Mode (`tars_bot.py`)
|
| 73 |
- **Use case**: Physical TARS robot on Raspberry Pi
|
| 74 |
- **Transport**: aiortc (RPi β Pipecat) + gRPC (commands)
|
| 75 |
- **Features**: Same pipeline + robot control (eyes, gestures, movement)
|
|
@@ -159,7 +159,7 @@ type = hybrid # SQLite-based hybrid search (vector + BM25)
|
|
| 159 |
|
| 160 |
**Terminal 1: Python backend**
|
| 161 |
```bash
|
| 162 |
-
python pipecat_service.py
|
| 163 |
```
|
| 164 |
|
| 165 |
**Terminal 2: Gradio UI (optional)**
|
|
@@ -197,7 +197,7 @@ Deployment detection:
|
|
| 197 |
|
| 198 |
Run:
|
| 199 |
```bash
|
| 200 |
-
python tars_bot.py
|
| 201 |
```
|
| 202 |
|
| 203 |
## Gradio Dashboard
|
|
@@ -268,7 +268,7 @@ See [docs/DEVELOPING_APPS.md](docs/DEVELOPING_APPS.md) for comprehensive guide o
|
|
| 268 |
### Adding Tools
|
| 269 |
1. Create function in `src/tools/`
|
| 270 |
2. Create schema with `create_*_schema()`
|
| 271 |
-
3. Register in `bot.py` or `tars_bot.py`
|
| 272 |
4. LLM can now call your tool
|
| 273 |
|
| 274 |
### Modifying UI
|
|
@@ -287,7 +287,7 @@ Removes virtual environment and optionally data/config files.
|
|
| 287 |
## Troubleshooting
|
| 288 |
|
| 289 |
### No metrics in Gradio UI
|
| 290 |
-
- Ensure bot is running (`bot.py` or `tars_bot.py`)
|
| 291 |
- Check WebRTC client is connected
|
| 292 |
- Verify at least one conversation turn completed
|
| 293 |
|
|
|
|
| 15 |
## Features
|
| 16 |
|
| 17 |
- **Dual Operation Modes**
|
| 18 |
+
- **WebRTC Mode** (`src/bot.py`) - Browser-based voice AI with real-time metrics dashboard
|
| 19 |
+
- **Robot Mode** (`src/tars_bot.py`) - Connect to Raspberry Pi TARS robot via WebRTC and gRPC
|
| 20 |
- **Real-time Transcription** - Speechmatics or Deepgram with smart turn detection
|
| 21 |
- **Dual TTS Options** - Qwen3-TTS (local, free, voice cloning) or ElevenLabs (cloud)
|
| 22 |
- **LLM Integration** - Any model via DeepInfra
|
|
|
|
| 32 |
|
| 33 |
```
|
| 34 |
tars-conversation-app/
|
| 35 |
+
βββ src/bot.py # WebRTC mode - Browser voice AI
|
| 36 |
+
βββ src/tars_bot.py # Robot mode - Raspberry Pi hardware
|
| 37 |
+
βββ src/pipecat_service.py # FastAPI backend (WebRTC signaling)
|
| 38 |
βββ config.py # Configuration management
|
| 39 |
βββ config.ini # User configuration file
|
| 40 |
βββ requirements.txt # Python dependencies
|
|
|
|
| 62 |
|
| 63 |
## Operation Modes
|
| 64 |
|
| 65 |
+
### WebRTC Mode (`src/bot.py`)
|
| 66 |
- **Use case**: Browser-based voice AI conversations
|
| 67 |
- **Transport**: SmallWebRTC (browser β Pipecat)
|
| 68 |
- **Features**: Full pipeline with STT, LLM, TTS, Memory
|
| 69 |
- **UI**: Gradio dashboard for metrics and transcription
|
| 70 |
- **Best for**: Development, testing, remote conversations
|
| 71 |
|
| 72 |
+
### Robot Mode (`src/tars_bot.py`)
|
| 73 |
- **Use case**: Physical TARS robot on Raspberry Pi
|
| 74 |
- **Transport**: aiortc (RPi β Pipecat) + gRPC (commands)
|
| 75 |
- **Features**: Same pipeline + robot control (eyes, gestures, movement)
|
|
|
|
| 159 |
|
| 160 |
**Terminal 1: Python backend**
|
| 161 |
```bash
|
| 162 |
+
python src/pipecat_service.py
|
| 163 |
```
|
| 164 |
|
| 165 |
**Terminal 2: Gradio UI (optional)**
|
|
|
|
| 197 |
|
| 198 |
Run:
|
| 199 |
```bash
|
| 200 |
+
python src/tars_bot.py
|
| 201 |
```
|
| 202 |
|
| 203 |
## Gradio Dashboard
|
|
|
|
| 268 |
### Adding Tools
|
| 269 |
1. Create function in `src/tools/`
|
| 270 |
2. Create schema with `create_*_schema()`
|
| 271 |
+
3. Register in `src/bot.py` or `src/tars_bot.py`
|
| 272 |
4. LLM can now call your tool
|
| 273 |
|
| 274 |
### Modifying UI
|
|
|
|
| 287 |
## Troubleshooting
|
| 288 |
|
| 289 |
### No metrics in Gradio UI
|
| 290 |
+
- Ensure bot is running (`src/bot.py` or `src/tars_bot.py`)
|
| 291 |
- Check WebRTC client is connected
|
| 292 |
- Verify at least one conversation turn completed
|
| 293 |
|
assets/index-BGP_uT2P.css
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
*,:before,:after{box-sizing:border-box;border-width:0;border-style:solid;border-color:#e5e7eb}:before,:after{--tw-content: ""}html,:host{line-height:1.5;-webkit-text-size-adjust:100%;-moz-tab-size:4;-o-tab-size:4;tab-size:4;font-family:ui-sans-serif,system-ui,sans-serif,"Apple Color Emoji","Segoe UI Emoji",Segoe UI Symbol,"Noto Color Emoji";font-feature-settings:normal;font-variation-settings:normal;-webkit-tap-highlight-color:transparent}body{margin:0;line-height:inherit}hr{height:0;color:inherit;border-top-width:1px}abbr:where([title]){-webkit-text-decoration:underline dotted;text-decoration:underline dotted}h1,h2,h3,h4,h5,h6{font-size:inherit;font-weight:inherit}a{color:inherit;text-decoration:inherit}b,strong{font-weight:bolder}code,kbd,samp,pre{font-family:ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace;font-feature-settings:normal;font-variation-settings:normal;font-size:1em}small{font-size:80%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sub{bottom:-.25em}sup{top:-.5em}table{text-indent:0;border-color:inherit;border-collapse:collapse}button,input,optgroup,select,textarea{font-family:inherit;font-feature-settings:inherit;font-variation-settings:inherit;font-size:100%;font-weight:inherit;line-height:inherit;color:inherit;margin:0;padding:0}button,select{text-transform:none}button,[type=button],[type=reset],[type=submit]{-webkit-appearance:button;background-color:transparent;background-image:none}:-moz-focusring{outline:auto}:-moz-ui-invalid{box-shadow:none}progress{vertical-align:baseline}::-webkit-inner-spin-button,::-webkit-outer-spin-button{height:auto}[type=search]{-webkit-appearance:textfield;outline-offset:-2px}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{-webkit-appearance:button;font:inherit}summary{display:list-item}blockquote,dl,dd,h1,h2,h3,h4,h5,h6,hr,figure,p,pre{margin:0}fieldset{margin:0;padding:0}legend{padding:0}ol,ul,menu{list-style:none;margin:0;padding:0}dialog{padding:0}textarea{resize:vertical}input::-moz-placeholder,textarea::-moz-placeholder{opacity:1;color:#9ca3af}input::placeholder,textarea::placeholder{opacity:1;color:#9ca3af}button,[role=button]{cursor:pointer}:disabled{cursor:default}img,svg,video,canvas,audio,iframe,embed,object{display:block;vertical-align:middle}img,video{max-width:100%;height:auto}[hidden]{display:none}:root{--background: 0 0% 100%;--foreground: 0 0% 9%;--card: 0 0% 100%;--card-foreground: 0 0% 9%;--popover: 0 0% 100%;--popover-foreground: 0 0% 9%;--primary: 0 0% 9%;--primary-foreground: 0 0% 98%;--secondary: 0 0% 96.1%;--secondary-foreground: 0 0% 9%;--muted: 0 0% 96.1%;--muted-foreground: 0 0% 45.1%;--accent: 0 0% 96.1%;--accent-foreground: 0 0% 9%;--destructive: 0 84.2% 60.2%;--destructive-foreground: 0 0% 98%;--border: 0 0% 89.8%;--input: 0 0% 89.8%;--ring: 0 0% 9%;--radius: .5rem}*{border-color:hsl(var(--border))}body{background-color:hsl(var(--background));color:hsl(var(--foreground))}*,:before,:after{--tw-border-spacing-x: 0;--tw-border-spacing-y: 0;--tw-translate-x: 0;--tw-translate-y: 0;--tw-rotate: 0;--tw-skew-x: 0;--tw-skew-y: 0;--tw-scale-x: 1;--tw-scale-y: 1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness: proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width: 0px;--tw-ring-offset-color: #fff;--tw-ring-color: rgb(59 130 246 / .5);--tw-ring-offset-shadow: 0 0 #0000;--tw-ring-shadow: 0 0 #0000;--tw-shadow: 0 0 #0000;--tw-shadow-colored: 0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: }::backdrop{--tw-border-spacing-x: 0;--tw-border-spacing-y: 0;--tw-translate-x: 0;--tw-translate-y: 0;--tw-rotate: 0;--tw-skew-x: 0;--tw-skew-y: 0;--tw-scale-x: 1;--tw-scale-y: 1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness: proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width: 0px;--tw-ring-offset-color: #fff;--tw-ring-color: rgb(59 130 246 / .5);--tw-ring-offset-shadow: 0 0 #0000;--tw-ring-shadow: 0 0 #0000;--tw-shadow: 0 0 #0000;--tw-shadow-colored: 0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: }.container{width:100%;margin-right:auto;margin-left:auto;padding-right:2rem;padding-left:2rem}@media(min-width:1400px){.container{max-width:1400px}}.sticky{position:sticky}.top-0{top:0}.z-50{z-index:50}.mx-auto{margin-left:auto;margin-right:auto}.mb-12{margin-bottom:3rem}.mb-2{margin-bottom:.5rem}.mb-4{margin-bottom:1rem}.flex{display:flex}.inline-flex{display:inline-flex}.grid{display:grid}.h-10{height:2.5rem}.h-11{height:2.75rem}.h-12{height:3rem}.h-3{height:.75rem}.h-4{height:1rem}.h-6{height:1.5rem}.h-8{height:2rem}.h-9{height:2.25rem}.min-h-screen{min-height:100vh}.w-10{width:2.5rem}.w-12{width:3rem}.w-3{width:.75rem}.w-4{width:1rem}.w-6{width:1.5rem}.w-8{width:2rem}.w-full{width:100%}.max-w-3xl{max-width:48rem}.max-w-5xl{max-width:64rem}.max-w-6xl{max-width:72rem}.max-w-md{max-width:28rem}.flex-1{flex:1 1 0%}.flex-shrink-0{flex-shrink:0}.flex-col{flex-direction:column}.flex-wrap{flex-wrap:wrap}.items-center{align-items:center}.justify-center{justify-content:center}.justify-between{justify-content:space-between}.gap-1{gap:.25rem}.gap-12{gap:3rem}.gap-2{gap:.5rem}.gap-3{gap:.75rem}.gap-4{gap:1rem}.gap-6{gap:1.5rem}.gap-8{gap:2rem}.space-y-1>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(.25rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(.25rem * var(--tw-space-y-reverse))}.space-y-1\.5>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(.375rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(.375rem * var(--tw-space-y-reverse))}.space-y-4>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(1rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(1rem * var(--tw-space-y-reverse))}.space-y-6>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(1.5rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(1.5rem * var(--tw-space-y-reverse))}.whitespace-nowrap{white-space:nowrap}.rounded{border-radius:.25rem}.rounded-full{border-radius:9999px}.rounded-lg{border-radius:var(--radius)}.rounded-md{border-radius:calc(var(--radius) - 2px)}.border{border-width:1px}.border-b{border-bottom-width:1px}.border-t{border-top-width:1px}.border-input{border-color:hsl(var(--input))}.border-transparent{border-color:transparent}.bg-background{background-color:hsl(var(--background))}.bg-black{--tw-bg-opacity: 1;background-color:rgb(0 0 0 / var(--tw-bg-opacity))}.bg-card{background-color:hsl(var(--card))}.bg-destructive{background-color:hsl(var(--destructive))}.bg-gray-50{--tw-bg-opacity: 1;background-color:rgb(249 250 251 / var(--tw-bg-opacity))}.bg-primary{background-color:0 0% 9%}.bg-secondary{background-color:hsl(var(--secondary))}.bg-white{--tw-bg-opacity: 1;background-color:rgb(255 255 255 / var(--tw-bg-opacity))}.p-4{padding:1rem}.p-6{padding:1.5rem}.px-2{padding-left:.5rem;padding-right:.5rem}.px-2\.5{padding-left:.625rem;padding-right:.625rem}.px-3{padding-left:.75rem;padding-right:.75rem}.px-4{padding-left:1rem;padding-right:1rem}.px-8{padding-left:2rem;padding-right:2rem}.py-0{padding-top:0;padding-bottom:0}.py-0\.5{padding-top:.125rem;padding-bottom:.125rem}.py-1{padding-top:.25rem;padding-bottom:.25rem}.py-16{padding-top:4rem;padding-bottom:4rem}.py-2{padding-top:.5rem;padding-bottom:.5rem}.py-4{padding-top:1rem;padding-bottom:1rem}.py-8{padding-top:2rem;padding-bottom:2rem}.pt-0{padding-top:0}.text-center{text-align:center}.text-2xl{font-size:1.5rem;line-height:2rem}.text-3xl{font-size:1.875rem;line-height:2.25rem}.text-5xl{font-size:3rem;line-height:1}.text-lg{font-size:1.125rem;line-height:1.75rem}.text-sm{font-size:.875rem;line-height:1.25rem}.text-xl{font-size:1.25rem;line-height:1.75rem}.text-xs{font-size:.75rem;line-height:1rem}.font-bold{font-weight:700}.font-medium{font-weight:500}.font-semibold{font-weight:600}.leading-none{line-height:1}.tracking-tight{letter-spacing:-.025em}.text-black{--tw-text-opacity: 1;color:rgb(0 0 0 / var(--tw-text-opacity))}.text-card-foreground{color:hsl(var(--card-foreground))}.text-destructive-foreground{color:hsl(var(--destructive-foreground))}.text-foreground{color:hsl(var(--foreground))}.text-muted-foreground{color:hsl(var(--muted-foreground))}.text-primary{color:0 0% 9%}.text-primary-foreground{color:0 0% 98%}.text-secondary-foreground{color:hsl(var(--secondary-foreground))}.text-white{--tw-text-opacity: 1;color:rgb(255 255 255 / var(--tw-text-opacity))}.underline-offset-4{text-underline-offset:4px}.shadow{--tw-shadow: 0 1px 3px 0 rgb(0 0 0 / .1), 0 1px 2px -1px rgb(0 0 0 / .1);--tw-shadow-colored: 0 1px 3px 0 var(--tw-shadow-color), 0 1px 2px -1px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.shadow-lg{--tw-shadow: 0 10px 15px -3px rgb(0 0 0 / .1), 0 4px 6px -4px rgb(0 0 0 / .1);--tw-shadow-colored: 0 10px 15px -3px var(--tw-shadow-color), 0 4px 6px -4px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.shadow-sm{--tw-shadow: 0 1px 2px 0 rgb(0 0 0 / .05);--tw-shadow-colored: 0 1px 2px 0 var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.outline{outline-style:solid}.ring-offset-background{--tw-ring-offset-color: hsl(var(--background))}.transition-colors{transition-property:color,background-color,border-color,text-decoration-color,fill,stroke;transition-timing-function:cubic-bezier(.4,0,.2,1);transition-duration:.15s}.transition-shadow{transition-property:box-shadow;transition-timing-function:cubic-bezier(.4,0,.2,1);transition-duration:.15s}.hover\:bg-accent:hover{background-color:hsl(var(--accent))}.hover\:bg-destructive\/80:hover{background-color:hsl(var(--destructive) / .8)}.hover\:bg-destructive\/90:hover{background-color:hsl(var(--destructive) / .9)}.hover\:bg-secondary\/80:hover{background-color:hsl(var(--secondary) / .8)}.hover\:text-accent-foreground:hover{color:hsl(var(--accent-foreground))}.hover\:underline:hover{text-decoration-line:underline}.hover\:shadow-lg:hover{--tw-shadow: 0 10px 15px -3px rgb(0 0 0 / .1), 0 4px 6px -4px rgb(0 0 0 / .1);--tw-shadow-colored: 0 10px 15px -3px var(--tw-shadow-color), 0 4px 6px -4px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.focus\:outline-none:focus{outline:2px solid transparent;outline-offset:2px}.focus\:ring-2:focus{--tw-ring-offset-shadow: var(--tw-ring-inset) 0 0 0 var(--tw-ring-offset-width) var(--tw-ring-offset-color);--tw-ring-shadow: var(--tw-ring-inset) 0 0 0 calc(2px + var(--tw-ring-offset-width)) var(--tw-ring-color);box-shadow:var(--tw-ring-offset-shadow),var(--tw-ring-shadow),var(--tw-shadow, 0 0 #0000)}.focus\:ring-ring:focus{--tw-ring-color: hsl(var(--ring))}.focus\:ring-offset-2:focus{--tw-ring-offset-width: 2px}.focus-visible\:outline-none:focus-visible{outline:2px solid transparent;outline-offset:2px}.focus-visible\:ring-2:focus-visible{--tw-ring-offset-shadow: var(--tw-ring-inset) 0 0 0 var(--tw-ring-offset-width) var(--tw-ring-offset-color);--tw-ring-shadow: var(--tw-ring-inset) 0 0 0 calc(2px + var(--tw-ring-offset-width)) var(--tw-ring-color);box-shadow:var(--tw-ring-offset-shadow),var(--tw-ring-shadow),var(--tw-shadow, 0 0 #0000)}.focus-visible\:ring-ring:focus-visible{--tw-ring-color: hsl(var(--ring))}.focus-visible\:ring-offset-2:focus-visible{--tw-ring-offset-width: 2px}.disabled\:pointer-events-none:disabled{pointer-events:none}.disabled\:opacity-50:disabled{opacity:.5}@media(min-width:768px){.md\:grid-cols-2{grid-template-columns:repeat(2,minmax(0,1fr))}}@media(min-width:1024px){.lg\:grid-cols-2{grid-template-columns:repeat(2,minmax(0,1fr))}.lg\:grid-cols-3{grid-template-columns:repeat(3,minmax(0,1fr))}}.\[\&_svg\]\:pointer-events-none svg{pointer-events:none}.\[\&_svg\]\:size-4 svg{width:1rem;height:1rem}.\[\&_svg\]\:shrink-0 svg{flex-shrink:0}
|
assets/index-BdlkdtJU.js
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
index.html
CHANGED
|
@@ -6,8 +6,8 @@
|
|
| 6 |
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
| 7 |
<meta name="description" content="Real-time conversational AI with transcription, vision, and intelligent conversation for TARS robot" />
|
| 8 |
<title>TARS Conversation App - Real-time AI Voice Assistant</title>
|
| 9 |
-
<script type="module" crossorigin src="/assets/index-
|
| 10 |
-
<link rel="stylesheet" crossorigin href="/assets/index-
|
| 11 |
</head>
|
| 12 |
<body>
|
| 13 |
<div id="root"></div>
|
|
|
|
| 6 |
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
| 7 |
<meta name="description" content="Real-time conversational AI with transcription, vision, and intelligent conversation for TARS robot" />
|
| 8 |
<title>TARS Conversation App - Real-time AI Voice Assistant</title>
|
| 9 |
+
<script type="module" crossorigin src="/assets/index-BdlkdtJU.js"></script>
|
| 10 |
+
<link rel="stylesheet" crossorigin href="/assets/index-BGP_uT2P.css">
|
| 11 |
</head>
|
| 12 |
<body>
|
| 13 |
<div id="root"></div>
|
scripts/install.sh
CHANGED
|
@@ -89,11 +89,11 @@ if [ "$CONFIG_CREATED" = true ] || [ "$ENV_CREATED" = true ]; then
|
|
| 89 |
[ "$ENV_CREATED" = true ] && echo " - Add API keys to: $APP_DIR/.env.local"
|
| 90 |
[ "$CONFIG_CREATED" = true ] && echo " - Configure settings: $APP_DIR/config.ini"
|
| 91 |
echo "2. Activate environment: source $APP_DIR/venv/bin/activate"
|
| 92 |
-
echo "3. Run the app: python $APP_DIR/tars_bot.py"
|
| 93 |
else
|
| 94 |
echo "1. Activate environment: source $APP_DIR/venv/bin/activate"
|
| 95 |
-
echo "2. Run the app: python $APP_DIR/tars_bot.py"
|
| 96 |
fi
|
| 97 |
echo
|
| 98 |
-
echo "For browser mode: python $APP_DIR/bot.py"
|
| 99 |
echo "For dashboard: python $APP_DIR/ui/app.py"
|
|
|
|
| 89 |
[ "$ENV_CREATED" = true ] && echo " - Add API keys to: $APP_DIR/.env.local"
|
| 90 |
[ "$CONFIG_CREATED" = true ] && echo " - Configure settings: $APP_DIR/config.ini"
|
| 91 |
echo "2. Activate environment: source $APP_DIR/venv/bin/activate"
|
| 92 |
+
echo "3. Run the app: python $APP_DIR/src/tars_bot.py"
|
| 93 |
else
|
| 94 |
echo "1. Activate environment: source $APP_DIR/venv/bin/activate"
|
| 95 |
+
echo "2. Run the app: python $APP_DIR/src/tars_bot.py"
|
| 96 |
fi
|
| 97 |
echo
|
| 98 |
+
echo "For browser mode: python $APP_DIR/src/bot.py"
|
| 99 |
echo "For dashboard: python $APP_DIR/ui/app.py"
|
scripts/start_robot_mode.sh
CHANGED
|
@@ -58,8 +58,8 @@ if [ -d ".venv" ]; then
|
|
| 58 |
fi
|
| 59 |
|
| 60 |
# Check if tars_bot.py exists
|
| 61 |
-
if [ ! -f "tars_bot.py" ]; then
|
| 62 |
-
echo "β Error: tars_bot.py not found"
|
| 63 |
exit 1
|
| 64 |
fi
|
| 65 |
|
|
@@ -84,5 +84,5 @@ else
|
|
| 84 |
echo "β οΈ Note: Audio bridge integration is in progress"
|
| 85 |
echo " See IMPLEMENTATION_SUMMARY.md for current status"
|
| 86 |
echo ""
|
| 87 |
-
python tars_bot.py
|
| 88 |
fi
|
|
|
|
| 58 |
fi
|
| 59 |
|
| 60 |
# Check if tars_bot.py exists
|
| 61 |
+
if [ ! -f "src/tars_bot.py" ]; then
|
| 62 |
+
echo "β Error: src/tars_bot.py not found"
|
| 63 |
exit 1
|
| 64 |
fi
|
| 65 |
|
|
|
|
| 84 |
echo "β οΈ Note: Audio bridge integration is in progress"
|
| 85 |
echo " See IMPLEMENTATION_SUMMARY.md for current status"
|
| 86 |
echo ""
|
| 87 |
+
python src/tars_bot.py
|
| 88 |
fi
|
scripts/uninstall.sh
CHANGED
|
@@ -10,9 +10,9 @@ echo
|
|
| 10 |
|
| 11 |
# Stop running processes
|
| 12 |
echo "Stopping running processes..."
|
| 13 |
-
pkill -f "python.*tars_bot.py" || true
|
| 14 |
-
pkill -f "python.*bot.py" || true
|
| 15 |
-
pkill -f "python.*pipecat_service.py" || true
|
| 16 |
pkill -f "python.*ui/app.py" || true
|
| 17 |
sleep 1
|
| 18 |
echo "Processes stopped"
|
|
|
|
| 10 |
|
| 11 |
# Stop running processes
|
| 12 |
echo "Stopping running processes..."
|
| 13 |
+
pkill -f "python.*src/tars_bot.py" || true
|
| 14 |
+
pkill -f "python.*src/bot.py" || true
|
| 15 |
+
pkill -f "python.*src/pipecat_service.py" || true
|
| 16 |
pkill -f "python.*ui/app.py" || true
|
| 17 |
sleep 1
|
| 18 |
echo "Processes stopped"
|
src/bot.py
ADDED
|
@@ -0,0 +1,606 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Bot pipeline setup and execution."""
|
| 2 |
+
|
| 3 |
+
import sys
|
| 4 |
+
from pathlib import Path
|
| 5 |
+
|
| 6 |
+
# Add src directory to Python path for imports
|
| 7 |
+
src_dir = Path(__file__).parent
|
| 8 |
+
sys.path.insert(0, str(src_dir))
|
| 9 |
+
|
| 10 |
+
import asyncio
|
| 11 |
+
import json
|
| 12 |
+
import os
|
| 13 |
+
import logging
|
| 14 |
+
import uuid
|
| 15 |
+
import httpx
|
| 16 |
+
|
| 17 |
+
from pipecat.adapters.schemas.tools_schema import ToolsSchema
|
| 18 |
+
from pipecat.frames.frames import (
|
| 19 |
+
LLMRunFrame,
|
| 20 |
+
TranscriptionFrame,
|
| 21 |
+
InterimTranscriptionFrame,
|
| 22 |
+
Frame,
|
| 23 |
+
TranscriptionMessage,
|
| 24 |
+
TranslationFrame,
|
| 25 |
+
UserImageRawFrame,
|
| 26 |
+
UserAudioRawFrame,
|
| 27 |
+
UserImageRequestFrame,
|
| 28 |
+
)
|
| 29 |
+
from pipecat.processors.frame_processor import FrameProcessor, FrameDirection
|
| 30 |
+
from pipecat.pipeline.pipeline import Pipeline
|
| 31 |
+
from pipecat.pipeline.runner import PipelineRunner
|
| 32 |
+
from pipecat.pipeline.task import PipelineTask, PipelineParams
|
| 33 |
+
from pipecat.processors.aggregators.llm_context import LLMContext
|
| 34 |
+
from pipecat.processors.aggregators.llm_response_universal import (
|
| 35 |
+
LLMContextAggregatorPair,
|
| 36 |
+
LLMUserAggregatorParams
|
| 37 |
+
)
|
| 38 |
+
from pipecat.observers.turn_tracking_observer import TurnTrackingObserver
|
| 39 |
+
from pipecat.observers.loggers.user_bot_latency_log_observer import UserBotLatencyLogObserver
|
| 40 |
+
from pipecat.services.moondream.vision import MoondreamService
|
| 41 |
+
from pipecat.services.openai.llm import OpenAILLMService
|
| 42 |
+
from pipecat.services.llm_service import FunctionCallParams
|
| 43 |
+
from services.memory_hybrid import HybridMemoryService
|
| 44 |
+
from pipecat.transcriptions.language import Language
|
| 45 |
+
from pipecat.transports.base_transport import TransportParams
|
| 46 |
+
from pipecat.transports.smallwebrtc.transport import SmallWebRTCTransport
|
| 47 |
+
|
| 48 |
+
from loguru import logger
|
| 49 |
+
|
| 50 |
+
from config import (
|
| 51 |
+
SPEECHMATICS_API_KEY,
|
| 52 |
+
DEEPGRAM_API_KEY,
|
| 53 |
+
ELEVENLABS_API_KEY,
|
| 54 |
+
ELEVENLABS_VOICE_ID,
|
| 55 |
+
DEEPINFRA_API_KEY,
|
| 56 |
+
DEEPINFRA_BASE_URL,
|
| 57 |
+
MEM0_API_KEY,
|
| 58 |
+
get_fresh_config,
|
| 59 |
+
)
|
| 60 |
+
from services.factories import create_stt_service, create_tts_service
|
| 61 |
+
from processors import (
|
| 62 |
+
SilenceFilter,
|
| 63 |
+
InputAudioFilter,
|
| 64 |
+
InterventionGating,
|
| 65 |
+
VisualObserver,
|
| 66 |
+
EmotionalStateMonitor,
|
| 67 |
+
)
|
| 68 |
+
from observers import (
|
| 69 |
+
MetricsObserver,
|
| 70 |
+
TranscriptionObserver,
|
| 71 |
+
AssistantResponseObserver,
|
| 72 |
+
TTSStateObserver,
|
| 73 |
+
VisionObserver,
|
| 74 |
+
DebugObserver,
|
| 75 |
+
DisplayEventsObserver,
|
| 76 |
+
)
|
| 77 |
+
from character.prompts import (
|
| 78 |
+
load_persona_ini,
|
| 79 |
+
load_tars_json,
|
| 80 |
+
build_tars_system_prompt,
|
| 81 |
+
get_introduction_instruction,
|
| 82 |
+
)
|
| 83 |
+
from tools import (
|
| 84 |
+
fetch_user_image,
|
| 85 |
+
adjust_persona_parameter,
|
| 86 |
+
execute_movement,
|
| 87 |
+
capture_camera_view,
|
| 88 |
+
create_fetch_image_schema,
|
| 89 |
+
create_adjust_persona_schema,
|
| 90 |
+
create_identity_schema,
|
| 91 |
+
create_movement_schema,
|
| 92 |
+
create_camera_capture_schema,
|
| 93 |
+
get_persona_storage,
|
| 94 |
+
get_crossword_hint,
|
| 95 |
+
create_crossword_hint_schema,
|
| 96 |
+
)
|
| 97 |
+
from shared_state import metrics_store
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
# ============================================================================
|
| 101 |
+
# CUSTOM FRAME PROCESSORS
|
| 102 |
+
# ============================================================================
|
| 103 |
+
|
| 104 |
+
class IdentityUnifier(FrameProcessor):
|
| 105 |
+
"""
|
| 106 |
+
Applies 'guest_ID' ONLY to specific user input frames.
|
| 107 |
+
Leaves other frames untouched.
|
| 108 |
+
"""
|
| 109 |
+
# Define the frame types that should have user_id set
|
| 110 |
+
TARGET_FRAME_TYPES = (
|
| 111 |
+
TranscriptionFrame,
|
| 112 |
+
TranscriptionMessage,
|
| 113 |
+
TranslationFrame,
|
| 114 |
+
InterimTranscriptionFrame,
|
| 115 |
+
UserImageRawFrame,
|
| 116 |
+
UserAudioRawFrame,
|
| 117 |
+
UserImageRequestFrame,
|
| 118 |
+
)
|
| 119 |
+
|
| 120 |
+
def __init__(self, target_user_id):
|
| 121 |
+
super().__init__()
|
| 122 |
+
self.target_user_id = target_user_id
|
| 123 |
+
|
| 124 |
+
async def process_frame(self, frame: Frame, direction: FrameDirection):
|
| 125 |
+
# 1. Handle internal state
|
| 126 |
+
await super().process_frame(frame, direction)
|
| 127 |
+
|
| 128 |
+
# 2. Only modify specific frame types
|
| 129 |
+
if isinstance(frame, self.TARGET_FRAME_TYPES):
|
| 130 |
+
try:
|
| 131 |
+
frame.user_id = self.target_user_id
|
| 132 |
+
except Exception:
|
| 133 |
+
pass
|
| 134 |
+
|
| 135 |
+
# 3. Push downstream
|
| 136 |
+
await self.push_frame(frame, direction)
|
| 137 |
+
|
| 138 |
+
|
| 139 |
+
# ============================================================================
|
| 140 |
+
# HELPER FUNCTIONS
|
| 141 |
+
# ============================================================================
|
| 142 |
+
|
| 143 |
+
async def _cleanup_services(service_refs: dict):
|
| 144 |
+
if service_refs.get("stt"):
|
| 145 |
+
try:
|
| 146 |
+
await service_refs["stt"].close()
|
| 147 |
+
logger.info("β STT service cleaned up")
|
| 148 |
+
except Exception:
|
| 149 |
+
pass
|
| 150 |
+
if service_refs.get("tts"):
|
| 151 |
+
try:
|
| 152 |
+
await service_refs["tts"].close()
|
| 153 |
+
logger.info("β TTS service cleaned up")
|
| 154 |
+
except Exception:
|
| 155 |
+
pass
|
| 156 |
+
|
| 157 |
+
|
| 158 |
+
# ============================================================================
|
| 159 |
+
# MAIN BOT PIPELINE
|
| 160 |
+
# ============================================================================
|
| 161 |
+
|
| 162 |
+
async def run_bot(webrtc_connection):
|
| 163 |
+
"""Initialize and run the TARS bot pipeline."""
|
| 164 |
+
logger.info("Starting bot pipeline for WebRTC connection...")
|
| 165 |
+
|
| 166 |
+
# Load fresh configuration for this connection (allows runtime config updates)
|
| 167 |
+
runtime_config = get_fresh_config()
|
| 168 |
+
DEEPINFRA_MODEL = runtime_config['DEEPINFRA_MODEL']
|
| 169 |
+
DEEPINFRA_GATING_MODEL = runtime_config['DEEPINFRA_GATING_MODEL']
|
| 170 |
+
STT_PROVIDER = runtime_config['STT_PROVIDER']
|
| 171 |
+
TTS_PROVIDER = runtime_config['TTS_PROVIDER']
|
| 172 |
+
QWEN3_TTS_MODEL = runtime_config['QWEN3_TTS_MODEL']
|
| 173 |
+
QWEN3_TTS_DEVICE = runtime_config['QWEN3_TTS_DEVICE']
|
| 174 |
+
QWEN3_TTS_REF_AUDIO = runtime_config['QWEN3_TTS_REF_AUDIO']
|
| 175 |
+
EMOTIONAL_MONITORING_ENABLED = runtime_config['EMOTIONAL_MONITORING_ENABLED']
|
| 176 |
+
EMOTIONAL_SAMPLING_INTERVAL = runtime_config['EMOTIONAL_SAMPLING_INTERVAL']
|
| 177 |
+
EMOTIONAL_INTERVENTION_THRESHOLD = runtime_config['EMOTIONAL_INTERVENTION_THRESHOLD']
|
| 178 |
+
TARS_DISPLAY_URL = runtime_config['TARS_DISPLAY_URL']
|
| 179 |
+
TARS_DISPLAY_ENABLED = runtime_config['TARS_DISPLAY_ENABLED']
|
| 180 |
+
|
| 181 |
+
logger.info(f"π Runtime config loaded - STT: {STT_PROVIDER}, LLM: {DEEPINFRA_MODEL}, TTS: {TTS_PROVIDER}, Emotional: {EMOTIONAL_MONITORING_ENABLED}")
|
| 182 |
+
|
| 183 |
+
# Session initialization
|
| 184 |
+
session_id = str(uuid.uuid4())[:8]
|
| 185 |
+
client_id = f"guest_{session_id}"
|
| 186 |
+
client_state = {"client_id": client_id}
|
| 187 |
+
logger.info(f"Session started: {client_id}")
|
| 188 |
+
|
| 189 |
+
service_refs = {"stt": None, "tts": None}
|
| 190 |
+
|
| 191 |
+
try:
|
| 192 |
+
# ====================================================================
|
| 193 |
+
# TRANSPORT INITIALIZATION
|
| 194 |
+
# ====================================================================
|
| 195 |
+
# Note: STT providers handle their own turn detection:
|
| 196 |
+
# - Speechmatics: SMART_TURN mode
|
| 197 |
+
# - Deepgram: endpointing parameter (300ms silence detection)
|
| 198 |
+
# - Deepgram Flux: built-in turn detection with ExternalUserTurnStrategies (deprecated)
|
| 199 |
+
|
| 200 |
+
logger.info(f"Initializing transport with {STT_PROVIDER} turn detection...")
|
| 201 |
+
|
| 202 |
+
transport_params = TransportParams(
|
| 203 |
+
audio_in_enabled=True,
|
| 204 |
+
audio_out_enabled=True,
|
| 205 |
+
video_in_enabled=False,
|
| 206 |
+
video_out_enabled=False,
|
| 207 |
+
video_out_is_live=False,
|
| 208 |
+
)
|
| 209 |
+
|
| 210 |
+
pipecat_transport = SmallWebRTCTransport(
|
| 211 |
+
webrtc_connection=webrtc_connection,
|
| 212 |
+
params=transport_params,
|
| 213 |
+
)
|
| 214 |
+
|
| 215 |
+
logger.info("β Transport initialized")
|
| 216 |
+
|
| 217 |
+
# ====================================================================
|
| 218 |
+
# SPEECH-TO-TEXT SERVICE
|
| 219 |
+
# ====================================================================
|
| 220 |
+
|
| 221 |
+
logger.info(f"Initializing {STT_PROVIDER} STT...")
|
| 222 |
+
stt = None
|
| 223 |
+
try:
|
| 224 |
+
stt = create_stt_service(
|
| 225 |
+
provider=STT_PROVIDER,
|
| 226 |
+
speechmatics_api_key=SPEECHMATICS_API_KEY,
|
| 227 |
+
deepgram_api_key=DEEPGRAM_API_KEY,
|
| 228 |
+
language=Language.EN,
|
| 229 |
+
enable_diarization=False,
|
| 230 |
+
)
|
| 231 |
+
service_refs["stt"] = stt
|
| 232 |
+
|
| 233 |
+
# Log additional info for Deepgram
|
| 234 |
+
if STT_PROVIDER == "deepgram":
|
| 235 |
+
logger.info("β Deepgram: 300ms endpointing for turn detection")
|
| 236 |
+
logger.info("β Deepgram: VAD events enabled for speech detection")
|
| 237 |
+
|
| 238 |
+
except Exception as e:
|
| 239 |
+
logger.error(f"Failed to initialize {STT_PROVIDER} STT: {e}", exc_info=True)
|
| 240 |
+
return
|
| 241 |
+
|
| 242 |
+
# ====================================================================
|
| 243 |
+
# TEXT-TO-SPEECH SERVICE
|
| 244 |
+
# ====================================================================
|
| 245 |
+
|
| 246 |
+
try:
|
| 247 |
+
tts = create_tts_service(
|
| 248 |
+
provider=TTS_PROVIDER,
|
| 249 |
+
elevenlabs_api_key=ELEVENLABS_API_KEY,
|
| 250 |
+
elevenlabs_voice_id=ELEVENLABS_VOICE_ID,
|
| 251 |
+
qwen_model=QWEN3_TTS_MODEL,
|
| 252 |
+
qwen_device=QWEN3_TTS_DEVICE,
|
| 253 |
+
qwen_ref_audio=QWEN3_TTS_REF_AUDIO,
|
| 254 |
+
)
|
| 255 |
+
service_refs["tts"] = tts
|
| 256 |
+
except Exception as e:
|
| 257 |
+
logger.error(f"Failed to initialize TTS service: {e}", exc_info=True)
|
| 258 |
+
return
|
| 259 |
+
|
| 260 |
+
# ====================================================================
|
| 261 |
+
# LLM SERVICE & TOOLS
|
| 262 |
+
# ====================================================================
|
| 263 |
+
|
| 264 |
+
logger.info("Initializing LLM via DeepInfra...")
|
| 265 |
+
llm = None
|
| 266 |
+
try:
|
| 267 |
+
llm = OpenAILLMService(
|
| 268 |
+
api_key=DEEPINFRA_API_KEY,
|
| 269 |
+
base_url=DEEPINFRA_BASE_URL,
|
| 270 |
+
model=DEEPINFRA_MODEL
|
| 271 |
+
)
|
| 272 |
+
|
| 273 |
+
character_dir = os.path.join(os.path.dirname(__file__), "character")
|
| 274 |
+
persona_params = load_persona_ini(os.path.join(character_dir, "persona.ini"))
|
| 275 |
+
tars_data = load_tars_json(os.path.join(character_dir, "TARS.json"))
|
| 276 |
+
system_prompt = build_tars_system_prompt(persona_params, tars_data)
|
| 277 |
+
|
| 278 |
+
# Create tool schemas (these return FunctionSchema objects)
|
| 279 |
+
fetch_image_tool = create_fetch_image_schema()
|
| 280 |
+
persona_tool = create_adjust_persona_schema()
|
| 281 |
+
identity_tool = create_identity_schema()
|
| 282 |
+
crossword_hint_tool = create_crossword_hint_schema()
|
| 283 |
+
movement_tool = create_movement_schema()
|
| 284 |
+
camera_capture_tool = create_camera_capture_schema()
|
| 285 |
+
|
| 286 |
+
# Pass FunctionSchema objects directly to standard_tools
|
| 287 |
+
tools = ToolsSchema(
|
| 288 |
+
standard_tools=[
|
| 289 |
+
fetch_image_tool,
|
| 290 |
+
persona_tool,
|
| 291 |
+
identity_tool,
|
| 292 |
+
crossword_hint_tool,
|
| 293 |
+
movement_tool,
|
| 294 |
+
camera_capture_tool,
|
| 295 |
+
]
|
| 296 |
+
)
|
| 297 |
+
messages = [system_prompt]
|
| 298 |
+
context = LLMContext(messages, tools)
|
| 299 |
+
|
| 300 |
+
llm.register_function("fetch_user_image", fetch_user_image)
|
| 301 |
+
llm.register_function("adjust_persona_parameter", adjust_persona_parameter)
|
| 302 |
+
llm.register_function("get_crossword_hint", get_crossword_hint)
|
| 303 |
+
llm.register_function("execute_movement", execute_movement)
|
| 304 |
+
llm.register_function("capture_camera_view", capture_camera_view)
|
| 305 |
+
|
| 306 |
+
pipeline_unifier = IdentityUnifier(client_id)
|
| 307 |
+
async def wrapped_set_identity(params: FunctionCallParams):
|
| 308 |
+
name = params.arguments["name"]
|
| 309 |
+
logger.info(f"π€ Identity discovered: {name}")
|
| 310 |
+
|
| 311 |
+
old_id = client_state["client_id"]
|
| 312 |
+
new_id = f"user_{name.lower().replace(' ', '_')}"
|
| 313 |
+
|
| 314 |
+
if old_id != new_id:
|
| 315 |
+
logger.info(f"π Switching User ID: {old_id} -> {new_id}")
|
| 316 |
+
client_state["client_id"] = new_id
|
| 317 |
+
|
| 318 |
+
# Update the pipeline unifier to use new identity
|
| 319 |
+
pipeline_unifier.target_user_id = new_id
|
| 320 |
+
logger.info(f"β Updated pipeline unifier with new ID: {new_id}")
|
| 321 |
+
|
| 322 |
+
# Update memory service with new user_id
|
| 323 |
+
if memory_service:
|
| 324 |
+
memory_service.user_id = new_id
|
| 325 |
+
logger.info(f"β Updated memory service user_id to: {new_id}")
|
| 326 |
+
|
| 327 |
+
# Notify frontend of identity change
|
| 328 |
+
try:
|
| 329 |
+
if webrtc_connection and webrtc_connection.is_connected():
|
| 330 |
+
webrtc_connection.send_app_message({
|
| 331 |
+
"type": "identity_update",
|
| 332 |
+
"old_id": old_id,
|
| 333 |
+
"new_id": new_id,
|
| 334 |
+
"name": name
|
| 335 |
+
})
|
| 336 |
+
logger.info(f"π€ Sent identity update to frontend: {new_id}")
|
| 337 |
+
except Exception as e:
|
| 338 |
+
logger.warning(f"Failed to send identity update to frontend: {e}")
|
| 339 |
+
|
| 340 |
+
await params.result_callback(f"Identity updated to {name}.")
|
| 341 |
+
|
| 342 |
+
llm.register_function("set_user_identity", wrapped_set_identity)
|
| 343 |
+
logger.info(f"β LLM initialized with model: {DEEPINFRA_MODEL}")
|
| 344 |
+
|
| 345 |
+
except Exception as e:
|
| 346 |
+
logger.error(f"Failed to initialize LLM: {e}", exc_info=True)
|
| 347 |
+
return
|
| 348 |
+
|
| 349 |
+
# ====================================================================
|
| 350 |
+
# VISION & GATING SERVICES
|
| 351 |
+
# ====================================================================
|
| 352 |
+
|
| 353 |
+
logger.info("Initializing Moondream vision service...")
|
| 354 |
+
moondream = None
|
| 355 |
+
try:
|
| 356 |
+
moondream = MoondreamService(model="vikhyatk/moondream2", revision="2025-01-09")
|
| 357 |
+
logger.info("β Moondream vision service initialized")
|
| 358 |
+
except Exception as e:
|
| 359 |
+
logger.error(f"Failed to initialize Moondream: {e}")
|
| 360 |
+
return
|
| 361 |
+
|
| 362 |
+
# ====================================================================
|
| 363 |
+
# TARS DISPLAY - Note: Display control via gRPC in robot mode only
|
| 364 |
+
# ====================================================================
|
| 365 |
+
|
| 366 |
+
logger.info("TARS Display features available in robot mode (tars_bot.py)")
|
| 367 |
+
tars_client = None
|
| 368 |
+
|
| 369 |
+
logger.info("Initializing Visual Observer...")
|
| 370 |
+
visual_observer = VisualObserver(
|
| 371 |
+
vision_client=moondream,
|
| 372 |
+
enable_face_detection=True,
|
| 373 |
+
tars_client=tars_client
|
| 374 |
+
)
|
| 375 |
+
logger.info("β Visual Observer initialized")
|
| 376 |
+
|
| 377 |
+
logger.info("Initializing Emotional State Monitor...")
|
| 378 |
+
emotional_monitor = EmotionalStateMonitor(
|
| 379 |
+
vision_client=moondream,
|
| 380 |
+
model="vikhyatk/moondream2",
|
| 381 |
+
sampling_interval=EMOTIONAL_SAMPLING_INTERVAL,
|
| 382 |
+
intervention_threshold=EMOTIONAL_INTERVENTION_THRESHOLD,
|
| 383 |
+
enabled=EMOTIONAL_MONITORING_ENABLED,
|
| 384 |
+
auto_intervene=False, # Let gating layer handle intervention decisions
|
| 385 |
+
)
|
| 386 |
+
logger.info(f"β Emotional State Monitor initialized (enabled: {EMOTIONAL_MONITORING_ENABLED})")
|
| 387 |
+
logger.info(f" Mode: Integrated with gating layer for smarter decisions")
|
| 388 |
+
|
| 389 |
+
logger.info("Initializing Gating Layer...")
|
| 390 |
+
gating_layer = InterventionGating(
|
| 391 |
+
api_key=DEEPINFRA_API_KEY,
|
| 392 |
+
base_url=DEEPINFRA_BASE_URL,
|
| 393 |
+
model=DEEPINFRA_GATING_MODEL,
|
| 394 |
+
visual_observer=visual_observer,
|
| 395 |
+
emotional_monitor=emotional_monitor
|
| 396 |
+
)
|
| 397 |
+
logger.info(f"β Gating Layer initialized with emotional state integration")
|
| 398 |
+
|
| 399 |
+
# ====================================================================
|
| 400 |
+
# MEMORY SERVICE
|
| 401 |
+
# ====================================================================
|
| 402 |
+
|
| 403 |
+
# Memory service: Hybrid search combining vector similarity (70%) and BM25 keyword matching (30%)
|
| 404 |
+
# Optimized for voice AI with <50ms latency target
|
| 405 |
+
logger.info("Initializing hybrid memory service...")
|
| 406 |
+
memory_service = None
|
| 407 |
+
try:
|
| 408 |
+
memory_service = HybridMemoryService(
|
| 409 |
+
user_id=client_id,
|
| 410 |
+
db_path="./memory_data/memory.sqlite",
|
| 411 |
+
search_limit=3,
|
| 412 |
+
search_timeout_ms=100, # Hybrid search needs ~60-80ms, allow buffer
|
| 413 |
+
vector_weight=0.7, # 70% semantic similarity
|
| 414 |
+
bm25_weight=0.3, # 30% keyword matching
|
| 415 |
+
system_prompt_prefix="From our conversations:\n",
|
| 416 |
+
)
|
| 417 |
+
logger.info(f"β Hybrid memory service initialized for {client_id}")
|
| 418 |
+
except Exception as e:
|
| 419 |
+
logger.error(f"Failed to initialize hybrid memory service: {e}")
|
| 420 |
+
logger.info(" Continuing without memory service...")
|
| 421 |
+
memory_service = None # Continue without memory if it fails
|
| 422 |
+
|
| 423 |
+
# ====================================================================
|
| 424 |
+
# CONTEXT AGGREGATOR & PERSONA STORAGE
|
| 425 |
+
# ====================================================================
|
| 426 |
+
|
| 427 |
+
# Configure user turn aggregation
|
| 428 |
+
# STT services (Speechmatics, Deepgram) handle turn detection internally
|
| 429 |
+
user_params = LLMUserAggregatorParams(
|
| 430 |
+
user_turn_stop_timeout=1.5
|
| 431 |
+
)
|
| 432 |
+
|
| 433 |
+
context_aggregator = LLMContextAggregatorPair(
|
| 434 |
+
context,
|
| 435 |
+
user_params=user_params
|
| 436 |
+
)
|
| 437 |
+
|
| 438 |
+
|
| 439 |
+
persona_storage = get_persona_storage()
|
| 440 |
+
persona_storage["persona_params"] = persona_params
|
| 441 |
+
persona_storage["tars_data"] = tars_data
|
| 442 |
+
persona_storage["context_aggregator"] = context_aggregator
|
| 443 |
+
|
| 444 |
+
# ====================================================================
|
| 445 |
+
# LOGGING PROCESSORS
|
| 446 |
+
# ====================================================================
|
| 447 |
+
|
| 448 |
+
transcription_observer = TranscriptionObserver(
|
| 449 |
+
webrtc_connection=webrtc_connection,
|
| 450 |
+
client_state=client_state
|
| 451 |
+
)
|
| 452 |
+
assistant_observer = AssistantResponseObserver(webrtc_connection=webrtc_connection)
|
| 453 |
+
tts_state_observer = TTSStateObserver(webrtc_connection=webrtc_connection)
|
| 454 |
+
vision_observer = VisionObserver(webrtc_connection=webrtc_connection)
|
| 455 |
+
display_events_observer = DisplayEventsObserver(tars_client=tars_client)
|
| 456 |
+
|
| 457 |
+
# Create MetricsObserver (non-intrusive monitoring outside pipeline)
|
| 458 |
+
metrics_observer = MetricsObserver(
|
| 459 |
+
webrtc_connection=webrtc_connection,
|
| 460 |
+
stt_service=stt
|
| 461 |
+
)
|
| 462 |
+
|
| 463 |
+
# Turn tracking observer (for debugging turn detection)
|
| 464 |
+
turn_observer = TurnTrackingObserver()
|
| 465 |
+
|
| 466 |
+
@turn_observer.event_handler("on_turn_started")
|
| 467 |
+
async def on_turn_started(*args, **kwargs):
|
| 468 |
+
turn_number = args[1] if len(args) > 1 else kwargs.get('turn_number', 0)
|
| 469 |
+
logger.info(f"π£οΈ [TurnObserver] Turn STARTED: {turn_number}")
|
| 470 |
+
# Notify metrics observer of new turn
|
| 471 |
+
metrics_observer.start_turn(turn_number)
|
| 472 |
+
|
| 473 |
+
@turn_observer.event_handler("on_turn_ended")
|
| 474 |
+
async def on_turn_ended(*args, **kwargs):
|
| 475 |
+
turn_number = args[1] if len(args) > 1 else kwargs.get('turn_number', 0)
|
| 476 |
+
logger.info(f"π£οΈ [TurnObserver] Turn ENDED: {turn_number}")
|
| 477 |
+
|
| 478 |
+
# ====================================================================
|
| 479 |
+
# PIPELINE ASSEMBLY
|
| 480 |
+
# ====================================================================
|
| 481 |
+
|
| 482 |
+
logger.info("Creating audio/video pipeline...")
|
| 483 |
+
|
| 484 |
+
pipeline = Pipeline([
|
| 485 |
+
pipecat_transport.input(),
|
| 486 |
+
# emotional_monitor, # Real-time emotional state monitoring
|
| 487 |
+
stt,
|
| 488 |
+
pipeline_unifier,
|
| 489 |
+
context_aggregator.user(),
|
| 490 |
+
memory_service, # Hybrid memory (70% vector + 30% BM25) for automatic recall/storage
|
| 491 |
+
# gating_layer, # AI decision system (with emotional state integration)
|
| 492 |
+
llm,
|
| 493 |
+
SilenceFilter(),
|
| 494 |
+
tts,
|
| 495 |
+
pipecat_transport.output(),
|
| 496 |
+
context_aggregator.assistant(),
|
| 497 |
+
])
|
| 498 |
+
|
| 499 |
+
# ====================================================================
|
| 500 |
+
# EVENT HANDLERS
|
| 501 |
+
# ====================================================================
|
| 502 |
+
|
| 503 |
+
task_ref = {"task": None}
|
| 504 |
+
|
| 505 |
+
@pipecat_transport.event_handler("on_client_connected")
|
| 506 |
+
async def on_client_connected(transport, client):
|
| 507 |
+
logger.info("Pipecat Client connected")
|
| 508 |
+
try:
|
| 509 |
+
if webrtc_connection.is_connected():
|
| 510 |
+
webrtc_connection.send_app_message({"type": "system", "message": "Connection established"})
|
| 511 |
+
|
| 512 |
+
# Send service configuration info with provider and model details
|
| 513 |
+
llm_display = DEEPINFRA_MODEL.split('/')[-1] if '/' in DEEPINFRA_MODEL else DEEPINFRA_MODEL
|
| 514 |
+
|
| 515 |
+
if TTS_PROVIDER == "elevenlabs":
|
| 516 |
+
tts_display = "ElevenLabs: eleven_flash_v2_5"
|
| 517 |
+
else:
|
| 518 |
+
tts_model = QWEN3_TTS_MODEL.split('/')[-1] if '/' in QWEN3_TTS_MODEL else QWEN3_TTS_MODEL
|
| 519 |
+
tts_display = f"Qwen3-TTS: {tts_model}"
|
| 520 |
+
|
| 521 |
+
# Format STT provider name for display
|
| 522 |
+
stt_display = {
|
| 523 |
+
"speechmatics": "Speechmatics",
|
| 524 |
+
"deepgram": "Deepgram Nova-2"
|
| 525 |
+
}.get(STT_PROVIDER, STT_PROVIDER.capitalize())
|
| 526 |
+
|
| 527 |
+
service_info = {
|
| 528 |
+
"stt": stt_display,
|
| 529 |
+
"memory": "Hybrid Search (SQLite)",
|
| 530 |
+
"llm": f"DeepInfra: {llm_display}",
|
| 531 |
+
"tts": tts_display
|
| 532 |
+
}
|
| 533 |
+
|
| 534 |
+
# Store in shared state for Gradio UI
|
| 535 |
+
metrics_store.set_service_info(service_info)
|
| 536 |
+
|
| 537 |
+
# Send via WebRTC
|
| 538 |
+
webrtc_connection.send_app_message({
|
| 539 |
+
"type": "service_info",
|
| 540 |
+
**service_info
|
| 541 |
+
})
|
| 542 |
+
logger.info(f"π Sent service info to frontend: STT={stt_display}, LLM={llm_display}, TTS={tts_display}")
|
| 543 |
+
except Exception as e:
|
| 544 |
+
logger.error(f"β Error sending service info: {e}")
|
| 545 |
+
|
| 546 |
+
if task_ref["task"]:
|
| 547 |
+
verbosity = persona_params.get("verbosity", 10) if persona_params else 10
|
| 548 |
+
intro_instruction = get_introduction_instruction(client_state['client_id'], verbosity)
|
| 549 |
+
|
| 550 |
+
if context and hasattr(context, "messages"):
|
| 551 |
+
context.messages.append(intro_instruction)
|
| 552 |
+
|
| 553 |
+
logger.info("Waiting for pipeline to warm up...")
|
| 554 |
+
await asyncio.sleep(2.0)
|
| 555 |
+
|
| 556 |
+
logger.info("Queueing initial LLM greeting...")
|
| 557 |
+
await task_ref["task"].queue_frames([LLMRunFrame()])
|
| 558 |
+
|
| 559 |
+
@pipecat_transport.event_handler("on_client_disconnected")
|
| 560 |
+
async def on_client_disconnected(transport, client):
|
| 561 |
+
logger.info("Pipecat Client disconnected")
|
| 562 |
+
if task_ref["task"]:
|
| 563 |
+
await task_ref["task"].cancel()
|
| 564 |
+
await _cleanup_services(service_refs)
|
| 565 |
+
|
| 566 |
+
# ====================================================================
|
| 567 |
+
# PIPELINE EXECUTION
|
| 568 |
+
# ====================================================================
|
| 569 |
+
|
| 570 |
+
# Enable built-in Pipecat metrics for latency tracking
|
| 571 |
+
user_bot_latency_observer = UserBotLatencyLogObserver()
|
| 572 |
+
|
| 573 |
+
task = PipelineTask(
|
| 574 |
+
pipeline,
|
| 575 |
+
params=PipelineParams(
|
| 576 |
+
enable_metrics=True, # Enable performance metrics (TTFB, latency)
|
| 577 |
+
enable_usage_metrics=True, # Enable LLM/TTS usage metrics
|
| 578 |
+
report_only_initial_ttfb=False, # Report all TTFB measurements
|
| 579 |
+
),
|
| 580 |
+
observers=[
|
| 581 |
+
turn_observer,
|
| 582 |
+
metrics_observer,
|
| 583 |
+
transcription_observer,
|
| 584 |
+
assistant_observer,
|
| 585 |
+
tts_state_observer,
|
| 586 |
+
vision_observer,
|
| 587 |
+
display_events_observer, # Send events to TARS display
|
| 588 |
+
user_bot_latency_observer, # Measures total userβbot response time
|
| 589 |
+
], # Non-intrusive monitoring
|
| 590 |
+
)
|
| 591 |
+
task_ref["task"] = task
|
| 592 |
+
runner = PipelineRunner(handle_sigint=False)
|
| 593 |
+
|
| 594 |
+
logger.info("Starting pipeline runner...")
|
| 595 |
+
|
| 596 |
+
try:
|
| 597 |
+
await runner.run(task)
|
| 598 |
+
except Exception:
|
| 599 |
+
raise
|
| 600 |
+
finally:
|
| 601 |
+
await _cleanup_services(service_refs)
|
| 602 |
+
|
| 603 |
+
except Exception as e:
|
| 604 |
+
logger.error(f"Error in bot pipeline: {e}", exc_info=True)
|
| 605 |
+
finally:
|
| 606 |
+
await _cleanup_services(service_refs)
|
src/pipecat_service.py
ADDED
|
@@ -0,0 +1,274 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
Pipecat.ai service for real-time transcription and TTS using SmallWebRTC
|
| 4 |
+
Communicates directly with browser via WebRTC
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
# Fix SSL certificate issues FIRST - before any SSL-using imports
|
| 8 |
+
import os
|
| 9 |
+
import sys
|
| 10 |
+
from pathlib import Path
|
| 11 |
+
|
| 12 |
+
# Add src/ to Python path
|
| 13 |
+
# Add src directory to Python path for imports
|
| 14 |
+
src_dir = Path(__file__).parent
|
| 15 |
+
sys.path.insert(0, str(src_dir))
|
| 16 |
+
|
| 17 |
+
try:
|
| 18 |
+
import certifi
|
| 19 |
+
cert_file = certifi.where()
|
| 20 |
+
os.environ['SSL_CERT_FILE'] = cert_file
|
| 21 |
+
os.environ['REQUESTS_CA_BUNDLE'] = cert_file
|
| 22 |
+
os.environ['CURL_CA_BUNDLE'] = cert_file
|
| 23 |
+
except ImportError:
|
| 24 |
+
pass # certifi not available, will use system certs
|
| 25 |
+
|
| 26 |
+
import ssl
|
| 27 |
+
from contextlib import asynccontextmanager
|
| 28 |
+
|
| 29 |
+
# Configure SSL to use certifi certificates for Python's ssl module
|
| 30 |
+
# For development: disable SSL verification completely to avoid certificate issues
|
| 31 |
+
# This MUST happen before any libraries that use SSL are imported
|
| 32 |
+
try:
|
| 33 |
+
import certifi
|
| 34 |
+
cert_file = certifi.where()
|
| 35 |
+
# Set environment variables for libraries that respect them
|
| 36 |
+
os.environ['SSL_CERT_FILE'] = cert_file
|
| 37 |
+
os.environ['REQUESTS_CA_BUNDLE'] = cert_file
|
| 38 |
+
os.environ['CURL_CA_BUNDLE'] = cert_file
|
| 39 |
+
|
| 40 |
+
# For Python's ssl module: use unverified context for development
|
| 41 |
+
# This bypasses SSL certificate verification to avoid connection issues
|
| 42 |
+
ssl._create_default_https_context = ssl._create_unverified_context
|
| 43 |
+
except ImportError:
|
| 44 |
+
# If certifi not available, use unverified (development only)
|
| 45 |
+
ssl._create_default_https_context = ssl._create_unverified_context
|
| 46 |
+
except Exception as e:
|
| 47 |
+
# If anything fails, use unverified context
|
| 48 |
+
ssl._create_default_https_context = ssl._create_unverified_context
|
| 49 |
+
|
| 50 |
+
import argparse
|
| 51 |
+
import logging
|
| 52 |
+
from fastapi import BackgroundTasks, FastAPI
|
| 53 |
+
from fastapi.middleware.cors import CORSMiddleware
|
| 54 |
+
from loguru import logger
|
| 55 |
+
from pipecat.transports.smallwebrtc.request_handler import (
|
| 56 |
+
SmallWebRTCPatchRequest,
|
| 57 |
+
SmallWebRTCRequest,
|
| 58 |
+
SmallWebRTCRequestHandler,
|
| 59 |
+
)
|
| 60 |
+
|
| 61 |
+
from bot import run_bot
|
| 62 |
+
from config import (
|
| 63 |
+
PIPECAT_HOST,
|
| 64 |
+
PIPECAT_PORT,
|
| 65 |
+
SPEECHMATICS_API_KEY,
|
| 66 |
+
DEEPGRAM_API_KEY,
|
| 67 |
+
ELEVENLABS_API_KEY,
|
| 68 |
+
DEEPINFRA_API_KEY,
|
| 69 |
+
STT_PROVIDER,
|
| 70 |
+
TTS_PROVIDER, # Only used for startup validation
|
| 71 |
+
get_fresh_config,
|
| 72 |
+
)
|
| 73 |
+
|
| 74 |
+
# Remove default loguru handler and set up custom logging
|
| 75 |
+
logger.remove(0)
|
| 76 |
+
|
| 77 |
+
# Configure standard logging
|
| 78 |
+
logging.basicConfig(level=logging.INFO)
|
| 79 |
+
standard_logger = logging.getLogger(__name__)
|
| 80 |
+
|
| 81 |
+
# Reduce noise from websockets library - only log warnings and above
|
| 82 |
+
websockets_logger = logging.getLogger('websockets')
|
| 83 |
+
websockets_logger.setLevel(logging.WARNING)
|
| 84 |
+
|
| 85 |
+
# Log SSL certificate configuration
|
| 86 |
+
try:
|
| 87 |
+
import certifi
|
| 88 |
+
logger.info(f"SSL Configuration: Using certificates from {certifi.where()}")
|
| 89 |
+
logger.info(f"SSL_CERT_FILE env: {os.environ.get('SSL_CERT_FILE', 'not set')}")
|
| 90 |
+
except:
|
| 91 |
+
logger.warning("certifi not available - SSL verification disabled for development")
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
@asynccontextmanager
|
| 95 |
+
async def lifespan(app: FastAPI):
|
| 96 |
+
"""Handle app lifespan events."""
|
| 97 |
+
logger.info(f"Starting Pipecat service on http://{PIPECAT_HOST}:{PIPECAT_PORT}...")
|
| 98 |
+
logger.info(f"STT Provider: {STT_PROVIDER}")
|
| 99 |
+
logger.info(f"TTS Provider: {TTS_PROVIDER}")
|
| 100 |
+
|
| 101 |
+
# Check required API keys based on STT and TTS providers
|
| 102 |
+
missing_keys = []
|
| 103 |
+
if STT_PROVIDER == "speechmatics" and not SPEECHMATICS_API_KEY:
|
| 104 |
+
missing_keys.append("SPEECHMATICS_API_KEY")
|
| 105 |
+
if STT_PROVIDER == "deepgram" and not DEEPGRAM_API_KEY:
|
| 106 |
+
missing_keys.append("DEEPGRAM_API_KEY")
|
| 107 |
+
if not DEEPINFRA_API_KEY:
|
| 108 |
+
missing_keys.append("DEEPINFRA_API_KEY")
|
| 109 |
+
if TTS_PROVIDER == "elevenlabs" and not ELEVENLABS_API_KEY:
|
| 110 |
+
missing_keys.append("ELEVENLABS_API_KEY")
|
| 111 |
+
|
| 112 |
+
if missing_keys:
|
| 113 |
+
logger.error(f"ERROR: Missing required API keys: {', '.join(missing_keys)}")
|
| 114 |
+
sys.exit(1)
|
| 115 |
+
|
| 116 |
+
yield # Run app
|
| 117 |
+
|
| 118 |
+
# Cleanup
|
| 119 |
+
await small_webrtc_handler.close()
|
| 120 |
+
logger.info("Shutting down...")
|
| 121 |
+
|
| 122 |
+
|
| 123 |
+
app = FastAPI(lifespan=lifespan)
|
| 124 |
+
|
| 125 |
+
# Add CORS middleware
|
| 126 |
+
app.add_middleware(
|
| 127 |
+
CORSMiddleware,
|
| 128 |
+
allow_origins=["*"], # In production, replace with specific origins
|
| 129 |
+
allow_credentials=True,
|
| 130 |
+
allow_methods=["*"],
|
| 131 |
+
allow_headers=["*"],
|
| 132 |
+
)
|
| 133 |
+
|
| 134 |
+
# Initialize the SmallWebRTC request handler
|
| 135 |
+
small_webrtc_handler: SmallWebRTCRequestHandler = SmallWebRTCRequestHandler()
|
| 136 |
+
|
| 137 |
+
@app.post("/api/offer")
|
| 138 |
+
async def offer(request: SmallWebRTCRequest, background_tasks: BackgroundTasks):
|
| 139 |
+
"""Handle WebRTC offer requests via SmallWebRTCRequestHandler."""
|
| 140 |
+
logger.debug(f"Received WebRTC offer request")
|
| 141 |
+
|
| 142 |
+
# Prepare runner arguments with the callback to run your bot
|
| 143 |
+
async def webrtc_connection_callback(connection):
|
| 144 |
+
background_tasks.add_task(run_bot, connection)
|
| 145 |
+
|
| 146 |
+
# Delegate handling to SmallWebRTCRequestHandler
|
| 147 |
+
answer = await small_webrtc_handler.handle_web_request(
|
| 148 |
+
request=request,
|
| 149 |
+
webrtc_connection_callback=webrtc_connection_callback,
|
| 150 |
+
)
|
| 151 |
+
return answer
|
| 152 |
+
|
| 153 |
+
|
| 154 |
+
@app.patch("/api/offer")
|
| 155 |
+
async def ice_candidate(request: SmallWebRTCPatchRequest):
|
| 156 |
+
"""Handle ICE candidate patch requests."""
|
| 157 |
+
logger.debug(f"Received ICE candidate patch request")
|
| 158 |
+
await small_webrtc_handler.handle_patch_request(request)
|
| 159 |
+
return {"status": "success"}
|
| 160 |
+
|
| 161 |
+
|
| 162 |
+
@app.get("/api/status")
|
| 163 |
+
async def status():
|
| 164 |
+
"""Health check endpoint with fresh config values."""
|
| 165 |
+
# Get current config from config.ini
|
| 166 |
+
current_config = get_fresh_config()
|
| 167 |
+
current_stt = current_config['STT_PROVIDER']
|
| 168 |
+
current_tts = current_config['TTS_PROVIDER']
|
| 169 |
+
current_model = current_config['DEEPINFRA_MODEL']
|
| 170 |
+
|
| 171 |
+
return {
|
| 172 |
+
"status": "ok",
|
| 173 |
+
"stt_provider": current_stt,
|
| 174 |
+
"tts_provider": current_tts,
|
| 175 |
+
"llm_model": current_model,
|
| 176 |
+
"speechmatics_configured": bool(SPEECHMATICS_API_KEY) if current_stt == "speechmatics" else None,
|
| 177 |
+
"deepgram_configured": bool(DEEPGRAM_API_KEY) if current_stt == "deepgram" else None,
|
| 178 |
+
"elevenlabs_configured": bool(ELEVENLABS_API_KEY) if current_tts == "elevenlabs" else None,
|
| 179 |
+
"deepinfra_configured": bool(DEEPINFRA_API_KEY),
|
| 180 |
+
"qwen3_tts_configured": True if current_tts == "qwen3" else None,
|
| 181 |
+
}
|
| 182 |
+
|
| 183 |
+
|
| 184 |
+
@app.get("/api/config")
|
| 185 |
+
async def get_config():
|
| 186 |
+
"""Get current configuration from config.ini."""
|
| 187 |
+
import configparser
|
| 188 |
+
from pathlib import Path
|
| 189 |
+
|
| 190 |
+
config = configparser.ConfigParser()
|
| 191 |
+
config_path = Path("config.ini")
|
| 192 |
+
|
| 193 |
+
if not config_path.exists():
|
| 194 |
+
return {"error": "config.ini not found"}
|
| 195 |
+
|
| 196 |
+
config.read(config_path)
|
| 197 |
+
|
| 198 |
+
return {
|
| 199 |
+
"llm": {
|
| 200 |
+
"model": config.get("LLM", "model", fallback="Qwen/Qwen3-235B-A22B-Instruct-2507")
|
| 201 |
+
},
|
| 202 |
+
"stt": {
|
| 203 |
+
"provider": config.get("STT", "provider", fallback="speechmatics")
|
| 204 |
+
},
|
| 205 |
+
"tts": {
|
| 206 |
+
"provider": config.get("TTS", "provider", fallback="qwen3"),
|
| 207 |
+
"qwen3_model": config.get("TTS", "qwen3_model", fallback="Qwen/Qwen3-TTS-12Hz-0.6B-Base"),
|
| 208 |
+
"qwen3_device": config.get("TTS", "qwen3_device", fallback="mps"),
|
| 209 |
+
"qwen3_ref_audio": config.get("TTS", "qwen3_ref_audio", fallback="tars-clean-compressed.mp3"),
|
| 210 |
+
}
|
| 211 |
+
}
|
| 212 |
+
|
| 213 |
+
|
| 214 |
+
@app.post("/api/config")
|
| 215 |
+
async def update_config(request: dict):
|
| 216 |
+
"""Update configuration in config.ini."""
|
| 217 |
+
import configparser
|
| 218 |
+
from pathlib import Path
|
| 219 |
+
|
| 220 |
+
config = configparser.ConfigParser()
|
| 221 |
+
config_path = Path("config.ini")
|
| 222 |
+
|
| 223 |
+
if not config_path.exists():
|
| 224 |
+
return {"error": "config.ini not found"}
|
| 225 |
+
|
| 226 |
+
config.read(config_path)
|
| 227 |
+
|
| 228 |
+
# Update LLM config
|
| 229 |
+
if "llm_model" in request:
|
| 230 |
+
if not config.has_section("LLM"):
|
| 231 |
+
config.add_section("LLM")
|
| 232 |
+
config.set("LLM", "model", request["llm_model"])
|
| 233 |
+
|
| 234 |
+
# Update STT config
|
| 235 |
+
if "stt_provider" in request:
|
| 236 |
+
if not config.has_section("STT"):
|
| 237 |
+
config.add_section("STT")
|
| 238 |
+
config.set("STT", "provider", request["stt_provider"])
|
| 239 |
+
|
| 240 |
+
# Update TTS config
|
| 241 |
+
if "tts_provider" in request:
|
| 242 |
+
if not config.has_section("TTS"):
|
| 243 |
+
config.add_section("TTS")
|
| 244 |
+
config.set("TTS", "provider", request["tts_provider"])
|
| 245 |
+
|
| 246 |
+
# Write back to file
|
| 247 |
+
with open(config_path, "w") as f:
|
| 248 |
+
config.write(f)
|
| 249 |
+
|
| 250 |
+
return {
|
| 251 |
+
"success": True,
|
| 252 |
+
"message": "Configuration updated. Please restart the service for changes to take effect.",
|
| 253 |
+
"restart_required": True
|
| 254 |
+
}
|
| 255 |
+
|
| 256 |
+
|
| 257 |
+
if __name__ == "__main__":
|
| 258 |
+
parser = argparse.ArgumentParser(description="WebRTC Pipecat service")
|
| 259 |
+
parser.add_argument(
|
| 260 |
+
"--host", default=PIPECAT_HOST, help=f"Host for HTTP server (default: {PIPECAT_HOST})"
|
| 261 |
+
)
|
| 262 |
+
parser.add_argument(
|
| 263 |
+
"--port", type=int, default=PIPECAT_PORT, help=f"Port for HTTP server (default: {PIPECAT_PORT})"
|
| 264 |
+
)
|
| 265 |
+
parser.add_argument("--verbose", "-v", action="count")
|
| 266 |
+
args = parser.parse_args()
|
| 267 |
+
|
| 268 |
+
if args.verbose:
|
| 269 |
+
logger.add(sys.stderr, level="TRACE")
|
| 270 |
+
else:
|
| 271 |
+
logger.add(sys.stderr, level="INFO")
|
| 272 |
+
|
| 273 |
+
import uvicorn
|
| 274 |
+
uvicorn.run(app, host=args.host, port=args.port)
|
src/tars_bot.py
ADDED
|
@@ -0,0 +1,457 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
TARS Bot - Robot Mode
|
| 3 |
+
|
| 4 |
+
Pipecat pipeline that connects to Raspberry Pi TARS robot via WebRTC.
|
| 5 |
+
Uses aiortc client for bidirectional audio and DataChannel for state sync.
|
| 6 |
+
|
| 7 |
+
Architecture:
|
| 8 |
+
- RPi WebRTC Server (aiortc) β MacBook WebRTC Client (aiortc)
|
| 9 |
+
- Audio: RPi mic β Pipeline β RPi speaker
|
| 10 |
+
- State: DataChannel for real-time sync
|
| 11 |
+
- Commands: gRPC for robot control
|
| 12 |
+
"""
|
| 13 |
+
|
| 14 |
+
import sys
|
| 15 |
+
from pathlib import Path
|
| 16 |
+
|
| 17 |
+
# Add src/ to Python path
|
| 18 |
+
# Add src directory to Python path for imports
|
| 19 |
+
src_dir = Path(__file__).parent
|
| 20 |
+
sys.path.insert(0, str(src_dir))
|
| 21 |
+
|
| 22 |
+
import asyncio
|
| 23 |
+
import os
|
| 24 |
+
import uuid
|
| 25 |
+
from loguru import logger
|
| 26 |
+
|
| 27 |
+
from pipecat.pipeline.pipeline import Pipeline
|
| 28 |
+
from pipecat.pipeline.runner import PipelineRunner
|
| 29 |
+
from pipecat.pipeline.task import PipelineTask, PipelineParams
|
| 30 |
+
from pipecat.processors.aggregators.llm_context import LLMContext
|
| 31 |
+
from pipecat.processors.aggregators.llm_response_universal import (
|
| 32 |
+
LLMContextAggregatorPair,
|
| 33 |
+
LLMUserAggregatorParams
|
| 34 |
+
)
|
| 35 |
+
from pipecat.services.openai.llm import OpenAILLMService
|
| 36 |
+
from pipecat.adapters.schemas.tools_schema import ToolsSchema
|
| 37 |
+
from pipecat.transcriptions.language import Language
|
| 38 |
+
from pipecat.frames.frames import LLMRunFrame
|
| 39 |
+
|
| 40 |
+
from config import (
|
| 41 |
+
DEEPGRAM_API_KEY,
|
| 42 |
+
SPEECHMATICS_API_KEY,
|
| 43 |
+
ELEVENLABS_API_KEY,
|
| 44 |
+
ELEVENLABS_VOICE_ID,
|
| 45 |
+
DEEPINFRA_API_KEY,
|
| 46 |
+
DEEPINFRA_BASE_URL,
|
| 47 |
+
RPI_URL,
|
| 48 |
+
RPI_GRPC,
|
| 49 |
+
AUTO_CONNECT,
|
| 50 |
+
RECONNECT_DELAY,
|
| 51 |
+
MAX_RECONNECT_ATTEMPTS,
|
| 52 |
+
get_fresh_config,
|
| 53 |
+
detect_deployment_mode,
|
| 54 |
+
get_robot_grpc_address,
|
| 55 |
+
)
|
| 56 |
+
|
| 57 |
+
from transport import AiortcRPiClient, AudioBridge, StateSync
|
| 58 |
+
from transport.audio_bridge import RPiAudioInputTrack, RPiAudioOutputTrack
|
| 59 |
+
from services.factories import create_stt_service, create_tts_service
|
| 60 |
+
from services import tars_robot
|
| 61 |
+
from services.update_checker import TarsUpdateChecker, CLIENT_VERSION
|
| 62 |
+
from processors import SilenceFilter
|
| 63 |
+
from observers import StateObserver
|
| 64 |
+
from character.prompts import (
|
| 65 |
+
load_persona_ini,
|
| 66 |
+
load_tars_json,
|
| 67 |
+
build_tars_system_prompt,
|
| 68 |
+
get_introduction_instruction,
|
| 69 |
+
)
|
| 70 |
+
from tools import (
|
| 71 |
+
fetch_user_image,
|
| 72 |
+
adjust_persona_parameter,
|
| 73 |
+
execute_movement,
|
| 74 |
+
capture_camera_view,
|
| 75 |
+
create_fetch_image_schema,
|
| 76 |
+
create_adjust_persona_schema,
|
| 77 |
+
create_identity_schema,
|
| 78 |
+
create_movement_schema,
|
| 79 |
+
create_camera_capture_schema,
|
| 80 |
+
get_persona_storage,
|
| 81 |
+
set_emotion,
|
| 82 |
+
do_gesture,
|
| 83 |
+
create_emotion_schema,
|
| 84 |
+
create_gesture_schema,
|
| 85 |
+
set_rate_limiter,
|
| 86 |
+
ExpressionRateLimiter,
|
| 87 |
+
)
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
async def run_robot_bot():
|
| 91 |
+
"""Run TARS bot in robot mode (connected to RPi via aiortc)."""
|
| 92 |
+
logger.info("=" * 80)
|
| 93 |
+
logger.info("π€ Starting TARS in Robot Mode")
|
| 94 |
+
logger.info("=" * 80)
|
| 95 |
+
|
| 96 |
+
# Load fresh configuration
|
| 97 |
+
runtime_config = get_fresh_config()
|
| 98 |
+
DEEPINFRA_MODEL = runtime_config['DEEPINFRA_MODEL']
|
| 99 |
+
STT_PROVIDER = runtime_config['STT_PROVIDER']
|
| 100 |
+
TTS_PROVIDER = runtime_config['TTS_PROVIDER']
|
| 101 |
+
QWEN3_TTS_MODEL = runtime_config['QWEN3_TTS_MODEL']
|
| 102 |
+
QWEN3_TTS_DEVICE = runtime_config['QWEN3_TTS_DEVICE']
|
| 103 |
+
QWEN3_TTS_REF_AUDIO = runtime_config['QWEN3_TTS_REF_AUDIO']
|
| 104 |
+
TARS_DISPLAY_URL = runtime_config['TARS_DISPLAY_URL']
|
| 105 |
+
TARS_DISPLAY_ENABLED = runtime_config['TARS_DISPLAY_ENABLED']
|
| 106 |
+
|
| 107 |
+
# Detect deployment mode
|
| 108 |
+
deployment_mode = detect_deployment_mode()
|
| 109 |
+
robot_grpc_address = get_robot_grpc_address()
|
| 110 |
+
|
| 111 |
+
logger.info(f"π Configuration:")
|
| 112 |
+
logger.info(f" Client: v{CLIENT_VERSION}")
|
| 113 |
+
logger.info(f" Deployment: {deployment_mode}")
|
| 114 |
+
logger.info(f" STT: {STT_PROVIDER}")
|
| 115 |
+
logger.info(f" LLM: {DEEPINFRA_MODEL}")
|
| 116 |
+
logger.info(f" TTS: {TTS_PROVIDER}")
|
| 117 |
+
logger.info(f" RPi HTTP: {RPI_URL}")
|
| 118 |
+
logger.info(f" RPi gRPC: {robot_grpc_address}")
|
| 119 |
+
logger.info(f" Display: {TARS_DISPLAY_URL} ({'enabled' if TARS_DISPLAY_ENABLED else 'disabled'})")
|
| 120 |
+
|
| 121 |
+
# Session initialization
|
| 122 |
+
session_id = str(uuid.uuid4())[:8]
|
| 123 |
+
client_id = f"guest_{session_id}"
|
| 124 |
+
client_state = {"client_id": client_id}
|
| 125 |
+
logger.info(f"π± Session: {client_id}")
|
| 126 |
+
|
| 127 |
+
service_refs = {"stt": None, "tts": None, "robot_client": None, "aiortc_client": None}
|
| 128 |
+
|
| 129 |
+
try:
|
| 130 |
+
# ====================================================================
|
| 131 |
+
# WEBRTC CONNECTION TO RPI
|
| 132 |
+
# ====================================================================
|
| 133 |
+
|
| 134 |
+
logger.info("π Initializing WebRTC client...")
|
| 135 |
+
aiortc_client = AiortcRPiClient(
|
| 136 |
+
rpi_url=RPI_URL,
|
| 137 |
+
auto_reconnect=True,
|
| 138 |
+
reconnect_delay=RECONNECT_DELAY,
|
| 139 |
+
max_reconnect_attempts=MAX_RECONNECT_ATTEMPTS,
|
| 140 |
+
)
|
| 141 |
+
service_refs["aiortc_client"] = aiortc_client
|
| 142 |
+
|
| 143 |
+
# State sync via DataChannel
|
| 144 |
+
state_sync = StateSync()
|
| 145 |
+
|
| 146 |
+
# Set up callbacks
|
| 147 |
+
@aiortc_client.on_connected
|
| 148 |
+
async def on_connected():
|
| 149 |
+
logger.info("β WebRTC connected to RPi")
|
| 150 |
+
state_sync.set_send_callback(aiortc_client.send_data_channel_message)
|
| 151 |
+
|
| 152 |
+
@aiortc_client.on_disconnected
|
| 153 |
+
async def on_disconnected():
|
| 154 |
+
logger.warning("β οΈ WebRTC disconnected from RPi")
|
| 155 |
+
|
| 156 |
+
@aiortc_client.on_data_channel_message
|
| 157 |
+
def on_data_message(message: str):
|
| 158 |
+
state_sync.handle_message(message)
|
| 159 |
+
|
| 160 |
+
# Register DataChannel message handlers
|
| 161 |
+
state_sync.on_battery_update(lambda level, charging:
|
| 162 |
+
logger.debug(f"π Battery: {level}% ({'charging' if charging else 'discharging'})"))
|
| 163 |
+
|
| 164 |
+
state_sync.on_movement_status(lambda moving, movement:
|
| 165 |
+
logger.debug(f"πΆ Movement: {movement} ({'active' if moving else 'idle'})"))
|
| 166 |
+
|
| 167 |
+
# Connect to RPi
|
| 168 |
+
if AUTO_CONNECT:
|
| 169 |
+
logger.info("π Connecting to RPi...")
|
| 170 |
+
connected = await aiortc_client.connect()
|
| 171 |
+
if not connected:
|
| 172 |
+
logger.error("β Failed to connect to RPi. Exiting.")
|
| 173 |
+
return
|
| 174 |
+
else:
|
| 175 |
+
logger.info("βΈοΈ Auto-connect disabled. Waiting for manual connection.")
|
| 176 |
+
return
|
| 177 |
+
|
| 178 |
+
# Wait for audio track from RPi
|
| 179 |
+
logger.info("β³ Waiting for audio track from RPi...")
|
| 180 |
+
timeout = 10
|
| 181 |
+
start_time = asyncio.get_event_loop().time()
|
| 182 |
+
while not aiortc_client.get_audio_track() and (asyncio.get_event_loop().time() - start_time) < timeout:
|
| 183 |
+
await asyncio.sleep(0.1)
|
| 184 |
+
|
| 185 |
+
audio_track_from_rpi = aiortc_client.get_audio_track()
|
| 186 |
+
if not audio_track_from_rpi:
|
| 187 |
+
logger.error("β No audio track received from RPi. Exiting.")
|
| 188 |
+
return
|
| 189 |
+
|
| 190 |
+
logger.info("β Received audio track from RPi")
|
| 191 |
+
|
| 192 |
+
# ====================================================================
|
| 193 |
+
# AUDIO BRIDGE SETUP
|
| 194 |
+
# ====================================================================
|
| 195 |
+
|
| 196 |
+
logger.info("π§ Setting up audio bridge...")
|
| 197 |
+
|
| 198 |
+
# Create audio input track (RPi mic β Pipecat)
|
| 199 |
+
rpi_input = RPiAudioInputTrack(
|
| 200 |
+
aiortc_track=audio_track_from_rpi,
|
| 201 |
+
sample_rate=16000 # RPi mic sample rate
|
| 202 |
+
)
|
| 203 |
+
|
| 204 |
+
# Create audio output track (Pipecat TTS β RPi speaker)
|
| 205 |
+
rpi_output = RPiAudioOutputTrack(
|
| 206 |
+
sample_rate=24000 # TTS output sample rate
|
| 207 |
+
)
|
| 208 |
+
|
| 209 |
+
# Add output track to WebRTC connection
|
| 210 |
+
aiortc_client.add_audio_track(rpi_output)
|
| 211 |
+
|
| 212 |
+
# Create audio bridge processor
|
| 213 |
+
audio_bridge = AudioBridge(
|
| 214 |
+
rpi_input_track=rpi_input,
|
| 215 |
+
rpi_output_track=rpi_output
|
| 216 |
+
)
|
| 217 |
+
|
| 218 |
+
logger.info("β Audio bridge ready")
|
| 219 |
+
|
| 220 |
+
# ====================================================================
|
| 221 |
+
# SPEECH-TO-TEXT SERVICE
|
| 222 |
+
# ====================================================================
|
| 223 |
+
|
| 224 |
+
logger.info(f"π€ Initializing {STT_PROVIDER} STT...")
|
| 225 |
+
stt = create_stt_service(
|
| 226 |
+
provider=STT_PROVIDER,
|
| 227 |
+
speechmatics_api_key=SPEECHMATICS_API_KEY,
|
| 228 |
+
deepgram_api_key=DEEPGRAM_API_KEY,
|
| 229 |
+
language=Language.EN,
|
| 230 |
+
enable_diarization=False,
|
| 231 |
+
)
|
| 232 |
+
service_refs["stt"] = stt
|
| 233 |
+
logger.info(f"β STT initialized")
|
| 234 |
+
|
| 235 |
+
# ====================================================================
|
| 236 |
+
# TEXT-TO-SPEECH SERVICE
|
| 237 |
+
# ====================================================================
|
| 238 |
+
|
| 239 |
+
logger.info(f"π Initializing {TTS_PROVIDER} TTS...")
|
| 240 |
+
tts = create_tts_service(
|
| 241 |
+
provider=TTS_PROVIDER,
|
| 242 |
+
elevenlabs_api_key=ELEVENLABS_API_KEY,
|
| 243 |
+
elevenlabs_voice_id=ELEVENLABS_VOICE_ID,
|
| 244 |
+
qwen_model=QWEN3_TTS_MODEL,
|
| 245 |
+
qwen_device=QWEN3_TTS_DEVICE,
|
| 246 |
+
qwen_ref_audio=QWEN3_TTS_REF_AUDIO,
|
| 247 |
+
)
|
| 248 |
+
service_refs["tts"] = tts
|
| 249 |
+
logger.info(f"β TTS initialized")
|
| 250 |
+
|
| 251 |
+
# ====================================================================
|
| 252 |
+
# LLM SERVICE & TOOLS
|
| 253 |
+
# ====================================================================
|
| 254 |
+
|
| 255 |
+
logger.info("π§ Initializing LLM...")
|
| 256 |
+
llm = OpenAILLMService(
|
| 257 |
+
api_key=DEEPINFRA_API_KEY,
|
| 258 |
+
base_url=DEEPINFRA_BASE_URL,
|
| 259 |
+
model=DEEPINFRA_MODEL
|
| 260 |
+
)
|
| 261 |
+
|
| 262 |
+
# Load character
|
| 263 |
+
character_dir = os.path.join(os.path.dirname(__file__), "character")
|
| 264 |
+
persona_params = load_persona_ini(os.path.join(character_dir, "persona.ini"))
|
| 265 |
+
tars_data = load_tars_json(os.path.join(character_dir, "TARS.json"))
|
| 266 |
+
system_prompt = build_tars_system_prompt(persona_params, tars_data)
|
| 267 |
+
|
| 268 |
+
# Initialize expression rate limiter
|
| 269 |
+
rate_limiter = ExpressionRateLimiter(
|
| 270 |
+
min_emotion_interval=5.0,
|
| 271 |
+
min_gesture_interval=30.0,
|
| 272 |
+
max_gestures_per_session=3
|
| 273 |
+
)
|
| 274 |
+
set_rate_limiter(rate_limiter)
|
| 275 |
+
|
| 276 |
+
# Create tool schemas
|
| 277 |
+
tools = ToolsSchema(
|
| 278 |
+
standard_tools=[
|
| 279 |
+
create_fetch_image_schema(),
|
| 280 |
+
create_adjust_persona_schema(),
|
| 281 |
+
create_identity_schema(),
|
| 282 |
+
create_movement_schema(),
|
| 283 |
+
create_camera_capture_schema(),
|
| 284 |
+
create_emotion_schema(),
|
| 285 |
+
create_gesture_schema(),
|
| 286 |
+
]
|
| 287 |
+
)
|
| 288 |
+
|
| 289 |
+
messages = [system_prompt]
|
| 290 |
+
context = LLMContext(messages, tools)
|
| 291 |
+
|
| 292 |
+
# Register tool functions
|
| 293 |
+
llm.register_function("fetch_user_image", fetch_user_image)
|
| 294 |
+
llm.register_function("adjust_persona_parameter", adjust_persona_parameter)
|
| 295 |
+
llm.register_function("execute_movement", execute_movement)
|
| 296 |
+
llm.register_function("capture_camera_view", capture_camera_view)
|
| 297 |
+
llm.register_function("set_emotion", set_emotion)
|
| 298 |
+
llm.register_function("do_gesture", do_gesture)
|
| 299 |
+
|
| 300 |
+
logger.info(f"β LLM initialized with {DEEPINFRA_MODEL}")
|
| 301 |
+
|
| 302 |
+
# ====================================================================
|
| 303 |
+
# TARS ROBOT CLIENT (gRPC commands)
|
| 304 |
+
# ====================================================================
|
| 305 |
+
|
| 306 |
+
logger.info("π€ Initializing TARS Robot Client (gRPC)...")
|
| 307 |
+
robot_client = None
|
| 308 |
+
if TARS_DISPLAY_ENABLED:
|
| 309 |
+
try:
|
| 310 |
+
robot_client = tars_robot.get_robot_client(address=robot_grpc_address)
|
| 311 |
+
service_refs["robot_client"] = robot_client
|
| 312 |
+
if robot_client and tars_robot.is_robot_available():
|
| 313 |
+
logger.info(f"β TARS Robot Client connected via gRPC at {robot_grpc_address}")
|
| 314 |
+
tars_robot.set_eye_state("idle")
|
| 315 |
+
|
| 316 |
+
# Check daemon version
|
| 317 |
+
logger.info("Checking TARS daemon version...")
|
| 318 |
+
update_checker = TarsUpdateChecker(robot_client)
|
| 319 |
+
await update_checker.check_on_connect()
|
| 320 |
+
else:
|
| 321 |
+
logger.warning("β οΈ TARS Robot not available")
|
| 322 |
+
except Exception as e:
|
| 323 |
+
logger.warning(f"β οΈ Could not initialize TARS Robot: {e}")
|
| 324 |
+
else:
|
| 325 |
+
logger.info("βΉοΈ TARS Robot control disabled")
|
| 326 |
+
|
| 327 |
+
# ====================================================================
|
| 328 |
+
# CONTEXT AGGREGATOR
|
| 329 |
+
# ====================================================================
|
| 330 |
+
|
| 331 |
+
user_params = LLMUserAggregatorParams(
|
| 332 |
+
user_turn_stop_timeout=1.5
|
| 333 |
+
)
|
| 334 |
+
|
| 335 |
+
context_aggregator = LLMContextAggregatorPair(
|
| 336 |
+
context,
|
| 337 |
+
user_params=user_params
|
| 338 |
+
)
|
| 339 |
+
|
| 340 |
+
persona_storage = get_persona_storage()
|
| 341 |
+
persona_storage["persona_params"] = persona_params
|
| 342 |
+
persona_storage["tars_data"] = tars_data
|
| 343 |
+
persona_storage["context_aggregator"] = context_aggregator
|
| 344 |
+
|
| 345 |
+
# ====================================================================
|
| 346 |
+
# OBSERVERS
|
| 347 |
+
# ====================================================================
|
| 348 |
+
|
| 349 |
+
state_observer = StateObserver(state_sync=state_sync)
|
| 350 |
+
|
| 351 |
+
# ====================================================================
|
| 352 |
+
# PIPELINE ASSEMBLY
|
| 353 |
+
# ====================================================================
|
| 354 |
+
|
| 355 |
+
logger.info("π§ Building pipeline...")
|
| 356 |
+
|
| 357 |
+
pipeline = Pipeline([
|
| 358 |
+
stt,
|
| 359 |
+
context_aggregator.user(),
|
| 360 |
+
llm,
|
| 361 |
+
SilenceFilter(),
|
| 362 |
+
tts,
|
| 363 |
+
audio_bridge, # Captures TTS output and sends to RPi speaker
|
| 364 |
+
context_aggregator.assistant(),
|
| 365 |
+
])
|
| 366 |
+
|
| 367 |
+
# ====================================================================
|
| 368 |
+
# AUDIO INPUT FEEDING
|
| 369 |
+
# ====================================================================
|
| 370 |
+
|
| 371 |
+
# Task reference for audio feeding
|
| 372 |
+
task_ref = {"task": None, "audio_task": None}
|
| 373 |
+
|
| 374 |
+
async def feed_rpi_audio():
|
| 375 |
+
"""Feed audio frames from RPi mic into the pipeline."""
|
| 376 |
+
logger.info("π€ Starting audio input from RPi...")
|
| 377 |
+
try:
|
| 378 |
+
async for audio_frame in rpi_input.start():
|
| 379 |
+
if task_ref.get("task"):
|
| 380 |
+
await task_ref["task"].queue_frames([audio_frame])
|
| 381 |
+
except Exception as e:
|
| 382 |
+
logger.error(f"β Audio input error: {e}", exc_info=True)
|
| 383 |
+
finally:
|
| 384 |
+
logger.info("π€ Audio input stopped")
|
| 385 |
+
|
| 386 |
+
# ====================================================================
|
| 387 |
+
# PIPELINE EXECUTION
|
| 388 |
+
# ====================================================================
|
| 389 |
+
|
| 390 |
+
task = PipelineTask(
|
| 391 |
+
pipeline,
|
| 392 |
+
params=PipelineParams(
|
| 393 |
+
enable_metrics=True,
|
| 394 |
+
enable_usage_metrics=True,
|
| 395 |
+
report_only_initial_ttfb=False,
|
| 396 |
+
),
|
| 397 |
+
observers=[state_observer],
|
| 398 |
+
)
|
| 399 |
+
|
| 400 |
+
task_ref["task"] = task
|
| 401 |
+
runner = PipelineRunner(handle_sigint=True)
|
| 402 |
+
|
| 403 |
+
logger.info("βΆοΈ Starting pipeline...")
|
| 404 |
+
logger.info("=" * 80)
|
| 405 |
+
|
| 406 |
+
# Start audio input feeding task
|
| 407 |
+
audio_task = asyncio.create_task(feed_rpi_audio())
|
| 408 |
+
task_ref["audio_task"] = audio_task
|
| 409 |
+
|
| 410 |
+
# Send initial greeting
|
| 411 |
+
await asyncio.sleep(2.0)
|
| 412 |
+
intro_instruction = get_introduction_instruction(client_id, persona_params.get("verbosity", 10))
|
| 413 |
+
if context and hasattr(context, "messages"):
|
| 414 |
+
context.messages.append(intro_instruction)
|
| 415 |
+
await task.queue_frames([LLMRunFrame()])
|
| 416 |
+
|
| 417 |
+
# Run pipeline
|
| 418 |
+
try:
|
| 419 |
+
await runner.run(task)
|
| 420 |
+
finally:
|
| 421 |
+
# Cancel audio feeding task
|
| 422 |
+
if task_ref.get("audio_task"):
|
| 423 |
+
task_ref["audio_task"].cancel()
|
| 424 |
+
try:
|
| 425 |
+
await task_ref["audio_task"]
|
| 426 |
+
except asyncio.CancelledError:
|
| 427 |
+
pass
|
| 428 |
+
|
| 429 |
+
except KeyboardInterrupt:
|
| 430 |
+
logger.info("π Interrupted by user")
|
| 431 |
+
except Exception as e:
|
| 432 |
+
logger.error(f"β Error in robot bot: {e}", exc_info=True)
|
| 433 |
+
finally:
|
| 434 |
+
# Cleanup
|
| 435 |
+
logger.info("π§Ή Cleaning up...")
|
| 436 |
+
if service_refs.get("aiortc_client"):
|
| 437 |
+
await service_refs["aiortc_client"].disconnect()
|
| 438 |
+
if service_refs.get("stt"):
|
| 439 |
+
try:
|
| 440 |
+
await service_refs["stt"].close()
|
| 441 |
+
except:
|
| 442 |
+
pass
|
| 443 |
+
if service_refs.get("tts"):
|
| 444 |
+
try:
|
| 445 |
+
await service_refs["tts"].close()
|
| 446 |
+
except:
|
| 447 |
+
pass
|
| 448 |
+
if service_refs.get("robot_client"):
|
| 449 |
+
try:
|
| 450 |
+
tars_robot.close_robot_client()
|
| 451 |
+
except:
|
| 452 |
+
pass
|
| 453 |
+
logger.info("β Cleanup complete")
|
| 454 |
+
|
| 455 |
+
|
| 456 |
+
if __name__ == "__main__":
|
| 457 |
+
asyncio.run(run_robot_bot())
|
ui/README.md
CHANGED
|
@@ -39,7 +39,7 @@ Then open http://localhost:7861
|
|
| 39 |
|
| 40 |
Terminal 1:
|
| 41 |
```bash
|
| 42 |
-
python bot.py
|
| 43 |
```
|
| 44 |
|
| 45 |
Terminal 2:
|
|
@@ -52,7 +52,7 @@ python ui/app.py
|
|
| 52 |
The UI reads from `src/shared_state.py`, which is populated by observers in the Pipecat pipeline:
|
| 53 |
|
| 54 |
```
|
| 55 |
-
bot.py (Pipecat Pipeline)
|
| 56 |
β
|
| 57 |
src/observers/ (metrics, transcription, assistant)
|
| 58 |
β
|
|
@@ -123,7 +123,7 @@ python tests/gradio/test_gradio.py
|
|
| 123 |
## Troubleshooting
|
| 124 |
|
| 125 |
### No data showing
|
| 126 |
-
- Ensure bot.py is running
|
| 127 |
- Check that WebRTC client is connected
|
| 128 |
- Verify at least one conversation turn has completed
|
| 129 |
|
|
@@ -133,7 +133,7 @@ pip install gradio plotly
|
|
| 133 |
```
|
| 134 |
|
| 135 |
### Charts not updating
|
| 136 |
-
- Check that observers are enabled in bot.py
|
| 137 |
- Verify shared_state.py is being imported correctly
|
| 138 |
- Check console for errors
|
| 139 |
|
|
|
|
| 39 |
|
| 40 |
Terminal 1:
|
| 41 |
```bash
|
| 42 |
+
python src/src/bot.py
|
| 43 |
```
|
| 44 |
|
| 45 |
Terminal 2:
|
|
|
|
| 52 |
The UI reads from `src/shared_state.py`, which is populated by observers in the Pipecat pipeline:
|
| 53 |
|
| 54 |
```
|
| 55 |
+
src/bot.py (Pipecat Pipeline)
|
| 56 |
β
|
| 57 |
src/observers/ (metrics, transcription, assistant)
|
| 58 |
β
|
|
|
|
| 123 |
## Troubleshooting
|
| 124 |
|
| 125 |
### No data showing
|
| 126 |
+
- Ensure src/bot.py is running
|
| 127 |
- Check that WebRTC client is connected
|
| 128 |
- Verify at least one conversation turn has completed
|
| 129 |
|
|
|
|
| 133 |
```
|
| 134 |
|
| 135 |
### Charts not updating
|
| 136 |
+
- Check that observers are enabled in src/bot.py
|
| 137 |
- Verify shared_state.py is being imported correctly
|
| 138 |
- Check console for errors
|
| 139 |
|
ui/app.py
CHANGED
|
@@ -337,13 +337,13 @@ with gr.Blocks(
|
|
| 337 |
gr.Markdown("""
|
| 338 |
**To connect to TARS:**
|
| 339 |
|
| 340 |
-
1. Ensure bot pipeline is running: `python bot.py`
|
| 341 |
2. Open WebRTC client in browser
|
| 342 |
3. Pipeline will connect automatically
|
| 343 |
|
| 344 |
**Endpoints:**
|
| 345 |
- WebRTC Signaling: Handled by SmallWebRTC transport
|
| 346 |
-
- Health Check: Check bot.py logs for status
|
| 347 |
|
| 348 |
**Architecture:**
|
| 349 |
- Pipecat pipeline with STT, LLM, TTS
|
|
|
|
| 337 |
gr.Markdown("""
|
| 338 |
**To connect to TARS:**
|
| 339 |
|
| 340 |
+
1. Ensure bot pipeline is running: `python src/src/bot.py`
|
| 341 |
2. Open WebRTC client in browser
|
| 342 |
3. Pipeline will connect automatically
|
| 343 |
|
| 344 |
**Endpoints:**
|
| 345 |
- WebRTC Signaling: Handled by SmallWebRTC transport
|
| 346 |
+
- Health Check: Check src/bot.py logs for status
|
| 347 |
|
| 348 |
**Architecture:**
|
| 349 |
- Pipecat pipeline with STT, LLM, TTS
|