latishab commited on
Commit
7fb83e4
Β·
verified Β·
1 Parent(s): 23453ac

Update: Professional React landing page

Browse files
.gitignore CHANGED
@@ -18,6 +18,7 @@ __pycache__/
18
 
19
  # production
20
  /build
 
21
 
22
  # misc
23
  .DS_Store
 
18
 
19
  # production
20
  /build
21
+ /dist
22
 
23
  # misc
24
  .DS_Store
README.md CHANGED
@@ -15,8 +15,8 @@ Real-time voice AI with transcription, vision, and intelligent conversation usin
15
  ## Features
16
 
17
  - **Dual Operation Modes**
18
- - **WebRTC Mode** (`bot.py`) - Browser-based voice AI with real-time metrics dashboard
19
- - **Robot Mode** (`tars_bot.py`) - Connect to Raspberry Pi TARS robot via WebRTC and gRPC
20
  - **Real-time Transcription** - Speechmatics or Deepgram with smart turn detection
21
  - **Dual TTS Options** - Qwen3-TTS (local, free, voice cloning) or ElevenLabs (cloud)
22
  - **LLM Integration** - Any model via DeepInfra
@@ -32,9 +32,9 @@ Real-time voice AI with transcription, vision, and intelligent conversation usin
32
 
33
  ```
34
  tars-conversation-app/
35
- β”œβ”€β”€ bot.py # WebRTC mode - Browser voice AI
36
- β”œβ”€β”€ tars_bot.py # Robot mode - Raspberry Pi hardware
37
- β”œβ”€β”€ pipecat_service.py # FastAPI backend (WebRTC signaling)
38
  β”œβ”€β”€ config.py # Configuration management
39
  β”œβ”€β”€ config.ini # User configuration file
40
  β”œβ”€β”€ requirements.txt # Python dependencies
@@ -62,14 +62,14 @@ tars-conversation-app/
62
 
63
  ## Operation Modes
64
 
65
- ### WebRTC Mode (`bot.py`)
66
  - **Use case**: Browser-based voice AI conversations
67
  - **Transport**: SmallWebRTC (browser ↔ Pipecat)
68
  - **Features**: Full pipeline with STT, LLM, TTS, Memory
69
  - **UI**: Gradio dashboard for metrics and transcription
70
  - **Best for**: Development, testing, remote conversations
71
 
72
- ### Robot Mode (`tars_bot.py`)
73
  - **Use case**: Physical TARS robot on Raspberry Pi
74
  - **Transport**: aiortc (RPi ↔ Pipecat) + gRPC (commands)
75
  - **Features**: Same pipeline + robot control (eyes, gestures, movement)
@@ -159,7 +159,7 @@ type = hybrid # SQLite-based hybrid search (vector + BM25)
159
 
160
  **Terminal 1: Python backend**
161
  ```bash
162
- python pipecat_service.py
163
  ```
164
 
165
  **Terminal 2: Gradio UI (optional)**
@@ -197,7 +197,7 @@ Deployment detection:
197
 
198
  Run:
199
  ```bash
200
- python tars_bot.py
201
  ```
202
 
203
  ## Gradio Dashboard
@@ -268,7 +268,7 @@ See [docs/DEVELOPING_APPS.md](docs/DEVELOPING_APPS.md) for comprehensive guide o
268
  ### Adding Tools
269
  1. Create function in `src/tools/`
270
  2. Create schema with `create_*_schema()`
271
- 3. Register in `bot.py` or `tars_bot.py`
272
  4. LLM can now call your tool
273
 
274
  ### Modifying UI
@@ -287,7 +287,7 @@ Removes virtual environment and optionally data/config files.
287
  ## Troubleshooting
288
 
289
  ### No metrics in Gradio UI
290
- - Ensure bot is running (`bot.py` or `tars_bot.py`)
291
  - Check WebRTC client is connected
292
  - Verify at least one conversation turn completed
293
 
 
15
  ## Features
16
 
17
  - **Dual Operation Modes**
18
+ - **WebRTC Mode** (`src/bot.py`) - Browser-based voice AI with real-time metrics dashboard
19
+ - **Robot Mode** (`src/tars_bot.py`) - Connect to Raspberry Pi TARS robot via WebRTC and gRPC
20
  - **Real-time Transcription** - Speechmatics or Deepgram with smart turn detection
21
  - **Dual TTS Options** - Qwen3-TTS (local, free, voice cloning) or ElevenLabs (cloud)
22
  - **LLM Integration** - Any model via DeepInfra
 
32
 
33
  ```
34
  tars-conversation-app/
35
+ β”œβ”€β”€ src/bot.py # WebRTC mode - Browser voice AI
36
+ β”œβ”€β”€ src/tars_bot.py # Robot mode - Raspberry Pi hardware
37
+ β”œβ”€β”€ src/pipecat_service.py # FastAPI backend (WebRTC signaling)
38
  β”œβ”€β”€ config.py # Configuration management
39
  β”œβ”€β”€ config.ini # User configuration file
40
  β”œβ”€β”€ requirements.txt # Python dependencies
 
62
 
63
  ## Operation Modes
64
 
65
+ ### WebRTC Mode (`src/bot.py`)
66
  - **Use case**: Browser-based voice AI conversations
67
  - **Transport**: SmallWebRTC (browser ↔ Pipecat)
68
  - **Features**: Full pipeline with STT, LLM, TTS, Memory
69
  - **UI**: Gradio dashboard for metrics and transcription
70
  - **Best for**: Development, testing, remote conversations
71
 
72
+ ### Robot Mode (`src/tars_bot.py`)
73
  - **Use case**: Physical TARS robot on Raspberry Pi
74
  - **Transport**: aiortc (RPi ↔ Pipecat) + gRPC (commands)
75
  - **Features**: Same pipeline + robot control (eyes, gestures, movement)
 
159
 
160
  **Terminal 1: Python backend**
161
  ```bash
162
+ python src/pipecat_service.py
163
  ```
164
 
165
  **Terminal 2: Gradio UI (optional)**
 
197
 
198
  Run:
199
  ```bash
200
+ python src/tars_bot.py
201
  ```
202
 
203
  ## Gradio Dashboard
 
268
  ### Adding Tools
269
  1. Create function in `src/tools/`
270
  2. Create schema with `create_*_schema()`
271
+ 3. Register in `src/bot.py` or `src/tars_bot.py`
272
  4. LLM can now call your tool
273
 
274
  ### Modifying UI
 
287
  ## Troubleshooting
288
 
289
  ### No metrics in Gradio UI
290
+ - Ensure bot is running (`src/bot.py` or `src/tars_bot.py`)
291
  - Check WebRTC client is connected
292
  - Verify at least one conversation turn completed
293
 
assets/index-BGP_uT2P.css ADDED
@@ -0,0 +1 @@
 
 
1
+ *,:before,:after{box-sizing:border-box;border-width:0;border-style:solid;border-color:#e5e7eb}:before,:after{--tw-content: ""}html,:host{line-height:1.5;-webkit-text-size-adjust:100%;-moz-tab-size:4;-o-tab-size:4;tab-size:4;font-family:ui-sans-serif,system-ui,sans-serif,"Apple Color Emoji","Segoe UI Emoji",Segoe UI Symbol,"Noto Color Emoji";font-feature-settings:normal;font-variation-settings:normal;-webkit-tap-highlight-color:transparent}body{margin:0;line-height:inherit}hr{height:0;color:inherit;border-top-width:1px}abbr:where([title]){-webkit-text-decoration:underline dotted;text-decoration:underline dotted}h1,h2,h3,h4,h5,h6{font-size:inherit;font-weight:inherit}a{color:inherit;text-decoration:inherit}b,strong{font-weight:bolder}code,kbd,samp,pre{font-family:ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,Liberation Mono,Courier New,monospace;font-feature-settings:normal;font-variation-settings:normal;font-size:1em}small{font-size:80%}sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline}sub{bottom:-.25em}sup{top:-.5em}table{text-indent:0;border-color:inherit;border-collapse:collapse}button,input,optgroup,select,textarea{font-family:inherit;font-feature-settings:inherit;font-variation-settings:inherit;font-size:100%;font-weight:inherit;line-height:inherit;color:inherit;margin:0;padding:0}button,select{text-transform:none}button,[type=button],[type=reset],[type=submit]{-webkit-appearance:button;background-color:transparent;background-image:none}:-moz-focusring{outline:auto}:-moz-ui-invalid{box-shadow:none}progress{vertical-align:baseline}::-webkit-inner-spin-button,::-webkit-outer-spin-button{height:auto}[type=search]{-webkit-appearance:textfield;outline-offset:-2px}::-webkit-search-decoration{-webkit-appearance:none}::-webkit-file-upload-button{-webkit-appearance:button;font:inherit}summary{display:list-item}blockquote,dl,dd,h1,h2,h3,h4,h5,h6,hr,figure,p,pre{margin:0}fieldset{margin:0;padding:0}legend{padding:0}ol,ul,menu{list-style:none;margin:0;padding:0}dialog{padding:0}textarea{resize:vertical}input::-moz-placeholder,textarea::-moz-placeholder{opacity:1;color:#9ca3af}input::placeholder,textarea::placeholder{opacity:1;color:#9ca3af}button,[role=button]{cursor:pointer}:disabled{cursor:default}img,svg,video,canvas,audio,iframe,embed,object{display:block;vertical-align:middle}img,video{max-width:100%;height:auto}[hidden]{display:none}:root{--background: 0 0% 100%;--foreground: 0 0% 9%;--card: 0 0% 100%;--card-foreground: 0 0% 9%;--popover: 0 0% 100%;--popover-foreground: 0 0% 9%;--primary: 0 0% 9%;--primary-foreground: 0 0% 98%;--secondary: 0 0% 96.1%;--secondary-foreground: 0 0% 9%;--muted: 0 0% 96.1%;--muted-foreground: 0 0% 45.1%;--accent: 0 0% 96.1%;--accent-foreground: 0 0% 9%;--destructive: 0 84.2% 60.2%;--destructive-foreground: 0 0% 98%;--border: 0 0% 89.8%;--input: 0 0% 89.8%;--ring: 0 0% 9%;--radius: .5rem}*{border-color:hsl(var(--border))}body{background-color:hsl(var(--background));color:hsl(var(--foreground))}*,:before,:after{--tw-border-spacing-x: 0;--tw-border-spacing-y: 0;--tw-translate-x: 0;--tw-translate-y: 0;--tw-rotate: 0;--tw-skew-x: 0;--tw-skew-y: 0;--tw-scale-x: 1;--tw-scale-y: 1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness: proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width: 0px;--tw-ring-offset-color: #fff;--tw-ring-color: rgb(59 130 246 / .5);--tw-ring-offset-shadow: 0 0 #0000;--tw-ring-shadow: 0 0 #0000;--tw-shadow: 0 0 #0000;--tw-shadow-colored: 0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: }::backdrop{--tw-border-spacing-x: 0;--tw-border-spacing-y: 0;--tw-translate-x: 0;--tw-translate-y: 0;--tw-rotate: 0;--tw-skew-x: 0;--tw-skew-y: 0;--tw-scale-x: 1;--tw-scale-y: 1;--tw-pan-x: ;--tw-pan-y: ;--tw-pinch-zoom: ;--tw-scroll-snap-strictness: proximity;--tw-gradient-from-position: ;--tw-gradient-via-position: ;--tw-gradient-to-position: ;--tw-ordinal: ;--tw-slashed-zero: ;--tw-numeric-figure: ;--tw-numeric-spacing: ;--tw-numeric-fraction: ;--tw-ring-inset: ;--tw-ring-offset-width: 0px;--tw-ring-offset-color: #fff;--tw-ring-color: rgb(59 130 246 / .5);--tw-ring-offset-shadow: 0 0 #0000;--tw-ring-shadow: 0 0 #0000;--tw-shadow: 0 0 #0000;--tw-shadow-colored: 0 0 #0000;--tw-blur: ;--tw-brightness: ;--tw-contrast: ;--tw-grayscale: ;--tw-hue-rotate: ;--tw-invert: ;--tw-saturate: ;--tw-sepia: ;--tw-drop-shadow: ;--tw-backdrop-blur: ;--tw-backdrop-brightness: ;--tw-backdrop-contrast: ;--tw-backdrop-grayscale: ;--tw-backdrop-hue-rotate: ;--tw-backdrop-invert: ;--tw-backdrop-opacity: ;--tw-backdrop-saturate: ;--tw-backdrop-sepia: }.container{width:100%;margin-right:auto;margin-left:auto;padding-right:2rem;padding-left:2rem}@media(min-width:1400px){.container{max-width:1400px}}.sticky{position:sticky}.top-0{top:0}.z-50{z-index:50}.mx-auto{margin-left:auto;margin-right:auto}.mb-12{margin-bottom:3rem}.mb-2{margin-bottom:.5rem}.mb-4{margin-bottom:1rem}.flex{display:flex}.inline-flex{display:inline-flex}.grid{display:grid}.h-10{height:2.5rem}.h-11{height:2.75rem}.h-12{height:3rem}.h-3{height:.75rem}.h-4{height:1rem}.h-6{height:1.5rem}.h-8{height:2rem}.h-9{height:2.25rem}.min-h-screen{min-height:100vh}.w-10{width:2.5rem}.w-12{width:3rem}.w-3{width:.75rem}.w-4{width:1rem}.w-6{width:1.5rem}.w-8{width:2rem}.w-full{width:100%}.max-w-3xl{max-width:48rem}.max-w-5xl{max-width:64rem}.max-w-6xl{max-width:72rem}.max-w-md{max-width:28rem}.flex-1{flex:1 1 0%}.flex-shrink-0{flex-shrink:0}.flex-col{flex-direction:column}.flex-wrap{flex-wrap:wrap}.items-center{align-items:center}.justify-center{justify-content:center}.justify-between{justify-content:space-between}.gap-1{gap:.25rem}.gap-12{gap:3rem}.gap-2{gap:.5rem}.gap-3{gap:.75rem}.gap-4{gap:1rem}.gap-6{gap:1.5rem}.gap-8{gap:2rem}.space-y-1>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(.25rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(.25rem * var(--tw-space-y-reverse))}.space-y-1\.5>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(.375rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(.375rem * var(--tw-space-y-reverse))}.space-y-4>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(1rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(1rem * var(--tw-space-y-reverse))}.space-y-6>:not([hidden])~:not([hidden]){--tw-space-y-reverse: 0;margin-top:calc(1.5rem * calc(1 - var(--tw-space-y-reverse)));margin-bottom:calc(1.5rem * var(--tw-space-y-reverse))}.whitespace-nowrap{white-space:nowrap}.rounded{border-radius:.25rem}.rounded-full{border-radius:9999px}.rounded-lg{border-radius:var(--radius)}.rounded-md{border-radius:calc(var(--radius) - 2px)}.border{border-width:1px}.border-b{border-bottom-width:1px}.border-t{border-top-width:1px}.border-input{border-color:hsl(var(--input))}.border-transparent{border-color:transparent}.bg-background{background-color:hsl(var(--background))}.bg-black{--tw-bg-opacity: 1;background-color:rgb(0 0 0 / var(--tw-bg-opacity))}.bg-card{background-color:hsl(var(--card))}.bg-destructive{background-color:hsl(var(--destructive))}.bg-gray-50{--tw-bg-opacity: 1;background-color:rgb(249 250 251 / var(--tw-bg-opacity))}.bg-primary{background-color:0 0% 9%}.bg-secondary{background-color:hsl(var(--secondary))}.bg-white{--tw-bg-opacity: 1;background-color:rgb(255 255 255 / var(--tw-bg-opacity))}.p-4{padding:1rem}.p-6{padding:1.5rem}.px-2{padding-left:.5rem;padding-right:.5rem}.px-2\.5{padding-left:.625rem;padding-right:.625rem}.px-3{padding-left:.75rem;padding-right:.75rem}.px-4{padding-left:1rem;padding-right:1rem}.px-8{padding-left:2rem;padding-right:2rem}.py-0{padding-top:0;padding-bottom:0}.py-0\.5{padding-top:.125rem;padding-bottom:.125rem}.py-1{padding-top:.25rem;padding-bottom:.25rem}.py-16{padding-top:4rem;padding-bottom:4rem}.py-2{padding-top:.5rem;padding-bottom:.5rem}.py-4{padding-top:1rem;padding-bottom:1rem}.py-8{padding-top:2rem;padding-bottom:2rem}.pt-0{padding-top:0}.text-center{text-align:center}.text-2xl{font-size:1.5rem;line-height:2rem}.text-3xl{font-size:1.875rem;line-height:2.25rem}.text-5xl{font-size:3rem;line-height:1}.text-lg{font-size:1.125rem;line-height:1.75rem}.text-sm{font-size:.875rem;line-height:1.25rem}.text-xl{font-size:1.25rem;line-height:1.75rem}.text-xs{font-size:.75rem;line-height:1rem}.font-bold{font-weight:700}.font-medium{font-weight:500}.font-semibold{font-weight:600}.leading-none{line-height:1}.tracking-tight{letter-spacing:-.025em}.text-black{--tw-text-opacity: 1;color:rgb(0 0 0 / var(--tw-text-opacity))}.text-card-foreground{color:hsl(var(--card-foreground))}.text-destructive-foreground{color:hsl(var(--destructive-foreground))}.text-foreground{color:hsl(var(--foreground))}.text-muted-foreground{color:hsl(var(--muted-foreground))}.text-primary{color:0 0% 9%}.text-primary-foreground{color:0 0% 98%}.text-secondary-foreground{color:hsl(var(--secondary-foreground))}.text-white{--tw-text-opacity: 1;color:rgb(255 255 255 / var(--tw-text-opacity))}.underline-offset-4{text-underline-offset:4px}.shadow{--tw-shadow: 0 1px 3px 0 rgb(0 0 0 / .1), 0 1px 2px -1px rgb(0 0 0 / .1);--tw-shadow-colored: 0 1px 3px 0 var(--tw-shadow-color), 0 1px 2px -1px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.shadow-lg{--tw-shadow: 0 10px 15px -3px rgb(0 0 0 / .1), 0 4px 6px -4px rgb(0 0 0 / .1);--tw-shadow-colored: 0 10px 15px -3px var(--tw-shadow-color), 0 4px 6px -4px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.shadow-sm{--tw-shadow: 0 1px 2px 0 rgb(0 0 0 / .05);--tw-shadow-colored: 0 1px 2px 0 var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.outline{outline-style:solid}.ring-offset-background{--tw-ring-offset-color: hsl(var(--background))}.transition-colors{transition-property:color,background-color,border-color,text-decoration-color,fill,stroke;transition-timing-function:cubic-bezier(.4,0,.2,1);transition-duration:.15s}.transition-shadow{transition-property:box-shadow;transition-timing-function:cubic-bezier(.4,0,.2,1);transition-duration:.15s}.hover\:bg-accent:hover{background-color:hsl(var(--accent))}.hover\:bg-destructive\/80:hover{background-color:hsl(var(--destructive) / .8)}.hover\:bg-destructive\/90:hover{background-color:hsl(var(--destructive) / .9)}.hover\:bg-secondary\/80:hover{background-color:hsl(var(--secondary) / .8)}.hover\:text-accent-foreground:hover{color:hsl(var(--accent-foreground))}.hover\:underline:hover{text-decoration-line:underline}.hover\:shadow-lg:hover{--tw-shadow: 0 10px 15px -3px rgb(0 0 0 / .1), 0 4px 6px -4px rgb(0 0 0 / .1);--tw-shadow-colored: 0 10px 15px -3px var(--tw-shadow-color), 0 4px 6px -4px var(--tw-shadow-color);box-shadow:var(--tw-ring-offset-shadow, 0 0 #0000),var(--tw-ring-shadow, 0 0 #0000),var(--tw-shadow)}.focus\:outline-none:focus{outline:2px solid transparent;outline-offset:2px}.focus\:ring-2:focus{--tw-ring-offset-shadow: var(--tw-ring-inset) 0 0 0 var(--tw-ring-offset-width) var(--tw-ring-offset-color);--tw-ring-shadow: var(--tw-ring-inset) 0 0 0 calc(2px + var(--tw-ring-offset-width)) var(--tw-ring-color);box-shadow:var(--tw-ring-offset-shadow),var(--tw-ring-shadow),var(--tw-shadow, 0 0 #0000)}.focus\:ring-ring:focus{--tw-ring-color: hsl(var(--ring))}.focus\:ring-offset-2:focus{--tw-ring-offset-width: 2px}.focus-visible\:outline-none:focus-visible{outline:2px solid transparent;outline-offset:2px}.focus-visible\:ring-2:focus-visible{--tw-ring-offset-shadow: var(--tw-ring-inset) 0 0 0 var(--tw-ring-offset-width) var(--tw-ring-offset-color);--tw-ring-shadow: var(--tw-ring-inset) 0 0 0 calc(2px + var(--tw-ring-offset-width)) var(--tw-ring-color);box-shadow:var(--tw-ring-offset-shadow),var(--tw-ring-shadow),var(--tw-shadow, 0 0 #0000)}.focus-visible\:ring-ring:focus-visible{--tw-ring-color: hsl(var(--ring))}.focus-visible\:ring-offset-2:focus-visible{--tw-ring-offset-width: 2px}.disabled\:pointer-events-none:disabled{pointer-events:none}.disabled\:opacity-50:disabled{opacity:.5}@media(min-width:768px){.md\:grid-cols-2{grid-template-columns:repeat(2,minmax(0,1fr))}}@media(min-width:1024px){.lg\:grid-cols-2{grid-template-columns:repeat(2,minmax(0,1fr))}.lg\:grid-cols-3{grid-template-columns:repeat(3,minmax(0,1fr))}}.\[\&_svg\]\:pointer-events-none svg{pointer-events:none}.\[\&_svg\]\:size-4 svg{width:1rem;height:1rem}.\[\&_svg\]\:shrink-0 svg{flex-shrink:0}
assets/index-BdlkdtJU.js ADDED
The diff for this file is too large to render. See raw diff
 
index.html CHANGED
@@ -6,8 +6,8 @@
6
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
7
  <meta name="description" content="Real-time conversational AI with transcription, vision, and intelligent conversation for TARS robot" />
8
  <title>TARS Conversation App - Real-time AI Voice Assistant</title>
9
- <script type="module" crossorigin src="/assets/index-Dyqch0TE.js"></script>
10
- <link rel="stylesheet" crossorigin href="/assets/index-C9-qqRmx.css">
11
  </head>
12
  <body>
13
  <div id="root"></div>
 
6
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
7
  <meta name="description" content="Real-time conversational AI with transcription, vision, and intelligent conversation for TARS robot" />
8
  <title>TARS Conversation App - Real-time AI Voice Assistant</title>
9
+ <script type="module" crossorigin src="/assets/index-BdlkdtJU.js"></script>
10
+ <link rel="stylesheet" crossorigin href="/assets/index-BGP_uT2P.css">
11
  </head>
12
  <body>
13
  <div id="root"></div>
scripts/install.sh CHANGED
@@ -89,11 +89,11 @@ if [ "$CONFIG_CREATED" = true ] || [ "$ENV_CREATED" = true ]; then
89
  [ "$ENV_CREATED" = true ] && echo " - Add API keys to: $APP_DIR/.env.local"
90
  [ "$CONFIG_CREATED" = true ] && echo " - Configure settings: $APP_DIR/config.ini"
91
  echo "2. Activate environment: source $APP_DIR/venv/bin/activate"
92
- echo "3. Run the app: python $APP_DIR/tars_bot.py"
93
  else
94
  echo "1. Activate environment: source $APP_DIR/venv/bin/activate"
95
- echo "2. Run the app: python $APP_DIR/tars_bot.py"
96
  fi
97
  echo
98
- echo "For browser mode: python $APP_DIR/bot.py"
99
  echo "For dashboard: python $APP_DIR/ui/app.py"
 
89
  [ "$ENV_CREATED" = true ] && echo " - Add API keys to: $APP_DIR/.env.local"
90
  [ "$CONFIG_CREATED" = true ] && echo " - Configure settings: $APP_DIR/config.ini"
91
  echo "2. Activate environment: source $APP_DIR/venv/bin/activate"
92
+ echo "3. Run the app: python $APP_DIR/src/tars_bot.py"
93
  else
94
  echo "1. Activate environment: source $APP_DIR/venv/bin/activate"
95
+ echo "2. Run the app: python $APP_DIR/src/tars_bot.py"
96
  fi
97
  echo
98
+ echo "For browser mode: python $APP_DIR/src/bot.py"
99
  echo "For dashboard: python $APP_DIR/ui/app.py"
scripts/start_robot_mode.sh CHANGED
@@ -58,8 +58,8 @@ if [ -d ".venv" ]; then
58
  fi
59
 
60
  # Check if tars_bot.py exists
61
- if [ ! -f "tars_bot.py" ]; then
62
- echo "❌ Error: tars_bot.py not found"
63
  exit 1
64
  fi
65
 
@@ -84,5 +84,5 @@ else
84
  echo "⚠️ Note: Audio bridge integration is in progress"
85
  echo " See IMPLEMENTATION_SUMMARY.md for current status"
86
  echo ""
87
- python tars_bot.py
88
  fi
 
58
  fi
59
 
60
  # Check if tars_bot.py exists
61
+ if [ ! -f "src/tars_bot.py" ]; then
62
+ echo "❌ Error: src/tars_bot.py not found"
63
  exit 1
64
  fi
65
 
 
84
  echo "⚠️ Note: Audio bridge integration is in progress"
85
  echo " See IMPLEMENTATION_SUMMARY.md for current status"
86
  echo ""
87
+ python src/tars_bot.py
88
  fi
scripts/uninstall.sh CHANGED
@@ -10,9 +10,9 @@ echo
10
 
11
  # Stop running processes
12
  echo "Stopping running processes..."
13
- pkill -f "python.*tars_bot.py" || true
14
- pkill -f "python.*bot.py" || true
15
- pkill -f "python.*pipecat_service.py" || true
16
  pkill -f "python.*ui/app.py" || true
17
  sleep 1
18
  echo "Processes stopped"
 
10
 
11
  # Stop running processes
12
  echo "Stopping running processes..."
13
+ pkill -f "python.*src/tars_bot.py" || true
14
+ pkill -f "python.*src/bot.py" || true
15
+ pkill -f "python.*src/pipecat_service.py" || true
16
  pkill -f "python.*ui/app.py" || true
17
  sleep 1
18
  echo "Processes stopped"
src/bot.py ADDED
@@ -0,0 +1,606 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Bot pipeline setup and execution."""
2
+
3
+ import sys
4
+ from pathlib import Path
5
+
6
+ # Add src directory to Python path for imports
7
+ src_dir = Path(__file__).parent
8
+ sys.path.insert(0, str(src_dir))
9
+
10
+ import asyncio
11
+ import json
12
+ import os
13
+ import logging
14
+ import uuid
15
+ import httpx
16
+
17
+ from pipecat.adapters.schemas.tools_schema import ToolsSchema
18
+ from pipecat.frames.frames import (
19
+ LLMRunFrame,
20
+ TranscriptionFrame,
21
+ InterimTranscriptionFrame,
22
+ Frame,
23
+ TranscriptionMessage,
24
+ TranslationFrame,
25
+ UserImageRawFrame,
26
+ UserAudioRawFrame,
27
+ UserImageRequestFrame,
28
+ )
29
+ from pipecat.processors.frame_processor import FrameProcessor, FrameDirection
30
+ from pipecat.pipeline.pipeline import Pipeline
31
+ from pipecat.pipeline.runner import PipelineRunner
32
+ from pipecat.pipeline.task import PipelineTask, PipelineParams
33
+ from pipecat.processors.aggregators.llm_context import LLMContext
34
+ from pipecat.processors.aggregators.llm_response_universal import (
35
+ LLMContextAggregatorPair,
36
+ LLMUserAggregatorParams
37
+ )
38
+ from pipecat.observers.turn_tracking_observer import TurnTrackingObserver
39
+ from pipecat.observers.loggers.user_bot_latency_log_observer import UserBotLatencyLogObserver
40
+ from pipecat.services.moondream.vision import MoondreamService
41
+ from pipecat.services.openai.llm import OpenAILLMService
42
+ from pipecat.services.llm_service import FunctionCallParams
43
+ from services.memory_hybrid import HybridMemoryService
44
+ from pipecat.transcriptions.language import Language
45
+ from pipecat.transports.base_transport import TransportParams
46
+ from pipecat.transports.smallwebrtc.transport import SmallWebRTCTransport
47
+
48
+ from loguru import logger
49
+
50
+ from config import (
51
+ SPEECHMATICS_API_KEY,
52
+ DEEPGRAM_API_KEY,
53
+ ELEVENLABS_API_KEY,
54
+ ELEVENLABS_VOICE_ID,
55
+ DEEPINFRA_API_KEY,
56
+ DEEPINFRA_BASE_URL,
57
+ MEM0_API_KEY,
58
+ get_fresh_config,
59
+ )
60
+ from services.factories import create_stt_service, create_tts_service
61
+ from processors import (
62
+ SilenceFilter,
63
+ InputAudioFilter,
64
+ InterventionGating,
65
+ VisualObserver,
66
+ EmotionalStateMonitor,
67
+ )
68
+ from observers import (
69
+ MetricsObserver,
70
+ TranscriptionObserver,
71
+ AssistantResponseObserver,
72
+ TTSStateObserver,
73
+ VisionObserver,
74
+ DebugObserver,
75
+ DisplayEventsObserver,
76
+ )
77
+ from character.prompts import (
78
+ load_persona_ini,
79
+ load_tars_json,
80
+ build_tars_system_prompt,
81
+ get_introduction_instruction,
82
+ )
83
+ from tools import (
84
+ fetch_user_image,
85
+ adjust_persona_parameter,
86
+ execute_movement,
87
+ capture_camera_view,
88
+ create_fetch_image_schema,
89
+ create_adjust_persona_schema,
90
+ create_identity_schema,
91
+ create_movement_schema,
92
+ create_camera_capture_schema,
93
+ get_persona_storage,
94
+ get_crossword_hint,
95
+ create_crossword_hint_schema,
96
+ )
97
+ from shared_state import metrics_store
98
+
99
+
100
+ # ============================================================================
101
+ # CUSTOM FRAME PROCESSORS
102
+ # ============================================================================
103
+
104
+ class IdentityUnifier(FrameProcessor):
105
+ """
106
+ Applies 'guest_ID' ONLY to specific user input frames.
107
+ Leaves other frames untouched.
108
+ """
109
+ # Define the frame types that should have user_id set
110
+ TARGET_FRAME_TYPES = (
111
+ TranscriptionFrame,
112
+ TranscriptionMessage,
113
+ TranslationFrame,
114
+ InterimTranscriptionFrame,
115
+ UserImageRawFrame,
116
+ UserAudioRawFrame,
117
+ UserImageRequestFrame,
118
+ )
119
+
120
+ def __init__(self, target_user_id):
121
+ super().__init__()
122
+ self.target_user_id = target_user_id
123
+
124
+ async def process_frame(self, frame: Frame, direction: FrameDirection):
125
+ # 1. Handle internal state
126
+ await super().process_frame(frame, direction)
127
+
128
+ # 2. Only modify specific frame types
129
+ if isinstance(frame, self.TARGET_FRAME_TYPES):
130
+ try:
131
+ frame.user_id = self.target_user_id
132
+ except Exception:
133
+ pass
134
+
135
+ # 3. Push downstream
136
+ await self.push_frame(frame, direction)
137
+
138
+
139
+ # ============================================================================
140
+ # HELPER FUNCTIONS
141
+ # ============================================================================
142
+
143
+ async def _cleanup_services(service_refs: dict):
144
+ if service_refs.get("stt"):
145
+ try:
146
+ await service_refs["stt"].close()
147
+ logger.info("βœ“ STT service cleaned up")
148
+ except Exception:
149
+ pass
150
+ if service_refs.get("tts"):
151
+ try:
152
+ await service_refs["tts"].close()
153
+ logger.info("βœ“ TTS service cleaned up")
154
+ except Exception:
155
+ pass
156
+
157
+
158
+ # ============================================================================
159
+ # MAIN BOT PIPELINE
160
+ # ============================================================================
161
+
162
+ async def run_bot(webrtc_connection):
163
+ """Initialize and run the TARS bot pipeline."""
164
+ logger.info("Starting bot pipeline for WebRTC connection...")
165
+
166
+ # Load fresh configuration for this connection (allows runtime config updates)
167
+ runtime_config = get_fresh_config()
168
+ DEEPINFRA_MODEL = runtime_config['DEEPINFRA_MODEL']
169
+ DEEPINFRA_GATING_MODEL = runtime_config['DEEPINFRA_GATING_MODEL']
170
+ STT_PROVIDER = runtime_config['STT_PROVIDER']
171
+ TTS_PROVIDER = runtime_config['TTS_PROVIDER']
172
+ QWEN3_TTS_MODEL = runtime_config['QWEN3_TTS_MODEL']
173
+ QWEN3_TTS_DEVICE = runtime_config['QWEN3_TTS_DEVICE']
174
+ QWEN3_TTS_REF_AUDIO = runtime_config['QWEN3_TTS_REF_AUDIO']
175
+ EMOTIONAL_MONITORING_ENABLED = runtime_config['EMOTIONAL_MONITORING_ENABLED']
176
+ EMOTIONAL_SAMPLING_INTERVAL = runtime_config['EMOTIONAL_SAMPLING_INTERVAL']
177
+ EMOTIONAL_INTERVENTION_THRESHOLD = runtime_config['EMOTIONAL_INTERVENTION_THRESHOLD']
178
+ TARS_DISPLAY_URL = runtime_config['TARS_DISPLAY_URL']
179
+ TARS_DISPLAY_ENABLED = runtime_config['TARS_DISPLAY_ENABLED']
180
+
181
+ logger.info(f"πŸ“‹ Runtime config loaded - STT: {STT_PROVIDER}, LLM: {DEEPINFRA_MODEL}, TTS: {TTS_PROVIDER}, Emotional: {EMOTIONAL_MONITORING_ENABLED}")
182
+
183
+ # Session initialization
184
+ session_id = str(uuid.uuid4())[:8]
185
+ client_id = f"guest_{session_id}"
186
+ client_state = {"client_id": client_id}
187
+ logger.info(f"Session started: {client_id}")
188
+
189
+ service_refs = {"stt": None, "tts": None}
190
+
191
+ try:
192
+ # ====================================================================
193
+ # TRANSPORT INITIALIZATION
194
+ # ====================================================================
195
+ # Note: STT providers handle their own turn detection:
196
+ # - Speechmatics: SMART_TURN mode
197
+ # - Deepgram: endpointing parameter (300ms silence detection)
198
+ # - Deepgram Flux: built-in turn detection with ExternalUserTurnStrategies (deprecated)
199
+
200
+ logger.info(f"Initializing transport with {STT_PROVIDER} turn detection...")
201
+
202
+ transport_params = TransportParams(
203
+ audio_in_enabled=True,
204
+ audio_out_enabled=True,
205
+ video_in_enabled=False,
206
+ video_out_enabled=False,
207
+ video_out_is_live=False,
208
+ )
209
+
210
+ pipecat_transport = SmallWebRTCTransport(
211
+ webrtc_connection=webrtc_connection,
212
+ params=transport_params,
213
+ )
214
+
215
+ logger.info("βœ“ Transport initialized")
216
+
217
+ # ====================================================================
218
+ # SPEECH-TO-TEXT SERVICE
219
+ # ====================================================================
220
+
221
+ logger.info(f"Initializing {STT_PROVIDER} STT...")
222
+ stt = None
223
+ try:
224
+ stt = create_stt_service(
225
+ provider=STT_PROVIDER,
226
+ speechmatics_api_key=SPEECHMATICS_API_KEY,
227
+ deepgram_api_key=DEEPGRAM_API_KEY,
228
+ language=Language.EN,
229
+ enable_diarization=False,
230
+ )
231
+ service_refs["stt"] = stt
232
+
233
+ # Log additional info for Deepgram
234
+ if STT_PROVIDER == "deepgram":
235
+ logger.info("βœ“ Deepgram: 300ms endpointing for turn detection")
236
+ logger.info("βœ“ Deepgram: VAD events enabled for speech detection")
237
+
238
+ except Exception as e:
239
+ logger.error(f"Failed to initialize {STT_PROVIDER} STT: {e}", exc_info=True)
240
+ return
241
+
242
+ # ====================================================================
243
+ # TEXT-TO-SPEECH SERVICE
244
+ # ====================================================================
245
+
246
+ try:
247
+ tts = create_tts_service(
248
+ provider=TTS_PROVIDER,
249
+ elevenlabs_api_key=ELEVENLABS_API_KEY,
250
+ elevenlabs_voice_id=ELEVENLABS_VOICE_ID,
251
+ qwen_model=QWEN3_TTS_MODEL,
252
+ qwen_device=QWEN3_TTS_DEVICE,
253
+ qwen_ref_audio=QWEN3_TTS_REF_AUDIO,
254
+ )
255
+ service_refs["tts"] = tts
256
+ except Exception as e:
257
+ logger.error(f"Failed to initialize TTS service: {e}", exc_info=True)
258
+ return
259
+
260
+ # ====================================================================
261
+ # LLM SERVICE & TOOLS
262
+ # ====================================================================
263
+
264
+ logger.info("Initializing LLM via DeepInfra...")
265
+ llm = None
266
+ try:
267
+ llm = OpenAILLMService(
268
+ api_key=DEEPINFRA_API_KEY,
269
+ base_url=DEEPINFRA_BASE_URL,
270
+ model=DEEPINFRA_MODEL
271
+ )
272
+
273
+ character_dir = os.path.join(os.path.dirname(__file__), "character")
274
+ persona_params = load_persona_ini(os.path.join(character_dir, "persona.ini"))
275
+ tars_data = load_tars_json(os.path.join(character_dir, "TARS.json"))
276
+ system_prompt = build_tars_system_prompt(persona_params, tars_data)
277
+
278
+ # Create tool schemas (these return FunctionSchema objects)
279
+ fetch_image_tool = create_fetch_image_schema()
280
+ persona_tool = create_adjust_persona_schema()
281
+ identity_tool = create_identity_schema()
282
+ crossword_hint_tool = create_crossword_hint_schema()
283
+ movement_tool = create_movement_schema()
284
+ camera_capture_tool = create_camera_capture_schema()
285
+
286
+ # Pass FunctionSchema objects directly to standard_tools
287
+ tools = ToolsSchema(
288
+ standard_tools=[
289
+ fetch_image_tool,
290
+ persona_tool,
291
+ identity_tool,
292
+ crossword_hint_tool,
293
+ movement_tool,
294
+ camera_capture_tool,
295
+ ]
296
+ )
297
+ messages = [system_prompt]
298
+ context = LLMContext(messages, tools)
299
+
300
+ llm.register_function("fetch_user_image", fetch_user_image)
301
+ llm.register_function("adjust_persona_parameter", adjust_persona_parameter)
302
+ llm.register_function("get_crossword_hint", get_crossword_hint)
303
+ llm.register_function("execute_movement", execute_movement)
304
+ llm.register_function("capture_camera_view", capture_camera_view)
305
+
306
+ pipeline_unifier = IdentityUnifier(client_id)
307
+ async def wrapped_set_identity(params: FunctionCallParams):
308
+ name = params.arguments["name"]
309
+ logger.info(f"πŸ‘€ Identity discovered: {name}")
310
+
311
+ old_id = client_state["client_id"]
312
+ new_id = f"user_{name.lower().replace(' ', '_')}"
313
+
314
+ if old_id != new_id:
315
+ logger.info(f"πŸ”„ Switching User ID: {old_id} -> {new_id}")
316
+ client_state["client_id"] = new_id
317
+
318
+ # Update the pipeline unifier to use new identity
319
+ pipeline_unifier.target_user_id = new_id
320
+ logger.info(f"βœ“ Updated pipeline unifier with new ID: {new_id}")
321
+
322
+ # Update memory service with new user_id
323
+ if memory_service:
324
+ memory_service.user_id = new_id
325
+ logger.info(f"βœ“ Updated memory service user_id to: {new_id}")
326
+
327
+ # Notify frontend of identity change
328
+ try:
329
+ if webrtc_connection and webrtc_connection.is_connected():
330
+ webrtc_connection.send_app_message({
331
+ "type": "identity_update",
332
+ "old_id": old_id,
333
+ "new_id": new_id,
334
+ "name": name
335
+ })
336
+ logger.info(f"πŸ“€ Sent identity update to frontend: {new_id}")
337
+ except Exception as e:
338
+ logger.warning(f"Failed to send identity update to frontend: {e}")
339
+
340
+ await params.result_callback(f"Identity updated to {name}.")
341
+
342
+ llm.register_function("set_user_identity", wrapped_set_identity)
343
+ logger.info(f"βœ“ LLM initialized with model: {DEEPINFRA_MODEL}")
344
+
345
+ except Exception as e:
346
+ logger.error(f"Failed to initialize LLM: {e}", exc_info=True)
347
+ return
348
+
349
+ # ====================================================================
350
+ # VISION & GATING SERVICES
351
+ # ====================================================================
352
+
353
+ logger.info("Initializing Moondream vision service...")
354
+ moondream = None
355
+ try:
356
+ moondream = MoondreamService(model="vikhyatk/moondream2", revision="2025-01-09")
357
+ logger.info("βœ“ Moondream vision service initialized")
358
+ except Exception as e:
359
+ logger.error(f"Failed to initialize Moondream: {e}")
360
+ return
361
+
362
+ # ====================================================================
363
+ # TARS DISPLAY - Note: Display control via gRPC in robot mode only
364
+ # ====================================================================
365
+
366
+ logger.info("TARS Display features available in robot mode (tars_bot.py)")
367
+ tars_client = None
368
+
369
+ logger.info("Initializing Visual Observer...")
370
+ visual_observer = VisualObserver(
371
+ vision_client=moondream,
372
+ enable_face_detection=True,
373
+ tars_client=tars_client
374
+ )
375
+ logger.info("βœ“ Visual Observer initialized")
376
+
377
+ logger.info("Initializing Emotional State Monitor...")
378
+ emotional_monitor = EmotionalStateMonitor(
379
+ vision_client=moondream,
380
+ model="vikhyatk/moondream2",
381
+ sampling_interval=EMOTIONAL_SAMPLING_INTERVAL,
382
+ intervention_threshold=EMOTIONAL_INTERVENTION_THRESHOLD,
383
+ enabled=EMOTIONAL_MONITORING_ENABLED,
384
+ auto_intervene=False, # Let gating layer handle intervention decisions
385
+ )
386
+ logger.info(f"βœ“ Emotional State Monitor initialized (enabled: {EMOTIONAL_MONITORING_ENABLED})")
387
+ logger.info(f" Mode: Integrated with gating layer for smarter decisions")
388
+
389
+ logger.info("Initializing Gating Layer...")
390
+ gating_layer = InterventionGating(
391
+ api_key=DEEPINFRA_API_KEY,
392
+ base_url=DEEPINFRA_BASE_URL,
393
+ model=DEEPINFRA_GATING_MODEL,
394
+ visual_observer=visual_observer,
395
+ emotional_monitor=emotional_monitor
396
+ )
397
+ logger.info(f"βœ“ Gating Layer initialized with emotional state integration")
398
+
399
+ # ====================================================================
400
+ # MEMORY SERVICE
401
+ # ====================================================================
402
+
403
+ # Memory service: Hybrid search combining vector similarity (70%) and BM25 keyword matching (30%)
404
+ # Optimized for voice AI with <50ms latency target
405
+ logger.info("Initializing hybrid memory service...")
406
+ memory_service = None
407
+ try:
408
+ memory_service = HybridMemoryService(
409
+ user_id=client_id,
410
+ db_path="./memory_data/memory.sqlite",
411
+ search_limit=3,
412
+ search_timeout_ms=100, # Hybrid search needs ~60-80ms, allow buffer
413
+ vector_weight=0.7, # 70% semantic similarity
414
+ bm25_weight=0.3, # 30% keyword matching
415
+ system_prompt_prefix="From our conversations:\n",
416
+ )
417
+ logger.info(f"βœ“ Hybrid memory service initialized for {client_id}")
418
+ except Exception as e:
419
+ logger.error(f"Failed to initialize hybrid memory service: {e}")
420
+ logger.info(" Continuing without memory service...")
421
+ memory_service = None # Continue without memory if it fails
422
+
423
+ # ====================================================================
424
+ # CONTEXT AGGREGATOR & PERSONA STORAGE
425
+ # ====================================================================
426
+
427
+ # Configure user turn aggregation
428
+ # STT services (Speechmatics, Deepgram) handle turn detection internally
429
+ user_params = LLMUserAggregatorParams(
430
+ user_turn_stop_timeout=1.5
431
+ )
432
+
433
+ context_aggregator = LLMContextAggregatorPair(
434
+ context,
435
+ user_params=user_params
436
+ )
437
+
438
+
439
+ persona_storage = get_persona_storage()
440
+ persona_storage["persona_params"] = persona_params
441
+ persona_storage["tars_data"] = tars_data
442
+ persona_storage["context_aggregator"] = context_aggregator
443
+
444
+ # ====================================================================
445
+ # LOGGING PROCESSORS
446
+ # ====================================================================
447
+
448
+ transcription_observer = TranscriptionObserver(
449
+ webrtc_connection=webrtc_connection,
450
+ client_state=client_state
451
+ )
452
+ assistant_observer = AssistantResponseObserver(webrtc_connection=webrtc_connection)
453
+ tts_state_observer = TTSStateObserver(webrtc_connection=webrtc_connection)
454
+ vision_observer = VisionObserver(webrtc_connection=webrtc_connection)
455
+ display_events_observer = DisplayEventsObserver(tars_client=tars_client)
456
+
457
+ # Create MetricsObserver (non-intrusive monitoring outside pipeline)
458
+ metrics_observer = MetricsObserver(
459
+ webrtc_connection=webrtc_connection,
460
+ stt_service=stt
461
+ )
462
+
463
+ # Turn tracking observer (for debugging turn detection)
464
+ turn_observer = TurnTrackingObserver()
465
+
466
+ @turn_observer.event_handler("on_turn_started")
467
+ async def on_turn_started(*args, **kwargs):
468
+ turn_number = args[1] if len(args) > 1 else kwargs.get('turn_number', 0)
469
+ logger.info(f"πŸ—£οΈ [TurnObserver] Turn STARTED: {turn_number}")
470
+ # Notify metrics observer of new turn
471
+ metrics_observer.start_turn(turn_number)
472
+
473
+ @turn_observer.event_handler("on_turn_ended")
474
+ async def on_turn_ended(*args, **kwargs):
475
+ turn_number = args[1] if len(args) > 1 else kwargs.get('turn_number', 0)
476
+ logger.info(f"πŸ—£οΈ [TurnObserver] Turn ENDED: {turn_number}")
477
+
478
+ # ====================================================================
479
+ # PIPELINE ASSEMBLY
480
+ # ====================================================================
481
+
482
+ logger.info("Creating audio/video pipeline...")
483
+
484
+ pipeline = Pipeline([
485
+ pipecat_transport.input(),
486
+ # emotional_monitor, # Real-time emotional state monitoring
487
+ stt,
488
+ pipeline_unifier,
489
+ context_aggregator.user(),
490
+ memory_service, # Hybrid memory (70% vector + 30% BM25) for automatic recall/storage
491
+ # gating_layer, # AI decision system (with emotional state integration)
492
+ llm,
493
+ SilenceFilter(),
494
+ tts,
495
+ pipecat_transport.output(),
496
+ context_aggregator.assistant(),
497
+ ])
498
+
499
+ # ====================================================================
500
+ # EVENT HANDLERS
501
+ # ====================================================================
502
+
503
+ task_ref = {"task": None}
504
+
505
+ @pipecat_transport.event_handler("on_client_connected")
506
+ async def on_client_connected(transport, client):
507
+ logger.info("Pipecat Client connected")
508
+ try:
509
+ if webrtc_connection.is_connected():
510
+ webrtc_connection.send_app_message({"type": "system", "message": "Connection established"})
511
+
512
+ # Send service configuration info with provider and model details
513
+ llm_display = DEEPINFRA_MODEL.split('/')[-1] if '/' in DEEPINFRA_MODEL else DEEPINFRA_MODEL
514
+
515
+ if TTS_PROVIDER == "elevenlabs":
516
+ tts_display = "ElevenLabs: eleven_flash_v2_5"
517
+ else:
518
+ tts_model = QWEN3_TTS_MODEL.split('/')[-1] if '/' in QWEN3_TTS_MODEL else QWEN3_TTS_MODEL
519
+ tts_display = f"Qwen3-TTS: {tts_model}"
520
+
521
+ # Format STT provider name for display
522
+ stt_display = {
523
+ "speechmatics": "Speechmatics",
524
+ "deepgram": "Deepgram Nova-2"
525
+ }.get(STT_PROVIDER, STT_PROVIDER.capitalize())
526
+
527
+ service_info = {
528
+ "stt": stt_display,
529
+ "memory": "Hybrid Search (SQLite)",
530
+ "llm": f"DeepInfra: {llm_display}",
531
+ "tts": tts_display
532
+ }
533
+
534
+ # Store in shared state for Gradio UI
535
+ metrics_store.set_service_info(service_info)
536
+
537
+ # Send via WebRTC
538
+ webrtc_connection.send_app_message({
539
+ "type": "service_info",
540
+ **service_info
541
+ })
542
+ logger.info(f"πŸ“Š Sent service info to frontend: STT={stt_display}, LLM={llm_display}, TTS={tts_display}")
543
+ except Exception as e:
544
+ logger.error(f"❌ Error sending service info: {e}")
545
+
546
+ if task_ref["task"]:
547
+ verbosity = persona_params.get("verbosity", 10) if persona_params else 10
548
+ intro_instruction = get_introduction_instruction(client_state['client_id'], verbosity)
549
+
550
+ if context and hasattr(context, "messages"):
551
+ context.messages.append(intro_instruction)
552
+
553
+ logger.info("Waiting for pipeline to warm up...")
554
+ await asyncio.sleep(2.0)
555
+
556
+ logger.info("Queueing initial LLM greeting...")
557
+ await task_ref["task"].queue_frames([LLMRunFrame()])
558
+
559
+ @pipecat_transport.event_handler("on_client_disconnected")
560
+ async def on_client_disconnected(transport, client):
561
+ logger.info("Pipecat Client disconnected")
562
+ if task_ref["task"]:
563
+ await task_ref["task"].cancel()
564
+ await _cleanup_services(service_refs)
565
+
566
+ # ====================================================================
567
+ # PIPELINE EXECUTION
568
+ # ====================================================================
569
+
570
+ # Enable built-in Pipecat metrics for latency tracking
571
+ user_bot_latency_observer = UserBotLatencyLogObserver()
572
+
573
+ task = PipelineTask(
574
+ pipeline,
575
+ params=PipelineParams(
576
+ enable_metrics=True, # Enable performance metrics (TTFB, latency)
577
+ enable_usage_metrics=True, # Enable LLM/TTS usage metrics
578
+ report_only_initial_ttfb=False, # Report all TTFB measurements
579
+ ),
580
+ observers=[
581
+ turn_observer,
582
+ metrics_observer,
583
+ transcription_observer,
584
+ assistant_observer,
585
+ tts_state_observer,
586
+ vision_observer,
587
+ display_events_observer, # Send events to TARS display
588
+ user_bot_latency_observer, # Measures total user→bot response time
589
+ ], # Non-intrusive monitoring
590
+ )
591
+ task_ref["task"] = task
592
+ runner = PipelineRunner(handle_sigint=False)
593
+
594
+ logger.info("Starting pipeline runner...")
595
+
596
+ try:
597
+ await runner.run(task)
598
+ except Exception:
599
+ raise
600
+ finally:
601
+ await _cleanup_services(service_refs)
602
+
603
+ except Exception as e:
604
+ logger.error(f"Error in bot pipeline: {e}", exc_info=True)
605
+ finally:
606
+ await _cleanup_services(service_refs)
src/pipecat_service.py ADDED
@@ -0,0 +1,274 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Pipecat.ai service for real-time transcription and TTS using SmallWebRTC
4
+ Communicates directly with browser via WebRTC
5
+ """
6
+
7
+ # Fix SSL certificate issues FIRST - before any SSL-using imports
8
+ import os
9
+ import sys
10
+ from pathlib import Path
11
+
12
+ # Add src/ to Python path
13
+ # Add src directory to Python path for imports
14
+ src_dir = Path(__file__).parent
15
+ sys.path.insert(0, str(src_dir))
16
+
17
+ try:
18
+ import certifi
19
+ cert_file = certifi.where()
20
+ os.environ['SSL_CERT_FILE'] = cert_file
21
+ os.environ['REQUESTS_CA_BUNDLE'] = cert_file
22
+ os.environ['CURL_CA_BUNDLE'] = cert_file
23
+ except ImportError:
24
+ pass # certifi not available, will use system certs
25
+
26
+ import ssl
27
+ from contextlib import asynccontextmanager
28
+
29
+ # Configure SSL to use certifi certificates for Python's ssl module
30
+ # For development: disable SSL verification completely to avoid certificate issues
31
+ # This MUST happen before any libraries that use SSL are imported
32
+ try:
33
+ import certifi
34
+ cert_file = certifi.where()
35
+ # Set environment variables for libraries that respect them
36
+ os.environ['SSL_CERT_FILE'] = cert_file
37
+ os.environ['REQUESTS_CA_BUNDLE'] = cert_file
38
+ os.environ['CURL_CA_BUNDLE'] = cert_file
39
+
40
+ # For Python's ssl module: use unverified context for development
41
+ # This bypasses SSL certificate verification to avoid connection issues
42
+ ssl._create_default_https_context = ssl._create_unverified_context
43
+ except ImportError:
44
+ # If certifi not available, use unverified (development only)
45
+ ssl._create_default_https_context = ssl._create_unverified_context
46
+ except Exception as e:
47
+ # If anything fails, use unverified context
48
+ ssl._create_default_https_context = ssl._create_unverified_context
49
+
50
+ import argparse
51
+ import logging
52
+ from fastapi import BackgroundTasks, FastAPI
53
+ from fastapi.middleware.cors import CORSMiddleware
54
+ from loguru import logger
55
+ from pipecat.transports.smallwebrtc.request_handler import (
56
+ SmallWebRTCPatchRequest,
57
+ SmallWebRTCRequest,
58
+ SmallWebRTCRequestHandler,
59
+ )
60
+
61
+ from bot import run_bot
62
+ from config import (
63
+ PIPECAT_HOST,
64
+ PIPECAT_PORT,
65
+ SPEECHMATICS_API_KEY,
66
+ DEEPGRAM_API_KEY,
67
+ ELEVENLABS_API_KEY,
68
+ DEEPINFRA_API_KEY,
69
+ STT_PROVIDER,
70
+ TTS_PROVIDER, # Only used for startup validation
71
+ get_fresh_config,
72
+ )
73
+
74
+ # Remove default loguru handler and set up custom logging
75
+ logger.remove(0)
76
+
77
+ # Configure standard logging
78
+ logging.basicConfig(level=logging.INFO)
79
+ standard_logger = logging.getLogger(__name__)
80
+
81
+ # Reduce noise from websockets library - only log warnings and above
82
+ websockets_logger = logging.getLogger('websockets')
83
+ websockets_logger.setLevel(logging.WARNING)
84
+
85
+ # Log SSL certificate configuration
86
+ try:
87
+ import certifi
88
+ logger.info(f"SSL Configuration: Using certificates from {certifi.where()}")
89
+ logger.info(f"SSL_CERT_FILE env: {os.environ.get('SSL_CERT_FILE', 'not set')}")
90
+ except:
91
+ logger.warning("certifi not available - SSL verification disabled for development")
92
+
93
+
94
+ @asynccontextmanager
95
+ async def lifespan(app: FastAPI):
96
+ """Handle app lifespan events."""
97
+ logger.info(f"Starting Pipecat service on http://{PIPECAT_HOST}:{PIPECAT_PORT}...")
98
+ logger.info(f"STT Provider: {STT_PROVIDER}")
99
+ logger.info(f"TTS Provider: {TTS_PROVIDER}")
100
+
101
+ # Check required API keys based on STT and TTS providers
102
+ missing_keys = []
103
+ if STT_PROVIDER == "speechmatics" and not SPEECHMATICS_API_KEY:
104
+ missing_keys.append("SPEECHMATICS_API_KEY")
105
+ if STT_PROVIDER == "deepgram" and not DEEPGRAM_API_KEY:
106
+ missing_keys.append("DEEPGRAM_API_KEY")
107
+ if not DEEPINFRA_API_KEY:
108
+ missing_keys.append("DEEPINFRA_API_KEY")
109
+ if TTS_PROVIDER == "elevenlabs" and not ELEVENLABS_API_KEY:
110
+ missing_keys.append("ELEVENLABS_API_KEY")
111
+
112
+ if missing_keys:
113
+ logger.error(f"ERROR: Missing required API keys: {', '.join(missing_keys)}")
114
+ sys.exit(1)
115
+
116
+ yield # Run app
117
+
118
+ # Cleanup
119
+ await small_webrtc_handler.close()
120
+ logger.info("Shutting down...")
121
+
122
+
123
+ app = FastAPI(lifespan=lifespan)
124
+
125
+ # Add CORS middleware
126
+ app.add_middleware(
127
+ CORSMiddleware,
128
+ allow_origins=["*"], # In production, replace with specific origins
129
+ allow_credentials=True,
130
+ allow_methods=["*"],
131
+ allow_headers=["*"],
132
+ )
133
+
134
+ # Initialize the SmallWebRTC request handler
135
+ small_webrtc_handler: SmallWebRTCRequestHandler = SmallWebRTCRequestHandler()
136
+
137
+ @app.post("/api/offer")
138
+ async def offer(request: SmallWebRTCRequest, background_tasks: BackgroundTasks):
139
+ """Handle WebRTC offer requests via SmallWebRTCRequestHandler."""
140
+ logger.debug(f"Received WebRTC offer request")
141
+
142
+ # Prepare runner arguments with the callback to run your bot
143
+ async def webrtc_connection_callback(connection):
144
+ background_tasks.add_task(run_bot, connection)
145
+
146
+ # Delegate handling to SmallWebRTCRequestHandler
147
+ answer = await small_webrtc_handler.handle_web_request(
148
+ request=request,
149
+ webrtc_connection_callback=webrtc_connection_callback,
150
+ )
151
+ return answer
152
+
153
+
154
+ @app.patch("/api/offer")
155
+ async def ice_candidate(request: SmallWebRTCPatchRequest):
156
+ """Handle ICE candidate patch requests."""
157
+ logger.debug(f"Received ICE candidate patch request")
158
+ await small_webrtc_handler.handle_patch_request(request)
159
+ return {"status": "success"}
160
+
161
+
162
+ @app.get("/api/status")
163
+ async def status():
164
+ """Health check endpoint with fresh config values."""
165
+ # Get current config from config.ini
166
+ current_config = get_fresh_config()
167
+ current_stt = current_config['STT_PROVIDER']
168
+ current_tts = current_config['TTS_PROVIDER']
169
+ current_model = current_config['DEEPINFRA_MODEL']
170
+
171
+ return {
172
+ "status": "ok",
173
+ "stt_provider": current_stt,
174
+ "tts_provider": current_tts,
175
+ "llm_model": current_model,
176
+ "speechmatics_configured": bool(SPEECHMATICS_API_KEY) if current_stt == "speechmatics" else None,
177
+ "deepgram_configured": bool(DEEPGRAM_API_KEY) if current_stt == "deepgram" else None,
178
+ "elevenlabs_configured": bool(ELEVENLABS_API_KEY) if current_tts == "elevenlabs" else None,
179
+ "deepinfra_configured": bool(DEEPINFRA_API_KEY),
180
+ "qwen3_tts_configured": True if current_tts == "qwen3" else None,
181
+ }
182
+
183
+
184
+ @app.get("/api/config")
185
+ async def get_config():
186
+ """Get current configuration from config.ini."""
187
+ import configparser
188
+ from pathlib import Path
189
+
190
+ config = configparser.ConfigParser()
191
+ config_path = Path("config.ini")
192
+
193
+ if not config_path.exists():
194
+ return {"error": "config.ini not found"}
195
+
196
+ config.read(config_path)
197
+
198
+ return {
199
+ "llm": {
200
+ "model": config.get("LLM", "model", fallback="Qwen/Qwen3-235B-A22B-Instruct-2507")
201
+ },
202
+ "stt": {
203
+ "provider": config.get("STT", "provider", fallback="speechmatics")
204
+ },
205
+ "tts": {
206
+ "provider": config.get("TTS", "provider", fallback="qwen3"),
207
+ "qwen3_model": config.get("TTS", "qwen3_model", fallback="Qwen/Qwen3-TTS-12Hz-0.6B-Base"),
208
+ "qwen3_device": config.get("TTS", "qwen3_device", fallback="mps"),
209
+ "qwen3_ref_audio": config.get("TTS", "qwen3_ref_audio", fallback="tars-clean-compressed.mp3"),
210
+ }
211
+ }
212
+
213
+
214
+ @app.post("/api/config")
215
+ async def update_config(request: dict):
216
+ """Update configuration in config.ini."""
217
+ import configparser
218
+ from pathlib import Path
219
+
220
+ config = configparser.ConfigParser()
221
+ config_path = Path("config.ini")
222
+
223
+ if not config_path.exists():
224
+ return {"error": "config.ini not found"}
225
+
226
+ config.read(config_path)
227
+
228
+ # Update LLM config
229
+ if "llm_model" in request:
230
+ if not config.has_section("LLM"):
231
+ config.add_section("LLM")
232
+ config.set("LLM", "model", request["llm_model"])
233
+
234
+ # Update STT config
235
+ if "stt_provider" in request:
236
+ if not config.has_section("STT"):
237
+ config.add_section("STT")
238
+ config.set("STT", "provider", request["stt_provider"])
239
+
240
+ # Update TTS config
241
+ if "tts_provider" in request:
242
+ if not config.has_section("TTS"):
243
+ config.add_section("TTS")
244
+ config.set("TTS", "provider", request["tts_provider"])
245
+
246
+ # Write back to file
247
+ with open(config_path, "w") as f:
248
+ config.write(f)
249
+
250
+ return {
251
+ "success": True,
252
+ "message": "Configuration updated. Please restart the service for changes to take effect.",
253
+ "restart_required": True
254
+ }
255
+
256
+
257
+ if __name__ == "__main__":
258
+ parser = argparse.ArgumentParser(description="WebRTC Pipecat service")
259
+ parser.add_argument(
260
+ "--host", default=PIPECAT_HOST, help=f"Host for HTTP server (default: {PIPECAT_HOST})"
261
+ )
262
+ parser.add_argument(
263
+ "--port", type=int, default=PIPECAT_PORT, help=f"Port for HTTP server (default: {PIPECAT_PORT})"
264
+ )
265
+ parser.add_argument("--verbose", "-v", action="count")
266
+ args = parser.parse_args()
267
+
268
+ if args.verbose:
269
+ logger.add(sys.stderr, level="TRACE")
270
+ else:
271
+ logger.add(sys.stderr, level="INFO")
272
+
273
+ import uvicorn
274
+ uvicorn.run(app, host=args.host, port=args.port)
src/tars_bot.py ADDED
@@ -0,0 +1,457 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ TARS Bot - Robot Mode
3
+
4
+ Pipecat pipeline that connects to Raspberry Pi TARS robot via WebRTC.
5
+ Uses aiortc client for bidirectional audio and DataChannel for state sync.
6
+
7
+ Architecture:
8
+ - RPi WebRTC Server (aiortc) ← MacBook WebRTC Client (aiortc)
9
+ - Audio: RPi mic β†’ Pipeline β†’ RPi speaker
10
+ - State: DataChannel for real-time sync
11
+ - Commands: gRPC for robot control
12
+ """
13
+
14
+ import sys
15
+ from pathlib import Path
16
+
17
+ # Add src/ to Python path
18
+ # Add src directory to Python path for imports
19
+ src_dir = Path(__file__).parent
20
+ sys.path.insert(0, str(src_dir))
21
+
22
+ import asyncio
23
+ import os
24
+ import uuid
25
+ from loguru import logger
26
+
27
+ from pipecat.pipeline.pipeline import Pipeline
28
+ from pipecat.pipeline.runner import PipelineRunner
29
+ from pipecat.pipeline.task import PipelineTask, PipelineParams
30
+ from pipecat.processors.aggregators.llm_context import LLMContext
31
+ from pipecat.processors.aggregators.llm_response_universal import (
32
+ LLMContextAggregatorPair,
33
+ LLMUserAggregatorParams
34
+ )
35
+ from pipecat.services.openai.llm import OpenAILLMService
36
+ from pipecat.adapters.schemas.tools_schema import ToolsSchema
37
+ from pipecat.transcriptions.language import Language
38
+ from pipecat.frames.frames import LLMRunFrame
39
+
40
+ from config import (
41
+ DEEPGRAM_API_KEY,
42
+ SPEECHMATICS_API_KEY,
43
+ ELEVENLABS_API_KEY,
44
+ ELEVENLABS_VOICE_ID,
45
+ DEEPINFRA_API_KEY,
46
+ DEEPINFRA_BASE_URL,
47
+ RPI_URL,
48
+ RPI_GRPC,
49
+ AUTO_CONNECT,
50
+ RECONNECT_DELAY,
51
+ MAX_RECONNECT_ATTEMPTS,
52
+ get_fresh_config,
53
+ detect_deployment_mode,
54
+ get_robot_grpc_address,
55
+ )
56
+
57
+ from transport import AiortcRPiClient, AudioBridge, StateSync
58
+ from transport.audio_bridge import RPiAudioInputTrack, RPiAudioOutputTrack
59
+ from services.factories import create_stt_service, create_tts_service
60
+ from services import tars_robot
61
+ from services.update_checker import TarsUpdateChecker, CLIENT_VERSION
62
+ from processors import SilenceFilter
63
+ from observers import StateObserver
64
+ from character.prompts import (
65
+ load_persona_ini,
66
+ load_tars_json,
67
+ build_tars_system_prompt,
68
+ get_introduction_instruction,
69
+ )
70
+ from tools import (
71
+ fetch_user_image,
72
+ adjust_persona_parameter,
73
+ execute_movement,
74
+ capture_camera_view,
75
+ create_fetch_image_schema,
76
+ create_adjust_persona_schema,
77
+ create_identity_schema,
78
+ create_movement_schema,
79
+ create_camera_capture_schema,
80
+ get_persona_storage,
81
+ set_emotion,
82
+ do_gesture,
83
+ create_emotion_schema,
84
+ create_gesture_schema,
85
+ set_rate_limiter,
86
+ ExpressionRateLimiter,
87
+ )
88
+
89
+
90
+ async def run_robot_bot():
91
+ """Run TARS bot in robot mode (connected to RPi via aiortc)."""
92
+ logger.info("=" * 80)
93
+ logger.info("πŸ€– Starting TARS in Robot Mode")
94
+ logger.info("=" * 80)
95
+
96
+ # Load fresh configuration
97
+ runtime_config = get_fresh_config()
98
+ DEEPINFRA_MODEL = runtime_config['DEEPINFRA_MODEL']
99
+ STT_PROVIDER = runtime_config['STT_PROVIDER']
100
+ TTS_PROVIDER = runtime_config['TTS_PROVIDER']
101
+ QWEN3_TTS_MODEL = runtime_config['QWEN3_TTS_MODEL']
102
+ QWEN3_TTS_DEVICE = runtime_config['QWEN3_TTS_DEVICE']
103
+ QWEN3_TTS_REF_AUDIO = runtime_config['QWEN3_TTS_REF_AUDIO']
104
+ TARS_DISPLAY_URL = runtime_config['TARS_DISPLAY_URL']
105
+ TARS_DISPLAY_ENABLED = runtime_config['TARS_DISPLAY_ENABLED']
106
+
107
+ # Detect deployment mode
108
+ deployment_mode = detect_deployment_mode()
109
+ robot_grpc_address = get_robot_grpc_address()
110
+
111
+ logger.info(f"πŸ“‹ Configuration:")
112
+ logger.info(f" Client: v{CLIENT_VERSION}")
113
+ logger.info(f" Deployment: {deployment_mode}")
114
+ logger.info(f" STT: {STT_PROVIDER}")
115
+ logger.info(f" LLM: {DEEPINFRA_MODEL}")
116
+ logger.info(f" TTS: {TTS_PROVIDER}")
117
+ logger.info(f" RPi HTTP: {RPI_URL}")
118
+ logger.info(f" RPi gRPC: {robot_grpc_address}")
119
+ logger.info(f" Display: {TARS_DISPLAY_URL} ({'enabled' if TARS_DISPLAY_ENABLED else 'disabled'})")
120
+
121
+ # Session initialization
122
+ session_id = str(uuid.uuid4())[:8]
123
+ client_id = f"guest_{session_id}"
124
+ client_state = {"client_id": client_id}
125
+ logger.info(f"πŸ“± Session: {client_id}")
126
+
127
+ service_refs = {"stt": None, "tts": None, "robot_client": None, "aiortc_client": None}
128
+
129
+ try:
130
+ # ====================================================================
131
+ # WEBRTC CONNECTION TO RPI
132
+ # ====================================================================
133
+
134
+ logger.info("πŸ”Œ Initializing WebRTC client...")
135
+ aiortc_client = AiortcRPiClient(
136
+ rpi_url=RPI_URL,
137
+ auto_reconnect=True,
138
+ reconnect_delay=RECONNECT_DELAY,
139
+ max_reconnect_attempts=MAX_RECONNECT_ATTEMPTS,
140
+ )
141
+ service_refs["aiortc_client"] = aiortc_client
142
+
143
+ # State sync via DataChannel
144
+ state_sync = StateSync()
145
+
146
+ # Set up callbacks
147
+ @aiortc_client.on_connected
148
+ async def on_connected():
149
+ logger.info("βœ“ WebRTC connected to RPi")
150
+ state_sync.set_send_callback(aiortc_client.send_data_channel_message)
151
+
152
+ @aiortc_client.on_disconnected
153
+ async def on_disconnected():
154
+ logger.warning("⚠️ WebRTC disconnected from RPi")
155
+
156
+ @aiortc_client.on_data_channel_message
157
+ def on_data_message(message: str):
158
+ state_sync.handle_message(message)
159
+
160
+ # Register DataChannel message handlers
161
+ state_sync.on_battery_update(lambda level, charging:
162
+ logger.debug(f"πŸ”‹ Battery: {level}% ({'charging' if charging else 'discharging'})"))
163
+
164
+ state_sync.on_movement_status(lambda moving, movement:
165
+ logger.debug(f"🚢 Movement: {movement} ({'active' if moving else 'idle'})"))
166
+
167
+ # Connect to RPi
168
+ if AUTO_CONNECT:
169
+ logger.info("πŸ”„ Connecting to RPi...")
170
+ connected = await aiortc_client.connect()
171
+ if not connected:
172
+ logger.error("❌ Failed to connect to RPi. Exiting.")
173
+ return
174
+ else:
175
+ logger.info("⏸️ Auto-connect disabled. Waiting for manual connection.")
176
+ return
177
+
178
+ # Wait for audio track from RPi
179
+ logger.info("⏳ Waiting for audio track from RPi...")
180
+ timeout = 10
181
+ start_time = asyncio.get_event_loop().time()
182
+ while not aiortc_client.get_audio_track() and (asyncio.get_event_loop().time() - start_time) < timeout:
183
+ await asyncio.sleep(0.1)
184
+
185
+ audio_track_from_rpi = aiortc_client.get_audio_track()
186
+ if not audio_track_from_rpi:
187
+ logger.error("❌ No audio track received from RPi. Exiting.")
188
+ return
189
+
190
+ logger.info("βœ“ Received audio track from RPi")
191
+
192
+ # ====================================================================
193
+ # AUDIO BRIDGE SETUP
194
+ # ====================================================================
195
+
196
+ logger.info("🎧 Setting up audio bridge...")
197
+
198
+ # Create audio input track (RPi mic β†’ Pipecat)
199
+ rpi_input = RPiAudioInputTrack(
200
+ aiortc_track=audio_track_from_rpi,
201
+ sample_rate=16000 # RPi mic sample rate
202
+ )
203
+
204
+ # Create audio output track (Pipecat TTS β†’ RPi speaker)
205
+ rpi_output = RPiAudioOutputTrack(
206
+ sample_rate=24000 # TTS output sample rate
207
+ )
208
+
209
+ # Add output track to WebRTC connection
210
+ aiortc_client.add_audio_track(rpi_output)
211
+
212
+ # Create audio bridge processor
213
+ audio_bridge = AudioBridge(
214
+ rpi_input_track=rpi_input,
215
+ rpi_output_track=rpi_output
216
+ )
217
+
218
+ logger.info("βœ“ Audio bridge ready")
219
+
220
+ # ====================================================================
221
+ # SPEECH-TO-TEXT SERVICE
222
+ # ====================================================================
223
+
224
+ logger.info(f"🎀 Initializing {STT_PROVIDER} STT...")
225
+ stt = create_stt_service(
226
+ provider=STT_PROVIDER,
227
+ speechmatics_api_key=SPEECHMATICS_API_KEY,
228
+ deepgram_api_key=DEEPGRAM_API_KEY,
229
+ language=Language.EN,
230
+ enable_diarization=False,
231
+ )
232
+ service_refs["stt"] = stt
233
+ logger.info(f"βœ“ STT initialized")
234
+
235
+ # ====================================================================
236
+ # TEXT-TO-SPEECH SERVICE
237
+ # ====================================================================
238
+
239
+ logger.info(f"πŸ”Š Initializing {TTS_PROVIDER} TTS...")
240
+ tts = create_tts_service(
241
+ provider=TTS_PROVIDER,
242
+ elevenlabs_api_key=ELEVENLABS_API_KEY,
243
+ elevenlabs_voice_id=ELEVENLABS_VOICE_ID,
244
+ qwen_model=QWEN3_TTS_MODEL,
245
+ qwen_device=QWEN3_TTS_DEVICE,
246
+ qwen_ref_audio=QWEN3_TTS_REF_AUDIO,
247
+ )
248
+ service_refs["tts"] = tts
249
+ logger.info(f"βœ“ TTS initialized")
250
+
251
+ # ====================================================================
252
+ # LLM SERVICE & TOOLS
253
+ # ====================================================================
254
+
255
+ logger.info("🧠 Initializing LLM...")
256
+ llm = OpenAILLMService(
257
+ api_key=DEEPINFRA_API_KEY,
258
+ base_url=DEEPINFRA_BASE_URL,
259
+ model=DEEPINFRA_MODEL
260
+ )
261
+
262
+ # Load character
263
+ character_dir = os.path.join(os.path.dirname(__file__), "character")
264
+ persona_params = load_persona_ini(os.path.join(character_dir, "persona.ini"))
265
+ tars_data = load_tars_json(os.path.join(character_dir, "TARS.json"))
266
+ system_prompt = build_tars_system_prompt(persona_params, tars_data)
267
+
268
+ # Initialize expression rate limiter
269
+ rate_limiter = ExpressionRateLimiter(
270
+ min_emotion_interval=5.0,
271
+ min_gesture_interval=30.0,
272
+ max_gestures_per_session=3
273
+ )
274
+ set_rate_limiter(rate_limiter)
275
+
276
+ # Create tool schemas
277
+ tools = ToolsSchema(
278
+ standard_tools=[
279
+ create_fetch_image_schema(),
280
+ create_adjust_persona_schema(),
281
+ create_identity_schema(),
282
+ create_movement_schema(),
283
+ create_camera_capture_schema(),
284
+ create_emotion_schema(),
285
+ create_gesture_schema(),
286
+ ]
287
+ )
288
+
289
+ messages = [system_prompt]
290
+ context = LLMContext(messages, tools)
291
+
292
+ # Register tool functions
293
+ llm.register_function("fetch_user_image", fetch_user_image)
294
+ llm.register_function("adjust_persona_parameter", adjust_persona_parameter)
295
+ llm.register_function("execute_movement", execute_movement)
296
+ llm.register_function("capture_camera_view", capture_camera_view)
297
+ llm.register_function("set_emotion", set_emotion)
298
+ llm.register_function("do_gesture", do_gesture)
299
+
300
+ logger.info(f"βœ“ LLM initialized with {DEEPINFRA_MODEL}")
301
+
302
+ # ====================================================================
303
+ # TARS ROBOT CLIENT (gRPC commands)
304
+ # ====================================================================
305
+
306
+ logger.info("πŸ€– Initializing TARS Robot Client (gRPC)...")
307
+ robot_client = None
308
+ if TARS_DISPLAY_ENABLED:
309
+ try:
310
+ robot_client = tars_robot.get_robot_client(address=robot_grpc_address)
311
+ service_refs["robot_client"] = robot_client
312
+ if robot_client and tars_robot.is_robot_available():
313
+ logger.info(f"βœ“ TARS Robot Client connected via gRPC at {robot_grpc_address}")
314
+ tars_robot.set_eye_state("idle")
315
+
316
+ # Check daemon version
317
+ logger.info("Checking TARS daemon version...")
318
+ update_checker = TarsUpdateChecker(robot_client)
319
+ await update_checker.check_on_connect()
320
+ else:
321
+ logger.warning("⚠️ TARS Robot not available")
322
+ except Exception as e:
323
+ logger.warning(f"⚠️ Could not initialize TARS Robot: {e}")
324
+ else:
325
+ logger.info("ℹ️ TARS Robot control disabled")
326
+
327
+ # ====================================================================
328
+ # CONTEXT AGGREGATOR
329
+ # ====================================================================
330
+
331
+ user_params = LLMUserAggregatorParams(
332
+ user_turn_stop_timeout=1.5
333
+ )
334
+
335
+ context_aggregator = LLMContextAggregatorPair(
336
+ context,
337
+ user_params=user_params
338
+ )
339
+
340
+ persona_storage = get_persona_storage()
341
+ persona_storage["persona_params"] = persona_params
342
+ persona_storage["tars_data"] = tars_data
343
+ persona_storage["context_aggregator"] = context_aggregator
344
+
345
+ # ====================================================================
346
+ # OBSERVERS
347
+ # ====================================================================
348
+
349
+ state_observer = StateObserver(state_sync=state_sync)
350
+
351
+ # ====================================================================
352
+ # PIPELINE ASSEMBLY
353
+ # ====================================================================
354
+
355
+ logger.info("πŸ”§ Building pipeline...")
356
+
357
+ pipeline = Pipeline([
358
+ stt,
359
+ context_aggregator.user(),
360
+ llm,
361
+ SilenceFilter(),
362
+ tts,
363
+ audio_bridge, # Captures TTS output and sends to RPi speaker
364
+ context_aggregator.assistant(),
365
+ ])
366
+
367
+ # ====================================================================
368
+ # AUDIO INPUT FEEDING
369
+ # ====================================================================
370
+
371
+ # Task reference for audio feeding
372
+ task_ref = {"task": None, "audio_task": None}
373
+
374
+ async def feed_rpi_audio():
375
+ """Feed audio frames from RPi mic into the pipeline."""
376
+ logger.info("🎀 Starting audio input from RPi...")
377
+ try:
378
+ async for audio_frame in rpi_input.start():
379
+ if task_ref.get("task"):
380
+ await task_ref["task"].queue_frames([audio_frame])
381
+ except Exception as e:
382
+ logger.error(f"❌ Audio input error: {e}", exc_info=True)
383
+ finally:
384
+ logger.info("🎀 Audio input stopped")
385
+
386
+ # ====================================================================
387
+ # PIPELINE EXECUTION
388
+ # ====================================================================
389
+
390
+ task = PipelineTask(
391
+ pipeline,
392
+ params=PipelineParams(
393
+ enable_metrics=True,
394
+ enable_usage_metrics=True,
395
+ report_only_initial_ttfb=False,
396
+ ),
397
+ observers=[state_observer],
398
+ )
399
+
400
+ task_ref["task"] = task
401
+ runner = PipelineRunner(handle_sigint=True)
402
+
403
+ logger.info("▢️ Starting pipeline...")
404
+ logger.info("=" * 80)
405
+
406
+ # Start audio input feeding task
407
+ audio_task = asyncio.create_task(feed_rpi_audio())
408
+ task_ref["audio_task"] = audio_task
409
+
410
+ # Send initial greeting
411
+ await asyncio.sleep(2.0)
412
+ intro_instruction = get_introduction_instruction(client_id, persona_params.get("verbosity", 10))
413
+ if context and hasattr(context, "messages"):
414
+ context.messages.append(intro_instruction)
415
+ await task.queue_frames([LLMRunFrame()])
416
+
417
+ # Run pipeline
418
+ try:
419
+ await runner.run(task)
420
+ finally:
421
+ # Cancel audio feeding task
422
+ if task_ref.get("audio_task"):
423
+ task_ref["audio_task"].cancel()
424
+ try:
425
+ await task_ref["audio_task"]
426
+ except asyncio.CancelledError:
427
+ pass
428
+
429
+ except KeyboardInterrupt:
430
+ logger.info("πŸ›‘ Interrupted by user")
431
+ except Exception as e:
432
+ logger.error(f"❌ Error in robot bot: {e}", exc_info=True)
433
+ finally:
434
+ # Cleanup
435
+ logger.info("🧹 Cleaning up...")
436
+ if service_refs.get("aiortc_client"):
437
+ await service_refs["aiortc_client"].disconnect()
438
+ if service_refs.get("stt"):
439
+ try:
440
+ await service_refs["stt"].close()
441
+ except:
442
+ pass
443
+ if service_refs.get("tts"):
444
+ try:
445
+ await service_refs["tts"].close()
446
+ except:
447
+ pass
448
+ if service_refs.get("robot_client"):
449
+ try:
450
+ tars_robot.close_robot_client()
451
+ except:
452
+ pass
453
+ logger.info("βœ“ Cleanup complete")
454
+
455
+
456
+ if __name__ == "__main__":
457
+ asyncio.run(run_robot_bot())
ui/README.md CHANGED
@@ -39,7 +39,7 @@ Then open http://localhost:7861
39
 
40
  Terminal 1:
41
  ```bash
42
- python bot.py
43
  ```
44
 
45
  Terminal 2:
@@ -52,7 +52,7 @@ python ui/app.py
52
  The UI reads from `src/shared_state.py`, which is populated by observers in the Pipecat pipeline:
53
 
54
  ```
55
- bot.py (Pipecat Pipeline)
56
  ↓
57
  src/observers/ (metrics, transcription, assistant)
58
  ↓
@@ -123,7 +123,7 @@ python tests/gradio/test_gradio.py
123
  ## Troubleshooting
124
 
125
  ### No data showing
126
- - Ensure bot.py is running
127
  - Check that WebRTC client is connected
128
  - Verify at least one conversation turn has completed
129
 
@@ -133,7 +133,7 @@ pip install gradio plotly
133
  ```
134
 
135
  ### Charts not updating
136
- - Check that observers are enabled in bot.py
137
  - Verify shared_state.py is being imported correctly
138
  - Check console for errors
139
 
 
39
 
40
  Terminal 1:
41
  ```bash
42
+ python src/src/bot.py
43
  ```
44
 
45
  Terminal 2:
 
52
  The UI reads from `src/shared_state.py`, which is populated by observers in the Pipecat pipeline:
53
 
54
  ```
55
+ src/bot.py (Pipecat Pipeline)
56
  ↓
57
  src/observers/ (metrics, transcription, assistant)
58
  ↓
 
123
  ## Troubleshooting
124
 
125
  ### No data showing
126
+ - Ensure src/bot.py is running
127
  - Check that WebRTC client is connected
128
  - Verify at least one conversation turn has completed
129
 
 
133
  ```
134
 
135
  ### Charts not updating
136
+ - Check that observers are enabled in src/bot.py
137
  - Verify shared_state.py is being imported correctly
138
  - Check console for errors
139
 
ui/app.py CHANGED
@@ -337,13 +337,13 @@ with gr.Blocks(
337
  gr.Markdown("""
338
  **To connect to TARS:**
339
 
340
- 1. Ensure bot pipeline is running: `python bot.py`
341
  2. Open WebRTC client in browser
342
  3. Pipeline will connect automatically
343
 
344
  **Endpoints:**
345
  - WebRTC Signaling: Handled by SmallWebRTC transport
346
- - Health Check: Check bot.py logs for status
347
 
348
  **Architecture:**
349
  - Pipecat pipeline with STT, LLM, TTS
 
337
  gr.Markdown("""
338
  **To connect to TARS:**
339
 
340
+ 1. Ensure bot pipeline is running: `python src/src/bot.py`
341
  2. Open WebRTC client in browser
342
  3. Pipeline will connect automatically
343
 
344
  **Endpoints:**
345
  - WebRTC Signaling: Handled by SmallWebRTC transport
346
+ - Health Check: Check src/bot.py logs for status
347
 
348
  **Architecture:**
349
  - Pipecat pipeline with STT, LLM, TTS