Spaces:

pgits
/

voiceCal

Sleeping

App Files Files Community

voiceCal / README.md

Peter Michael Gits

feat: Complete fresh restart with clean Streamlit configuration

3cb7ae9 7 months ago

preview code

raw

history blame contribute delete

3.83 kB

	---
	title: VoiceCal - Voice-Enabled AI Scheduling Assistant
	emoji: 🎤📅
	colorFrom: blue
	colorTo: purple
	sdk: streamlit
	app_file: app.py
	pinned: false
	license: mit
	short_description: Voice-enabled AI assistant with WebRTC integration
	---

	# 🎤📅 VoiceCal - Voice-Enabled AI Scheduling Assistant

	An intelligent AI scheduling assistant with voice interaction capabilities powered by WebRTC, Whisper STT, and Bark TTS. Book Google Calendar appointments through natural conversation - now with voice input and audio responses!

	> 🚀 New Deployment: VoiceCal is the latest iteration with improved WebSocket-based voice services and enhanced reliability. Previous ChatCal.ai space has been superseded by this deployment.

	## 🎯 Features

	### 🗣️ Voice Interaction
	- 🎤 Real-time Speech-to-Text: WebRTC audio capture with Whisper transcription
	- 🔊 Text-to-Speech Responses: AI responses with natural voice synthesis
	- 🎭 Multiple Voice Options: Choose from different voice personalities
	- ⚡ Real-time Processing: Live transcription as you speak

	### 📅 Smart Calendar Integration
	- 🤖 AI-Powered Booking: Natural language appointment scheduling
	- 📅 Google Calendar Sync: Seamless integration with your calendar
	- 🔍 Conflict Detection: Smart availability checking
	- 🎥 Google Meet Integration: Automatic video conference setup
	- 📧 Email Notifications: Booking confirmations and cancellations

	### 🧠 Intelligent Conversation
	- 💭 Conversation Memory: Persistent context across interactions
	- 🎯 Smart Extraction: Automatically extract names, emails, times
	- 🌍 Timezone Awareness: Global scheduling support
	- ⏰ Flexible Time Parsing: "tomorrow at 2pm", "next Tuesday", etc.

	## 🏗️ Architecture

	- Frontend: Gradio with WebRTC audio capture
	- AI: Groq Llama-3.1 with Anthropic Claude fallback
	- STT: Whisper via external service integration
	- TTS: Bark text-to-speech synthesis
	- Calendar: Google Calendar API with OAuth2
	- Storage: Google Cloud Secret Manager for persistent auth

	## 🚀 Usage

	### Voice Interaction
	1. Click the microphone button to start recording
	2. Speak naturally: "Hi, I'm John. Book a 30-minute meeting tomorrow at 2pm"
	3. Watch real-time transcription appear in the text box
	4. AI responds with voice confirmation of your booking

	### Text Interaction
	- Type messages as normal - voice and text work together
	- Edit voice transcriptions before sending
	- Use quick action buttons for common requests

	### Example Conversations

	Voice: "Book a Google Meet with Peter next Tuesday at 10 AM for 45 minutes"
	AI Audio Response: "Perfect! I've scheduled your 45-minute Google Meet with Peter for next Tuesday at 10:00 AM..."

	## 🛠️ Development

	This space integrates:
	- VoiceCal Core: Calendar booking logic and Google integration
	- STT Service: External Whisper service for speech recognition via WebSocket
	- TTS Service: External Bark service for voice synthesis via WebSocket
	- WebRTC: Browser-based audio capture and streaming

	### 🔧 Service Configuration

	For optimal performance, the external voice services should be configured with:
	- STT Service: `STT_SERVICE_MODE=websocket` (enables WebSocket endpoints)
	- TTS Service: `TTS_SERVICE_MODE=websocket` (enables WebSocket endpoints)
	- Service URLs:
	- STT: `wss://pgits-stt-gpu-service.hf.space/ws/stt`
	- TTS: `wss://pgits-tts-gpu-service.hf.space/ws/tts`

	## 🔐 Privacy & Security

	- Secure OAuth2 authentication with Google
	- Audio processed in real-time, not stored
	- Persistent token storage via Google Secret Manager
	- All calendar operations respect your existing permissions

	## 📞 Contact

	For business scheduling needs or technical support: pgits.job@gmail.com