Spaces:
Build error
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Development Commands
Environment Setup
poetry install- Install dependencies using Poetrypoetry shell- Activate virtual environmentpip install -r requirements.txt- Alternative pip installation
Development Server
uvicorn app.api.main:app --reload- Run development server with hot reloadpython app.py- Run production server (HuggingFace Spaces compatible)
Testing & Quality
poetry run pytest- Run test suitepoetry run pytest unit_tests/test_booking_interception.py- Run single test filepoetry run pytest --cov=app- Run tests with coveragepoetry run black .- Code formattingpoetry run flake8- Lintingpoetry run mypy .- Type checking
Utility Scripts
python scripts/refresh_google_token.py- Refresh Google OAuth credentialspython scripts/debug_chat.py- Interactive CLI for testing conversations
Docker
docker build -t voicecal-ai:latest .- Build Docker imagedocker run -p 7860:7860 voicecal-ai:latest- Run container
Architecture Overview
Core Components
- app/core/agent.py -
ChatCalAgent: ReAct agent with Google Calendar tools, user info tracking, and loop detection - app/core/tools.py -
CalendarTools: check_availability and create_appointment implementations - app/core/llm.py + app/core/llm_anthropic.py - LLM abstraction; primary is Anthropic claude-sonnet-4-20250514; Groq and Gemini are fallbacks
- app/core/custom_parser.py -
VerbatimOutputParser: extracts raw tool responses from ReAct output (critical for preserving HTML formatting) - app/core/session_factory.py - Selects Redis (dev) or JWT (HuggingFace) backend at runtime
- app/personality/prompts.py - 400+ line system prompt that drives the entire booking workflow
API Structure
- app/api/main.py - FastAPI application with ~20 endpoints
- app/api/chat_widget.py - Embeddable chat UI (HTML/JS served inline)
- app/api/models.py - Pydantic models for API requests/responses
Configuration
- app/config.py - Centralized Pydantic BaseSettings; loads from env vars then
.env - Session backend configurable:
SESSION_BACKEND=redis(default) orSESSION_BACKEND=jwt
Key Architectural Patterns
Agent Workflow
The agent is a LlamaIndex ReActAgent. On each turn:
- A
CalendarLLMWrapperdynamically injects a system prompt with missing user info and user context - The agent reasons through tool calls (
check_availability,create_appointment) VerbatimOutputParserextracts raw tool output verbatim β never summarized β to preserve HTML confirmation markup- Booking success is detected by markers like
<div id="booking-success">and celebration phrases ("thanks", "all set") to return a pre-written farewell without an LLM call
Email Auto-Provision
Email flows from the landing page URL (/chat-widget?email=...) β stored in session user_data β loaded into agent.user_info on init. The system prompt strictly forbids asking for email again.
Meeting Conflict Handling
When create_appointment detects a conflict, it stores the full booking details in conversation_state.pending_operation so the agent can retry with an alternate time without re-asking for topic/duration.
Custom Meeting ID Format
MMDD-HHMM-DURm (e.g., 0731-1400-60m) β encodes date, time, and duration in a human-readable string stored alongside the Google Calendar event ID.
Loop Detection
Agent tracks the last 5 responses and stops after 2 similar consecutive responses (normalized for missing-info content).
Key Features
LLM Integration
- Primary: Anthropic Claude Sonnet 4 (
claude-sonnet-4-20250514) - Fallbacks: Groq Llama-3.1-8b-instant, Google Gemini
- Mock LLM available for testing via
USE_MOCK_LLM=trueenv var - HTML-formatted responses pass through verbatim
Calendar Integration
- Google Calendar OAuth2 authentication
- Meeting booking with conflict detection
- Google Meet video link integration
- Email invitations sent to attendees (SMTP via Gmail)
- Business hours configurable via env vars (default: 9 AMβ5 PM EST)
Session Management
- Redis sessions for stateful development
- JWT sessions for stateless HuggingFace deployment (no external dependencies)
- Configurable session timeout (default: 10 minutes)
- Conversation objects are always in-memory regardless of backend
Speech-to-Text / Text-to-Speech
- STT file upload:
POST /api/stt/transcribeβ Groq Whisper API (whisper-large-v3-turbo); auto-converts MP4+Opus β WebM - TTS:
POST /tts/synthesizeβ Groq PlayAI with 27 available voices; audio cached in-memory (10 most recent) - WebSocket STT: Configured for
wss://pgits-stt-gpu-service-v3.hf.space/wsat 16kHz (feature flag, not active by default)
Deployment
HuggingFace Spaces
- Entry point:
app.py(forcesSESSION_BACKEND=jwt, port 7860) - Logs written to stdout and
/tmp/app.log - SSH debugging:
ssh -i ~/.ssh/id_ed25519 pgits-voicecal-ai@ssh.hf.space(via Dev Mode) - HF deploys from
mainbranch
Environment Variables
Required:
GROQ_API_KEYβ Groq LLM and Whisper STTANTHROPIC_API_KEYβ Primary LLMGOOGLE_CLIENT_ID,GOOGLE_CLIENT_SECRETβ OAuth2 credentialsSECRET_KEYβ Session signingMY_PHONE_NUMBER,MY_EMAIL_ADDRESSβ Peter's contact info
Optional:
GEMINI_API_KEYβ Fallback LLMTESTING_MODE=trueβ Bypass email validation for developmentSESSION_BACKEND=jwtβ Stateless mode for HuggingFaceUSE_MOCK_LLM=trueβ Mock LLM for unit testsBUSINESS_START_HOUR,BUSINESS_END_HOURβ Override default 9β17
Version Management
- Semantic versioning in
pyproject.tomlandversion.txt - Update both files with every commit
- Version displayed in UI footer
Development Notes
- Tool response preservation is critical: never summarize tool output; use
VerbatimOutputParserto return as-is - System prompt is the source of truth for agent behavior β edit
app/personality/prompts.pyfor workflow changes - HTML responses require
allow_html=truein agent configuration testing_modebypasses email validation (allows non-Peter emails)- Decision: removed booking summary step β allow raw HTML confirmation through directly to frontend
- Google credentials stored in
credentials/dir and synced to HF Secrets after OAuth flow