Spaces:

pgits
/

voiceCal-ai-v3

Build error

pgits Claude Sonnet 4.6 commited on Feb 21

Commit

cd8a534

1 Parent(s): 082abed

DOCS: Update CLAUDE.md with accurate architecture and commands

Corrects LLM primary (Anthropic Claude Sonnet 4, not Groq), adds
VerbatimOutputParser pattern, email auto-provision flow, conflict
handling, single test command, and utility scripts section.

Bump version to 1.7.8.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Files changed (1) hide show

CLAUDE.md +69 -45

CLAUDE.md CHANGED Viewed

@@ -15,11 +15,16 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 ### Testing & Quality
 - `poetry run pytest` - Run test suite
 - `poetry run pytest --cov=app` - Run tests with coverage
 - `poetry run black .` - Code formatting
 - `poetry run flake8` - Linting
 - `poetry run mypy .` - Type checking
 ### Docker
 - `docker build -t voicecal-ai:latest .` - Build Docker image
 - `docker run -p 7860:7860 voicecal-ai:latest` - Run container
@@ -27,83 +32,102 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 ## Architecture Overview
 ### Core Components
-- **app/core/agent.py** - Main LLM agent with conversation management and Google Calendar tool integration
-- **app/core/tools.py** - Calendar booking, cancellation, and availability tools
-- **app/core/llm.py** - LLM abstraction with Groq (primary), Anthropic and Gemini fallbacks
-- **app/core/session.py** - Session management (Redis or JWT based)
-- **app/calendar/** - Google Calendar API integration and authentication
 ### API Structure
-- **app/api/main.py** - FastAPI application with core endpoints
-- **app/api/chat_widget.py** - Embeddable chat widget with STT integration
-- **app/api/simple_chat.py** - Simple chat interface for testing
 - **app/api/models.py** - Pydantic models for API requests/responses
 ### Configuration
-- **app/config.py** - Centralized settings with Pydantic BaseSettings
-- Environment variables loaded from `.env` file
-- Session backend configurable: Redis (default) or JWT (for HuggingFace Spaces)
 ## Key Features
 ### LLM Integration
-- Primary: Groq Llama-3.1-8b-instant
-- Fallbacks: Anthropic Claude, Google Gemini
-- Conversation memory with configurable history length
-- HTML-formatted responses for rich display
 ### Calendar Integration
 - Google Calendar OAuth2 authentication
 - Meeting booking with conflict detection
-- Custom meeting ID format: MMDD-HHMM-DURm (e.g., 0731-1400-60m)
-- Smart cancellation by matching user/time details
-- Google Meet integration for video calls
-- Email notifications for bookings/cancellations
 ### Session Management
 - Redis sessions for stateful development
-- JWT sessions for stateless HuggingFace deployment
 - Configurable session timeout (default: 10 minutes)
-- Session factory pattern in `app/core/session_factory.py`
-### Speech-to-Text Integration
-- WebSocket connection to external STT service
-- Real-time audio processing at 16kHz
-- Configurable silence detection and auto-submit
-- STT service URL: `wss://pgits-stt-gpu-service-v3.hf.space/ws`
 ## Deployment
 ### HuggingFace Spaces
-- Uses `app.py` as entry point with JWT sessions
-- Dockerfile optimized for HF Spaces (port 7860)
-- Enhanced logging with file output for debugging
-- Dev Mode SSH debugging support
 ### Environment Variables
 Required:
-- `GROQ_API_KEY` - Groq LLM API key
-- `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` - OAuth2 credentials
-- `SECRET_KEY` - Application secret
-- `MY_PHONE_NUMBER`, `MY_EMAIL_ADDRESS` - Contact information
 Optional:
-- `ANTHROPIC_API_KEY`, `GEMINI_API_KEY` - Fallback LLM APIs
-- `TESTING_MODE=true` - Bypass email validation for development
-- `SESSION_BACKEND=jwt` - Use JWT instead of Redis (for HuggingFace)
 ### Version Management
 - Semantic versioning in `pyproject.toml` and `version.txt`
 - Version displayed in UI footer
-- Update version with each commit per deployment workflow
 ## Development Notes
-- Always update semantic version before committing
-- Use Poetry for dependency management in development
-- Testing mode bypasses Peter's email validation
 - HTML responses require `allow_html=true` in agent configuration
-- Calendar conflicts automatically detected and reported
-- Business hours configurable via environment variables (default: 9 AM - 5 PM)
-- decision to remove the summary when meeting booked, follow a different path, allow raw html through to show.

 ### Testing & Quality
 - `poetry run pytest` - Run test suite
+- `poetry run pytest unit_tests/test_booking_interception.py` - Run single test file
 - `poetry run pytest --cov=app` - Run tests with coverage
 - `poetry run black .` - Code formatting
 - `poetry run flake8` - Linting
 - `poetry run mypy .` - Type checking
+### Utility Scripts
+- `python scripts/refresh_google_token.py` - Refresh Google OAuth credentials
+- `python scripts/debug_chat.py` - Interactive CLI for testing conversations
 ### Docker
 - `docker build -t voicecal-ai:latest .` - Build Docker image
 - `docker run -p 7860:7860 voicecal-ai:latest` - Run container
 ## Architecture Overview
 ### Core Components
+- **app/core/agent.py** - `ChatCalAgent`: ReAct agent with Google Calendar tools, user info tracking, and loop detection
+- **app/core/tools.py** - `CalendarTools`: check_availability and create_appointment implementations
+- **app/core/llm.py** + **app/core/llm_anthropic.py** - LLM abstraction; primary is Anthropic claude-sonnet-4-20250514; Groq and Gemini are fallbacks
+- **app/core/custom_parser.py** - `VerbatimOutputParser`: extracts raw tool responses from ReAct output (critical for preserving HTML formatting)
+- **app/core/session_factory.py** - Selects Redis (dev) or JWT (HuggingFace) backend at runtime
+- **app/personality/prompts.py** - 400+ line system prompt that drives the entire booking workflow
 ### API Structure
+- **app/api/main.py** - FastAPI application with ~20 endpoints
+- **app/api/chat_widget.py** - Embeddable chat UI (HTML/JS served inline)
 - **app/api/models.py** - Pydantic models for API requests/responses
 ### Configuration
+- **app/config.py** - Centralized Pydantic BaseSettings; loads from env vars then `.env`
+- Session backend configurable: `SESSION_BACKEND=redis` (default) or `SESSION_BACKEND=jwt`
+## Key Architectural Patterns
+### Agent Workflow
+The agent is a LlamaIndex ReActAgent. On each turn:
+1. A `CalendarLLMWrapper` dynamically injects a system prompt with missing user info and user context
+2. The agent reasons through tool calls (`check_availability`, `create_appointment`)
+3. `VerbatimOutputParser` extracts raw tool output verbatim — never summarized — to preserve HTML confirmation markup
+4. Booking success is detected by markers like `<div id="booking-success">` and celebration phrases ("thanks", "all set") to return a pre-written farewell without an LLM call
+### Email Auto-Provision
+Email flows from the landing page URL (`/chat-widget?email=...`) → stored in session `user_data` → loaded into `agent.user_info` on init. The system prompt strictly forbids asking for email again.
+### Meeting Conflict Handling
+When `create_appointment` detects a conflict, it stores the full booking details in `conversation_state.pending_operation` so the agent can retry with an alternate time without re-asking for topic/duration.
+### Custom Meeting ID Format
+`MMDD-HHMM-DURm` (e.g., `0731-1400-60m`) — encodes date, time, and duration in a human-readable string stored alongside the Google Calendar event ID.
+### Loop Detection
+Agent tracks the last 5 responses and stops after 2 similar consecutive responses (normalized for missing-info content).
 ## Key Features
 ### LLM Integration
+- Primary: Anthropic Claude Sonnet 4 (`claude-sonnet-4-20250514`)
+- Fallbacks: Groq Llama-3.1-8b-instant, Google Gemini
+- Mock LLM available for testing via `USE_MOCK_LLM=true` env var
+- HTML-formatted responses pass through verbatim
 ### Calendar Integration
 - Google Calendar OAuth2 authentication
 - Meeting booking with conflict detection
+- Google Meet video link integration
+- Email invitations sent to attendees (SMTP via Gmail)
+- Business hours configurable via env vars (default: 9 AM–5 PM EST)
 ### Session Management
 - Redis sessions for stateful development
+- JWT sessions for stateless HuggingFace deployment (no external dependencies)
 - Configurable session timeout (default: 10 minutes)
+- Conversation objects are always in-memory regardless of backend
+### Speech-to-Text / Text-to-Speech
+- **STT file upload**: `POST /api/stt/transcribe` — Groq Whisper API (whisper-large-v3-turbo); auto-converts MP4+Opus → WebM
+- **TTS**: `POST /tts/synthesize` — Groq PlayAI with 27 available voices; audio cached in-memory (10 most recent)
+- **WebSocket STT**: Configured for `wss://pgits-stt-gpu-service-v3.hf.space/ws` at 16kHz (feature flag, not active by default)
 ## Deployment
 ### HuggingFace Spaces
+- Entry point: `app.py` (forces `SESSION_BACKEND=jwt`, port 7860)
+- Logs written to stdout and `/tmp/app.log`
+- SSH debugging: `ssh -i ~/.ssh/id_ed25519 pgits-voicecal-ai@ssh.hf.space` (via Dev Mode)
+- HF deploys from `main` branch
 ### Environment Variables
 Required:
+- `GROQ_API_KEY` — Groq LLM and Whisper STT
+- `ANTHROPIC_API_KEY` — Primary LLM
+- `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` — OAuth2 credentials
+- `SECRET_KEY` — Session signing
+- `MY_PHONE_NUMBER`, `MY_EMAIL_ADDRESS` — Peter's contact info
 Optional:
+- `GEMINI_API_KEY` — Fallback LLM
+- `TESTING_MODE=true` — Bypass email validation for development
+- `SESSION_BACKEND=jwt` — Stateless mode for HuggingFace
+- `USE_MOCK_LLM=true` — Mock LLM for unit tests
+- `BUSINESS_START_HOUR`, `BUSINESS_END_HOUR` — Override default 9–17
 ### Version Management
 - Semantic versioning in `pyproject.toml` and `version.txt`
+- Update both files with every commit
 - Version displayed in UI footer
 ## Development Notes
+- **Tool response preservation is critical**: never summarize tool output; use `VerbatimOutputParser` to return as-is
+- **System prompt is the source of truth** for agent behavior — edit `app/personality/prompts.py` for workflow changes
 - HTML responses require `allow_html=true` in agent configuration
+- `testing_mode` bypasses email validation (allows non-Peter emails)
+- Decision: removed booking summary step — allow raw HTML confirmation through directly to frontend
+- Google credentials stored in `credentials/` dir and synced to HF Secrets after OAuth flow