voiceCal-ai-v3 / CLAUDE.md
pgits's picture
DOCS: Update CLAUDE.md with accurate architecture and commands
cd8a534

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Development Commands

Environment Setup

  • poetry install - Install dependencies using Poetry
  • poetry shell - Activate virtual environment
  • pip install -r requirements.txt - Alternative pip installation

Development Server

  • uvicorn app.api.main:app --reload - Run development server with hot reload
  • python app.py - Run production server (HuggingFace Spaces compatible)

Testing & Quality

  • poetry run pytest - Run test suite
  • poetry run pytest unit_tests/test_booking_interception.py - Run single test file
  • poetry run pytest --cov=app - Run tests with coverage
  • poetry run black . - Code formatting
  • poetry run flake8 - Linting
  • poetry run mypy . - Type checking

Utility Scripts

  • python scripts/refresh_google_token.py - Refresh Google OAuth credentials
  • python scripts/debug_chat.py - Interactive CLI for testing conversations

Docker

  • docker build -t voicecal-ai:latest . - Build Docker image
  • docker run -p 7860:7860 voicecal-ai:latest - Run container

Architecture Overview

Core Components

  • app/core/agent.py - ChatCalAgent: ReAct agent with Google Calendar tools, user info tracking, and loop detection
  • app/core/tools.py - CalendarTools: check_availability and create_appointment implementations
  • app/core/llm.py + app/core/llm_anthropic.py - LLM abstraction; primary is Anthropic claude-sonnet-4-20250514; Groq and Gemini are fallbacks
  • app/core/custom_parser.py - VerbatimOutputParser: extracts raw tool responses from ReAct output (critical for preserving HTML formatting)
  • app/core/session_factory.py - Selects Redis (dev) or JWT (HuggingFace) backend at runtime
  • app/personality/prompts.py - 400+ line system prompt that drives the entire booking workflow

API Structure

  • app/api/main.py - FastAPI application with ~20 endpoints
  • app/api/chat_widget.py - Embeddable chat UI (HTML/JS served inline)
  • app/api/models.py - Pydantic models for API requests/responses

Configuration

  • app/config.py - Centralized Pydantic BaseSettings; loads from env vars then .env
  • Session backend configurable: SESSION_BACKEND=redis (default) or SESSION_BACKEND=jwt

Key Architectural Patterns

Agent Workflow

The agent is a LlamaIndex ReActAgent. On each turn:

  1. A CalendarLLMWrapper dynamically injects a system prompt with missing user info and user context
  2. The agent reasons through tool calls (check_availability, create_appointment)
  3. VerbatimOutputParser extracts raw tool output verbatim β€” never summarized β€” to preserve HTML confirmation markup
  4. Booking success is detected by markers like <div id="booking-success"> and celebration phrases ("thanks", "all set") to return a pre-written farewell without an LLM call

Email Auto-Provision

Email flows from the landing page URL (/chat-widget?email=...) β†’ stored in session user_data β†’ loaded into agent.user_info on init. The system prompt strictly forbids asking for email again.

Meeting Conflict Handling

When create_appointment detects a conflict, it stores the full booking details in conversation_state.pending_operation so the agent can retry with an alternate time without re-asking for topic/duration.

Custom Meeting ID Format

MMDD-HHMM-DURm (e.g., 0731-1400-60m) β€” encodes date, time, and duration in a human-readable string stored alongside the Google Calendar event ID.

Loop Detection

Agent tracks the last 5 responses and stops after 2 similar consecutive responses (normalized for missing-info content).

Key Features

LLM Integration

  • Primary: Anthropic Claude Sonnet 4 (claude-sonnet-4-20250514)
  • Fallbacks: Groq Llama-3.1-8b-instant, Google Gemini
  • Mock LLM available for testing via USE_MOCK_LLM=true env var
  • HTML-formatted responses pass through verbatim

Calendar Integration

  • Google Calendar OAuth2 authentication
  • Meeting booking with conflict detection
  • Google Meet video link integration
  • Email invitations sent to attendees (SMTP via Gmail)
  • Business hours configurable via env vars (default: 9 AM–5 PM EST)

Session Management

  • Redis sessions for stateful development
  • JWT sessions for stateless HuggingFace deployment (no external dependencies)
  • Configurable session timeout (default: 10 minutes)
  • Conversation objects are always in-memory regardless of backend

Speech-to-Text / Text-to-Speech

  • STT file upload: POST /api/stt/transcribe β€” Groq Whisper API (whisper-large-v3-turbo); auto-converts MP4+Opus β†’ WebM
  • TTS: POST /tts/synthesize β€” Groq PlayAI with 27 available voices; audio cached in-memory (10 most recent)
  • WebSocket STT: Configured for wss://pgits-stt-gpu-service-v3.hf.space/ws at 16kHz (feature flag, not active by default)

Deployment

HuggingFace Spaces

  • Entry point: app.py (forces SESSION_BACKEND=jwt, port 7860)
  • Logs written to stdout and /tmp/app.log
  • SSH debugging: ssh -i ~/.ssh/id_ed25519 pgits-voicecal-ai@ssh.hf.space (via Dev Mode)
  • HF deploys from main branch

Environment Variables

Required:

  • GROQ_API_KEY β€” Groq LLM and Whisper STT
  • ANTHROPIC_API_KEY β€” Primary LLM
  • GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET β€” OAuth2 credentials
  • SECRET_KEY β€” Session signing
  • MY_PHONE_NUMBER, MY_EMAIL_ADDRESS β€” Peter's contact info

Optional:

  • GEMINI_API_KEY β€” Fallback LLM
  • TESTING_MODE=true β€” Bypass email validation for development
  • SESSION_BACKEND=jwt β€” Stateless mode for HuggingFace
  • USE_MOCK_LLM=true β€” Mock LLM for unit tests
  • BUSINESS_START_HOUR, BUSINESS_END_HOUR β€” Override default 9–17

Version Management

  • Semantic versioning in pyproject.toml and version.txt
  • Update both files with every commit
  • Version displayed in UI footer

Development Notes

  • Tool response preservation is critical: never summarize tool output; use VerbatimOutputParser to return as-is
  • System prompt is the source of truth for agent behavior β€” edit app/personality/prompts.py for workflow changes
  • HTML responses require allow_html=true in agent configuration
  • testing_mode bypasses email validation (allows non-Peter emails)
  • Decision: removed booking summary step β€” allow raw HTML confirmation through directly to frontend
  • Google credentials stored in credentials/ dir and synced to HF Secrets after OAuth flow