pgits Claude Sonnet 4.6 commited on
Commit
cd8a534
Β·
1 Parent(s): 082abed

DOCS: Update CLAUDE.md with accurate architecture and commands

Browse files

Corrects LLM primary (Anthropic Claude Sonnet 4, not Groq), adds
VerbatimOutputParser pattern, email auto-provision flow, conflict
handling, single test command, and utility scripts section.

Bump version to 1.7.8.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Files changed (1) hide show
  1. CLAUDE.md +69 -45
CLAUDE.md CHANGED
@@ -15,11 +15,16 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
15
 
16
  ### Testing & Quality
17
  - `poetry run pytest` - Run test suite
 
18
  - `poetry run pytest --cov=app` - Run tests with coverage
19
  - `poetry run black .` - Code formatting
20
  - `poetry run flake8` - Linting
21
  - `poetry run mypy .` - Type checking
22
 
 
 
 
 
23
  ### Docker
24
  - `docker build -t voicecal-ai:latest .` - Build Docker image
25
  - `docker run -p 7860:7860 voicecal-ai:latest` - Run container
@@ -27,83 +32,102 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
27
  ## Architecture Overview
28
 
29
  ### Core Components
30
- - **app/core/agent.py** - Main LLM agent with conversation management and Google Calendar tool integration
31
- - **app/core/tools.py** - Calendar booking, cancellation, and availability tools
32
- - **app/core/llm.py** - LLM abstraction with Groq (primary), Anthropic and Gemini fallbacks
33
- - **app/core/session.py** - Session management (Redis or JWT based)
34
- - **app/calendar/** - Google Calendar API integration and authentication
 
35
 
36
  ### API Structure
37
- - **app/api/main.py** - FastAPI application with core endpoints
38
- - **app/api/chat_widget.py** - Embeddable chat widget with STT integration
39
- - **app/api/simple_chat.py** - Simple chat interface for testing
40
  - **app/api/models.py** - Pydantic models for API requests/responses
41
 
42
  ### Configuration
43
- - **app/config.py** - Centralized settings with Pydantic BaseSettings
44
- - Environment variables loaded from `.env` file
45
- - Session backend configurable: Redis (default) or JWT (for HuggingFace Spaces)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
  ## Key Features
48
 
49
  ### LLM Integration
50
- - Primary: Groq Llama-3.1-8b-instant
51
- - Fallbacks: Anthropic Claude, Google Gemini
52
- - Conversation memory with configurable history length
53
- - HTML-formatted responses for rich display
54
 
55
  ### Calendar Integration
56
  - Google Calendar OAuth2 authentication
57
  - Meeting booking with conflict detection
58
- - Custom meeting ID format: MMDD-HHMM-DURm (e.g., 0731-1400-60m)
59
- - Smart cancellation by matching user/time details
60
- - Google Meet integration for video calls
61
- - Email notifications for bookings/cancellations
62
 
63
  ### Session Management
64
  - Redis sessions for stateful development
65
- - JWT sessions for stateless HuggingFace deployment
66
  - Configurable session timeout (default: 10 minutes)
67
- - Session factory pattern in `app/core/session_factory.py`
68
 
69
- ### Speech-to-Text Integration
70
- - WebSocket connection to external STT service
71
- - Real-time audio processing at 16kHz
72
- - Configurable silence detection and auto-submit
73
- - STT service URL: `wss://pgits-stt-gpu-service-v3.hf.space/ws`
74
 
75
  ## Deployment
76
 
77
  ### HuggingFace Spaces
78
- - Uses `app.py` as entry point with JWT sessions
79
- - Dockerfile optimized for HF Spaces (port 7860)
80
- - Enhanced logging with file output for debugging
81
- - Dev Mode SSH debugging support
82
 
83
  ### Environment Variables
84
  Required:
85
- - `GROQ_API_KEY` - Groq LLM API key
86
- - `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` - OAuth2 credentials
87
- - `SECRET_KEY` - Application secret
88
- - `MY_PHONE_NUMBER`, `MY_EMAIL_ADDRESS` - Contact information
 
89
 
90
  Optional:
91
- - `ANTHROPIC_API_KEY`, `GEMINI_API_KEY` - Fallback LLM APIs
92
- - `TESTING_MODE=true` - Bypass email validation for development
93
- - `SESSION_BACKEND=jwt` - Use JWT instead of Redis (for HuggingFace)
 
 
94
 
95
  ### Version Management
96
  - Semantic versioning in `pyproject.toml` and `version.txt`
 
97
  - Version displayed in UI footer
98
- - Update version with each commit per deployment workflow
99
 
100
  ## Development Notes
101
 
102
- - Always update semantic version before committing
103
- - Use Poetry for dependency management in development
104
- - Testing mode bypasses Peter's email validation
105
  - HTML responses require `allow_html=true` in agent configuration
106
- - Calendar conflicts automatically detected and reported
107
- - Business hours configurable via environment variables (default: 9 AM - 5 PM)
108
-
109
- - decision to remove the summary when meeting booked, follow a different path, allow raw html through to show.
 
15
 
16
  ### Testing & Quality
17
  - `poetry run pytest` - Run test suite
18
+ - `poetry run pytest unit_tests/test_booking_interception.py` - Run single test file
19
  - `poetry run pytest --cov=app` - Run tests with coverage
20
  - `poetry run black .` - Code formatting
21
  - `poetry run flake8` - Linting
22
  - `poetry run mypy .` - Type checking
23
 
24
+ ### Utility Scripts
25
+ - `python scripts/refresh_google_token.py` - Refresh Google OAuth credentials
26
+ - `python scripts/debug_chat.py` - Interactive CLI for testing conversations
27
+
28
  ### Docker
29
  - `docker build -t voicecal-ai:latest .` - Build Docker image
30
  - `docker run -p 7860:7860 voicecal-ai:latest` - Run container
 
32
  ## Architecture Overview
33
 
34
  ### Core Components
35
+ - **app/core/agent.py** - `ChatCalAgent`: ReAct agent with Google Calendar tools, user info tracking, and loop detection
36
+ - **app/core/tools.py** - `CalendarTools`: check_availability and create_appointment implementations
37
+ - **app/core/llm.py** + **app/core/llm_anthropic.py** - LLM abstraction; primary is Anthropic claude-sonnet-4-20250514; Groq and Gemini are fallbacks
38
+ - **app/core/custom_parser.py** - `VerbatimOutputParser`: extracts raw tool responses from ReAct output (critical for preserving HTML formatting)
39
+ - **app/core/session_factory.py** - Selects Redis (dev) or JWT (HuggingFace) backend at runtime
40
+ - **app/personality/prompts.py** - 400+ line system prompt that drives the entire booking workflow
41
 
42
  ### API Structure
43
+ - **app/api/main.py** - FastAPI application with ~20 endpoints
44
+ - **app/api/chat_widget.py** - Embeddable chat UI (HTML/JS served inline)
 
45
  - **app/api/models.py** - Pydantic models for API requests/responses
46
 
47
  ### Configuration
48
+ - **app/config.py** - Centralized Pydantic BaseSettings; loads from env vars then `.env`
49
+ - Session backend configurable: `SESSION_BACKEND=redis` (default) or `SESSION_BACKEND=jwt`
50
+
51
+ ## Key Architectural Patterns
52
+
53
+ ### Agent Workflow
54
+ The agent is a LlamaIndex ReActAgent. On each turn:
55
+ 1. A `CalendarLLMWrapper` dynamically injects a system prompt with missing user info and user context
56
+ 2. The agent reasons through tool calls (`check_availability`, `create_appointment`)
57
+ 3. `VerbatimOutputParser` extracts raw tool output verbatim β€” never summarized β€” to preserve HTML confirmation markup
58
+ 4. Booking success is detected by markers like `<div id="booking-success">` and celebration phrases ("thanks", "all set") to return a pre-written farewell without an LLM call
59
+
60
+ ### Email Auto-Provision
61
+ Email flows from the landing page URL (`/chat-widget?email=...`) β†’ stored in session `user_data` β†’ loaded into `agent.user_info` on init. The system prompt strictly forbids asking for email again.
62
+
63
+ ### Meeting Conflict Handling
64
+ When `create_appointment` detects a conflict, it stores the full booking details in `conversation_state.pending_operation` so the agent can retry with an alternate time without re-asking for topic/duration.
65
+
66
+ ### Custom Meeting ID Format
67
+ `MMDD-HHMM-DURm` (e.g., `0731-1400-60m`) β€” encodes date, time, and duration in a human-readable string stored alongside the Google Calendar event ID.
68
+
69
+ ### Loop Detection
70
+ Agent tracks the last 5 responses and stops after 2 similar consecutive responses (normalized for missing-info content).
71
 
72
  ## Key Features
73
 
74
  ### LLM Integration
75
+ - Primary: Anthropic Claude Sonnet 4 (`claude-sonnet-4-20250514`)
76
+ - Fallbacks: Groq Llama-3.1-8b-instant, Google Gemini
77
+ - Mock LLM available for testing via `USE_MOCK_LLM=true` env var
78
+ - HTML-formatted responses pass through verbatim
79
 
80
  ### Calendar Integration
81
  - Google Calendar OAuth2 authentication
82
  - Meeting booking with conflict detection
83
+ - Google Meet video link integration
84
+ - Email invitations sent to attendees (SMTP via Gmail)
85
+ - Business hours configurable via env vars (default: 9 AM–5 PM EST)
 
86
 
87
  ### Session Management
88
  - Redis sessions for stateful development
89
+ - JWT sessions for stateless HuggingFace deployment (no external dependencies)
90
  - Configurable session timeout (default: 10 minutes)
91
+ - Conversation objects are always in-memory regardless of backend
92
 
93
+ ### Speech-to-Text / Text-to-Speech
94
+ - **STT file upload**: `POST /api/stt/transcribe` β€” Groq Whisper API (whisper-large-v3-turbo); auto-converts MP4+Opus β†’ WebM
95
+ - **TTS**: `POST /tts/synthesize` β€” Groq PlayAI with 27 available voices; audio cached in-memory (10 most recent)
96
+ - **WebSocket STT**: Configured for `wss://pgits-stt-gpu-service-v3.hf.space/ws` at 16kHz (feature flag, not active by default)
 
97
 
98
  ## Deployment
99
 
100
  ### HuggingFace Spaces
101
+ - Entry point: `app.py` (forces `SESSION_BACKEND=jwt`, port 7860)
102
+ - Logs written to stdout and `/tmp/app.log`
103
+ - SSH debugging: `ssh -i ~/.ssh/id_ed25519 pgits-voicecal-ai@ssh.hf.space` (via Dev Mode)
104
+ - HF deploys from `main` branch
105
 
106
  ### Environment Variables
107
  Required:
108
+ - `GROQ_API_KEY` β€” Groq LLM and Whisper STT
109
+ - `ANTHROPIC_API_KEY` β€” Primary LLM
110
+ - `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` β€” OAuth2 credentials
111
+ - `SECRET_KEY` β€” Session signing
112
+ - `MY_PHONE_NUMBER`, `MY_EMAIL_ADDRESS` β€” Peter's contact info
113
 
114
  Optional:
115
+ - `GEMINI_API_KEY` β€” Fallback LLM
116
+ - `TESTING_MODE=true` β€” Bypass email validation for development
117
+ - `SESSION_BACKEND=jwt` β€” Stateless mode for HuggingFace
118
+ - `USE_MOCK_LLM=true` β€” Mock LLM for unit tests
119
+ - `BUSINESS_START_HOUR`, `BUSINESS_END_HOUR` β€” Override default 9–17
120
 
121
  ### Version Management
122
  - Semantic versioning in `pyproject.toml` and `version.txt`
123
+ - Update both files with every commit
124
  - Version displayed in UI footer
 
125
 
126
  ## Development Notes
127
 
128
+ - **Tool response preservation is critical**: never summarize tool output; use `VerbatimOutputParser` to return as-is
129
+ - **System prompt is the source of truth** for agent behavior β€” edit `app/personality/prompts.py` for workflow changes
 
130
  - HTML responses require `allow_html=true` in agent configuration
131
+ - `testing_mode` bypasses email validation (allows non-Peter emails)
132
+ - Decision: removed booking summary step β€” allow raw HTML confirmation through directly to frontend
133
+ - Google credentials stored in `credentials/` dir and synced to HF Secrets after OAuth flow