Spaces:

MikelWL
/

ConverTA

Sleeping

App Files Files Community

MikelWL commited on Nov 5, 2025

Commit

e3892d4

1 Parent(s): 0a865e9

refactor: Simplified and streamlined the core demo and overall codebase

Browse files

Files changed (37) hide show

.claude/README.md +0 -47
.claude/commands/conversation-task.md +0 -48
.claude/commands/llm-task.md +0 -53
.claude/commands/websocket-task.md +0 -44
.claude/settings.local.json +0 -14
.env.example +10 -81
.gitignore +1 -0
AXIOM_ARCHITECTURE.md +0 -206
AXIOM_CLAUDE_INTEGRATION.md +0 -171
AXIOM_IMPLEMENTATION_HISTORY.md +0 -159
AXIOM_README.md +0 -316
AXIOM_SETUP.md +0 -208
AXIOM_WEBSOCKET_ARCHITECTURE.md +0 -285
PROJECT_STATE.md +0 -246
README.md +156 -0
backend/api/{services/conversation_service.py → conversation_service.py} +27 -14
backend/api/{websockets/conversation_ws.py → conversation_ws.py} +5 -5
backend/api/main.py +23 -9
backend/api/{routes/conversations.py → routes.py} +4 -10
backend/core/conversation_manager.py +1 -1
backend/core/llm_client.py +36 -5
backend/core/persona_system.py +3 -3
config/default_config.yaml +1 -1
config/settings.py +74 -0
data/{personas/patient_personas.yaml → patient_personas.yaml} +0 -0
data/{personas/surveyor_personas.yaml → surveyor_personas.yaml} +0 -0
docs/README.md +9 -0
docs/development.md +61 -0
docs/overview.md +56 -0
docs/roadmap.md +37 -0
frontend/gradio_app.py +43 -26
run_local.sh +105 -0
scripts/check_setup.py +0 -234
scripts/run_conversation_demo.py +0 -233
scripts/test_integration.py +0 -387
scripts/test_llm_connection.py +0 -317
scripts/test_websocket.py +0 -250

.claude/README.md DELETED Viewed

@@ -1,47 +0,0 @@
-# 🤖 .claude Directory - Claude Code Integration
-This directory contains Claude Code-specific customizations for AI agent development workflow.
-## Structure
-```
-.claude/
-├── commands/              # Custom slash commands
-│   ├── websocket-task.md  # /websocket-task command
-│   ├── llm-task.md        # /llm-task command
-│   └── conversation-task.md # /conversation-task command
-└── settings.local.json    # Local Claude Code settings
-```
-## Usage
-### ✨ Custom Slash Commands (Primary Method)
-```bash
-# Load complete task context with automatic file loading:
-/websocket-task      # WebSocket implementation
-/llm-task           # LLM integration
-/conversation-task  # Conversation orchestration
-```
-### 📂 Individual File References
-```bash
-# Load specific files when needed:
-@backend/core/llm_client.py @data/personas/patient_personas.yaml
-# Load entire directories:
-@backend/models/ @data/personas/
-```
-## Workflow for AI Agents
-**Streamlined 3-step process:**
-1. **Context**: `@CLAUDE.md` for project overview
-2. **Task**: `/websocket-task`, `/llm-task`, or `/conversation-task`
-3. **Code**: Begin implementation with full context loaded
-## Benefits
-- **All-in-one**: Each slash command provides context + guidance + file loading
-- **Native integration**: Leverages Claude Code's built-in features
-- **No maintenance**: Single source of truth per task
-- **Automatic**: Files load instantly with `@` syntax

.claude/commands/conversation-task.md DELETED Viewed

@@ -1,48 +0,0 @@
-# Conversation Orchestration Task
-Load the necessary context for implementing conversation orchestration logic.
-## Context to Load
-@backend/core/conversation_manager.py - Main orchestration logic
-@backend/core/llm_client.py - LLM interface to use
-@data/personas/ - All persona definitions
-@config/personas_config.yaml - Persona structure reference
-@TODO_CONTEXT.md - Full task breakdown
-## Key Implementation Areas
-### Files to Create
-1. **backend/core/persona_system.py** - Persona prompt builder
-2. **backend/models/conversation.py** - Conversation data models
-3. **backend/models/message.py** - Message data models
-### PersonaSystem Class Structure
-```python
-class PersonaSystem:
-    def build_prompt(self, persona: dict, history: List[dict], context: dict) -> str:
-        """Build complete prompt including persona, history, and context."""
-    def extract_persona_traits(self, persona_id: str) -> dict:
-        """Extract behavioral traits from persona definition."""
-    def format_conversation_history(self, messages: List[dict]) -> str:
-        """Format message history for inclusion in prompt."""
-```
-### Conversation Flow Logic
-1. Surveyor initiates with greeting
-2. Patient responds based on persona
-3. Surveyor asks follow-up questions
-4. Continue until end conditions met
-### Persona Behavior Implementation
-- **cooperative_senior**: Add delays, ask for clarification
-- **anxious_parent**: Question motives, seek reassurance
-- **busy_professional**: Keep responses brief, mention time
-## Success Criteria
-- Two AIs can have coherent conversation
-- Personas behave according to definitions
-- Conversation flows naturally
-- History is maintained correctly

.claude/commands/llm-task.md DELETED Viewed

@@ -1,53 +0,0 @@
-# LLM Integration Task
-Load the necessary context for implementing Ollama LLM integration in the AI Survey Simulator.
-## Context to Load
-@backend/core/llm_client.py - Implement OllamaClient.generate()
-@config/personas_config.yaml - System prompts for personas
-@.env.example - LLM configuration examples
-@data/personas/ - All persona definitions
-@TODO_CONTEXT.md - Full task breakdown
-## Implementation Focus
-### Complete OllamaClient.generate() Method
-```python
-async def generate(self, prompt: str, system_prompt: Optional[str] = None, **kwargs) -> str:
-    messages = []
-    if system_prompt:
-        messages.append({"role": "system", "content": system_prompt})
-    messages.append({"role": "user", "content": prompt})
-    payload = {
-        "model": self.model,
-        "messages": messages,
-        "stream": False,
-        **kwargs
-    }
-    response = await self.client.post(f"{self.host}/api/chat", json=payload)
-    # Handle response and errors
-```
-## Test Setup Commands
-```bash
-# Install Ollama
-curl -fsSL https://ollama.ai/install.sh | sh
-# Pull model
-ollama pull llama2:13b
-# Test connection
-curl http://localhost:11434/api/tags
-```
-## Files to Create
-1. **scripts/test_llm_connection.py** - Test script for LLM connectivity
-## Success Criteria
-- Can call Ollama and get responses
-- System prompts correctly formatted
-- Errors handled gracefully
-- Test script validates connection

.claude/commands/websocket-task.md DELETED Viewed

@@ -1,44 +0,0 @@
-# WebSocket Implementation Task
-Load the necessary context for implementing WebSocket functionality in the AI Survey Simulator.
-## Context to Load
-Please load these files to understand the task:
-@backend/api/main.py - Add WebSocket endpoint here
-@config/default_config.yaml - WebSocket configuration section
-@frontend/streamlit_app.py - Frontend that needs WebSocket client
-@TODO_CONTEXT.md - Full task breakdown
-## Files to Create
-1. **backend/api/websockets/conversation_ws.py** - WebSocket endpoint implementation
-2. **frontend/utils/websocket_client.py** - Frontend WebSocket client
-## Implementation Notes
-### WebSocket Endpoint Pattern
-```python
-@app.websocket("/ws/conversation/{conversation_id}")
-async def websocket_endpoint(websocket: WebSocket, conversation_id: str):
-    await websocket.accept()
-    # Handle messages
-```
-### Message Format
-```json
-{
-    "type": "conversation_message",
-    "role": "surveyor" | "patient",
-    "content": "message text",
-    "timestamp": "2024-01-01T12:00:00Z",
-    "conversation_id": "uuid"
-}
-```
-## Success Criteria
-- Frontend connects to backend WebSocket
-- Messages flow bidirectionally
-- Connection recovers from disconnects
-- Status indicator shows connection state

.claude/settings.local.json DELETED Viewed

@@ -1,14 +0,0 @@
-{
-  "permissions": {
-    "allow": [
-      "Bash(mkdir:*)",
-      "Bash(ls:*)",
-      "Bash(mv:*)",
-      "Bash(true)",
-      "Bash(cp:*)",
-      "Bash(find:*)",
-      "WebFetch(domain:docs.anthropic.com)"
-    ],
-    "deny": []
-  }
-}

.env.example CHANGED Viewed

@@ -1,87 +1,16 @@
-# AI Survey Simulator Environment Variables
-# Copy this file to .env and update with your settings
-# Application Settings
-APP_NAME="AI Survey Simulator"
-APP_ENV=development
-DEBUG=true
-LOG_LEVEL=INFO
-# API Server Configuration
 API_HOST=0.0.0.0
 API_PORT=8000
-API_WORKERS=1
-API_RELOAD=true
-# Frontend Configuration
-STREAMLIT_SERVER_PORT=8501
-STREAMLIT_SERVER_ADDRESS=localhost
-# LLM Backend Configuration
-# Choose one: ollama, vllm, openai
 LLM_BACKEND=ollama
-# Ollama Configuration
-OLLAMA_HOST=http://localhost:11434
-OLLAMA_MODEL=llama2:13b
-OLLAMA_TIMEOUT=120
-# vLLM Configuration (alternative to Ollama)
-# VLLM_HOST=http://localhost:8000
-# VLLM_MODEL=meta-llama/Llama-2-13b-chat-hf
-# VLLM_TIMEOUT=120
-# OpenAI Configuration (for testing/comparison)
-# OPENAI_API_KEY=your-api-key-here
-# OPENAI_MODEL=gpt-3.5-turbo
-# Model Parameters
-LLM_TEMPERATURE=0.7
-LLM_MAX_TOKENS=2048
-LLM_TOP_P=0.9
-# Database Configuration
-DATABASE_TYPE=sqlite
-DATABASE_URL=sqlite:///./data/conversations.db
-# PostgreSQL Configuration (for production)
-# DATABASE_TYPE=postgresql
-# POSTGRES_HOST=localhost
-# POSTGRES_PORT=5432
-# POSTGRES_DB=ai_survey_simulator
-# POSTGRES_USER=postgres
-# POSTGRES_PASSWORD=your-password
-# WebSocket Configuration
-WS_PING_INTERVAL=10
-WS_PING_TIMEOUT=5
-WS_MAX_MESSAGE_SIZE=1048576
-# Security Settings
-# Set to true in production
-ENABLE_AUTH=false
-API_KEY=your-secure-api-key-here
-SECRET_KEY=your-secret-key-for-sessions
-ALLOWED_HOSTS=localhost,127.0.0.1
-# CORS Settings
-CORS_ORIGINS=http://localhost:8501,http://localhost:3000
-# Performance Settings
-CACHE_ENABLED=true
-CACHE_TTL=3600
-MAX_CONCURRENT_CONVERSATIONS=10
-CONVERSATION_QUEUE_SIZE=100
-# File Storage
-LOG_FILE_PATH=./logs/app.log
-LOG_FILE_MAX_SIZE=10MB
-LOG_FILE_BACKUP_COUNT=5
-# Monitoring (Optional)
-ENABLE_METRICS=false
-METRICS_PORT=9090
-# Development Settings
-AUTO_RELOAD_PERSONAS=true
-MOCK_LLM_RESPONSES=false

+# API configuration
 API_HOST=0.0.0.0
 API_PORT=8000
+# LLM backend configuration
 LLM_BACKEND=ollama
+LLM_HOST=http://localhost:11434
+LLM_MODEL=llama3.2:latest
+LLM_TIMEOUT=120
+# Frontend configuration
+FRONTEND_BACKEND_BASE_URL=http://localhost:8000
+FRONTEND_WEBSOCKET_URL=ws://localhost:8000/ws/conversation
+# Logging
+LOG_LEVEL=INFO

.gitignore CHANGED Viewed

	@@ -1 +1,2 @@
1	*.pyc


1	*.pyc
2	+ .env

AXIOM_ARCHITECTURE.md DELETED Viewed

@@ -1,206 +0,0 @@
-# 🏗️ Architecture Decision Records (ADRs)
-> **Purpose**: Document significant architectural decisions for AI agents and developers.
-> Each decision includes context, options considered, decision made, and consequences.
-## ADR Template
-```markdown
-### ADR-XXX: [Decision Title]
-**Date**: YYYY-MM-DD
-**Status**: Accepted/Deprecated/Superseded
-**Context**: Why this decision was needed
-**Decision**: What we decided
-**Consequences**: What happens as a result
-**Alternatives**: What else was considered
-```
----
-## ADR-001: Use Ollama for MVP LLM Backend
-**Date**: 2025-08-22
-**Status**: Accepted
-**Context**: Need local LLM hosting for privacy and cost control in research environment
-**Decision**: Use Ollama as primary LLM backend for MVP
-**Consequences**:
-- ✅ Simple setup with one-line install
-- ✅ No API costs
-- ✅ Full data privacy
-- ❌ Limited to models Ollama supports
-- ❌ Requires local GPU resources
-**Alternatives Considered**:
-- vLLM: More performant but complex setup
-- OpenAI API: Easy but costs and privacy concerns
-- Hugging Face local: More setup complexity
----
-## ADR-002: FastAPI + Gradio Architecture
-**Date**: 2025-09-18 (Updated from original Streamlit decision)
-**Status**: Accepted
-**Context**: Need rapid development with real-time chat features and WebSocket integration
-**Decision**: FastAPI backend with Gradio frontend
-**Consequences**:
-- ✅ Fast development velocity
-- ✅ Built-in WebSocket support
-- ✅ Auto-generated API docs
-- ✅ Native chat components (`gr.Chatbot()`)
-- ✅ Better real-time streaming support
-- ✅ Cleaner async/await integration
-- ✅ No full page reruns on updates
-- ❌ Different component ecosystem than Streamlit
-- ❌ Not ideal for production scaling
-**Alternatives Considered**:
-- Streamlit: Good for research but limitations for real-time chat
-- Flask + React: More control but slower development
-- Django + templates: Too heavyweight for MVP
-- Pure FastAPI with Jinja: Limited interactivity
----
-## ADR-003: SQLite for MVP Database
-**Date**: 2025-08-22
-**Status**: Accepted
-**Context**: Need simple persistence without setup complexity
-**Decision**: Use SQLite for MVP, design for PostgreSQL migration
-**Consequences**:
-- ✅ Zero configuration required
-- ✅ Single file database
-- ✅ Good enough for research scale
-- ✅ Easy to backup/share
-- ❌ Limited concurrent writes
-- ❌ No native JSON queries
-**Migration Path**:
-- Use SQLAlchemy ORM for abstraction
-- Keep database logic isolated
-- Document PostgreSQL migration steps
----
-## ADR-004: YAML for Persona Configuration
-**Date**: 2025-08-22
-**Status**: Accepted
-**Context**: Researchers need to edit personas without coding
-**Decision**: Store personas as YAML files
-**Consequences**:
-- ✅ Human-readable and editable
-- ✅ Git-friendly for tracking changes
-- ✅ Comments for documentation
-- ✅ Hierarchical structure support
-- ❌ No schema validation (without extras)
-- ❌ Potential for syntax errors
-**Alternatives Considered**:
-- JSON: Less readable, no comments
-- Python files: Too technical for researchers
-- Database: Harder to version control
----
-## ADR-005: WebSocket for Real-time Communication
-**Date**: 2025-08-22
-**Status**: Accepted
-**Context**: Need real-time conversation streaming
-**Decision**: Use WebSocket protocol for live updates
-**Consequences**:
-- ✅ True real-time bidirectional communication
-- ✅ Lower latency than polling
-- ✅ Native FastAPI support
-- ❌ More complex than REST
-- ❌ Connection management needed
-- ❌ Reconnection logic required
-**Implementation Notes**:
-- JSON message protocol
-- Heartbeat for connection health
-- Automatic reconnection in frontend
----
-## ADR-006: Monolithic Repository Structure
-**Date**: 2025-08-22
-**Status**: Accepted
-**Context**: MVP development speed and AI agent context
-**Decision**: Keep all code in single repository
-**Consequences**:
-- ✅ Easier for AI agents to understand
-- ✅ Simpler development workflow
-- ✅ All context in one place
-- ❌ Harder to scale teams later
-- ❌ Frontend/backend coupling
-**Future Considerations**:
-- Clear module boundaries for future split
-- API-first design enables separation
-- Document splitting strategy
----
-## ADR-007: AI-First Documentation
-**Date**: 2025-08-22
-**Status**: Accepted
-**Context**: Development primarily by AI agents
-**Decision**: Optimize all documentation for AI consumption
-**Consequences**:
-- ✅ Faster AI agent onboarding
-- ✅ Better context management
-- ✅ Consistent development patterns
-- ✅ Self-documenting approach
-**Key Principles**:
-- CLAUDE.md as central entry point
-- Context packages in .claude/
-- Detailed docstrings everywhere
-- Clear file navigation paths
----
-## ADR-008: Configuration-Driven Design
-**Date**: 2025-08-22
-**Status**: Accepted
-**Context**: Research tool needs flexibility
-**Decision**: Externalize all configuration
-**Consequences**:
-- ✅ No code changes for experiments
-- ✅ Easy A/B testing
-- ✅ Shareable configurations
-- ❌ More files to manage
-- ❌ Configuration validation needed
-**Configuration Hierarchy**:
-1. Environment variables (secrets)
-2. YAML config files (settings)
-3. Default values in code
----
-## Future Decision Points
-### Pending Decisions:
-1. **Streaming vs Batch LLM Responses**
-   - Consider: UX vs simplicity
-   - Impact: WebSocket message protocol
-2. **Multi-conversation Support**
-   - Consider: Concurrent conversations
-   - Impact: Resource management
-3. **Persona Fine-tuning**
-   - Consider: Learning from conversations
-   - Impact: Model management complexity
-4. **Export Format Standards**
-   - Consider: Research tool compatibility
-   - Impact: Data pipeline design
-### Review Triggers:
-- When moving beyond MVP
-- If performance becomes issue
-- When adding team members
-- For production deployment
----
-**Note for AI Agents**: Always check if decisions here conflict with your implementation approach. If so, document why you're deviating or propose an ADR update.

AXIOM_CLAUDE_INTEGRATION.md DELETED Viewed

@@ -1,171 +0,0 @@
-# 🤖 AI Agent Context Guide - AI Survey Simulator
-> **Purpose**: Static reference guide for AI agents working on this project.
-> For current project state, see `PROJECT_STATE.md`.
-## 🎯 Project Overview
-**What**: AI-to-AI healthcare survey research platform
-**Purpose**: Enable simulated conversations between AI surveyors and patient personas
-**Tech Stack**: FastAPI backend, Streamlit frontend, Ollama/vLLM for LLMs
-## 🗺️ Critical Files Navigation
-### For Understanding Architecture
-```bash
-# Read these files in order:
-1. AXIOM_README.md              # Developer overview and setup
-2. config/default_config.yaml   # System configuration
-3. backend/api/main.py          # API entry point
-4. backend/core/conversation_manager.py  # Core logic structure
-```
-### For Persona System
-```bash
-# Essential persona files:
-1. config/personas_config.yaml           # Persona template structure
-2. data/personas/patient_personas.yaml   # Patient definitions
-3. data/personas/surveyor_personas.yaml  # Surveyor definitions
-```
-### For Implementation Work
-```bash
-# Key implementation files:
-1. backend/core/llm_client.py      # LLM integration point
-2. frontend/streamlit_app.py       # UI implementation
-3. requirements.txt                # Dependencies
-```
-## 🧩 Native Claude Code Workflow
-### 📦 Custom Slash Commands
-Use Claude Code's native slash commands for task-specific context loading:
-- `/websocket-task` - Load WebSocket implementation context
-- `/llm-task` - Load LLM integration context
-- `/conversation-task` - Load conversation orchestration context
-These commands automatically load the right files and provide implementation guidance.
-### 📂 File References
-Use Claude Code's `@` syntax for quick file loading:
-- `@backend/api/main.py` - Main FastAPI app
-- `@config/default_config.yaml` - System configuration
-- `@data/personas/` - All persona definitions
-- `@PROJECT_STATE.md` - Current project status
-### 🔗 Context Loading Examples
-```
-# Load all WebSocket-related context:
-/websocket-task
-# Load specific files:
-@backend/core/llm_client.py @data/personas/patient_personas.yaml
-# Load directory:
-@backend/models/
-```
-## 🔧 AI Agent Commands
-### Quick Context Commands
-```bash
-# View project structure
-find . -type f -name "*.py" | grep -E "(api|core|frontend)" | head -20
-# Check current TODOs in code
-grep -r "TODO" --include="*.py" .
-# See configuration structure
-ls -la config/ data/personas/
-# Understand dependencies
-head -50 requirements.txt
-```
-### Development Patterns
-1. **Before Adding Features**: Always check existing patterns in similar files
-2. **Configuration First**: Add new settings to config files before hardcoding
-3. **Docstring Discipline**: Every new function needs a docstring
-4. **Type Hints**: Use type hints for all function parameters
-5. **Async by Default**: Backend should be async-first
-## 📝 Architecture Decisions (Reference)
-**Major Decisions Made**:
-1. **Ollama for MVP**: Simpler than vLLM for initial development
-2. **SQLite for MVP**: Easy setup, upgradeable to PostgreSQL
-3. **Streamlit UI**: Rapid prototyping, good for research tools
-4. **YAML for Personas**: Human-readable, easy to modify
-## 🚨 Important Context Rules
-### Always Remember:
-1. **This is a research tool** - Not for production healthcare
-2. **Privacy conscious** - No real patient data
-3. **Modular design** - Each component should be replaceable
-4. **AI-first documentation** - Write docs for AI agents to understand
-### Never Do:
-1. Don't hardcode configuration values
-2. Don't create files without docstrings
-3. Don't skip error handling in async code
-4. Don't mix UI logic with business logic
-## 🔄 AI Agent Resumption Workflow
-**For new AI agent instances starting work:**
-### **Step 1: Load Project Context**
-```bash
-@PROJECT_STATE.md       # Current status and next steps
-@AXIOM_CLAUDE_INTEGRATION.md  # This file - static reference
-```
-### **Step 2: Verify Environment**
-```bash
-python scripts/check_setup.py    # Quick environment verification
-```
-### **Step 3: Choose Task Based on Status**
-```bash
-/conversation-task      # Next priority - conversation orchestration
-/websocket-task        # If WebSocket needs work
-/llm-task             # If LLM needs work
-```
-### **Step 4: Test Current State**
-```bash
-python scripts/test_integration.py    # Test all working components
-```
-### **Step 5: Begin Implementation**
-Use task context to guide implementation.
-## 💡 Meta-Tips for Claude Code AI Agents
-1. **Leverage native features**: Use `@` references and slash commands
-2. **Use TodoWrite tool** frequently to track progress
-3. **Use `--continue` or `--resume`** to maintain context across sessions
-4. **Read before writing** - Always check existing patterns
-5. **Document decisions** in `PROJECT_STATE.md`
-## 🚀 Quick Start for Next Session
-```bash
-# 1. Load current status
-@PROJECT_STATE.md
-# 2. Choose your task and load context
-/websocket-task     # OR
-/llm-task          # OR
-/conversation-task
-# 3. Begin implementation
-```
----
-**Remember**: You're building a tool for other researchers. Keep it simple, well-documented, and extensible.

AXIOM_IMPLEMENTATION_HISTORY.md DELETED Viewed

@@ -1,159 +0,0 @@
-# 📚 Implementation History Archive
-> **Purpose**: Historical record of completed implementations and test results.
-> For current active development, see `PROJECT_STATE.md`.
----
-## 🏆 **Foundation Phase Complete** (August-September 2025)
-### **Integration Test Results** (2025-09-16)
-**Test Command**: `python scripts/test_integration.py`
-**Result**: 🎉 **7/7 tests passed**
-#### **Components Successfully Verified:**
-- ✅ **Persona System**: 5 personas loaded (cooperative_senior_001, anxious_parent_001, busy_professional_001, + 2 surveyors)
-- ✅ **LLM Client**: Full connectivity to Ollama with llama2:7b model
-- ✅ **Basic Persona Response**: Realistic character responses with personality
-- ✅ **Surveyor Persona**: Dr. Sarah Mitchell professional introduction working
-- ✅ **Patient Persona**: Jennifer Chen health rating responses with character behavior
-- ✅ **Multi-turn Conversation**: History-aware conversation capability
-- ✅ **Configuration Loading**: All YAML and environment config operational
-#### **Quality Assessment - Sample Responses:**
-```
-Margaret Thompson (patient):
-"Oh, goodness me! *adjusts glasses* Well, you see, as a retired teacher with Type..."
-Dr. Sarah Mitchell (surveyor):
-"Hello there! My name is Dr. Sarah Mitchell, and I am a senior healthcare survey specialist with over 15 years of experie..."
-Jennifer Chen (anxious parent):
-"*looks up from her phone* Umm... let me see... *pauses* I would say my child's health is a..."
-```
-**Assessment**: Personas demonstrate authentic personality traits, healthcare context, and realistic behavioral patterns suitable for research use.
----
-## 🔧 **Technical Implementation Details** (Completed)
-### **LLM Integration System** ✅
-- **Ollama Client**: Retry logic, error handling, performance tracking
-- **Persona System**: YAML-based with prompt building and behavior modifiers
-- **Configuration**: File-based + environment variable support
-- **Health Checks**: Connectivity and model availability testing
-- **Models Verified**: llama2:7b, gemma3:4b, gpt-oss:20b available
-### **WebSocket Infrastructure** ✅
-- **Backend Endpoint**: Connection management with heartbeat
-- **Frontend Client**: Reconnection logic for production reliability
-- **Message Validation**: Strict format validation prevents errors
-- **Streamlit Integration**: Session state storage for UI updates
-- **Error Handling**: Graceful disconnection and recovery
-### **Project Foundation** ✅
-- **Directory Structure**: Complete with 11 Python files
-- **Configuration System**: YAML configs + .env support operational
-- **Environment**: Python 3.9.23 in conda environment 'converai'
-- **Hardware**: RTX 3080 GPU verified, adequate for 13B models
-- **Dependencies**: All requirements.txt packages working
----
-## 🧪 **Environment Verification** (Completed)
-### **System Setup Results:**
-```bash
-# Working Commands (Historical Record):
-python scripts/test_integration.py     # 7/7 tests pass
-python scripts/test_llm_connection.py  # Full LLM connectivity
-python scripts/test_websocket.py       # WebSocket management verified
-python scripts/check_setup.py          # Environment verified
-```
-### **Hardware Configuration:**
-- **GPU**: RTX 3080 Lite Hash Rate (ideal for 13B models)
-- **RAM**: 64GB total, 11GB used
-- **OS**: Ubuntu 22.04.5 LTS
-- **Python**: 3.9.23 via conda
----
-## 📝 **Architecture Decisions Made** (Historical)
-### **Foundation Decisions (August 2025):**
-1. **Ollama for MVP**: Local hosting, no API costs, full data privacy
-2. **SQLite for MVP**: Simple setup, upgradeable to PostgreSQL later
-3. **Streamlit for UI**: Rapid prototyping, suitable for research tools
-4. **YAML for Personas**: Human-readable format, researchers can modify easily
-5. **FastAPI Backend**: Built-in async support, automatic API documentation
-6. **WebSocket Real-time**: Essential for live conversation monitoring
-### **Technical Patterns Established:**
-- **Async Everything**: FastAPI endpoints, LLM calls, WebSocket handling
-- **Configuration-Driven**: No hardcoded values, YAML + environment variables
-- **AI-First Documentation**: Every component designed for AI agent consumption
-- **Modular Design**: Clear boundaries for future component replacement
----
-## 🎉 **Key Achievements**
-### **What Works Right Now:**
-- **Individual Persona Responses**: Each character has distinct personality
-- **Multi-turn Conversations**: Conversations maintain history and context
-- **Professional Healthcare Context**: Responses include medical terminology and situations
-- **Real-time Capability**: WebSocket infrastructure ready for live streaming
-- **Research-Ready Personas**: 5 detailed characters suitable for survey research
-### **Demo Readiness Assessment:**
-- **Technical Team**: ✅ Ready immediately (working code demonstrated)
-- **Non-Technical Research Team**: 🎯 1-2 weeks to visual interface
-- **Demo Appeal**: HIGH - authentic AI personalities with healthcare context
----
-## 🔄 **Development Methodology Established**
-### **Testing Approach:**
-- **Integration-First**: Full component interaction testing
-- **Persona Quality**: Response authenticity verification
-- **Environment Verification**: Multi-platform setup validation
-### **Documentation Strategy:**
-- **AXIOM Files**: Static reference documentation
-- **PROJECT_STATE**: Single evolving status file
-- **Historical Archive**: This file for completed implementations
----
----
-## 🚀 **Step 1 Implementation Complete** (2025-09-16)
-### **Core Conversation Engine** ✅
-- **Files Created**: `backend/core/conversation_manager.py`, `scripts/run_conversation_demo.py`
-- **Functionality**: Full AI-to-AI conversation orchestration with persona management
-- **Features**: Rich terminal display, conversation termination, error handling
-- **Testing**: Live conversations between Dr. Sarah Mitchell and Margaret Thompson
-- **Performance**: Faster than estimated (1 session vs 2-3 days planned)
-### **Terminal Demo Script**
-```bash
-# Commands now available:
-python scripts/run_conversation_demo.py
-python scripts/run_conversation_demo.py --model gemma3:4b
-```
-**Demo Features**:
-- Rich formatted terminal output with colored panels
-- Persona selection interface
-- Live conversation streaming
-- Turn-by-turn progress tracking
-- Graceful error handling and cleanup
----
-**Last Updated**: 2025-09-16
-**Status**: Foundation + Step 1 complete, archived for reference
-**Next Phase**: Step 2 - WebSocket Bridge (see PROJECT_STATE.md)

AXIOM_README.md DELETED Viewed

@@ -1,316 +0,0 @@
-# AI Survey Simulator
-An AI-to-AI healthcare survey research platform that enables simulated conversations between AI survey interviewers and patient personas for healthcare research purposes.
-## 🎯 Project Overview
-The AI Survey Simulator is a research tool designed to:
-- Enable AI-to-AI conversations for healthcare survey testing
-- Simulate various patient personas responding to survey questions
-- Provide real-time monitoring of AI conversations
-- Log conversations for research analysis
-- Support self-hosted deployment using local LLMs
-## 🏗️ Architecture
-```
-┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
-│   Streamlit UI  │────▶│  FastAPI Server │────▶│   LLM Backend   │
-│                 │◀────│   (WebSocket)   │◀────│  (Ollama/vLLM)  │
-└─────────────────┘     └─────────────────┘     └─────────────────┘
-                               │
-                               ▼
-                        ┌─────────────────┐
-                        │  SQLite/JSON    │
-                        │  Data Storage   │
-                        └─────────────────┘
-```
-### Core Components
-- **Frontend**: Streamlit-based interface for real-time conversation monitoring
-- **Backend API**: FastAPI server handling conversation orchestration
-- **LLM Integration**: Supports Ollama and vLLM for local model deployment
-- **Data Storage**: SQLite for conversation logs, JSON/YAML for personas
-- **Real-time Communication**: WebSocket for live conversation streaming
-## 🚀 Quick Start
-### Prerequisites
-- Python 3.9+
-- NVIDIA GPU with 8GB+ VRAM (for local LLM hosting)
-- Ollama or vLLM installed (see LLM Setup section)
-### Installation
-1. Clone the repository:
-```bash
-git clone <repository-url>
-cd ai-survey-simulator
-```
-2. Create a virtual environment:
-```bash
-python -m venv venv
-source venv/bin/activate  # On Windows: venv\Scripts\activate
-```
-3. Install dependencies:
-```bash
-pip install -r requirements.txt
-```
-4. Copy environment configuration:
-```bash
-cp .env.example .env
-```
-5. Configure your LLM backend in `.env`:
-```bash
-# For Ollama
-LLM_BACKEND=ollama
-OLLAMA_HOST=http://localhost:11434
-OLLAMA_MODEL=llama2:13b
-# For vLLM
-# LLM_BACKEND=vllm
-# VLLM_HOST=http://localhost:8000
-# VLLM_MODEL=meta-llama/Llama-2-13b-chat-hf
-```
-### Running the Application
-1. Start the backend server:
-```bash
-cd backend
-uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
-```
-2. In a new terminal, start the Streamlit frontend:
-```bash
-cd frontend
-streamlit run streamlit_app.py
-```
-3. Access the application at `http://localhost:8501`
-## 🔧 Development Workflow
-### Project Structure
-```
-ai-survey-simulator/
-├── backend/              # FastAPI backend service
-│   ├── api/             # API endpoints and WebSocket handlers
-│   ├── core/            # Core business logic
-│   ├── models/          # Data models
-│   └── storage/         # Database and logging
-├── frontend/            # Streamlit UI
-│   ├── components/      # UI components
-│   └── utils/           # Frontend utilities
-├── data/                # Persona definitions and database
-├── config/              # Configuration files
-├── scripts/             # Utility scripts
-└── tests/               # Test suite
-```
-### Development Guidelines
-1. **Code Style**: Follow PEP 8 and use type hints
-2. **Testing**: Write tests for new features in the `tests/` directory
-3. **Documentation**: Update docstrings and README for new features
-4. **Git Workflow**: Create feature branches and use descriptive commit messages
-### Running Tests
-```bash
-# Run all tests
-pytest
-# Run with coverage
-pytest --cov=backend --cov=frontend
-# Run specific test file
-pytest tests/test_conversation_manager.py
-```
-## 📦 MVP Features
-The Minimum Viable Product includes:
-1. **Basic AI-to-AI Conversations**: Two AI agents conducting healthcare surveys
-2. **Patient Personas**: 2-3 pre-configured patient personas
-3. **Real-time Display**: Live conversation monitoring via Streamlit
-4. **Conversation Logging**: SQLite database for conversation storage
-5. **Simple Configuration**: YAML-based configuration system
-## 🚧 Extending to Full Features
-### Phase 2: Enhanced Personas
-- Dynamic persona creation/editing interface
-- Persona behavior parameters (cooperation level, anxiety, etc.)
-- Persona template library
-### Phase 3: Advanced Conversation Control
-- Mid-conversation intervention capabilities
-- Conversation branching and replay
-- Survey script management
-### Phase 4: Analytics & Export
-- Conversation analysis dashboard
-- Export to common research formats (CSV, JSON, SPSS)
-- Automated report generation
-### Phase 5: Multi-Model Support
-- Support for multiple LLM providers
-- Model comparison features
-- Fine-tuning integration
-## ⚙️ Configuration Guide
-### Environment Variables (.env)
-```bash
-# LLM Configuration
-LLM_BACKEND=ollama              # Options: ollama, vllm
-OLLAMA_HOST=http://localhost:11434
-OLLAMA_MODEL=llama2:13b
-# API Configuration
-API_HOST=0.0.0.0
-API_PORT=8000
-API_WORKERS=1
-# Database Configuration
-DATABASE_URL=sqlite:///./data/conversations.db
-# Logging
-LOG_LEVEL=INFO
-LOG_FILE=./logs/app.log
-```
-### Persona Configuration (config/personas_config.yaml)
-```yaml
-patient_personas:
-  cooperative_senior:
-    name: "Margaret Thompson"
-    age: 72
-    personality: "Friendly, cooperative, detail-oriented"
-    health_context: "Managing diabetes and hypertension"
-    communication_style: "Polite, sometimes needs clarification"
-  anxious_parent:
-    name: "Jennifer Chen"
-    age: 38
-    personality: "Worried, protective, questioning"
-    health_context: "Child with recurring asthma"
-    communication_style: "Asks many questions, needs reassurance"
-```
-## 🔌 LLM Setup
-### Option 1: Ollama (Recommended for MVP)
-1. Install Ollama:
-```bash
-curl -fsSL https://ollama.ai/install.sh | sh
-```
-2. Pull a model:
-```bash
-ollama pull llama2:13b
-```
-3. Verify installation:
-```bash
-ollama list
-```
-### Option 2: vLLM (For Production)
-1. Install vLLM:
-```bash
-pip install vllm
-```
-2. Start vLLM server:
-```bash
-python -m vllm.entrypoints.openai.api_server \
-    --model meta-llama/Llama-2-13b-chat-hf \
-    --host 0.0.0.0 \
-    --port 8000
-```
-## 🧪 Testing Approach
-### Unit Tests
-- Test individual components (persona system, conversation manager)
-- Mock LLM responses for predictable testing
-- Located in `tests/test_*.py`
-### Integration Tests
-- Test API endpoints with real/mock data
-- WebSocket connection testing
-- Database operation verification
-### End-to-End Tests
-- Full conversation flow testing
-- UI interaction testing with Selenium
-- Performance benchmarking
-## 📊 Data Export
-Conversations can be exported via:
-1. **API Endpoint**: `GET /api/conversations/export?format=csv`
-2. **CLI Script**: `python scripts/export_conversations.py --format json`
-3. **UI Export**: Available in the Streamlit interface
-## 🤝 Contributing
-1. Fork the repository
-2. Create a feature branch: `git checkout -b feature/your-feature`
-3. Commit changes: `git commit -m 'Add your feature'`
-4. Push to branch: `git push origin feature/your-feature`
-5. Submit a Pull Request
-## 📄 License
-This project is licensed under the MIT License - see LICENSE file for details.
-## 🆘 Troubleshooting
-### Common Issues
-1. **LLM Connection Failed**
-   - Verify Ollama/vLLM is running
-   - Check LLM_HOST in .env file
-   - Test with: `python scripts/test_llm_connection.py`
-2. **WebSocket Connection Error**
-   - Ensure backend is running on correct port
-   - Check firewall settings
-   - Verify CORS configuration
-3. **Database Lock Error**
-   - Close other connections to SQLite
-   - Check file permissions
-   - Consider PostgreSQL for production
-### Debug Mode
-Enable debug logging:
-```bash
-export LOG_LEVEL=DEBUG
-export STREAMLIT_SERVER_LOG_LEVEL=debug
-```
-## 📞 Support
-- Create an issue for bug reports
-- Discussion forum for feature requests
-- Email: [project-email]
----
-**Note**: This is a research tool intended for healthcare survey development and testing. Not for clinical use.

AXIOM_SETUP.md DELETED Viewed

@@ -1,208 +0,0 @@
-# 🚀 Environment Setup Guide
-> **Quick Setup**: Complete environment configuration for AI Survey Simulator development.
-## 📋 **Prerequisites**
-- Python 3.9+
-- Git
-- NVIDIA GPU recommended (but not required)
-## ⚡ **Quick Setup (5 minutes)**
-### **1. Create Conda Environment**
-```bash
-# Create and activate environment
-conda create -n ai-survey-sim python=3.9 -y
-conda activate ai-survey-sim
-# Install dependencies
-pip install -r requirements.txt
-```
-### **2. Install Ollama**
-```bash
-# Linux/Mac
-curl -fsSL https://ollama.ai/install.sh | sh
-# Start Ollama service
-ollama serve &
-# Pull a model (choose based on your GPU)
-ollama pull llama2:7b      # 4GB+ GPU or CPU
-# OR
-ollama pull llama2:13b     # 8GB+ GPU (recommended)
-```
-### **3. Verify Setup**
-```bash
-# Quick verification
-python scripts/check_setup.py
-# Detailed component tests
-python scripts/test_websocket.py
-python scripts/test_llm_connection.py
-```
-## 🖥️ **Machine-Specific Setup**
-### **Laptop (RTX A4000 4GB)**
-```bash
-ollama pull llama2:7b
-export LLM_MODEL=llama2:7b
-```
-### **Office Desktop (RTX 3080 10GB)**
-```bash
-ollama pull llama2:13b
-export LLM_MODEL=llama2:13b
-```
-### **Home PC (AMD GPU)**
-```bash
-# CPU-only mode
-ollama pull llama2:7b
-export LLM_MODEL=llama2:7b
-# Performance will be slower but functional
-```
-### **HPC (Multiple A100s)**
-```bash
-ollama pull llama2:70b
-export LLM_MODEL=llama2:70b
-# Consider vLLM for multi-GPU setups
-```
-## 🔧 **Configuration**
-### **Environment Variables**
-```bash
-# Copy example environment file
-cp .env.example .env
-# Edit .env with your settings
-export LLM_HOST=http://localhost:11434
-export LLM_MODEL=llama2:7b  # or llama2:13b
-export API_HOST=0.0.0.0
-export API_PORT=8000
-```
-### **Test Configuration**
-```bash
-# Test WebSocket
-cd backend && uvicorn api.main:app --reload --host 0.0.0.0 --port 8000 &
-python scripts/test_websocket.py
-# Test LLM
-python scripts/test_llm_connection.py
-```
-## 🚨 **Troubleshooting**
-### **Common Issues**
-#### **Ollama Connection Failed**
-```bash
-# Check if Ollama is running
-curl http://localhost:11434/api/tags
-# If not running:
-ollama serve
-# Check models
-ollama list
-```
-#### **GPU Not Detected**
-```bash
-# Check GPU status
-nvidia-smi  # For NVIDIA
-# AMD GPUs will use CPU automatically
-```
-#### **Package Import Errors**
-```bash
-# Reinstall requirements
-pip install -r requirements.txt --force-reinstall
-# Check Python environment
-which python
-python --version
-```
-#### **Port Already in Use**
-```bash
-# Kill existing processes
-pkill -f uvicorn
-pkill -f streamlit
-# Or use different ports
-export API_PORT=8001
-export STREAMLIT_SERVER_PORT=8502
-```
-## 🧪 **Development Workflow**
-### **Start Development Session**
-```bash
-# 1. Activate environment
-conda activate ai-survey-sim
-# 2. Verify setup
-python scripts/check_setup.py
-# 3. Start backend (optional for testing)
-cd backend && uvicorn api.main:app --reload &
-# 4. Begin development
-# Load context: @CLAUDE.md @STATUS.md
-# Choose task: /conversation-task
-```
-### **Test Changes**
-```bash
-# Test individual components
-python scripts/test_websocket.py
-python scripts/test_llm_connection.py
-# Full integration test (when ready)
-python scripts/test_integration.py  # Future
-```
-## 📦 **Docker Alternative (Future)**
-For consistent environments across machines:
-```bash
-# Build container
-docker build -t ai-survey-sim .
-# Run with GPU support
-docker run --gpus all -p 8000:8000 -p 8501:8501 ai-survey-sim
-```
-## 🔗 **Next Steps**
-After setup completion:
-1. **Verify Everything Works**: `python scripts/check_setup.py`
-2. **Load Project Context**: `@CLAUDE.md @STATUS.md`
-3. **Choose Your Task**: `/conversation-task` (current priority)
-4. **Start Coding**: Follow task-specific guidance
-## 💡 **Performance Tips**
-### **Memory Optimization**
-- Use 7B models on 4GB GPUs
-- Use 13B models on 8GB+ GPUs
-- Close unused applications
-- Monitor with `nvidia-smi` or `htop`
-### **Development Speed**
-- Use `--reload` for FastAPI auto-reload
-- Use Streamlit's auto-refresh
-- Keep Ollama server running between sessions
----
-**Need Help?** Check `scripts/check_setup.py` for automated diagnostics!

AXIOM_WEBSOCKET_ARCHITECTURE.md DELETED Viewed

@@ -1,285 +0,0 @@
-# 🔗 AXIOM: WebSocket Architecture & Implementation History
-> **Status**: COMPLETE & STABLE
-> **Purpose**: Static reference for completed WebSocket foundation (Steps 1-3)
-> **Date Completed**: 2025-09-18
-This document captures the complete, finalized WebSocket architecture that enables real-time AI-to-AI conversation streaming. This is stable foundation code that should not require changes.
----
-## 🏗️ **Final Architecture Overview**
-### **Thread-Safe WebSocket Manager Design**
-**Problem Solved**: Async/sync boundary conflicts between Gradio's synchronous environment and WebSocket's asynchronous nature.
-**Solution**: Complete separation of concerns - WebSocket remains fully async in dedicated background thread, Gradio stays synchronous, communication via thread-safe message queues.
-```
-Architecture Flow:
-Gradio Frontend (Sync) ←→ Message Queues ←→ Background Thread (Fully Async WebSocket) ←→ FastAPI Backend
-```
-**Critical Design Decision**: We did NOT convert async to sync. Instead, we isolated the async WebSocket in its own thread with dedicated event loop, preserving both paradigms while eliminating conflicts.
-### **Key Components**
-1. **WebSocketManager** (`frontend/websocket_manager.py`)
-   - Runs WebSocket in dedicated background thread
-   - Thread-safe message queues for sync/async communication
-   - Automatic reconnection with exponential backoff
-   - Connection state management
-2. **ConversationService** (`backend/api/services/conversation_service.py`)
-   - Manages active conversation instances
-   - Bridges ConversationManager and WebSocket infrastructure
-   - Handles conversation lifecycle (start/stop/pause)
-3. **WebSocket Endpoints** (`backend/api/websockets/conversation_ws.py`)
-   - Real-time message broadcasting to connected clients
-   - Message validation and protocol handling
-   - Connection management with heartbeat
----
-## 📋 **Implementation Steps Completed**
-### **Step 1: Core Conversation Engine** ✅ (2025-09-16)
-**Goal**: Wire working components into conversation loop
-**Key Implementation**:
-- `backend/core/conversation_manager.py`: Orchestrates AI-to-AI conversations
-- Async generator pattern for real-time message streaming
-- Proper conversation flow: surveyor → patient → surveyor
-- Termination conditions and error handling
-**Success**: `python scripts/run_conversation_demo.py` shows live conversations
-### **Step 2: WebSocket Conversation Bridge** ✅ (2025-09-18)
-**Goal**: Stream conversations to web clients in real-time
-**Key Implementation**:
-- ConversationService connects ConversationManager to WebSocket system
-- REST API endpoints for conversation control
-- Message broadcasting to all connected clients
-- Start/stop conversation protocol via WebSocket
-**Success**: 3-terminal pipeline (Ollama + FastAPI + WebSocket test) working
-### **Step 3: Gradio Chat Interface** ✅ (2025-09-18)
-**Goal**: Visual chat display with reliable WebSocket connectivity
-**Key Challenge**: Async/sync conflicts caused immediate WebSocket disconnections
-**Solution Evolution**:
-1. **First Attempt**: Direct WebSocket in Gradio → Failed (JSON schema errors)
-2. **Second Attempt**: Simplified approach → Failed (connection drops)
-3. **Final Solution**: Complete architectural redesign with background threads
-**Breakthrough**: WebSocketManager with dedicated event loop in background thread
-**Success**: Real-time AI conversations display in browser with reliable connectivity
----
-## 🔧 **Technical Implementation Details**
-### **WebSocketManager Architecture**
-```python
-class WebSocketManager:
-    def __init__(self, url: str, conversation_id: str):
-        # Thread-safe message queues
-        self.outbound_queue = queue.Queue()  # Messages to send
-        self.inbound_queue = queue.Queue()   # Received messages
-    def _run_websocket(self):
-        """Run WebSocket in background thread with dedicated event loop."""
-        self.loop = asyncio.new_event_loop()
-        asyncio.set_event_loop(self.loop)
-        self.loop.run_until_complete(self._websocket_main())
-```
-**Key Features**:
-- Dedicated event loop in background thread
-- Thread-safe queues for sync/async boundary
-- Automatic reconnection with exponential backoff
-- State management (STOPPED, STARTING, CONNECTED, etc.)
-### **Message Flow Protocol**
-1. **Start Conversation**:
-   ```json
-   {
-     "type": "start_conversation",
-     "content": "start",
-     "surveyor_persona_id": "friendly_researcher_001",
-     "patient_persona_id": "cooperative_senior_001"
-   }
-   ```
-2. **Conversation Message**:
-   ```json
-   {
-     "type": "conversation_message",
-     "role": "surveyor|patient",
-     "content": "message content",
-     "persona": "persona name",
-     "turn": 1
-   }
-   ```
-3. **Status Updates**:
-   ```json
-   {
-     "type": "conversation_status",
-     "status": "starting|running|completed"
-   }
-   ```
-### **Critical Bug Fixes Implemented**
-1. **"Set changed size during iteration"** - WebSocket connection manager
-   - Fixed by creating copy of connections set before iteration
-2. **Async/Sync Boundary Conflicts** - Gradio + WebSocket
-   - Solved with background thread architecture
-3. **Persona ID Mismatches** - Frontend/Backend coordination
-   - Standardized on: "friendly_researcher_001", "cooperative_senior_001"
----
-## 📁 **Final File Structure**
-### **Frontend Files**
-```
-frontend/
-├── gradio_app.py           # Main Gradio application
-├── websocket_manager.py    # Thread-safe WebSocket client
-└── __pycache__/           # Python cache
-```
-### **Backend Files**
-```
-backend/
-├── api/
-│   ├── main.py                          # FastAPI app with WebSocket endpoint
-│   ├── routes/conversations.py          # REST API endpoints
-│   ├── services/conversation_service.py # Conversation management service
-│   └── websockets/conversation_ws.py    # WebSocket connection handling
-└── core/
-    ├── conversation_manager.py          # AI-to-AI conversation orchestration
-    ├── llm_client.py                   # Ollama integration
-    └── persona_system.py               # Persona loading and management
-```
-### **Test Files**
-```
-scripts/
-├── test_websocket.py              # Basic WebSocket functionality test
-├── test_integration.py            # Foundation component tests (7/7)
-└── run_conversation_demo.py       # Terminal conversation demo
-```
----
-## 🚀 **Deployment & Usage**
-### **Current Working Demo**
-```bash
-# Terminal 1: Start Ollama
-ollama serve
-# Terminal 2: Start FastAPI backend
-cd backend && uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
-# Terminal 3: Launch Gradio frontend
-python frontend/gradio_app.py
-```
-**Result**:
-- Browser opens to `http://localhost:7860`
-- Click "Connect to Backend" → "Start Conversation"
-- Real-time AI-to-AI conversation streams live
-- Click "Refresh Messages" to see new responses
-### **WebSocket Test**
-```bash
-python scripts/test_websocket.py
-```
-**Expected**: All WebSocket functionality tests pass
----
-## 🎯 **What This Foundation Enables**
-This completed WebSocket architecture provides the foundation for:
-1. **Real-time conversation streaming** - Messages appear instantly in browser
-2. **Reliable connectivity** - Automatic reconnection, error handling
-3. **Scalable architecture** - Multiple clients can connect to same conversation
-4. **Future UI development** - Solid backend for advanced frontend features
----
-## 📝 **Key Lessons & Design Decisions**
-### **Framework Choice: Gradio vs Streamlit**
-**Decision**: Gradio
-**Reasoning**:
-- Native chat components (`gr.Chatbot()`)
-- Better WebSocket integration
-- More suitable for real-time applications
-### **WebSocket Architecture: Direct vs Background Thread**
-**Decision**: Background thread with message queues
-**Reasoning**:
-- Eliminates async/sync conflicts completely
-- Provides reliable, persistent connections
-- Allows Gradio to remain fully synchronous
-### **Deployment Strategy: Local + ngrok**
-**Decision**: Local development with ngrok tunneling for team access
-**Reasoning**:
-- Leverages full local GPU power
-- Zero hosting costs during development
-- Instant team access when needed
----
-## 🔍 **Architecture Trade-offs & Implications**
-### **What We Preserved**
-- **Full WebSocket async capabilities**: All async WebSocket features remain available
-- **Gradio simplicity**: No async contamination in UI code
-- **Real-time performance**: Minimal latency impact (queue operations ~microseconds)
-### **Limitations Introduced**
-1. **Message Buffering**: Messages pass through queues instead of direct handling
-2. **Thread Overhead**: Additional background thread and event loop (minimal resource impact)
-3. **Complexity**: More complex than direct async integration (but necessary for Gradio compatibility)
-### **Performance Impact Assessment**
-- **Latency**: Negligible for AI conversations (queue ~μs, AI responses ~seconds)
-- **Memory**: Bounded by `max_messages = 100` (~1MB maximum)
-- **Reliability**: Major improvement (eliminated connection drops)
-### **User Experience Impact**
-- **✅ Positive**: Reliable, persistent connections
-- **✅ Neutral**: No perceptible delay in conversation flow
-- **❌ None**: No negative UX impacts identified
----
-## ⚠️ **Important Notes for Future Development**
-1. **Do not modify WebSocketManager**: This architecture solved critical async/sync conflicts
-2. **WebSocket stays fully async**: Never attempt to make WebSocket synchronous
-3. **Background thread is essential**: Direct WebSocket in Gradio main thread will fail
-4. **Message queues must remain thread-safe**: Any modifications must preserve thread safety
-5. **Consider implications**: New features should work within queue-based message flow
----
-**This architecture is COMPLETE and STABLE. The trade-offs are acceptable for our use case and no significant limitations were introduced. Use as reference for building additional features on top.**

PROJECT_STATE.md DELETED Viewed

@@ -1,246 +0,0 @@
-# 🚦 AI Survey Simulator - Current Project State
-> **Single Source of Truth**: This file tracks all current progress, next steps, and recent changes.
-> Update THIS file when making progress - no other documentation needs updates.
-**Last Updated**: 2025-09-18
-**Current Phase**: Local Development - Web UI Feature Development
-**Overall Status**: 🟢 **Step 3 Complete - Ready for Step 4 (Persona Selection)**
----
-## 🚀 **QUICK DEMO** - See Current Capabilities
-**What works RIGHT NOW**: Full web-based AI-to-AI conversation interface with real-time streaming
-**How to test** (3 terminals required):
-```bash
-# Terminal 1: Start Ollama
-ollama serve
-# Terminal 2: Start backend API (from backend/ directory)
-cd backend && uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
-# Terminal 3: Launch web interface
-python frontend/gradio_app.py
-```
-**Expected result**: Browser opens to localhost:7860 with working Gradio interface. Click "Connect to Backend" → "Start Conversation" to see live AI-to-AI conversation streaming in real-time.
----
-## 🎯 **Target Goal & Deployment Strategy**
-### **Final Application Vision:**
-**Local-First Web-Based Conversation Simulator** with:
-- Persona selection interface (surveyor + patient)
-- System prompt editor for surveyors
-- "Simulate Conversation" button
-- Live chat display (chat bubbles, proper positioning)
-- Real-time AI-to-AI conversation visualization
-### **Deployment Architecture Decision:**
-**📍 Local Development → ngrok Tunneling for Team Access**
-**Development Approach:**
-- Build complete application locally (`localhost:7860`)
-- Test and refine using local GPU resources
-- Deploy to research team via **ngrok tunneling** when ready
-**Why This Architecture:**
-- ✅ **Full GPU Power**: Leverage local compute resources
-- ✅ **Real-time Performance**: No cloud latency limitations
-- ✅ **Zero Hosting Costs**: No monthly cloud bills
-- ✅ **Team Access**: ngrok provides public URLs when needed
-- ✅ **Development Control**: Start/stop access on demand
-**Post-Development Deployment:**
-1. **Research Team Access**: `ngrok http 7860` → share temporary URL
-2. **Future Options**: Upgrade to ngrok Pro for permanent domain if needed
-3. **Alternative**: Can later deploy to HuggingFace Spaces for portfolio if desired
----
-## ✅ **Foundation Status**
-**WebSocket Architecture Complete** ✅ (See `AXIOM_WEBSOCKET_ARCHITECTURE.md` for details)
-- Real-time AI-to-AI conversation streaming
-- Thread-safe WebSocket manager (async/sync conflicts resolved)
-- Working Gradio frontend with live message display
-- Complete backend conversation management
-**Ready for Features**: Core conversation system operational, ready for UI enhancements
-## 🚀 **Web UI Implementation Roadmap**
-### **Step 1: Core Conversation Engine** ✅ **COMPLETE** (2025-09-16)
-**Goal**: Wire working components into conversation loop
-- ✅ Working LLM+Persona components (already tested)
-- ✅ Implemented `conversation_manager.py` orchestration
-- ✅ Created conversation loop (surveyor → patient → surveyor)
-- ✅ Added termination conditions
-- ✅ Tested terminal AI-to-AI conversations
-**Success Criteria**: ✅ `python scripts/run_conversation_demo.py` shows live back-and-forth
-### **Step 2: WebSocket Conversation Bridge** ✅ **COMPLETE** (2025-09-18)
-**Goal**: Stream conversations to web clients in real-time
-- ✅ WebSocket system infrastructure (already built)
-- ✅ Connected conversation engine to WebSocket system
-- ✅ Implemented message broadcasting to clients
-- ✅ Added conversation state management (start/stop/pause)
-- ✅ Created ConversationService for managing active conversations
-- ✅ Added REST API endpoints for conversation control
-- ✅ Updated WebSocket client to be Gradio-compatible
-**Success Criteria**: ✅ Conversation events stream to browser via WebSocket
-### **Step 3: Gradio Chat Interface** ✅ **COMPLETE** (2025-09-18)
-**Goal**: Visual chat display with reliable WebSocket connectivity
-- ✅ Replaced Streamlit with working Gradio frontend
-- ✅ Solved critical async/sync conflicts through architectural redesign
-- ✅ Implemented thread-safe WebSocket manager with background threads
-- ✅ Real-time message streaming operational
-- ✅ Complete Streamlit divorce and file consolidation
-**Success Criteria**: ✅ Live conversation displays in browser with reliable connectivity
-### **Step 4: Persona Selection & Management** (1-2 days)
-**Goal**: Interactive persona choosing and switching
-- ✅ 5 personas already defined and working
-- 🎯 Add persona dropdown/selection interface
-- 🎯 Implement persona switching mid-conversation
-- 🎯 Display current active personas clearly
-- 🎯 Add persona preview (name, description)
-**Success Criteria**: Can select and switch personas from UI
-**Current Status**: 🎯 **NEXT PRIORITY**
-### **Step 5: System Prompt Editor** (1-2 days)
-**Goal**: Dynamic prompt customization interface
-- 🎯 Build prompt editing interface for surveyors
-- 🎯 Add prompt templates and presets
-- 🎯 Implement live prompt updates without restart
-- 🎯 Add prompt validation and preview
-**Success Criteria**: Can edit surveyor prompts and see immediate effect
-### **Step 6: Conversation Controls & Polish** (1-2 days)
-**Goal**: Complete conversation management and prepare for team access
-- 🎯 Add start/stop/pause/reset conversation controls
-- 🎯 Implement conversation history/logging
-- 🎯 Add export functionality (save conversations)
-- 🎯 Polish UI styling and user experience
-- 🎯 **Prepare ngrok deployment guide for research team access**
-**Success Criteria**: Full-featured conversation simulator ready for local demo and team deployment via ngrok
-## ⏱️ **Timeline Estimate**: 3-4 days remaining (ahead of schedule!)
-**Original**: 8-12 days total | **Actual Progress**: Steps 1-3 completed (foundation + working web interface)
-**Week 1**: ✅ Steps 1-3 COMPLETE (core functionality + working web UI)
-**Week 2**: Steps 4-6 (persona selection + prompt editing + polish)
-**Major Breakthrough**: Solved WebSocket async/sync conflicts - reliable real-time streaming achieved
-## 🎯 **Current Priority: Step 4**
-**Next Action**: Add persona selection interface to working Gradio frontend
----
-## 🔧 **Quick Start Commands**
-### **Run Current Functionality**
-```bash
-# Terminal 1: Start Ollama
-ollama serve
-# Terminal 2: Start FastAPI backend (NEW!)
-cd backend && uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
-# Terminal 3: Launch web interface (NEW!)
-python frontend/gradio_app.py
-# Opens browser to localhost:7860 with full web UI
-# Alternative: Test terminal demo
-python scripts/run_conversation_demo.py
-# Verify foundation components still work
-python scripts/test_integration.py    # Should show 7/7 tests pass
-```
-### **Development Environment**
-```bash
-# Activate environment
-conda activate converai
-```
-### **Future Team Deployment** (After Step 6 Complete)
-```bash
-# Install ngrok (one-time setup)
-# Download from: https://ngrok.com/download
-# When ready to share with research team:
-ngrok http 7860
-# Share the generated URL (e.g., https://abc123.ngrok.io) with team
-# They can access the full application remotely using your local GPU
-```
----
-## 📝 **Recent Changes Log**
-### **2025-09-18 - Step 3 Complete: Gradio Web Interface & Architecture Consolidation**
-- ✅ **WebSocket Architecture Breakthrough**: Solved critical async/sync conflicts through complete redesign
-- ✅ **Thread-Safe WebSocket Manager**: Created background thread architecture with message queues
-- ✅ **Working Gradio Frontend**: `frontend/gradio_app.py` with real-time conversation streaming
-- ✅ **Complete Streamlit Divorce**: Removed all Streamlit dependencies and files
-- ✅ **File Consolidation**: Cleaned up deprecated files, single canonical implementation
-- ✅ **CORS Cleanup**: Removed Streamlit origins from backend configuration
-- ✅ **Foundation Documentation**: Moved completed Steps 1-3 to `AXIOM_WEBSOCKET_ARCHITECTURE.md`
-- **Key Files Created/Modified**:
-  - `frontend/websocket_manager.py` (new - thread-safe WebSocket client)
-  - `frontend/gradio_app.py` (working web interface)
-  - `backend/api/main.py` (CORS cleanup)
-  - `AXIOM_WEBSOCKET_ARCHITECTURE.md` (complete foundation documentation)
-**Major Achievement**: Real-time AI-to-AI conversations now work reliably in web browser
-### **Earlier History**
-See `AXIOM_IMPLEMENTATION_HISTORY.md` for foundation implementation details.
----
-## 🐛 **Current Issues & Blockers**
-**None** - Foundation tested and working, ready for web UI development.
----
-## 🔄 **For Next Development Session**
-### **Start Here**: Step 4 - Persona Selection Interface
-### **Key Files to Work On**:
-- `frontend/gradio_app.py` (add persona selection dropdowns)
-- `backend/core/persona_system.py` (already working - reference for available personas)
-- `backend/api/routes/conversations.py` (may need persona switching endpoints)
-### **Context Loading**:
-```bash
-# Load current roadmap
-@PROJECT_STATE.md
-# Load working web interface (foundation)
-@frontend/gradio_app.py
-@frontend/websocket_manager.py
-# Reference persona system
-@backend/core/persona_system.py
-# Reference complete WebSocket architecture
-@AXIOM_WEBSOCKET_ARCHITECTURE.md
-```
----
-**Remember**: This is the ONLY evolving file. Update progress here as you complete each step.

README.md ADDED Viewed

	@@ -0,0 +1,156 @@

+# AI Survey Simulator (Local Guide)
+Welcome! This guide walks you through running the AI Survey Simulator locally so you can evaluate the interviewer/patient conversation flow without digging into the implementation details.
+If you are looking for architecture deep dives or change history, head to `docs/` where all developer-facing material now lives.
+---
+## What You Get
+- A Gradio web interface to monitor and control AI-to-AI healthcare survey conversations
+- A FastAPI backend that orchestrates personas, manages the conversation state, and serves WebSocket updates
+- Out-of-the-box personas for a surveyor and multiple patient profiles stored in `data/`
+---
+## Prerequisites
+1. **Python 3.9+** installed (`python --version`)
+2. **Pip** available (`pip --version`)
+3. **Ollama** running locally with an accessible model (e.g., `llama3.2:latest`), since the simulator calls the LLM through Ollama by default
+   - Install instructions: <https://ollama.ai>
+   - Pull a model: `ollama pull llama3.2:latest`
+   - Verify: `ollama list`
+> ℹ️ We are actively planning support for hosted LLM providers. When that lands you will be able to configure the app via environment variables instead of relying on a local Ollama instance.
+---
+## 1. Configure Environment Variables
+Duplicate the sample configuration and adjust if necessary:
+```bash
+cp .env.example .env
+```
+Key values:
+- `LLM_HOST` / `LLM_MODEL` — where the backend reaches your Ollama model
+- `FRONTEND_BACKEND_BASE_URL` — the FastAPI base URL Gradio should call
+- `FRONTEND_WEBSOCKET_URL` — the WebSocket endpoint prefix (without conversation id)
+You can accept the defaults for a local run. Update them later if you move the backend or change models.
+---
+## 2. Set Up the Python Environment
+```bash
+git clone <repository-url>
+cd ConversationAI
+# (optional but recommended)
+python -m venv .venv
+source .venv/bin/activate       # Windows: .venv\Scripts\activate
+pip install -r requirements.txt
+```
+---
+## 3. Start the Backend (FastAPI)
+```bash
+cd backend
+uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
+```
+Keep this terminal running. The backend exposes REST endpoints under `http://localhost:8000` and a WebSocket endpoint the UI listens to.
+---
+## 4. Launch the Gradio Frontend
+In a new terminal (activate the virtual environment again if you created one):
+```bash
+cd frontend
+python gradio_app.py
+```
+Gradio starts on <http://localhost:7860>. Open that page in your browser.
+---
+## 5. Run a Conversation
+1. Click **“Start Conversation”** — the app will connect to the backend automatically and begin the AI interview flow. Messages appear in the “Live AI Conversation” panel.
+2. Watch the conversation update automatically (the UI polls once per second).
+3. Click **“Stop Conversation”** when you are done.
+If the backend or Ollama becomes unreachable, the status box will show an error message so you know where to look first.
+---
+## Personas
+- Surveyor profiles live in `data/surveyor_personas.yaml`
+- Patient profiles live in `data/patient_personas.yaml`
+To tweak a persona for experimentation:
+1. Edit the YAML entry (name, tone, system prompt, etc.)
+2. Restart the backend so it reloads the definitions
+We are working on UI controls to swap personas without editing files—stay tuned.
+---
+## Advanced Configuration
+- Change the log verbosity by setting `LOG_LEVEL=DEBUG` in `.env`
+- Point to a different LLM host/model using the `LLM_*` variables
+- When we introduce hosted-model support, the same `.env` file will control which backend is used without code edits
+---
+## Quick Start Script
+Prefer a single command? Run:
+```bash
+./run_local.sh
+```
+The script will:
+- Load environment variables from `.env`
+- Start `ollama serve` (if it is not already running)
+- Launch the FastAPI backend and Gradio frontend in the background
+Press `Ctrl+C` in that terminal to shut everything down cleanly.
+If you want to watch live logs, run the backend/frontend commands manually in separate terminals instead of using this helper.
+---
+## Helpful Scripts
+- `run_local.sh` — start/stop the full local stack with one command
+- `dev_setup.sh` — (planned) install dependencies and verify prerequisites
+- Smoke tests with mocked LLM responses — (planned)
+---
+## Need Implementation Details?
+For deeper implementation notes, visit the developer docs:
+- `docs/overview.md`
+- `docs/development.md`
+- `docs/roadmap.md`
+---
+Happy testing! If you run into issues, capture the console output from both backend and frontend terminals—it usually reveals configuration or network problems quickly.

backend/api/{services/conversation_service.py → conversation_service.py} RENAMED Viewed

@@ -19,18 +19,23 @@ Example:
 import asyncio
 import logging
 from datetime import datetime
-from typing import Dict, Optional, Set
 from dataclasses import dataclass
 from enum import Enum
 import sys
 from pathlib import Path
-# Add backend to path for imports
-sys.path.insert(0, str(Path(__file__).parent.parent.parent))
-from core.conversation_manager import ConversationManager, ConversationState
-from core.persona_system import PersonaSystem
-from api.websockets.conversation_ws import ConnectionManager
 # Setup logging
 logger = logging.getLogger(__name__)
@@ -68,24 +73,27 @@ class ConversationService:
         websocket_manager: WebSocket connection manager for broadcasting
         persona_system: Persona system for loading personas
         active_conversations: Dict of active conversation instances
     """
-    def __init__(self, websocket_manager: ConnectionManager):
         """Initialize conversation service.
         Args:
             websocket_manager: WebSocket manager for message broadcasting
         """
         self.websocket_manager = websocket_manager
         self.persona_system = PersonaSystem()
         self.active_conversations: Dict[str, ConversationInfo] = {}
     async def start_conversation(self,
                                conversation_id: str,
                                surveyor_persona_id: str,
                                patient_persona_id: str,
-                               host: str = "http://localhost:11434",
-                               model: str = "llama2:7b") -> bool:
         """Start a new AI-to-AI conversation.
         Args:
@@ -114,6 +122,10 @@ class ConversationService:
                 await self._send_error(conversation_id, "Invalid persona IDs")
                 return False
             # Create conversation info
             conv_info = ConversationInfo(
                 conversation_id=conversation_id,
@@ -132,8 +144,8 @@ class ConversationService:
             manager = ConversationManager(
                 surveyor_persona=surveyor_persona,
                 patient_persona=patient_persona,
-                host=host,
-                model=model
             )
             # Start conversation streaming task
@@ -346,12 +358,13 @@ def get_conversation_service() -> ConversationService:
     return conversation_service
-def initialize_conversation_service(websocket_manager: ConnectionManager):
     """Initialize the global conversation service.
     Args:
         websocket_manager: WebSocket connection manager
     """
     global conversation_service
-    conversation_service = ConversationService(websocket_manager)
-    logger.info("ConversationService initialized")

 import asyncio
 import logging
 from datetime import datetime
+from typing import Dict, Optional
 from dataclasses import dataclass
 from enum import Enum
 import sys
 from pathlib import Path
+# Add backend and project root to path for imports
+BACKEND_DIR = Path(__file__).resolve().parents[2]
+PROJECT_ROOT = Path(__file__).resolve().parents[3]
+for path in (BACKEND_DIR, PROJECT_ROOT):
+    if str(path) not in sys.path:
+        sys.path.insert(0, str(path))
+from config.settings import AppSettings, get_settings  # noqa: E402
+from core.conversation_manager import ConversationManager  # noqa: E402
+from core.persona_system import PersonaSystem  # noqa: E402
+from .conversation_ws import ConnectionManager  # noqa: E402
 # Setup logging
 logger = logging.getLogger(__name__)
         websocket_manager: WebSocket connection manager for broadcasting
         persona_system: Persona system for loading personas
         active_conversations: Dict of active conversation instances
+        settings: Shared application settings
     """
+    def __init__(self, websocket_manager: ConnectionManager, settings: Optional[AppSettings] = None):
         """Initialize conversation service.
         Args:
             websocket_manager: WebSocket manager for message broadcasting
+            settings: Shared application settings (optional)
         """
         self.websocket_manager = websocket_manager
         self.persona_system = PersonaSystem()
         self.active_conversations: Dict[str, ConversationInfo] = {}
+        self.settings = settings or get_settings()
     async def start_conversation(self,
                                conversation_id: str,
                                surveyor_persona_id: str,
                                patient_persona_id: str,
+                               host: Optional[str] = None,
+                               model: Optional[str] = None) -> bool:
         """Start a new AI-to-AI conversation.
         Args:
                 await self._send_error(conversation_id, "Invalid persona IDs")
                 return False
+            # Resolve LLM configuration
+            resolved_host = host or self.settings.llm.host
+            resolved_model = model or self.settings.llm.model
             # Create conversation info
             conv_info = ConversationInfo(
                 conversation_id=conversation_id,
             manager = ConversationManager(
                 surveyor_persona=surveyor_persona,
                 patient_persona=patient_persona,
+                host=resolved_host,
+                model=resolved_model
             )
             # Start conversation streaming task
     return conversation_service
+def initialize_conversation_service(websocket_manager: ConnectionManager, settings: Optional[AppSettings] = None):
     """Initialize the global conversation service.
     Args:
         websocket_manager: WebSocket connection manager
+        settings: Shared application settings (optional)
     """
     global conversation_service
+    conversation_service = ConversationService(websocket_manager, settings=settings)
+    logger.info("ConversationService initialized")

backend/api/{websockets/conversation_ws.py → conversation_ws.py} RENAMED Viewed

@@ -238,7 +238,7 @@ async def handle_conversation_control(data: dict, conversation_id: str):
     try:
         # Import here to avoid circular imports
-        from ..services.conversation_service import get_conversation_service
         service = get_conversation_service()
         if control_action == "stop":
@@ -295,14 +295,14 @@ async def handle_start_conversation(data: dict, conversation_id: str):
     """
     try:
         # Import here to avoid circular imports
-        from ..services.conversation_service import get_conversation_service
         service = get_conversation_service()
         # Extract required fields
         surveyor_persona_id = data.get("surveyor_persona_id")
         patient_persona_id = data.get("patient_persona_id")
-        host = data.get("host", "http://localhost:11434")
-        model = data.get("model", "llama2:7b")
         if not surveyor_persona_id or not patient_persona_id:
             await manager.send_to_conversation(conversation_id, {
@@ -340,4 +340,4 @@ async def handle_start_conversation(data: dict, conversation_id: str):
 # Export the manager for use in other modules
-__all__ = ["websocket_endpoint", "manager"]

     try:
         # Import here to avoid circular imports
+        from .conversation_service import get_conversation_service
         service = get_conversation_service()
         if control_action == "stop":
     """
     try:
         # Import here to avoid circular imports
+        from .conversation_service import get_conversation_service
         service = get_conversation_service()
         # Extract required fields
         surveyor_persona_id = data.get("surveyor_persona_id")
         patient_persona_id = data.get("patient_persona_id")
+        host = data.get("host")
+        model = data.get("model")
         if not surveyor_persona_id or not patient_persona_id:
             await manager.send_to_conversation(conversation_id, {
 # Export the manager for use in other modules
+__all__ = ["websocket_endpoint", "manager"]

backend/api/main.py CHANGED Viewed

@@ -11,18 +11,32 @@ Typical usage:
     uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
 """
 from fastapi import FastAPI, WebSocket
 from fastapi.middleware.cors import CORSMiddleware
 import uvicorn
-import logging
 # Import WebSocket endpoint and manager
-from .websockets.conversation_ws import websocket_endpoint, manager
-from .routes.conversations import router as conversations_router
-from .services.conversation_service import initialize_conversation_service
-# Setup logging
-logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 # Initialize FastAPI app
@@ -53,8 +67,8 @@ async def startup_event():
     """Initialize services on startup."""
     logger.info("Initializing AI Survey Simulator API...")
-    # Initialize conversation service with WebSocket manager
-    initialize_conversation_service(manager)
     logger.info("API startup complete")
@@ -88,4 +102,4 @@ async def websocket_conversation_endpoint(websocket: WebSocket, conversation_id:
 if __name__ == "__main__":
-    uvicorn.run(app, host="0.0.0.0", port=8000)

     uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
 """
+import logging
+import sys
+from pathlib import Path
 from fastapi import FastAPI, WebSocket
 from fastapi.middleware.cors import CORSMiddleware
 import uvicorn
+# Ensure project root is available for shared config imports
+ROOT_DIR = Path(__file__).resolve().parents[2]
+if str(ROOT_DIR) not in sys.path:
+    sys.path.insert(0, str(ROOT_DIR))
+from config.settings import get_settings  # noqa: E402
 # Import WebSocket endpoint and manager
+from .conversation_ws import websocket_endpoint, manager  # noqa: E402
+from .routes import router as conversations_router  # noqa: E402
+from .conversation_service import initialize_conversation_service  # noqa: E402
+# Load application settings
+settings = get_settings()
+# Setup logging using configured level
+log_level = getattr(logging, settings.log_level.upper(), logging.INFO)
+logging.basicConfig(level=log_level)
 logger = logging.getLogger(__name__)
 # Initialize FastAPI app
     """Initialize services on startup."""
     logger.info("Initializing AI Survey Simulator API...")
+    # Initialize conversation service with WebSocket manager and settings
+    initialize_conversation_service(manager, settings)
     logger.info("API startup complete")
 if __name__ == "__main__":
+    uvicorn.run(app, host=settings.api.host, port=settings.api.port)

backend/api/{routes/conversations.py → routes.py} RENAMED Viewed

@@ -23,13 +23,7 @@ from fastapi import APIRouter, HTTPException, BackgroundTasks
 from pydantic import BaseModel, Field
 from typing import Dict, List, Optional
 import logging
-import sys
-from pathlib import Path
-# Add backend to path for imports
-sys.path.insert(0, str(Path(__file__).parent.parent.parent))
-from api.services.conversation_service import get_conversation_service
 from core.persona_system import PersonaSystem
 # Setup logging
@@ -45,8 +39,8 @@ class StartConversationRequest(BaseModel):
     conversation_id: str = Field(..., description="Unique identifier for the conversation")
     surveyor_persona_id: str = Field(..., description="ID of the surveyor persona")
     patient_persona_id: str = Field(..., description="ID of the patient persona")
-    host: str = Field(default="http://localhost:11434", description="Ollama server host")
-    model: str = Field(default="llama2:7b", description="LLM model to use")
 class ConversationStatusResponse(BaseModel):
@@ -284,4 +278,4 @@ async def health_check() -> Dict[str, str]:
         return {
             "status": "unhealthy",
             "error": str(e)
-        }

 from pydantic import BaseModel, Field
 from typing import Dict, List, Optional
 import logging
+from .conversation_service import get_conversation_service
 from core.persona_system import PersonaSystem
 # Setup logging
     conversation_id: str = Field(..., description="Unique identifier for the conversation")
     surveyor_persona_id: str = Field(..., description="ID of the surveyor persona")
     patient_persona_id: str = Field(..., description="ID of the patient persona")
+    host: Optional[str] = Field(default=None, description="Override LLM host to use")
+    model: Optional[str] = Field(default=None, description="Override LLM model to use")
 class ConversationStatusResponse(BaseModel):
         return {
             "status": "unhealthy",
             "error": str(e)
+        }

backend/core/conversation_manager.py CHANGED Viewed

@@ -62,7 +62,7 @@ class ConversationManager:
                  surveyor_persona: dict = None,
                  patient_persona: dict = None,
                  host: str = "http://localhost:11434",
-                 model: str = "llama2:7b"):
         """Initialize conversation manager with personas.
         Args:

                  surveyor_persona: dict = None,
                  patient_persona: dict = None,
                  host: str = "http://localhost:11434",
+                 model: str = "llama3.2:latest"):
         """Initialize conversation manager with personas.
         Args:

backend/core/llm_client.py CHANGED Viewed

@@ -14,7 +14,7 @@ Classes:
     VLLMClient: Client for vLLM backend
 Example:
-    client = OllamaClient(host="http://localhost:11434", model="llama2:13b")
     response = await client.generate(prompt="Hello", system_prompt="You are helpful")
 """
@@ -182,7 +182,7 @@ class OllamaClient(LLMClient):
         Returns:
             Generated text response
         """
-        async def _make_request():
             messages = []
             if system_prompt:
                 messages.append({"role": "system", "content": system_prompt})
@@ -220,7 +220,38 @@ class OllamaClient(LLMClient):
             return data["message"]["content"]
-        return await self._retry_request(_make_request)
     async def health_check(self) -> Dict[str, Any]:
         """Check if Ollama server is healthy and list available models.
@@ -336,7 +367,7 @@ def create_llm_client_from_config(config_path: Optional[str] = None,
     config = {
         "backend": "ollama",
         "host": "http://localhost:11434",
-        "model": "llama2:7b",
         "timeout": 120,
         "max_retries": 3,
         "retry_delay": 1.0
@@ -432,4 +463,4 @@ async def test_llm_connection(client: LLMClient) -> Dict[str, Any]:
         results["error"] = str(e)
         logger.error(f"LLM connection test failed: {e}")
-    return results

     VLLMClient: Client for vLLM backend
 Example:
+    client = OllamaClient(host="http://localhost:11434", model="llama3.2:latest")
     response = await client.generate(prompt="Hello", system_prompt="You are helpful")
 """
         Returns:
             Generated text response
         """
+        async def _make_chat_request():
             messages = []
             if system_prompt:
                 messages.append({"role": "system", "content": system_prompt})
             return data["message"]["content"]
+        async def _make_generate_request():
+            payload = {
+                "model": self.model,
+                "prompt": prompt,
+                "stream": False,
+                **kwargs
+            }
+            # Preserve system prompt behavior by prepending when using generate endpoint
+            if system_prompt:
+                payload["system"] = system_prompt
+            response = await self.client.post(
+                f"{self.host}/api/generate",
+                json=payload
+            )
+            response.raise_for_status()
+            data = response.json()
+            # Track token usage if available (some versions omit this field)
+            if "prompt_eval_count" in data:
+                self.total_tokens += data.get("prompt_eval_count", 0) + data.get("eval_count", 0)
+            return data.get("response") or data.get("generated_text", "")
+        try:
+            return await self._retry_request(_make_chat_request)
+        except httpx.HTTPStatusError as exc:
+            if exc.response is not None and exc.response.status_code == 404:
+                logger.warning("Ollama /api/chat endpoint not found. Falling back to /api/generate")
+                return await self._retry_request(_make_generate_request)
+            raise
     async def health_check(self) -> Dict[str, Any]:
         """Check if Ollama server is healthy and list available models.
     config = {
         "backend": "ollama",
         "host": "http://localhost:11434",
+        "model": "llama3.2:latest",
         "timeout": 120,
         "max_retries": 3,
         "retry_delay": 1.0
         results["error"] = str(e)
         logger.error(f"LLM connection test failed: {e}")
+    return results

backend/core/persona_system.py CHANGED Viewed

@@ -139,8 +139,8 @@ class PersonaSystem:
             personas_dir: Directory containing persona YAML files
         """
         if personas_dir is None:
-            # Default to data/personas directory
-            personas_dir = Path(__file__).parent.parent.parent / "data" / "personas"
         self.personas_dir = Path(personas_dir)
         self.personas: Dict[str, Dict[str, Any]] = {}
@@ -353,4 +353,4 @@ def get_persona_system() -> PersonaSystem:
     global _persona_system
     if _persona_system is None:
         _persona_system = PersonaSystem()
-    return _persona_system

             personas_dir: Directory containing persona YAML files
         """
         if personas_dir is None:
+            # Default to data directory
+            personas_dir = Path(__file__).parent.parent.parent / "data"
         self.personas_dir = Path(personas_dir)
         self.personas: Dict[str, Dict[str, Any]] = {}
     global _persona_system
     if _persona_system is None:
         _persona_system = PersonaSystem()
+    return _persona_system

config/default_config.yaml CHANGED Viewed

@@ -29,7 +29,7 @@ llm:
   # Ollama Configuration
   ollama:
     host: "http://localhost:11434"
-    model: "llama2:13b"
     timeout: 120  # seconds
   # vLLM Configuration

   # Ollama Configuration
   ollama:
     host: "http://localhost:11434"
+    model: "llama3.2:latest"
     timeout: 120  # seconds
   # vLLM Configuration

config/settings.py ADDED Viewed

	@@ -0,0 +1,74 @@

+"""Centralized application settings loaded from environment variables.
+Uses pydantic-settings so both backend and frontend can share defaults and
+override them through a `.env` file or process environment variables.
+"""
+from functools import lru_cache
+from pydantic_settings import BaseSettings, SettingsConfigDict
+class APISettings(BaseSettings):
+    """Configuration for the FastAPI backend."""
+    host: str = "0.0.0.0"
+    port: int = 8000
+    log_level: str = "INFO"
+    model_config = SettingsConfigDict(
+        env_prefix="API_",
+        env_file=".env",
+        env_file_encoding="utf-8",
+        extra="ignore",
+    )
+class LLMSettings(BaseSettings):
+    """Configuration for the language model backend."""
+    backend: str = "ollama"
+    host: str = "http://localhost:11434"
+    model: str = "llama3.2:latest"
+    timeout: int = 120
+    model_config = SettingsConfigDict(
+        env_prefix="LLM_",
+        env_file=".env",
+        env_file_encoding="utf-8",
+        extra="ignore",
+    )
+class FrontendSettings(BaseSettings):
+    """Configuration for the Gradio frontend."""
+    backend_base_url: str = "http://localhost:8000"
+    websocket_url: str = "ws://localhost:8000/ws/conversation"
+    model_config = SettingsConfigDict(
+        env_prefix="FRONTEND_",
+        env_file=".env",
+        env_file_encoding="utf-8",
+        extra="ignore",
+    )
+class AppSettings(BaseSettings):
+    """Aggregate configuration exposed to the application."""
+    api: APISettings = APISettings()
+    llm: LLMSettings = LLMSettings()
+    frontend: FrontendSettings = FrontendSettings()
+    log_level: str = "INFO"
+    model_config = SettingsConfigDict(
+        env_file=".env",
+        env_file_encoding="utf-8",
+        extra="ignore",
+    )
+@lru_cache
+def get_settings() -> AppSettings:
+    """Return the singleton settings instance."""
+    return AppSettings()

data/{personas/patient_personas.yaml → patient_personas.yaml} RENAMED Viewed

File without changes

data/{personas/surveyor_personas.yaml → surveyor_personas.yaml} RENAMED Viewed

File without changes

docs/README.md ADDED Viewed

	@@ -0,0 +1,9 @@

+## Developer Documentation Index
+These short guides are all you need to extend the AI Survey Simulator:
+- `overview.md` — architecture summary, major components, and repository map.
+- `development.md` — setup, runtime instructions, and implementation guidelines.
+- `roadmap.md` — current status and prioritized future work.
+Keep documentation lean: update the relevant file when behavior changes or priorities shift.

docs/development.md ADDED Viewed

	@@ -0,0 +1,61 @@

+# Development Guide
+This guide captures what future contributors need to know to extend the AI Survey Simulator quickly.
+## Environment Essentials
+- Python 3.9+
+- Ollama running locally (or another LLM provider wired into `llm_client.py`)
+- Optional GPU for faster inference
+```bash
+cp .env.example .env                # adjust values as needed
+pip install -r requirements.txt
+```
+Key environment variables (see `.env.example`):
+- `LLM_HOST` / `LLM_MODEL` — target model endpoint
+- `FRONTEND_BACKEND_BASE_URL` and `FRONTEND_WEBSOCKET_URL` — how the UI talks to FastAPI
+- `LOG_LEVEL` — INFO by default
+## Running the Stack
+### One Command
+```bash
+./run_local.sh
+```
+- Starts `ollama serve` (if not already running)
+- Launches FastAPI backend and Gradio frontend in the background
+- Press `Ctrl+C` to stop all three processes
+### Manual Terminals (for logs)
+```bash
+# Terminal 1
+ollama serve
+# Terminal 2
+cd backend
+uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
+# Terminal 3
+cd frontend
+python gradio_app.py
+```
+## Making Changes Safely
+- Prefer editing personas via YAML (`data/`) and restart the backend to reload.
+- All configuration flows through `config/settings.py`; add new settings there and reference them via `get_settings()`.
+- When adding LLM providers, implement a new client in `backend/core/llm_client.py` and hook it into the existing factory.
+- Keep WebSocket message schemas stable (`backend/api/conversation_ws.py`); update both backend and frontend consumers if you change them.
+## Testing & Verification
+- No automated test suite yet. Add lightweight `pytest` modules under `tests/` as you extend functionality.
+- Manually verify conversations through the Gradio UI.
+- If you need to debug the conversation loop, instrument `backend/core/conversation_manager.py` or launch a shell and run it directly.
+## Roadmap & Next Steps
+See `docs/roadmap.md` for current priorities, open questions, and suggested next features (persona selector UI, hosted LLM support, etc.).

docs/overview.md ADDED Viewed

	@@ -0,0 +1,56 @@

+# System Overview
+## Purpose
+The AI Survey Simulator orchestrates AI-to-AI healthcare survey conversations so researchers can explore interviewer and patient persona behavior without involving real participants.
+## Architecture at a Glance
+- **Gradio Frontend (`frontend/`)**
+  Presents the control panel, connects to the backend via WebSocket, and renders streaming messages.
+- **FastAPI Backend (`backend/api/`)**
+  Hosts REST endpoints for conversation control, WebSocket endpoints for live streaming, and the conversation service that manages active sessions.
+- **Core Logic (`backend/core/`)**
+  Contains reusable building blocks: persona loading (`persona_system.py`), conversation flow management (`conversation_manager.py`), and LLM client adapters (`llm_client.py`).
+- **LLM Backend (Ollama by default)**
+  The backend uses `LLM_HOST`/`LLM_MODEL` from `.env` to reach a local Ollama server. Other providers can be integrated by extending `llm_client.py`.
+- **Data Assets (`data/`)**
+  Persona definitions live in YAML files (`patient_personas.yaml`, `surveyor_personas.yaml`). Update these to add or refine personas.
+## Runtime Flow
+1. Frontend requests a new conversation (REST) or emits `start_conversation` over WebSocket.
+2. Backend spawns a `ConversationManager`, which alternates surveyor/patient turns using the configured LLM.
+3. Generated messages stream back to the frontend over the WebSocket connection.
+4. Conversation statuses and errors are broadcast so the UI can show progress and failures.
+## Repository Map (Key Paths)
+```
+backend/
+  api/
+    main.py              # FastAPI entry point
+    routes.py            # REST endpoints
+    conversation_service.py
+    conversation_ws.py
+  core/
+    conversation_manager.py
+    persona_system.py
+    llm_client.py
+frontend/
+  gradio_app.py
+  websocket_manager.py
+data/
+  patient_personas.yaml
+  surveyor_personas.yaml
+config/
+  settings.py            # Shared configuration loader
+.env.example
+run_local.sh
+```
+Keep this mental model in mind when extending the simulator—it highlights where to plug in new personas, swap LLMs, or modify UI behavior.

docs/roadmap.md ADDED Viewed

	@@ -0,0 +1,37 @@

+# Roadmap & Status
+_Last updated: 2025-11-05_
+## Current Capabilities
+- Gradio UI driven by WebSocket streaming
+- FastAPI backend with conversation management service
+- Personas defined via YAML and loaded dynamically
+- Ollama integration with fallback to `/api/generate`
+## Near-Term Priorities
+1. **Persona Selection in UI**
+   Allow users to choose surveyor/patient personas from dropdowns instead of hard-coded IDs.
+2. **Hosted LLM Support**
+   Add an HTTP client implementation for a cloud provider (Hugging Face Inference, OpenRouter, etc.) and expose configuration via `.env`.
+3. **Basic Test Coverage**
+   Introduce smoke tests (mocked LLM responses) to prevent regressions in conversation flow.
+4. **Export / Logging Enhancements**
+   Persist conversation transcripts and expose a simple export (JSON/CSV) endpoint or UI action.
+## Longer-Term Ideas
+- Interactive persona editor within the UI
+- Conversation playback and analytics
+- Multi-model comparison mode
+- Cloud-hosted deployment (Hugging Face Spaces or similar)
+## How to Contribute
+1. Sync with this roadmap and open a planning thread or issue for new work.
+2. Keep docs up to date—update this file when priorities shift.
+3. Follow the patterns in `backend/core/` and `config/settings.py` to keep configuration centralized.

frontend/gradio_app.py CHANGED Viewed

@@ -21,17 +21,23 @@ project_root = Path(__file__).parent.parent
 sys.path.insert(0, str(project_root))
 sys.path.insert(0, str(project_root / "frontend"))
 from websocket_manager import WebSocketManager, ManagerState
-# Setup logging
-logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
 # Global state
-backend_url = "http://localhost:8000"
 conversation_id = f"gradio_conv_{int(time.time())}"
 ws_manager = None
 conversation_active = False
 # Message storage for display
 all_messages = []
@@ -45,7 +51,7 @@ def initialize_websocket() -> str:
         ws_manager.stop()
     try:
-        ws_url = f"ws://localhost:8000/ws/conversation/{conversation_id}"
         ws_manager = WebSocketManager(ws_url, conversation_id)
         success = ws_manager.start()
@@ -62,12 +68,27 @@ def initialize_websocket() -> str:
         return f"❌ Connection error: {e}"
 def start_conversation() -> tuple:
     """Start a new AI-to-AI conversation."""
     global conversation_active, all_messages
-    if not ws_manager or ws_manager.state != ManagerState.CONNECTED:
-        return get_message_display(), "❌ Not connected to backend. Please connect first."
     if conversation_active:
         return get_message_display(), "⚠️ Conversation already in progress"
@@ -82,8 +103,8 @@ def start_conversation() -> tuple:
             "content": "start",
             "surveyor_persona_id": "friendly_researcher_001",
             "patient_persona_id": "cooperative_senior_001",
-            "host": "http://localhost:11434",
-            "model": "llama2:7b"
         }
         success = ws_manager.send_message(message)
@@ -91,7 +112,10 @@ def start_conversation() -> tuple:
         if success:
             conversation_active = True
             logger.info("Conversation start message sent")
-            return get_message_display(), "✅ Conversation started! AI responses will appear below..."
         else:
             return get_message_display(), "❌ Failed to send start message"
@@ -241,7 +265,7 @@ with gr.Blocks(title="🏥 AI Survey Simulator v2") as app:
             # Main chat interface
             chat_display = gr.Textbox(
                 label="Live AI Conversation",
-                value="Click 'Connect to Backend' to begin",
                 lines=20,
                 max_lines=25,
                 interactive=False,
@@ -250,15 +274,13 @@ with gr.Blocks(title="🏥 AI Survey Simulator v2") as app:
             # Control buttons
             with gr.Row():
-                connect_btn = gr.Button("🔌 Connect to Backend", variant="secondary")
                 start_btn = gr.Button("▶️ Start Conversation", variant="primary")
                 stop_btn = gr.Button("⏹️ Stop Conversation", variant="stop")
-                refresh_btn = gr.Button("🔄 Refresh Messages", variant="secondary")
             # Status message
             status_msg = gr.Textbox(
                 label="Status Messages",
-                value="Ready to connect...",
                 interactive=False,
                 lines=2
             )
@@ -276,12 +298,11 @@ with gr.Blocks(title="🏥 AI Survey Simulator v2") as app:
             <div style="margin-top: 20px; padding: 15px; background-color: #f0f8ff; border-radius: 8px;">
                 <h3>📋 Instructions</h3>
                 <ol>
-                    <li><strong>Connect</strong> to backend first</li>
-                    <li><strong>Start Conversation</strong> to begin AI chat</li>
-                    <li><strong>Refresh Messages</strong> to see new responses</li>
                     <li><strong>Stop</strong> when finished</li>
                 </ol>
-                <p><small>💡 <strong>Tip</strong>: Click refresh regularly to see new AI messages as they arrive!</small></p>
             </div>
             """)
@@ -290,16 +311,11 @@ with gr.Blocks(title="🏥 AI Survey Simulator v2") as app:
                 <strong>🔧 Requirements:</strong><br>
                 • Ollama server running<br>
                 • FastAPI backend on port 8000<br>
-                • llama2:7b model available
             </div>
             """)
     # Event handlers
-    connect_btn.click(
-        fn=initialize_websocket,
-        outputs=[status_msg]
-    )
     start_btn.click(
         fn=start_conversation,
         outputs=[chat_display, status_msg]
@@ -310,9 +326,10 @@ with gr.Blocks(title="🏥 AI Survey Simulator v2") as app:
         outputs=[chat_display, status_msg]
     )
-    refresh_btn.click(
         fn=refresh_messages,
-        outputs=[chat_display, status_panel]
     )
 # Launch configuration
@@ -332,4 +349,4 @@ if __name__ == "__main__":
             inbrowser=True
         )
     finally:
-        cleanup_on_exit()

 sys.path.insert(0, str(project_root))
 sys.path.insert(0, str(project_root / "frontend"))
+from config.settings import get_settings
 from websocket_manager import WebSocketManager, ManagerState
+# Load shared settings
+settings = get_settings()
+# Setup logging using configured level
+log_level = getattr(logging, settings.log_level.upper(), logging.INFO)
+logging.basicConfig(level=log_level)
 logger = logging.getLogger(__name__)
 # Global state
+backend_url = settings.frontend.backend_base_url
 conversation_id = f"gradio_conv_{int(time.time())}"
 ws_manager = None
 conversation_active = False
+ws_base = settings.frontend.websocket_url.rstrip("/")
 # Message storage for display
 all_messages = []
         ws_manager.stop()
     try:
+        ws_url = f"{ws_base}/{conversation_id}"
         ws_manager = WebSocketManager(ws_url, conversation_id)
         success = ws_manager.start()
         return f"❌ Connection error: {e}"
+def ensure_connection() -> tuple[bool, str]:
+    """Ensure there is an active WebSocket connection."""
+    global ws_manager
+    if ws_manager and ws_manager.state == ManagerState.CONNECTED:
+        return True, "🟢 Connected to backend"
+    status_message = initialize_websocket()
+    if ws_manager and ws_manager.state == ManagerState.CONNECTED:
+        return True, status_message
+    return False, status_message
 def start_conversation() -> tuple:
     """Start a new AI-to-AI conversation."""
     global conversation_active, all_messages
+    connected, connect_message = ensure_connection()
+    if not connected:
+        return get_message_display(), connect_message
     if conversation_active:
         return get_message_display(), "⚠️ Conversation already in progress"
             "content": "start",
             "surveyor_persona_id": "friendly_researcher_001",
             "patient_persona_id": "cooperative_senior_001",
+            "host": settings.llm.host,
+            "model": settings.llm.model
         }
         success = ws_manager.send_message(message)
         if success:
             conversation_active = True
             logger.info("Conversation start message sent")
+            status_feedback = "✅ Conversation started! AI responses will appear below..."
+            if connect_message.startswith("✅"):
+                status_feedback = f"{connect_message}\n{status_feedback}"
+            return get_message_display(), status_feedback
         else:
             return get_message_display(), "❌ Failed to send start message"
             # Main chat interface
             chat_display = gr.Textbox(
                 label="Live AI Conversation",
+                value="Click 'Start Conversation' to begin",
                 lines=20,
                 max_lines=25,
                 interactive=False,
             # Control buttons
             with gr.Row():
                 start_btn = gr.Button("▶️ Start Conversation", variant="primary")
                 stop_btn = gr.Button("⏹️ Stop Conversation", variant="stop")
             # Status message
             status_msg = gr.Textbox(
                 label="Status Messages",
+                value="Ready to start a conversation.",
                 interactive=False,
                 lines=2
             )
             <div style="margin-top: 20px; padding: 15px; background-color: #f0f8ff; border-radius: 8px;">
                 <h3>📋 Instructions</h3>
                 <ol>
+                    <li><strong>Start Conversation</strong> to auto-connect and begin</li>
+                    <li><strong>Watch the conversation update automatically</strong></li>
                     <li><strong>Stop</strong> when finished</li>
                 </ol>
+                <p><small>💡 <strong>Tip</strong>: The panel refreshes once per second while connected.</small></p>
             </div>
             """)
                 <strong>🔧 Requirements:</strong><br>
                 • Ollama server running<br>
                 • FastAPI backend on port 8000<br>
+                • llama3.2:latest model available
             </div>
             """)
     # Event handlers
     start_btn.click(
         fn=start_conversation,
         outputs=[chat_display, status_msg]
         outputs=[chat_display, status_msg]
     )
+    app.load(
         fn=refresh_messages,
+        outputs=[chat_display, status_panel],
+        every=1.0
     )
 # Launch configuration
             inbrowser=True
         )
     finally:
+        cleanup_on_exit()

run_local.sh ADDED Viewed

	@@ -0,0 +1,105 @@

+#!/usr/bin/env bash
+# Convenience launcher for the local AI Survey Simulator stack.
+# Starts (if needed) Ollama, then the FastAPI backend, then the Gradio frontend.
+set -Eeuo pipefail
+ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+ENV_FILE="${ROOT_DIR}/.env"
+# Load environment variables if .env exists
+if [[ -f "${ENV_FILE}" ]]; then
+  echo "🔧 Loading environment from ${ENV_FILE}"
+  set -o allexport
+  source "${ENV_FILE}"
+  set +o allexport
+fi
+API_HOST="${API_HOST:-0.0.0.0}"
+API_PORT="${API_PORT:-8000}"
+LOG_LEVEL="${LOG_LEVEL:-INFO}"
+OLLAMA_STARTED=0
+BACKEND_PID=0
+FRONTEND_PID=0
+cleanup() {
+  echo -e "\n🧹 Shutting down..."
+  if [[ ${FRONTEND_PID} -ne 0 ]]; then
+    echo "  • Stopping frontend (PID ${FRONTEND_PID})"
+    kill "${FRONTEND_PID}" 2>/dev/null || true
+  fi
+  if [[ ${BACKEND_PID} -ne 0 ]]; then
+    echo "  • Stopping backend (PID ${BACKEND_PID})"
+    kill "${BACKEND_PID}" 2>/dev/null || true
+  fi
+  if [[ ${OLLAMA_STARTED} -eq 1 ]]; then
+    echo "  • Stopping ollama serve (PID ${OLLAMA_PID})"
+    kill "${OLLAMA_PID}" 2>/dev/null || true
+  fi
+  wait || true
+  echo "✅ Shutdown complete"
+}
+trap cleanup EXIT SIGINT SIGTERM
+start_ollama() {
+  if ! command -v ollama >/dev/null 2>&1; then
+    echo "❌ ollama command not found. Install Ollama and ensure it is on your PATH."
+    exit 1
+  fi
+  if pgrep -f "ollama serve" >/dev/null 2>&1; then
+    echo "🟢 ollama serve already running"
+  else
+    echo "🚀 Starting ollama serve (background)"
+    ollama serve >/dev/null 2>&1 &
+    OLLAMA_PID=$!
+    OLLAMA_STARTED=1
+    sleep 2
+  fi
+}
+start_backend() {
+  echo "🚀 Starting FastAPI backend on ${API_HOST}:${API_PORT} (background)"
+  (
+    cd "${ROOT_DIR}/backend"
+    uvicorn api.main:app --host "${API_HOST}" --port "${API_PORT}" --log-level "${LOG_LEVEL,,}"
+  ) >/dev/null 2>&1 &
+  BACKEND_PID=$!
+  sleep 2
+}
+start_frontend() {
+  echo "🚀 Starting Gradio frontend (background)"
+  (
+    cd "${ROOT_DIR}/frontend"
+    python gradio_app.py
+  ) >/dev/null 2>&1 &
+  FRONTEND_PID=$!
+  sleep 2
+}
+echo "==============================================="
+echo " AI Survey Simulator - Local Run"
+echo " Press Ctrl+C to stop all services"
+echo "==============================================="
+start_ollama
+start_backend
+start_frontend
+echo "✅ All services started."
+echo "   • FastAPI backend: http://${API_HOST}:${API_PORT}"
+echo "   • Gradio UI:       http://localhost:7860"
+echo "   • Ollama host:     ${LLM_HOST:-http://localhost:11434}"
+echo "==============================================="
+echo "Services are running in the background of this shell."
+echo "To view live logs, run each service manually in its own terminal."
+echo "==============================================="
+# Keep the script alive until interrupted so background processes persist.
+while true; do
+  sleep 60
+done

scripts/check_setup.py DELETED Viewed

@@ -1,234 +0,0 @@
-#!/usr/bin/env python3
-"""Quick setup verification script.
-This script checks if the development environment is properly configured
-and all components are ready for development.
-Usage:
-    python scripts/check_setup.py
-"""
-import sys
-import subprocess
-import importlib
-import asyncio
-from pathlib import Path
-def print_header(title: str):
-    """Print formatted header."""
-    print(f"\n{'='*50}")
-    print(f"🔍 {title}")
-    print('='*50)
-def check_python_packages():
-    """Check if required Python packages are installed."""
-    print_header("Python Environment Check")
-    required_packages = [
-        'fastapi', 'uvicorn', 'websockets', 'streamlit',
-        'httpx', 'pydantic', 'pyyaml', 'loguru'
-    ]
-    missing_packages = []
-    for package in required_packages:
-        try:
-            importlib.import_module(package)
-            print(f"✅ {package}")
-        except ImportError:
-            print(f"❌ {package} - Missing")
-            missing_packages.append(package)
-    if missing_packages:
-        print(f"\n⚠️  Missing packages: {', '.join(missing_packages)}")
-        print("Install with: pip install -r requirements.txt")
-        return False
-    else:
-        print("\n🎉 All required packages installed!")
-        return True
-def check_ollama():
-    """Check if Ollama is installed and available."""
-    print_header("Ollama Check")
-    try:
-        # Check if ollama command exists
-        result = subprocess.run(['ollama', '--version'],
-                              capture_output=True, text=True, timeout=5)
-        if result.returncode == 0:
-            print(f"✅ Ollama installed: {result.stdout.strip()}")
-            # Check if server is running
-            try:
-                result = subprocess.run(['ollama', 'list'],
-                                      capture_output=True, text=True, timeout=5)
-                if result.returncode == 0:
-                    print("✅ Ollama server is running")
-                    # List available models
-                    if result.stdout.strip():
-                        print("📦 Available models:")
-                        for line in result.stdout.strip().split('\n')[1:]:  # Skip header
-                            if line.strip():
-                                print(f"   - {line.split()[0]}")
-                    else:
-                        print("⚠️  No models installed")
-                        print("   Install with: ollama pull llama2:7b")
-                else:
-                    print("❌ Ollama server not responding")
-                    print("   Start with: ollama serve")
-                    return False
-            except subprocess.TimeoutExpired:
-                print("❌ Ollama command timed out")
-                return False
-        else:
-            print("❌ Ollama command failed")
-            return False
-    except FileNotFoundError:
-        print("❌ Ollama not installed")
-        print("   Install with: curl -fsSL https://ollama.ai/install.sh | sh")
-        return False
-    except subprocess.TimeoutExpired:
-        print("❌ Ollama command timed out")
-        return False
-    return True
-def check_project_structure():
-    """Check if project structure is complete."""
-    print_header("Project Structure Check")
-    required_dirs = [
-        'backend/api', 'backend/core', 'backend/models', 'backend/storage',
-        'frontend/components', 'frontend/utils',
-        'data/personas', 'config', 'scripts', 'tests', '.claude/commands'
-    ]
-    required_files = [
-        'CLAUDE.md', 'STATUS.md', 'TODO_CONTEXT.md', 'DEVELOPMENT_LOG.md',
-        'requirements.txt', '.env.example',
-        'backend/api/main.py', 'backend/core/llm_client.py',
-        'backend/core/persona_system.py',
-        'data/personas/patient_personas.yaml',
-        'config/default_config.yaml'
-    ]
-    missing_items = []
-    # Check directories
-    for dir_path in required_dirs:
-        if Path(dir_path).exists():
-            print(f"✅ {dir_path}/")
-        else:
-            print(f"❌ {dir_path}/ - Missing")
-            missing_items.append(dir_path)
-    # Check files
-    for file_path in required_files:
-        if Path(file_path).exists():
-            print(f"✅ {file_path}")
-        else:
-            print(f"❌ {file_path} - Missing")
-            missing_items.append(file_path)
-    if missing_items:
-        print(f"\n⚠️  Missing items: {len(missing_items)}")
-        return False
-    else:
-        print("\n🎉 Project structure complete!")
-        return True
-async def run_component_tests():
-    """Run basic component tests."""
-    print_header("Component Tests")
-    # Import here to avoid issues if packages aren't installed
-    try:
-        sys.path.insert(0, str(Path('backend')))
-        from core.llm_client import OllamaClient
-        from core.persona_system import PersonaSystem
-        # Test persona system
-        try:
-            persona_system = PersonaSystem()
-            personas = persona_system.list_personas()
-            print(f"✅ Persona system: {len(personas)} personas loaded")
-        except Exception as e:
-            print(f"❌ Persona system failed: {e}")
-            return False
-        # Test LLM client (without actual connection)
-        try:
-            client = OllamaClient(host="http://localhost:11434", model="llama2:7b")
-            print("✅ LLM client can be created")
-            await client.close()
-        except Exception as e:
-            print(f"❌ LLM client creation failed: {e}")
-            return False
-        return True
-    except ImportError as e:
-        print(f"❌ Import error: {e}")
-        return False
-def print_next_steps(all_checks_passed: bool):
-    """Print recommended next steps."""
-    print_header("Next Steps")
-    if all_checks_passed:
-        print("🎉 Environment is ready for development!")
-        print("\nRecommended workflow:")
-        print("1. Load project context:")
-        print("   @CLAUDE.md @STATUS.md")
-        print("\n2. Choose your task:")
-        print("   /conversation-task  # Next priority")
-        print("   /websocket-task    # If WebSocket needs testing")
-        print("   /llm-task         # If LLM needs testing")
-        print("\n3. Test components:")
-        print("   python scripts/test_websocket.py")
-        print("   python scripts/test_llm_connection.py")
-        print("\n4. Begin implementation!")
-    else:
-        print("⚠️  Setup incomplete. Please fix the issues above.")
-        print("\nQuick setup commands:")
-        print("1. Install Python packages:")
-        print("   pip install -r requirements.txt")
-        print("\n2. Install Ollama:")
-        print("   curl -fsSL https://ollama.ai/install.sh | sh")
-        print("   ollama pull llama2:7b")
-        print("\n3. Re-run this check:")
-        print("   python scripts/check_setup.py")
-async def main():
-    """Main check function."""
-    print("🔍 AI Survey Simulator - Setup Verification")
-    checks = []
-    # Run all checks
-    checks.append(check_python_packages())
-    checks.append(check_project_structure())
-    checks.append(check_ollama())
-    checks.append(await run_component_tests())
-    # Summary
-    passed = sum(checks)
-    total = len(checks)
-    print_header(f"Summary: {passed}/{total} checks passed")
-    all_passed = passed == total
-    print_next_steps(all_passed)
-    return 0 if all_passed else 1
-if __name__ == "__main__":
-    try:
-        exit_code = asyncio.run(main())
-        sys.exit(exit_code)
-    except KeyboardInterrupt:
-        print("\n⏹️  Check interrupted by user")
-        sys.exit(1)

scripts/run_conversation_demo.py DELETED Viewed

@@ -1,233 +0,0 @@
-#!/usr/bin/env python3
-"""Terminal demo of AI-to-AI conversation.
-This script demonstrates the core conversation engine by running
-a live AI survey conversation between a surveyor and patient persona
-in the terminal with rich formatting.
-Usage:
-    python scripts/run_conversation_demo.py [--host HOST] [--model MODEL]
-Example:
-    python scripts/run_conversation_demo.py --host http://localhost:11434 --model llama2:7b
-"""
-import asyncio
-import sys
-import argparse
-from pathlib import Path
-# Add backend to path for imports
-project_root = Path(__file__).parent.parent
-sys.path.insert(0, str(project_root / "backend"))
-try:
-    from core.conversation_manager import ConversationManager, ConversationState
-    from core.persona_system import PersonaSystem
-    from rich.console import Console
-    from rich.panel import Panel
-    from rich.text import Text
-    from rich.live import Live
-    from rich.layout import Layout
-    from rich.progress import Progress, SpinnerColumn, TextColumn
-    import rich.traceback
-    from rich.prompt import Prompt
-    from rich import print as rprint
-except ImportError as e:
-    print(f"❌ Import error: {e}")
-    print("Make sure you're running from the project root and all dependencies are installed:")
-    print("  pip install rich")
-    print("  conda activate converai")
-    sys.exit(1)
-def setup_rich():
-    """Configure rich for beautiful terminal output."""
-    rich.traceback.install()
-    return Console()
-def display_personas(console: Console, persona_system: PersonaSystem):
-    """Display available personas for selection."""
-    console.print("\n[bold blue]📋 Available Personas[/bold blue]")
-    # Show surveyors
-    console.print("[bold]Surveyors:[/bold]")
-    surveyors = persona_system.list_personas("surveyor")
-    for i, surveyor in enumerate(surveyors):
-        name = surveyor.get("name", "Unknown")
-        description = surveyor.get("description", "No description")
-        console.print(f"  {i+1}. [cyan]{name}[/cyan] - {description[:60]}...")
-    # Show patients
-    console.print("\n[bold]Patients:[/bold]")
-    patients = persona_system.list_personas("patient")
-    for i, patient in enumerate(patients):
-        name = patient.get("name", "Unknown")
-        description = patient.get("description", "No description")
-        console.print(f"  {i+1}. [green]{name}[/green] - {description[:60]}...")
-    console.print()
-def format_message(message: dict, console: Console):
-    """Format a conversation message for display."""
-    role = message["role"]
-    content = message["content"]
-    persona_name = message["persona"]
-    timestamp = message["timestamp"]
-    turn = message.get("turn", 0)
-    if role == "surveyor":
-        # Blue panel for surveyor
-        title = f"🔹 Dr. {persona_name} (Turn {turn})"
-        panel = Panel(
-            content,
-            title=title,
-            border_style="blue",
-            padding=(0, 1),
-        )
-    else:
-        # Green panel for patient
-        title = f"💬 {persona_name} (Turn {turn})"
-        panel = Panel(
-            content,
-            title=title,
-            border_style="green",
-            padding=(0, 1),
-        )
-    console.print(panel)
-    console.print()  # Add spacing
-async def run_conversation_demo(host: str = "http://localhost:11434", model: str = "llama2:7b"):
-    """Run the conversation demo."""
-    console = setup_rich()
-    try:
-        # Header
-        console.print("[bold green]🤖 AI Survey Conversation Demo[/bold green]")
-        console.print(f"Host: {host}")
-        console.print(f"Model: {model}")
-        console.print("=" * 50)
-        # Initialize persona system
-        console.print("[yellow]Loading persona system...[/yellow]")
-        persona_system = PersonaSystem()
-        # Show available personas
-        display_personas(console, persona_system)
-        # Let user choose personas or use defaults
-        use_defaults = Prompt.ask(
-            "Use default personas (Dr. Sarah Mitchell + Margaret Thompson)?",
-            choices=["y", "n"],
-            default="y"
-        )
-        if use_defaults.lower() == "y":
-            # Use first surveyor and patient
-            surveyors = persona_system.list_personas("surveyor")
-            patients = persona_system.list_personas("patient")
-            if not surveyors or not patients:
-                console.print("[red]❌ No personas found! Check persona configuration.[/red]")
-                return
-            surveyor_persona = surveyors[0]
-            patient_persona = patients[0]
-        else:
-            console.print("[yellow]Manual persona selection not implemented yet. Using defaults.[/yellow]")
-            surveyors = persona_system.list_personas("surveyor")
-            patients = persona_system.list_personas("patient")
-            surveyor_persona = surveyors[0]
-            patient_persona = patients[0]
-        console.print(f"[blue]Selected Surveyor:[/blue] {surveyor_persona.get('name', 'Unknown')}")
-        console.print(f"[green]Selected Patient:[/green] {patient_persona.get('name', 'Unknown')}")
-        console.print()
-        # Create conversation manager
-        console.print("[yellow]Initializing conversation manager...[/yellow]")
-        manager = ConversationManager(
-            surveyor_persona=surveyor_persona,
-            patient_persona=patient_persona,
-            host=host,
-            model=model
-        )
-        console.print("[green]✅ Ready! Starting conversation...[/green]")
-        console.print("=" * 50)
-        console.print()
-        # Run conversation
-        message_count = 0
-        try:
-            async for message in manager.conduct_conversation():
-                message_count += 1
-                format_message(message, console)
-                # Small delay for readability
-                await asyncio.sleep(1)
-                # Check for errors
-                if message.get("error"):
-                    console.print(f"[yellow]⚠️  Technical error detected in message {message_count}[/yellow]")
-                # Progress indicator
-                if message_count % 2 == 0:  # Every patient response
-                    console.print(f"[dim]Conversation progress: {message_count} messages...[/dim]")
-        except KeyboardInterrupt:
-            console.print("\n[yellow]⏹️ Conversation interrupted by user[/yellow]")
-        except Exception as e:
-            console.print(f"\n[red]❌ Conversation error: {e}[/red]")
-            import traceback
-            console.print(f"[dim]{traceback.format_exc()}[/dim]")
-        finally:
-            # Clean up
-            try:
-                await manager.close()
-            except:
-                pass
-        # Summary
-        console.print("=" * 50)
-        if message_count > 0:
-            console.print(f"[bold green]🎉 Conversation Complete![/bold green]")
-            console.print(f"Total messages: {message_count}")
-            console.print(f"Conversation ID: {manager.conversation_id}")
-        else:
-            console.print("[yellow]⚠️  No messages were generated.[/yellow]")
-            console.print("Check that Ollama is running: ollama serve")
-    except Exception as e:
-        console.print(f"[red]❌ Demo failed: {e}[/red]")
-        import traceback
-        console.print(f"[dim]{traceback.format_exc()}[/dim]")
-        return 1
-def main():
-    """Main entry point."""
-    parser = argparse.ArgumentParser(description="AI Survey Conversation Demo")
-    parser.add_argument("--host", default="http://localhost:11434",
-                       help="Ollama server host (default: http://localhost:11434)")
-    parser.add_argument("--model", default="llama2:7b",
-                       help="LLM model to use (default: llama2:7b)")
-    args = parser.parse_args()
-    try:
-        return asyncio.run(run_conversation_demo(args.host, args.model))
-    except KeyboardInterrupt:
-        print("\n⏹️ Demo interrupted by user")
-        return 0
-    except Exception as e:
-        print(f"❌ Failed to run demo: {e}")
-        return 1
-if __name__ == "__main__":
-    sys.exit(main())

scripts/test_integration.py DELETED Viewed

@@ -1,387 +0,0 @@
-#!/usr/bin/env python3
-"""Integration test script for AI Survey Simulator.
-This script tests the full integration of all components:
-- LLM client with personas
-- Persona system loading
-- End-to-end conversation flow
-- Configuration loading
-Usage:
-    python scripts/test_integration.py
-"""
-import asyncio
-import sys
-import yaml
-from pathlib import Path
-# Add backend to path for imports
-sys.path.insert(0, str(Path(__file__).parent.parent / "backend"))
-try:
-    from core.llm_client import OllamaClient
-    from core.persona_system import PersonaSystem
-except ImportError as e:
-    print(f"❌ Import error: {e}")
-    print("Make sure you're running from the project root directory")
-    sys.exit(1)
-class IntegrationTester:
-    """Full integration test suite."""
-    def __init__(self, host: str = "http://localhost:11434", model: str = "llama2:7b"):
-        """Initialize integration tester.
-        Args:
-            host: Ollama server host
-            model: Model to test with
-        """
-        self.host = host
-        self.model = model
-        self.client = None
-        self.persona_system = None
-    async def run_all_tests(self) -> bool:
-        """Run all integration tests.
-        Returns:
-            True if all tests pass
-        """
-        print("🔗 AI Survey Simulator - Integration Test Suite")
-        print("=" * 50)
-        print(f"Host: {self.host}")
-        print(f"Model: {self.model}")
-        print()
-        tests = [
-            ("Persona System Loading", self.test_persona_loading),
-            ("LLM Client Creation", self.test_llm_client_creation),
-            ("Basic Persona Response", self.test_basic_persona_response),
-            ("Surveyor Persona Test", self.test_surveyor_persona),
-            ("Patient Persona Test", self.test_patient_persona),
-            ("Multi-turn Conversation", self.test_multi_turn_conversation),
-            ("Configuration Loading", self.test_config_loading)
-        ]
-        results = []
-        try:
-            for test_name, test_func in tests:
-                print(f"Running: {test_name}...")
-                try:
-                    result = await test_func()
-                    if result:
-                        print(f"✅ {test_name}")
-                    else:
-                        print(f"❌ {test_name}")
-                    results.append(result)
-                except Exception as e:
-                    print(f"❌ {test_name} - Exception: {e}")
-                    results.append(False)
-                print()
-        finally:
-            if self.client:
-                await self.client.close()
-        # Summary
-        passed = sum(results)
-        total = len(results)
-        print("=" * 50)
-        print(f"Results: {passed}/{total} tests passed")
-        if passed == total:
-            print("🎉 All integration tests passed!")
-            print("✅ Ready for conversation orchestration implementation!")
-        else:
-            print("⚠️ Some integration tests failed. Check setup.")
-        return passed == total
-    async def test_persona_loading(self) -> bool:
-        """Test loading persona system and personas."""
-        try:
-            self.persona_system = PersonaSystem()
-            # Check if personas loaded
-            personas = self.persona_system.list_personas()
-            if len(personas) > 0:
-                print(f"   Loaded {len(personas)} personas:")
-                for persona in personas[:3]:  # Show first 3
-                    persona_id = persona.get('id', 'unknown')
-                    persona_name = persona.get('name', 'Unknown')
-                    print(f"   - {persona_id}: {persona_name}")
-                if len(personas) > 3:
-                    print(f"   ... and {len(personas) - 3} more")
-                return True
-            else:
-                print("   No personas loaded")
-                return False
-        except Exception as e:
-            print(f"   Persona loading failed: {e}")
-            return False
-    async def test_llm_client_creation(self) -> bool:
-        """Test LLM client creation and basic connectivity."""
-        try:
-            self.client = OllamaClient(host=self.host, model=self.model)
-            # Test health check
-            health = await self.client.health_check()
-            if health["status"] == "healthy" and health["model_available"]:
-                print(f"   Client created successfully")
-                print(f"   Model available: {health['model_available']}")
-                return True
-            else:
-                print(f"   Health check failed: {health}")
-                return False
-        except Exception as e:
-            print(f"   Client creation failed: {e}")
-            return False
-    async def test_basic_persona_response(self) -> bool:
-        """Test basic persona response generation."""
-        try:
-            if not self.persona_system or not self.client:
-                print("   Prerequisites not met")
-                return False
-            # Get a patient persona
-            patient_personas = self.persona_system.list_personas("patient")
-            if not patient_personas:
-                print("   No patient personas found")
-                return False
-            persona = patient_personas[0]
-            persona_id = persona.get('id')
-            # Build system prompt using conversation prompt method
-            system_prompt, _ = self.persona_system.build_conversation_prompt(persona_id)
-            response = await self.client.generate(
-                prompt="How are you feeling today?",
-                system_prompt=system_prompt
-            )
-            if response and len(response.strip()) > 0:
-                print(f"   Persona: {persona.get('name', persona_id)}")
-                print(f"   Response: {response[:100]}{'...' if len(response) > 100 else ''}")
-                return True
-            else:
-                print("   Empty response received")
-                return False
-        except Exception as e:
-            print(f"   Basic persona response failed: {e}")
-            return False
-    async def test_surveyor_persona(self) -> bool:
-        """Test surveyor persona functionality."""
-        try:
-            if not self.persona_system or not self.client:
-                print("   Prerequisites not met")
-                return False
-            # Get a surveyor persona
-            surveyor_personas = self.persona_system.list_personas("surveyor")
-            if not surveyor_personas:
-                print("   No surveyor personas found")
-                return False
-            persona = surveyor_personas[0]
-            persona_id = persona.get('id')
-            # Build system prompt using conversation prompt method
-            system_prompt, _ = self.persona_system.build_conversation_prompt(persona_id)
-            response = await self.client.generate(
-                prompt="Please introduce yourself and ask your first survey question.",
-                system_prompt=system_prompt
-            )
-            if response and len(response.strip()) > 0:
-                print(f"   Surveyor: {persona.get('name', persona_id)}")
-                print(f"   Introduction: {response[:120]}{'...' if len(response) > 120 else ''}")
-                return True
-            else:
-                print("   Empty response received")
-                return False
-        except Exception as e:
-            print(f"   Surveyor persona test failed: {e}")
-            return False
-    async def test_patient_persona(self) -> bool:
-        """Test patient persona functionality."""
-        try:
-            if not self.persona_system or not self.client:
-                print("   Prerequisites not met")
-                return False
-            # Get a patient persona (different from basic test)
-            patient_personas = self.persona_system.list_personas("patient")
-            if len(patient_personas) < 2:
-                print("   Not enough patient personas for varied testing")
-                return len(patient_personas) > 0  # Still pass if we have at least one
-            persona = patient_personas[1] if len(patient_personas) > 1 else patient_personas[0]  # Use second patient or first
-            persona_id = persona.get('id')
-            # Build system prompt using conversation prompt method
-            system_prompt, _ = self.persona_system.build_conversation_prompt(persona_id)
-            # Test with a survey question
-            response = await self.client.generate(
-                prompt="On a scale of 1-10, how would you rate your overall health today?",
-                system_prompt=system_prompt
-            )
-            if response and len(response.strip()) > 0:
-                print(f"   Patient: {persona.get('name', persona_id)}")
-                print(f"   Health rating: {response[:100]}{'...' if len(response) > 100 else ''}")
-                return True
-            else:
-                print("   Empty response received")
-                return False
-        except Exception as e:
-            print(f"   Patient persona test failed: {e}")
-            return False
-    async def test_multi_turn_conversation(self) -> bool:
-        """Test multi-turn conversation capability."""
-        try:
-            if not self.persona_system or not self.client:
-                print("   Prerequisites not met")
-                return False
-            # Get personas for conversation
-            patient_personas = self.persona_system.list_personas("patient")
-            if not patient_personas:
-                print("   No patient personas for conversation")
-                return False
-            persona = patient_personas[0]
-            persona_id = persona.get('id')
-            # Simulate conversation history
-            conversation_history = [
-                {"role": "assistant", "content": "Hello! How are you feeling today?"},
-                {"role": "user", "content": "I'm doing okay, thank you for asking."}
-            ]
-            # Build prompt with history
-            system_prompt, prompt_with_history = self.persona_system.build_conversation_prompt(
-                persona_id=persona_id,
-                conversation_history=conversation_history,
-                user_prompt="Can you tell me more about any health concerns you might have?"
-            )
-            response = await self.client.generate(
-                prompt=prompt_with_history,
-                system_prompt=system_prompt
-            )
-            if response and len(response.strip()) > 0:
-                print(f"   Multi-turn conversation successful")
-                print(f"   Response: {response[:80]}{'...' if len(response) > 80 else ''}")
-                return True
-            else:
-                print("   Empty response received")
-                return False
-        except Exception as e:
-            print(f"   Multi-turn conversation test failed: {e}")
-            return False
-    async def test_config_loading(self) -> bool:
-        """Test configuration loading."""
-        try:
-            config_path = Path(__file__).parent.parent / "config" / "default_config.yaml"
-            if not config_path.exists():
-                print(f"   Config file not found: {config_path}")
-                return False
-            with open(config_path, 'r') as f:
-                config = yaml.safe_load(f)
-            # Check for essential config sections
-            required_sections = ['llm', 'api']
-            missing_sections = [section for section in required_sections if section not in config]
-            if missing_sections:
-                print(f"   Missing config sections: {missing_sections}")
-                return False
-            # Personas section is optional since personas are loaded from separate files
-            print(f"   Config loaded successfully")
-            print(f"   LLM backend: {config.get('llm', {}).get('backend', 'unknown')}")
-            print(f"   API port: {config.get('api', {}).get('port', 'unknown')}")
-            return True
-        except Exception as e:
-            print(f"   Config loading failed: {e}")
-            return False
-async def main():
-    """Main integration test function."""
-    print("🔗 AI Survey Simulator - Integration Test")
-    print("=" * 45)
-    print()
-    print("This test verifies that all core components work together:")
-    print("- Persona system")
-    print("- LLM client")
-    print("- Configuration loading")
-    print("- End-to-end conversation flow")
-    print()
-    # Parse command line arguments for custom host/model
-    host = "http://localhost:11434"
-    model = "llama2:7b"
-    if len(sys.argv) > 1:
-        host = sys.argv[1]
-    if len(sys.argv) > 2:
-        model = sys.argv[2]
-    print(f"Testing with host: {host}, model: {model}")
-    print()
-    # Run integration tests
-    tester = IntegrationTester(host=host, model=model)
-    success = await tester.run_all_tests()
-    if success:
-        print()
-        print("🚀 Next Steps:")
-        print("1. Implement conversation orchestration")
-        print("2. Connect WebSocket + LLM + Personas")
-        print("3. Test AI-to-AI conversations")
-        print("4. Build Streamlit UI integration")
-    return 0 if success else 1
-if __name__ == "__main__":
-    try:
-        exit_code = asyncio.run(main())
-        sys.exit(exit_code)
-    except KeyboardInterrupt:
-        print("\n⏹️ Integration tests interrupted by user")
-        sys.exit(1)

scripts/test_llm_connection.py DELETED Viewed

@@ -1,317 +0,0 @@
-#!/usr/bin/env python3
-"""Test script for LLM connectivity and functionality.
-This script tests the LLM integration with Ollama to ensure
-language model connections work correctly.
-Usage:
-    python scripts/test_llm_connection.py
-"""
-import asyncio
-import sys
-import time
-from pathlib import Path
-# Add backend to path for imports
-sys.path.insert(0, str(Path(__file__).parent.parent / "backend"))
-try:
-    from core.llm_client import OllamaClient, create_llm_client
-except ImportError as e:
-    print(f"❌ Import error: {e}")
-    print("Make sure you're running from the project root directory")
-    sys.exit(1)
-class LLMTester:
-    """Test suite for LLM functionality."""
-    def __init__(self, host: str = "http://localhost:11434", model: str = "llama2:7b"):
-        """Initialize LLM tester.
-        Args:
-            host: Ollama server host
-            model: Model to test with
-        """
-        self.host = host
-        self.model = model
-        self.client = None
-    async def run_all_tests(self) -> bool:
-        """Run all LLM tests.
-        Returns:
-            True if all tests pass
-        """
-        print("🧠 LLM Connection Test Suite")
-        print("=" * 40)
-        print(f"Host: {self.host}")
-        print(f"Model: {self.model}")
-        print()
-        tests = [
-            ("Connection Health Check", self.test_health_check),
-            ("Basic Generation", self.test_basic_generation),
-            ("System Prompt", self.test_system_prompt),
-            ("Parameter Control", self.test_parameter_control),
-            ("Error Handling", self.test_error_handling),
-            ("Performance Stats", self.test_performance_stats),
-            ("Retry Logic", self.test_retry_logic)
-        ]
-        results = []
-        # Initialize client
-        self.client = OllamaClient(host=self.host, model=self.model)
-        try:
-            for test_name, test_func in tests:
-                print(f"Running: {test_name}...")
-                try:
-                    result = await test_func()
-                    if result:
-                        print(f"✅ {test_name}")
-                    else:
-                        print(f"❌ {test_name}")
-                    results.append(result)
-                except Exception as e:
-                    print(f"❌ {test_name} - Exception: {e}")
-                    results.append(False)
-                print()
-        finally:
-            if self.client:
-                await self.client.close()
-        # Summary
-        passed = sum(results)
-        total = len(results)
-        print("=" * 40)
-        print(f"Results: {passed}/{total} tests passed")
-        if passed == total:
-            print("🎉 All LLM tests passed!")
-        else:
-            print("⚠️ Some tests failed. Check Ollama setup.")
-        return passed == total
-    async def test_health_check(self) -> bool:
-        """Test LLM server health check."""
-        try:
-            health = await self.client.health_check()
-            if health["status"] == "healthy":
-                print(f"   Server: {health['status']}")
-                print(f"   Model available: {health['model_available']}")
-                print(f"   Available models: {len(health.get('available_models', []))}")
-                if not health['model_available']:
-                    print(f"   ⚠️ Model '{self.model}' not found!")
-                    print(f"   Available: {health['available_models']}")
-                    return False
-                return True
-            else:
-                print(f"   Server unhealthy: {health.get('error', 'Unknown error')}")
-                return False
-        except Exception as e:
-            print(f"   Health check failed: {e}")
-            return False
-    async def test_basic_generation(self) -> bool:
-        """Test basic text generation."""
-        try:
-            prompt = "Say 'Hello, World!' in exactly those words."
-            response = await self.client.generate(prompt)
-            if response and len(response.strip()) > 0:
-                print(f"   Prompt: {prompt}")
-                print(f"   Response: {response[:100]}{'...' if len(response) > 100 else ''}")
-                return True
-            else:
-                print("   Empty or invalid response")
-                return False
-        except Exception as e:
-            print(f"   Generation failed: {e}")
-            return False
-    async def test_system_prompt(self) -> bool:
-        """Test system prompt functionality."""
-        try:
-            system_prompt = "You are a helpful assistant. Always respond with exactly 3 words."
-            user_prompt = "What is AI?"
-            response = await self.client.generate(
-                prompt=user_prompt,
-                system_prompt=system_prompt
-            )
-            if response:
-                word_count = len(response.strip().split())
-                print(f"   System: {system_prompt}")
-                print(f"   User: {user_prompt}")
-                print(f"   Response: '{response}' ({word_count} words)")
-                # Check if roughly follows instructions (3 words ± 1)
-                return 2 <= word_count <= 4
-            else:
-                print("   No response received")
-                return False
-        except Exception as e:
-            print(f"   System prompt test failed: {e}")
-            return False
-    async def test_parameter_control(self) -> bool:
-        """Test parameter control (temperature, etc.)."""
-        try:
-            prompt = "Generate a random number between 1 and 10."
-            # Test with low temperature (more deterministic)
-            response1 = await self.client.generate(
-                prompt=prompt,
-                temperature=0.1
-            )
-            # Test with high temperature (more random)
-            response2 = await self.client.generate(
-                prompt=prompt,
-                temperature=0.9
-            )
-            print(f"   Low temp (0.1): {response1[:50]}")
-            print(f"   High temp (0.9): {response2[:50]}")
-            # Both should have responses
-            return bool(response1 and response2)
-        except Exception as e:
-            print(f"   Parameter test failed: {e}")
-            return False
-    async def test_error_handling(self) -> bool:
-        """Test error handling with invalid requests."""
-        try:
-            # Test with invalid model temporarily
-            invalid_client = OllamaClient(
-                host=self.host,
-                model="nonexistent-model-12345"
-            )
-            try:
-                await invalid_client.generate("Test prompt")
-                print("   Expected error but got response")
-                return False
-            except Exception as e:
-                print(f"   Correctly caught error: {type(e).__name__}")
-                return True
-            finally:
-                await invalid_client.close()
-        except Exception as e:
-            print(f"   Error handling test failed: {e}")
-            return False
-    async def test_performance_stats(self) -> bool:
-        """Test performance statistics tracking."""
-        try:
-            initial_stats = self.client.get_stats()
-            initial_count = initial_stats["request_count"]
-            # Make a few requests
-            for i in range(3):
-                await self.client.generate(f"Count to {i+1}")
-            final_stats = self.client.get_stats()
-            final_count = final_stats["request_count"]
-            print(f"   Requests: {initial_count} → {final_count}")
-            print(f"   Avg time: {final_stats['average_time']}s")
-            print(f"   Total time: {final_stats['total_time']}s")
-            return final_count > initial_count
-        except Exception as e:
-            print(f"   Performance stats test failed: {e}")
-            return False
-    async def test_retry_logic(self) -> bool:
-        """Test retry logic with connection issues."""
-        try:
-            # Test with unreachable host
-            retry_client = OllamaClient(
-                host="http://localhost:99999",  # Invalid port
-                model=self.model,
-                max_retries=2,
-                retry_delay=0.1  # Fast retry for testing
-            )
-            start_time = time.time()
-            try:
-                await retry_client.generate("Test prompt")
-                print("   Expected connection error but got response")
-                return False
-            except Exception as e:
-                elapsed = time.time() - start_time
-                print(f"   Retry logic worked: {type(e).__name__}")
-                print(f"   Time elapsed: {elapsed:.1f}s (expected ~0.3s for 2 retries)")
-                return elapsed > 0.2  # Should take time for retries
-            finally:
-                await retry_client.close()
-        except Exception as e:
-            print(f"   Retry logic test failed: {e}")
-            return False
-def print_setup_instructions():
-    """Print setup instructions for users."""
-    print("LLM Connection Test")
-    print("=" * 20)
-    print()
-    print("Prerequisites:")
-    print("1. Ollama must be installed and running:")
-    print("   curl -fsSL https://ollama.ai/install.sh | sh")
-    print("   ollama serve")
-    print()
-    print("2. Pull a model (adjust model name in script if needed):")
-    print("   ollama pull llama2:7b")
-    print()
-    print("3. Verify Ollama is running:")
-    print("   curl http://localhost:11434/api/tags")
-    print()
-async def main():
-    """Main test function."""
-    print_setup_instructions()
-    # Parse command line arguments for custom host/model
-    host = "http://localhost:11434"
-    model = "llama2:7b"
-    if len(sys.argv) > 1:
-        host = sys.argv[1]
-    if len(sys.argv) > 2:
-        model = sys.argv[2]
-    print(f"Testing with host: {host}, model: {model}")
-    print()
-    # Run tests
-    tester = LLMTester(host=host, model=model)
-    success = await tester.run_all_tests()
-    return 0 if success else 1
-if __name__ == "__main__":
-    try:
-        exit_code = asyncio.run(main())
-        sys.exit(exit_code)
-    except KeyboardInterrupt:
-        print("\n⏹️ Tests interrupted by user")
-        sys.exit(1)

scripts/test_websocket.py DELETED Viewed

@@ -1,250 +0,0 @@
-#!/usr/bin/env python3
-"""Test script for WebSocket functionality.
-This script tests the WebSocket connection between frontend and backend
-to ensure real-time messaging works correctly.
-Usage:
-    python scripts/test_websocket.py
-"""
-import asyncio
-import json
-import sys
-import logging
-from datetime import datetime
-try:
-    import websockets
-    from websockets.exceptions import ConnectionClosed
-except ImportError:
-    print("Error: websockets library not installed")
-    print("Install with: pip install websockets")
-    sys.exit(1)
-# Configure logging
-logging.basicConfig(level=logging.INFO)
-logger = logging.getLogger(__name__)
-async def test_websocket_connection():
-    """Test basic WebSocket connection and messaging."""
-    # Test configuration
-    BACKEND_URL = "ws://localhost:8000"
-    CONVERSATION_ID = "test-conversation-123"
-    WS_URL = f"{BACKEND_URL}/ws/conversation/{CONVERSATION_ID}"
-    print(f"Testing WebSocket connection to: {WS_URL}")
-    try:
-        # Connect to WebSocket
-        async with websockets.connect(WS_URL) as websocket:
-            print("✅ WebSocket connection established")
-            # Test 1: Wait for connection confirmation
-            try:
-                response = await asyncio.wait_for(websocket.recv(), timeout=5.0)
-                data = json.loads(response)
-                if data.get("type") == "connection_status":
-                    print("✅ Received connection confirmation")
-                    print(f"   Status: {data.get('status')}")
-                    print(f"   Message: {data.get('message')}")
-                else:
-                    print(f"⚠️  Unexpected first message: {data}")
-            except asyncio.TimeoutError:
-                print("❌ No connection confirmation received within 5 seconds")
-                return False
-            # Test 2: Send conversation message
-            test_message = {
-                "type": "conversation_message",
-                "role": "surveyor",
-                "content": "Hello, this is a test message",
-                "conversation_id": CONVERSATION_ID,
-                "timestamp": datetime.now().isoformat()
-            }
-            await websocket.send(json.dumps(test_message))
-            print("✅ Sent test conversation message")
-            # Wait for echo/response
-            try:
-                response = await asyncio.wait_for(websocket.recv(), timeout=5.0)
-                data = json.loads(response)
-                if data.get("type") == "conversation_message":
-                    print("✅ Received echoed conversation message")
-                    print(f"   Role: {data.get('role')}")
-                    print(f"   Content: {data.get('content')}")
-                else:
-                    print(f"⚠️  Unexpected message type: {data.get('type')}")
-            except asyncio.TimeoutError:
-                print("❌ No response to conversation message within 5 seconds")
-                return False
-            # Test 3: Send heartbeat
-            heartbeat_message = {
-                "type": "heartbeat",
-                "content": "ping",
-                "timestamp": datetime.now().isoformat()
-            }
-            await websocket.send(json.dumps(heartbeat_message))
-            print("✅ Sent heartbeat message")
-            # Wait for heartbeat response
-            try:
-                response = await asyncio.wait_for(websocket.recv(), timeout=5.0)
-                data = json.loads(response)
-                if data.get("type") == "heartbeat_response":
-                    print("✅ Received heartbeat response")
-                else:
-                    print(f"⚠️  Unexpected heartbeat response: {data.get('type')}")
-            except asyncio.TimeoutError:
-                print("❌ No heartbeat response within 5 seconds")
-                return False
-            # Test 4: Send conversation control
-            control_message = {
-                "type": "conversation_control",
-                "action": "start",
-                "content": "Starting conversation",
-                "timestamp": datetime.now().isoformat()
-            }
-            await websocket.send(json.dumps(control_message))
-            print("✅ Sent conversation control message")
-            # Wait for control response
-            try:
-                response = await asyncio.wait_for(websocket.recv(), timeout=5.0)
-                data = json.loads(response)
-                if data.get("type") == "conversation_control":
-                    print("✅ Received conversation control response")
-                    print(f"   Action: {data.get('action')}")
-                    print(f"   Message: {data.get('message')}")
-                else:
-                    print(f"⚠️  Unexpected control response: {data.get('type')}")
-            except asyncio.TimeoutError:
-                print("❌ No conversation control response within 5 seconds")
-                return False
-            print("🎉 All WebSocket tests passed!")
-            return True
-    except ConnectionRefusedError:
-        print("❌ Connection refused - is the backend server running?")
-        print("   Start with: cd backend && uvicorn api.main:app --reload")
-        return False
-    except Exception as e:
-        print(f"❌ WebSocket test failed: {e}")
-        return False
-async def test_invalid_messages():
-    """Test handling of invalid messages."""
-    BACKEND_URL = "ws://localhost:8000"
-    CONVERSATION_ID = "test-invalid-messages"
-    WS_URL = f"{BACKEND_URL}/ws/conversation/{CONVERSATION_ID}"
-    print(f"\nTesting invalid message handling...")
-    try:
-        async with websockets.connect(WS_URL) as websocket:
-            # Skip connection confirmation
-            await websocket.recv()
-            # Test invalid JSON
-            await websocket.send("invalid json")
-            print("✅ Sent invalid JSON")
-            # Test missing required fields
-            invalid_message = {
-                "type": "conversation_message"
-                # Missing "content" field
-            }
-            await websocket.send(json.dumps(invalid_message))
-            print("✅ Sent message with missing fields")
-            # Test invalid message type
-            invalid_type = {
-                "type": "invalid_type",
-                "content": "test"
-            }
-            await websocket.send(json.dumps(invalid_type))
-            print("✅ Sent message with invalid type")
-            # Check for error responses
-            error_count = 0
-            for _ in range(3):
-                try:
-                    response = await asyncio.wait_for(websocket.recv(), timeout=2.0)
-                    data = json.loads(response)
-                    if data.get("type") == "error":
-                        error_count += 1
-                        print(f"✅ Received error response: {data.get('error')}")
-                except asyncio.TimeoutError:
-                    break
-            if error_count > 0:
-                print(f"✅ Received {error_count} error responses for invalid messages")
-            else:
-                print("⚠️  No error responses received for invalid messages")
-            return True
-    except Exception as e:
-        print(f"❌ Invalid message test failed: {e}")
-        return False
-def print_usage():
-    """Print usage instructions."""
-    print("WebSocket Test Script")
-    print("====================")
-    print()
-    print("This script tests the WebSocket functionality of the AI Survey Simulator.")
-    print()
-    print("Prerequisites:")
-    print("1. Backend server must be running:")
-    print("   cd backend && uvicorn api.main:app --reload")
-    print()
-    print("2. WebSocket library must be installed:")
-    print("   pip install websockets")
-    print()
-    print("Usage:")
-    print("   python scripts/test_websocket.py")
-async def main():
-    """Main test function."""
-    print_usage()
-    print("\nStarting WebSocket tests...\n")
-    # Test 1: Basic functionality
-    success1 = await test_websocket_connection()
-    # Test 2: Invalid message handling
-    success2 = await test_invalid_messages()
-    print("\n" + "="*50)
-    if success1 and success2:
-        print("🎉 All WebSocket tests completed successfully!")
-        return 0
-    else:
-        print("❌ Some tests failed. Check the output above.")
-        return 1
-if __name__ == "__main__":
-    try:
-        exit_code = asyncio.run(main())
-        sys.exit(exit_code)
-    except KeyboardInterrupt:
-        print("\n⏹️  Tests interrupted by user")
-        sys.exit(1)