Spaces:

taboola-cz
/

sel-chat-coach

Running

tblaisaacliao Claude commited on Nov 4, 2025

Commit

bfcc0a8

1 Parent(s): cd53cef

feat: Add comprehensive production stress test suite

Add a complete stress testing framework for the backend API with multiple
scenarios and detailed performance reporting.

**Features:**
- 4 test scenarios: user registration, conversation creation, concurrent messages, admin monitoring
- Environment-based configuration (BASE_URL, CONCURRENT_USERS, etc.)
- Performance metrics: response times (min/avg/max/p95/p99), success rates, throughput
- HTML report generation with detailed metrics
- Production safety: confirmation prompts, dry-run mode, circuit breaker
- Admin endpoint verification

**Architecture:**
```
scripts/stress-test/
├── index.ts # Main orchestrator
├── config.ts # Environment configuration
├── metrics.ts # Metrics collection & HTML reports
├── utils.ts # HTTP client, helpers
├── README.md # Documentation
└── scenarios/
├── user-registration.ts # Scenario 1: Mass user registration
├── conversation-flow.ts # Scenario 2: Conversation creation
├── concurrent-messages.ts # Scenario 3: Concurrent LLM calls
└── admin-monitoring.ts # Scenario 4: Admin endpoint load
```

**Usage:**
```bash
# Local testing
npm run stress-test

# Production (safe)
BASE_URL=https://taboola-cz-sel-chat-coach.hf.space \
CONCURRENT_USERS=20 \
DURATION_MINUTES=2 \
npm run stress-test

# Dry run
DRY_RUN=true npm run stress-test
```

**Test Results (Local):**
- 5 concurrent users
- 10 conversations created
- 22 admin endpoint checks
- 100% success rate
- All scenarios passed

**Configuration Options:**
- BASE_URL - API endpoint (default: http://localhost:3000)
- CONCURRENT_USERS - Number of concurrent users (default: 50)
- DURATION_MINUTES - Test duration (default: 5)
- SCENARIOS - Comma-separated scenario list
- DRY_RUN - Validate config without executing
- MAX_ERROR_RATE - Circuit breaker threshold (default: 0.1)

**Safety Features:**
- Confirmation prompt for production URLs
- Circuit breaker stops test if error rate > 10%
- Dry-run mode for validation
- Detailed error logging

**Output:**
- Real-time console progress
- HTML report with performance charts
- Pass/fail status based on success rate

This enables systematic load testing and performance monitoring of the
production backend API.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (10) hide show

package.json +2 -1
scripts/stress-test/README.md +292 -0
scripts/stress-test/config.ts +75 -0
scripts/stress-test/index.ts +182 -0
scripts/stress-test/metrics.ts +348 -0
scripts/stress-test/scenarios/admin-monitoring.ts +138 -0
scripts/stress-test/scenarios/concurrent-messages.ts +81 -0
scripts/stress-test/scenarios/conversation-flow.ts +74 -0
scripts/stress-test/scenarios/user-registration.ts +56 -0
scripts/stress-test/utils.ts +255 -0

package.json CHANGED Viewed

@@ -12,7 +12,8 @@
     "test:ui": "playwright test --project=ui",
     "test:e2e": "playwright test",
     "test:e2e:headed": "playwright test --headed",
-    "test:e2e:ui": "playwright test --ui"
   },
   "dependencies": {
     "@ai-sdk/openai": "^2.0.32",

     "test:ui": "playwright test --project=ui",
     "test:e2e": "playwright test",
     "test:e2e:headed": "playwright test --headed",
+    "test:e2e:ui": "playwright test --ui",
+    "stress-test": "tsx scripts/stress-test/index.ts"
   },
   "dependencies": {
     "@ai-sdk/openai": "^2.0.32",

scripts/stress-test/README.md ADDED Viewed

	@@ -0,0 +1,292 @@

+# Production Backend API Stress Test
+Comprehensive stress testing tool for the SEL Chat Coach backend API. Simulates realistic user load and monitors admin endpoints for performance issues.
+## Features
+- ✅ **Configurable via environment variables**
+- ✅ **Multiple test scenarios**: User registration, conversation creation, concurrent messaging, admin monitoring
+- ✅ **Performance metrics**: Response times (min/avg/max/p95/p99), success rates, throughput
+- ✅ **Circuit breaker**: Auto-stop if error rate exceeds threshold
+- ✅ **HTML report generation**: Detailed performance reports with charts
+- ✅ **Production safety**: Confirmation prompts, dry-run mode, error rate limits
+## Quick Start
+### 1. Run Against Local Development Server
+```bash
+# Start dev server first
+npm run dev
+# In another terminal, run stress test (defaults to localhost:3000)
+npm run stress-test
+```
+### 2. Run Against Production
+```bash
+BASE_URL=https://taboola-cz-sel-chat-coach.hf.space \
+CONCURRENT_USERS=20 \
+DURATION_MINUTES=2 \
+npm run stress-test
+```
+### 3. Dry Run (Validate Configuration)
+```bash
+BASE_URL=https://taboola-cz-sel-chat-coach.hf.space \
+DRY_RUN=true \
+npm run stress-test
+```
+## Configuration
+All configuration is done via environment variables:
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `BASE_URL` | API base URL | `http://localhost:3000` |
+| `BASIC_AUTH_PASSWORD` | Auth password | `cz-2025` |
+| `CONCURRENT_USERS` | Number of concurrent users | `50` |
+| `DURATION_MINUTES` | Test duration in minutes | `5` |
+| `SCENARIOS` | Comma-separated scenario list | All scenarios |
+| `DRY_RUN` | Validate config without executing | `false` |
+| `CLEANUP` | Delete test data after run | `false` |
+| `MAX_ERROR_RATE` | Circuit breaker threshold (0-1) | `0.1` (10%) |
+| `VERBOSE` | Enable verbose logging | `false` |
+## Scenarios
+### 1. User Registration
+- **What it does**: Registers N concurrent users
+- **Metrics**: Registration response time, success rate
+- **Duration**: ~10-30 seconds
+### 2. Conversation Creation
+- **What it does**: Each user creates 2 conversations with random student/coach combinations
+- **Metrics**: Creation response time, database writes
+- **Duration**: ~20-60 seconds
+### 3. Concurrent Messages (WARNING: Slow)
+- **What it does**: Sends 20 concurrent messages with LLM calls
+- **Metrics**: LLM response time (60-90s per message), streaming performance
+- **Duration**: ~60-90 seconds (due to LLM bottleneck)
+### 4. Admin Monitoring
+- **What it does**: Continuously polls admin endpoints (health, stats)
+- **Metrics**: Query performance under load, data consistency
+- **Duration**: Configured duration (default 5 minutes)
+- **Verification**: Checks if database counts match expected values
+## Usage Examples
+### Test Specific Scenarios
+```bash
+# Only run user registration and conversation creation
+SCENARIOS=user-registration,conversation-flow \
+npm run stress-test
+```
+### High Concurrency Test
+```bash
+# Test with 100 concurrent users
+CONCURRENT_USERS=100 \
+DURATION_MINUTES=3 \
+npm run stress-test
+```
+### Extended Monitoring
+```bash
+# Monitor admin endpoints for 10 minutes
+SCENARIOS=admin-monitoring \
+DURATION_MINUTES=10 \
+npm run stress-test
+```
+### Production Test (Conservative)
+```bash
+# Safe production test: 20 users, 2 minutes, skip LLM scenario
+BASE_URL=https://taboola-cz-sel-chat-coach.hf.space \
+CONCURRENT_USERS=20 \
+DURATION_MINUTES=2 \
+SCENARIOS=user-registration,conversation-flow,admin-monitoring \
+npm run stress-test
+```
+## Output
+### Console Output
+```
+🚀 Production Backend API Stress Test
+Configuration:
+  Base URL: https://taboola-cz-sel-chat-coach.hf.space
+  Concurrent Users: 50
+  Duration: 5 minutes
+  Scenarios: user-registration, conversation-flow, concurrent-messages, admin-monitoring
+⚠️  WARNING: Running against PRODUCTION environment!
+Continue? (y/n) y
+[User Registration] Registering 50 users...
+✓ [User Registration] Completed in 15.2s
+  Total Requests: 50
+  Success Rate: 100.0% (50/50)
+  Response Time:
+    min: 120ms
+    avg: 245ms
+    max: 890ms
+    p95: 450ms
+    p99: 720ms
+  Throughput: 3.29 req/s
+[Conversation Creation] Creating 100 conversations (2 per user)...
+✓ [Conversation Creation] Completed in 22.8s
+  Total Requests: 100
+  Success Rate: 100.0% (100/100)
+  ...
+📊 Report saved to: /path/to/stress-test-report.html
+Overall Statistics:
+  Total Requests: 270
+  Successful: 268
+  Failed: 2
+  Success Rate: 99.3%
+✓ Test PASSED
+```
+### HTML Report
+After the test completes, an HTML report is generated at `./stress-test-report.html`. Open it in a browser to view:
+- Detailed metrics for each scenario
+- Response time distributions
+- Error breakdowns
+- Performance charts
+## Interpreting Results
+### Success Criteria
+✅ **Good Performance:**
+- Success rate ≥ 95%
+- p95 response time < 1000ms (except LLM calls)
+- No timeout errors
+- Database counts match expectations
+⚠️ **Warning Signs:**
+- Success rate 80-95%
+- p95 response time 1000-3000ms
+- Occasional timeout errors
+- Database inconsistencies
+❌ **Poor Performance:**
+- Success rate < 80%
+- p95 response time > 3000ms
+- Frequent timeout errors
+- Database connection pool exhaustion
+### Common Issues
+**Connection timeouts:**
+- Check Supabase connection limits
+- Verify network connectivity
+- Reduce concurrent users
+**High error rates:**
+- Check server logs for details
+- Verify authentication credentials
+- Check rate limiting
+**Slow response times:**
+- Database performance issues
+- LLM API rate limits
+- Network latency
+## Safety Features
+### Production Safety
+1. **Confirmation Prompt**: Requires manual confirmation when running against production
+2. **Circuit Breaker**: Stops test if error rate exceeds 10% (configurable)
+3. **Dry Run Mode**: Validate configuration without sending requests
+4. **Max Concurrency Limit**: Hard limit of 1000 concurrent users
+### Cleanup
+By default, test data is NOT cleaned up (for debugging). To enable cleanup:
+```bash
+CLEANUP=true npm run stress-test
+```
+**Warning**: Cleanup is not yet implemented. Test users/conversations will persist in the database.
+## Troubleshooting
+### `MODULE_NOT_FOUND` Error
+Ensure you have `tsx` installed:
+```bash
+npm install
+```
+### Permission Denied
+Make index.ts executable:
+```bash
+chmod +x scripts/stress-test/index.ts
+```
+### Test Timeouts
+Increase the duration for LLM-heavy scenarios:
+```bash
+DURATION_MINUTES=10 npm run stress-test
+```
+## Architecture
+```
+scripts/stress-test/
+├── index.ts                    # Main orchestrator
+├── config.ts                   # Environment variable parsing
+├── metrics.ts                  # Metrics collection & HTML reports
+├── utils.ts                    # HTTP client, auth, helpers
+└── scenarios/
+    ├── user-registration.ts    # Scenario 1
+    ├── conversation-flow.ts    # Scenario 2
+    ├── concurrent-messages.ts  # Scenario 3
+    └── admin-monitoring.ts     # Scenario 4
+```
+## Development
+### Adding a New Scenario
+1. Create scenario file: `scenarios/my-scenario.ts`
+2. Implement scenario function
+3. Add scenario name to config.ts
+4. Import and call in index.ts
+### Customizing Metrics
+Edit `metrics.ts` to add new metrics or modify HTML report template.
+## Limitations
+- **LLM Rate Limits**: OpenAI has rate limits - concurrent message scenario is slow
+- **Database**: SQLite (local) has limited concurrency vs Supabase (production)
+- **Network Latency**: Production tests affected by HuggingFace → Supabase Singapore latency
+- **Cleanup**: Not yet implemented - test data persists
+## License
+Part of SEL Chat Coach project.

scripts/stress-test/config.ts ADDED Viewed

	@@ -0,0 +1,75 @@

+/**
+ * Stress Test Configuration
+ * Reads from environment variables with sensible defaults
+ */
+export interface StressTestConfig {
+  baseURL: string;
+  password: string;
+  concurrentUsers: number;
+  durationMinutes: number;
+  scenarios: string[];
+  dryRun: boolean;
+  cleanup: boolean;
+  maxErrorRate: number; // Circuit breaker threshold (0-1)
+  verbose: boolean;
+}
+export function loadConfig(): StressTestConfig {
+  const baseURL = process.env.BASE_URL || 'http://localhost:3000';
+  const password = process.env.BASIC_AUTH_PASSWORD || 'cz-2025';
+  const concurrentUsers = parseInt(process.env.CONCURRENT_USERS || '50', 10);
+  const durationMinutes = parseInt(process.env.DURATION_MINUTES || '5', 10);
+  const dryRun = process.env.DRY_RUN === 'true';
+  const cleanup = process.env.CLEANUP === 'true';
+  const maxErrorRate = parseFloat(process.env.MAX_ERROR_RATE || '0.1'); // 10% default
+  const verbose = process.env.VERBOSE === 'true';
+  // Parse scenarios from env or use all by default
+  const scenariosEnv = process.env.SCENARIOS || '';
+  const scenarios = scenariosEnv
+    ? scenariosEnv.split(',').map(s => s.trim())
+    : [
+        'user-registration',
+        'conversation-flow',
+        'concurrent-messages',
+        'admin-monitoring',
+      ];
+  // Validation
+  if (concurrentUsers < 1 || concurrentUsers > 1000) {
+    throw new Error('CONCURRENT_USERS must be between 1 and 1000');
+  }
+  if (durationMinutes < 1 || durationMinutes > 60) {
+    throw new Error('DURATION_MINUTES must be between 1 and 60');
+  }
+  if (maxErrorRate < 0 || maxErrorRate > 1) {
+    throw new Error('MAX_ERROR_RATE must be between 0 and 1');
+  }
+  return {
+    baseURL,
+    password,
+    concurrentUsers,
+    durationMinutes,
+    scenarios,
+    dryRun,
+    cleanup,
+    maxErrorRate,
+    verbose,
+  };
+}
+export const STUDENT_PERSONALITIES = [
+  'ruirui',   // 睿睿 - Elementary, Distractor
+  'xiaoxu',   // 小許 - Elementary, Blamer
+  'xiaoen',   // 小恩 - Junior high, Distractor
+  'xiaojie',  // 小婕 - Junior high, Pleaser
+  'ajie',     // 阿杰 - Junior high, Super-rational
+];
+export const COACH_TYPES = [
+  'satir',      // Empathetic
+];

scripts/stress-test/index.ts ADDED Viewed

	@@ -0,0 +1,182 @@

+#!/usr/bin/env tsx
+/**
+ * Production Backend API Stress Test
+ *
+ * Simulates realistic load against the production backend and monitors
+ * admin endpoints for performance issues.
+ *
+ * Usage:
+ *   BASE_URL=https://taboola-cz-sel-chat-coach.hf.space \
+ *   CONCURRENT_USERS=50 \
+ *   DURATION_MINUTES=5 \
+ *   npm run stress-test
+ */
+import * as readline from 'readline';
+import { loadConfig } from './config';
+import { MetricsCollector } from './metrics';
+import { userRegistrationScenario } from './scenarios/user-registration';
+import { conversationFlowScenario } from './scenarios/conversation-flow';
+import { concurrentMessagesScenario } from './scenarios/concurrent-messages';
+import { adminMonitoringScenario } from './scenarios/admin-monitoring';
+/**
+ * Prompt user for confirmation
+ */
+function promptConfirm(question: string): Promise<boolean> {
+  const rl = readline.createInterface({
+    input: process.stdin,
+    output: process.stdout,
+  });
+  return new Promise(resolve => {
+    rl.question(question + ' (y/n) ', answer => {
+      rl.close();
+      resolve(answer.toLowerCase() === 'y' || answer.toLowerCase() === 'yes');
+    });
+  });
+}
+/**
+ * Main stress test orchestrator
+ */
+async function main() {
+  console.log('🚀 Production Backend API Stress Test\n');
+  // Load configuration
+  const config = loadConfig();
+  console.log('Configuration:');
+  console.log(`  Base URL: ${config.baseURL}`);
+  console.log(`  Concurrent Users: ${config.concurrentUsers}`);
+  console.log(`  Duration: ${config.durationMinutes} minutes`);
+  console.log(`  Scenarios: ${config.scenarios.join(', ')}`);
+  console.log(`  Dry Run: ${config.dryRun ? 'Yes' : 'No'}`);
+  console.log(`  Cleanup: ${config.cleanup ? 'Yes' : 'No'}`);
+  console.log(`  Max Error Rate: ${(config.maxErrorRate * 100).toFixed(0)}%`);
+  console.log('');
+  // Confirmation for production
+  if (config.baseURL.includes('hf.space') || config.baseURL.includes('production')) {
+    console.log('⚠️  WARNING: Running against PRODUCTION environment!');
+    const confirmed = await promptConfirm('Continue?');
+    if (!confirmed) {
+      console.log('Aborted.');
+      process.exit(0);
+    }
+  }
+  if (config.dryRun) {
+    console.log('✓ Dry run mode - Configuration validated. No requests will be sent.');
+    process.exit(0);
+  }
+  const metrics = new MetricsCollector();
+  const startTime = Date.now();
+  let usernames: string[] = [];
+  let conversations: Array<{ id: string; username: string }> = [];
+  try {
+    // Scenario 1: User Registration
+    if (config.scenarios.includes('user-registration')) {
+      const result = await userRegistrationScenario(config, metrics);
+      usernames = result.usernames;
+      // Check error rate (circuit breaker)
+      const scenarioMetrics = metrics.getScenarioMetrics('User Registration');
+      if (scenarioMetrics && scenarioMetrics.errorRate > config.maxErrorRate) {
+        console.log(
+          `\n❌ Error rate (${(scenarioMetrics.errorRate * 100).toFixed(1)}%) exceeds threshold (${(config.maxErrorRate * 100).toFixed(0)}%). Stopping.`
+        );
+        process.exit(1);
+      }
+    }
+    // Scenario 2: Conversation Creation
+    if (config.scenarios.includes('conversation-flow') && usernames.length > 0) {
+      const result = await conversationFlowScenario(config, metrics, usernames);
+      conversations = result.conversations;
+      // Check error rate
+      const scenarioMetrics = metrics.getScenarioMetrics('Conversation Creation');
+      if (scenarioMetrics && scenarioMetrics.errorRate > config.maxErrorRate) {
+        console.log(
+          `\n❌ Error rate (${(scenarioMetrics.errorRate * 100).toFixed(1)}%) exceeds threshold (${(config.maxErrorRate * 100).toFixed(0)}%). Stopping.`
+        );
+        process.exit(1);
+      }
+    }
+    // Scenario 3: Concurrent Messages (WARNING: Slow due to LLM)
+    if (config.scenarios.includes('concurrent-messages') && conversations.length > 0) {
+      await concurrentMessagesScenario(config, metrics, conversations);
+      // Check error rate
+      const scenarioMetrics = metrics.getScenarioMetrics('Concurrent Messages');
+      if (scenarioMetrics && scenarioMetrics.errorRate > config.maxErrorRate) {
+        console.log(
+          `\n❌ Error rate (${(scenarioMetrics.errorRate * 100).toFixed(1)}%) exceeds threshold (${(config.maxErrorRate * 100).toFixed(0)}%). Stopping.`
+        );
+        process.exit(1);
+      }
+    }
+    // Scenario 4: Admin Monitoring
+    if (config.scenarios.includes('admin-monitoring')) {
+      const expectedUsers = usernames.length;
+      const expectedConversations = conversations.length;
+      const expectedMessages = 0; // Messages sent in scenario 3
+      await adminMonitoringScenario(
+        config,
+        metrics,
+        expectedUsers,
+        expectedConversations,
+        expectedMessages
+      );
+    }
+    // Summary
+    const totalDuration = Date.now() - startTime;
+    console.log('\n' + '='.repeat(60));
+    console.log(`✓ Stress test completed in ${(totalDuration / 1000 / 60).toFixed(1)} minutes`);
+    console.log('='.repeat(60));
+    // Generate HTML report
+    metrics.generateHTMLReport('./stress-test-report.html');
+    // Overall summary
+    const allMetrics = metrics.getAllMetrics();
+    const totalRequests = allMetrics.reduce((sum, m) => sum + m.totalRequests, 0);
+    const totalSuccessful = allMetrics.reduce((sum, m) => sum + m.successfulRequests, 0);
+    const totalFailed = allMetrics.reduce((sum, m) => sum + m.failedRequests, 0);
+    const overallSuccessRate =
+      totalRequests > 0 ? (totalSuccessful / totalRequests) * 100 : 0;
+    console.log(`\nOverall Statistics:`);
+    console.log(`  Total Requests: ${totalRequests}`);
+    console.log(`  Successful: ${totalSuccessful}`);
+    console.log(`  Failed: ${totalFailed}`);
+    console.log(`  Success Rate: ${overallSuccessRate.toFixed(1)}%`);
+    // Exit with error if overall success rate is too low
+    if (overallSuccessRate < (1 - config.maxErrorRate) * 100) {
+      console.log(`\n❌ Overall success rate too low. Test FAILED.`);
+      process.exit(1);
+    }
+    console.log(`\n✓ Test PASSED`);
+  } catch (error: any) {
+    console.error('\n❌ Stress test failed:', error.message);
+    if (config.verbose && error.stack) {
+      console.error(error.stack);
+    }
+    process.exit(1);
+  }
+}
+main().catch(error => {
+  console.error('Fatal error:', error);
+  process.exit(1);
+});

scripts/stress-test/metrics.ts ADDED Viewed

	@@ -0,0 +1,348 @@

+/**
+ * Metrics Collection and Reporting
+ * Tracks performance metrics and generates reports
+ */
+import { RequestTiming, percentile, formatDuration } from './utils';
+import * as fs from 'fs';
+import * as path from 'path';
+export interface ScenarioMetrics {
+  name: string;
+  startTime: number;
+  endTime: number;
+  duration: number;
+  totalRequests: number;
+  successfulRequests: number;
+  failedRequests: number;
+  successRate: number;
+  errorRate: number;
+  timings: {
+    min: number;
+    max: number;
+    avg: number;
+    p50: number;
+    p95: number;
+    p99: number;
+  };
+  requestsPerSecond: number;
+  errors: Array<{ message: string; count: number }>;
+}
+export class MetricsCollector {
+  private scenarios: Map<string, ScenarioMetrics> = new Map();
+  private currentScenario: string | null = null;
+  private scenarioStartTime: number = 0;
+  private timings: RequestTiming[] = [];
+  /**
+   * Start tracking a scenario
+   */
+  startScenario(name: string): void {
+    this.currentScenario = name;
+    this.scenarioStartTime = Date.now();
+    this.timings = [];
+  }
+  /**
+   * Add timing data
+   */
+  addTiming(timing: RequestTiming): void {
+    this.timings.push(timing);
+  }
+  /**
+   * Add multiple timings
+   */
+  addTimings(timings: RequestTiming[]): void {
+    this.timings.push(...timings);
+  }
+  /**
+   * End tracking current scenario
+   */
+  endScenario(): void {
+    if (!this.currentScenario) return;
+    const endTime = Date.now();
+    const duration = endTime - this.scenarioStartTime;
+    const successfulRequests = this.timings.filter(t => t.success).length;
+    const failedRequests = this.timings.filter(t => !t.success).length;
+    const totalRequests = this.timings.length;
+    const durations = this.timings.map(t => t.duration);
+    const metrics: ScenarioMetrics = {
+      name: this.currentScenario,
+      startTime: this.scenarioStartTime,
+      endTime,
+      duration,
+      totalRequests,
+      successfulRequests,
+      failedRequests,
+      successRate: totalRequests > 0 ? successfulRequests / totalRequests : 0,
+      errorRate: totalRequests > 0 ? failedRequests / totalRequests : 0,
+      timings: {
+        min: durations.length > 0 ? Math.min(...durations) : 0,
+        max: durations.length > 0 ? Math.max(...durations) : 0,
+        avg: durations.length > 0 ? durations.reduce((a, b) => a + b, 0) / durations.length : 0,
+        p50: percentile(durations, 0.5),
+        p95: percentile(durations, 0.95),
+        p99: percentile(durations, 0.99),
+      },
+      requestsPerSecond: duration > 0 ? (totalRequests / duration) * 1000 : 0,
+      errors: this.aggregateErrors(),
+    };
+    this.scenarios.set(this.currentScenario, metrics);
+    this.currentScenario = null;
+  }
+  /**
+   * Aggregate error messages
+   */
+  private aggregateErrors(): Array<{ message: string; count: number }> {
+    const errorCounts = new Map<string, number>();
+    for (const timing of this.timings) {
+      if (timing.error) {
+        const count = errorCounts.get(timing.error) || 0;
+        errorCounts.set(timing.error, count + 1);
+      }
+    }
+    return Array.from(errorCounts.entries())
+      .map(([message, count]) => ({ message, count }))
+      .sort((a, b) => b.count - a.count);
+  }
+  /**
+   * Get metrics for a scenario
+   */
+  getScenarioMetrics(name: string): ScenarioMetrics | undefined {
+    return this.scenarios.get(name);
+  }
+  /**
+   * Get all scenario metrics
+   */
+  getAllMetrics(): ScenarioMetrics[] {
+    return Array.from(this.scenarios.values());
+  }
+  /**
+   * Print scenario summary to console
+   */
+  printScenarioSummary(name: string): void {
+    const metrics = this.scenarios.get(name);
+    if (!metrics) return;
+    console.log(`\n✓ [${name}] Completed in ${formatDuration(metrics.duration)}`);
+    console.log(`  Total Requests: ${metrics.totalRequests}`);
+    console.log(
+      `  Success Rate: ${(metrics.successRate * 100).toFixed(1)}% (${metrics.successfulRequests}/${metrics.totalRequests})`
+    );
+    if (metrics.failedRequests > 0) {
+      console.log(`  ❌ Failed: ${metrics.failedRequests}`);
+      metrics.errors.slice(0, 3).forEach(err => {
+        console.log(`     - ${err.message} (${err.count}x)`);
+      });
+    }
+    console.log(`  Response Time:`);
+    console.log(`    min: ${formatDuration(metrics.timings.min)}`);
+    console.log(`    avg: ${formatDuration(metrics.timings.avg)}`);
+    console.log(`    max: ${formatDuration(metrics.timings.max)}`);
+    console.log(`    p95: ${formatDuration(metrics.timings.p95)}`);
+    console.log(`    p99: ${formatDuration(metrics.timings.p99)}`);
+    console.log(`  Throughput: ${metrics.requestsPerSecond.toFixed(2)} req/s`);
+  }
+  /**
+   * Generate HTML report
+   */
+  generateHTMLReport(outputPath: string): void {
+    const allMetrics = this.getAllMetrics();
+    const html = `
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+  <title>Stress Test Report</title>
+  <style>
+    body {
+      font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
+      max-width: 1200px;
+      margin: 0 auto;
+      padding: 20px;
+      background: #f5f5f5;
+    }
+    h1 { color: #333; }
+    h2 { color: #555; margin-top: 30px; }
+    .summary {
+      background: white;
+      padding: 20px;
+      border-radius: 8px;
+      margin-bottom: 20px;
+      box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+    }
+    .scenario {
+      background: white;
+      padding: 20px;
+      border-radius: 8px;
+      margin-bottom: 20px;
+      box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+    }
+    .metric-row {
+      display: flex;
+      justify-content: space-between;
+      padding: 8px 0;
+      border-bottom: 1px solid #eee;
+    }
+    .metric-label { font-weight: 600; color: #666; }
+    .metric-value { color: #333; }
+    .success { color: #22c55e; }
+    .error { color: #ef4444; }
+    .warning { color: #f59e0b; }
+    table {
+      width: 100%;
+      border-collapse: collapse;
+      margin-top: 15px;
+    }
+    th, td {
+      text-align: left;
+      padding: 12px;
+      border-bottom: 1px solid #eee;
+    }
+    th {
+      background: #f9fafb;
+      font-weight: 600;
+      color: #666;
+    }
+    .timestamp {
+      color: #999;
+      font-size: 0.9em;
+    }
+  </style>
+</head>
+<body>
+  <h1>🚀 Stress Test Report</h1>
+  <div class="timestamp">Generated: ${new Date().toLocaleString()}</div>
+  <div class="summary">
+    <h2>📊 Overall Summary</h2>
+    ${allMetrics
+      .map(
+        m => `
+      <div class="metric-row">
+        <span class="metric-label">${m.name}</span>
+        <span class="metric-value">
+          ${m.successfulRequests}/${m.totalRequests} requests
+          (${(m.successRate * 100).toFixed(1)}% success)
+          in ${formatDuration(m.duration)}
+        </span>
+      </div>
+    `
+      )
+      .join('')}
+  </div>
+  ${allMetrics
+    .map(
+      m => `
+    <div class="scenario">
+      <h2>${m.name}</h2>
+      <table>
+        <tr>
+          <th>Metric</th>
+          <th>Value</th>
+        </tr>
+        <tr>
+          <td>Duration</td>
+          <td>${formatDuration(m.duration)}</td>
+        </tr>
+        <tr>
+          <td>Total Requests</td>
+          <td>${m.totalRequests}</td>
+        </tr>
+        <tr>
+          <td>Successful</td>
+          <td class="success">${m.successfulRequests}</td>
+        </tr>
+        <tr>
+          <td>Failed</td>
+          <td class="${m.failedRequests > 0 ? 'error' : ''}">${m.failedRequests}</td>
+        </tr>
+        <tr>
+          <td>Success Rate</td>
+          <td class="${m.successRate >= 0.95 ? 'success' : m.successRate >= 0.8 ? 'warning' : 'error'}">
+            ${(m.successRate * 100).toFixed(1)}%
+          </td>
+        </tr>
+        <tr>
+          <td>Requests per Second</td>
+          <td>${m.requestsPerSecond.toFixed(2)}</td>
+        </tr>
+        <tr>
+          <td>Response Time (min)</td>
+          <td>${formatDuration(m.timings.min)}</td>
+        </tr>
+        <tr>
+          <td>Response Time (avg)</td>
+          <td>${formatDuration(m.timings.avg)}</td>
+        </tr>
+        <tr>
+          <td>Response Time (max)</td>
+          <td>${formatDuration(m.timings.max)}</td>
+        </tr>
+        <tr>
+          <td>Response Time (p95)</td>
+          <td>${formatDuration(m.timings.p95)}</td>
+        </tr>
+        <tr>
+          <td>Response Time (p99)</td>
+          <td>${formatDuration(m.timings.p99)}</td>
+        </tr>
+      </table>
+      ${
+        m.errors.length > 0
+          ? `
+        <h3>Errors</h3>
+        <table>
+          <tr>
+            <th>Error Message</th>
+            <th>Count</th>
+          </tr>
+          ${m.errors
+            .map(
+              err => `
+            <tr>
+              <td>${err.message}</td>
+              <td>${err.count}</td>
+            </tr>
+          `
+            )
+            .join('')}
+        </table>
+      `
+          : ''
+      }
+    </div>
+  `
+    )
+    .join('')}
+</body>
+</html>
+    `;
+    fs.writeFileSync(outputPath, html, 'utf-8');
+    console.log(`\n📊 Report saved to: ${path.resolve(outputPath)}`);
+  }
+}

scripts/stress-test/scenarios/admin-monitoring.ts ADDED Viewed

	@@ -0,0 +1,138 @@

+/**
+ * Scenario 4: Admin Monitoring
+ * Simulates admin endpoints under load
+ */
+import { StressTestConfig } from '../config';
+import { HTTPClient, sleep } from '../utils';
+import { MetricsCollector } from '../metrics';
+export interface AdminMonitoringResult {
+  healthChecks: number;
+  statsChecks: number;
+  timings: any[];
+  finalHealth: any;
+  finalStats: any;
+}
+export async function adminMonitoringScenario(
+  config: StressTestConfig,
+  metrics: MetricsCollector,
+  expectedUsers: number,
+  expectedConversations: number,
+  expectedMessages: number
+): Promise<AdminMonitoringResult> {
+  const scenarioName = 'Admin Monitoring';
+  const durationSeconds = Math.min(config.durationMinutes * 60, 120); // Max 2 minutes for this scenario
+  const checkIntervalMs = 5000; // Check every 5 seconds
+  console.log(`\n[${scenarioName}] Monitoring admin endpoints for ${durationSeconds}s...`);
+  metrics.startScenario(scenarioName);
+  const client = new HTTPClient(config);
+  let healthChecks = 0;
+  let statsChecks = 0;
+  let finalHealth: any = null;
+  let finalStats: any = null;
+  const startTime = Date.now();
+  while (Date.now() - startTime < durationSeconds * 1000) {
+    const promises: Promise<void>[] = [];
+    // Health check
+    promises.push(
+      client
+        .get('/api/admin/health')
+        .then(data => {
+          healthChecks++;
+          finalHealth = data;
+          if (config.verbose) {
+            console.log(
+              `  Health: users=${data.tables.find((t: any) => t.name === 'users')?.rowCount || 0}, ` +
+                `conversations=${data.tables.find((t: any) => t.name === 'conversations')?.rowCount || 0}, ` +
+                `messages=${data.tables.find((t: any) => t.name === 'messages')?.rowCount || 0}`
+            );
+          }
+        })
+        .catch(error => {
+          console.error(`  ❌ Health check failed:`, error.message);
+        })
+    );
+    // Stats check
+    promises.push(
+      client
+        .get('/api/admin/stats')
+        .then(data => {
+          statsChecks++;
+          finalStats = data;
+          if (config.verbose) {
+            console.log(
+              `  Stats: users=${data.users?.total || 0}, ` +
+                `conversations=${data.conversations?.total || 0}, ` +
+                `messages=${data.messages?.total || 0}`
+            );
+          }
+        })
+        .catch(error => {
+          console.error(`  ❌ Stats check failed:`, error.message);
+        })
+    );
+    await Promise.all(promises);
+    await sleep(checkIntervalMs);
+  }
+  const timings = client.getTimings();
+  metrics.addTimings(timings);
+  metrics.endScenario();
+  metrics.printScenarioSummary(scenarioName);
+  // Verify admin data
+  console.log(`\n[${scenarioName}] Verifying admin endpoints...`);
+  if (finalHealth && finalStats) {
+    const healthUsers = finalHealth.tables.find((t: any) => t.name === 'users')?.rowCount || 0;
+    const healthConvos =
+      finalHealth.tables.find((t: any) => t.name === 'conversations')?.rowCount || 0;
+    const healthMessages = finalHealth.tables.find((t: any) => t.name === 'messages')?.rowCount || 0;
+    const statsUsers = finalStats.users?.total || 0;
+    const statsConvos = finalStats.conversations?.total || 0;
+    const statsMessages = finalStats.messages?.total || 0;
+    console.log(`  Health API:`);
+    console.log(`    Users: ${healthUsers} (expected: ${expectedUsers})`);
+    console.log(`    Conversations: ${healthConvos} (expected: ${expectedConversations})`);
+    console.log(`    Messages: ${healthMessages} (expected: ≥${expectedMessages})`);
+    console.log(`  Stats API:`);
+    console.log(`    Users: ${statsUsers}`);
+    console.log(`    Conversations: ${statsConvos}`);
+    console.log(`    Messages: ${statsMessages}`);
+    // Verify consistency
+    if (healthUsers === statsUsers && healthConvos === statsConvos) {
+      console.log(`  ✓ Health and Stats APIs are consistent`);
+    } else {
+      console.log(`  ⚠️  WARNING: Health and Stats APIs show different values`);
+    }
+    // Verify data matches expectations
+    if (
+      healthUsers >= expectedUsers &&
+      healthConvos >= expectedConversations &&
+      healthMessages >= expectedMessages
+    ) {
+      console.log(`  ✓ Database growth verified`);
+    } else {
+      console.log(`  ⚠️  WARNING: Database counts lower than expected`);
+    }
+  }
+  return { healthChecks, statsChecks, timings, finalHealth, finalStats };
+}

scripts/stress-test/scenarios/concurrent-messages.ts ADDED Viewed

	@@ -0,0 +1,81 @@

+/**
+ * Scenario 3: Concurrent Messages
+ * Simulates concurrent message sending (LLM calls)
+ * WARNING: This will be slow due to LLM response times (60-90s each)
+ */
+import { StressTestConfig } from '../config';
+import { HTTPClient, authHeader, generateMessage } from '../utils';
+import { MetricsCollector } from '../metrics';
+export interface ConcurrentMessagesResult {
+  messagesSent: number;
+  timings: any[];
+}
+export async function concurrentMessagesScenario(
+  config: StressTestConfig,
+  metrics: MetricsCollector,
+  conversations: Array<{ id: string; username: string }>
+): Promise<ConcurrentMessagesResult> {
+  const scenarioName = 'Concurrent Messages';
+  const maxConcurrent = Math.min(20, conversations.length); // Limit to 20 concurrent LLM calls
+  console.log(`\n[${scenarioName}] Sending ${maxConcurrent} concurrent messages...`);
+  console.log(`  ⚠️  WARNING: This will be slow (60-90s per LLM call)`);
+  metrics.startScenario(scenarioName);
+  const client = new HTTPClient(config);
+  const promises: Promise<void>[] = [];
+  let messagesSent = 0;
+  // Select random subset of conversations
+  const selectedConvos = conversations
+    .sort(() => Math.random() - 0.5)
+    .slice(0, maxConcurrent);
+  for (let i = 0; i < selectedConvos.length; i++) {
+    const { id: conversationId, username } = selectedConvos[i];
+    const messageText = generateMessage();
+    const promise = client
+      .post(
+        `/api/conversations/${conversationId}/message`,
+        {
+          messages: [
+            {
+              id: `stress-test-${Date.now()}-${i}`,
+              role: 'user',
+              parts: [{ type: 'text', text: messageText }],
+              metadata: { speaker: 'student' },
+            },
+          ],
+        },
+        {
+          Authorization: authHeader(username, config.password),
+          'Content-Type': 'application/json',
+        },
+        90000 // 90s timeout for LLM
+      )
+      .then(() => {
+        messagesSent++;
+        console.log(`  ✓ Message ${messagesSent}/${maxConcurrent} completed`);
+      })
+      .catch(error => {
+        console.error(`  ❌ Failed to send message ${i + 1}:`, error.message);
+      });
+    promises.push(promise);
+  }
+  await Promise.all(promises);
+  const timings = client.getTimings();
+  metrics.addTimings(timings);
+  metrics.endScenario();
+  metrics.printScenarioSummary(scenarioName);
+  return { messagesSent, timings };
+}

scripts/stress-test/scenarios/conversation-flow.ts ADDED Viewed

	@@ -0,0 +1,74 @@

+/**
+ * Scenario 2: Conversation Creation
+ * Simulates users creating multiple conversations
+ */
+import { StressTestConfig, STUDENT_PERSONALITIES, COACH_TYPES } from '../config';
+import { HTTPClient, authHeader, randomElement } from '../utils';
+import { MetricsCollector } from '../metrics';
+export interface ConversationFlowResult {
+  conversations: Array<{ id: string; username: string }>;
+  timings: any[];
+}
+export async function conversationFlowScenario(
+  config: StressTestConfig,
+  metrics: MetricsCollector,
+  usernames: string[]
+): Promise<ConversationFlowResult> {
+  const scenarioName = 'Conversation Creation';
+  const conversationsPerUser = 2;
+  const totalConversations = usernames.length * conversationsPerUser;
+  console.log(
+    `\n[${scenarioName}] Creating ${totalConversations} conversations (${conversationsPerUser} per user)...`
+  );
+  metrics.startScenario(scenarioName);
+  const client = new HTTPClient(config);
+  const conversations: Array<{ id: string; username: string }> = [];
+  const promises: Promise<void>[] = [];
+  for (const username of usernames) {
+    for (let i = 0; i < conversationsPerUser; i++) {
+      const studentPromptId = randomElement(STUDENT_PERSONALITIES);
+      const coachPromptId = randomElement(COACH_TYPES);
+      const promise = client
+        .post(
+          '/api/conversations/create',
+          {
+            studentPromptId,
+            coachPromptId,
+            include3ConversationSummary: false,
+          },
+          {
+            Authorization: authHeader(username, config.password),
+          }
+        )
+        .then(data => {
+          const conversationId = data.conversation?.id;
+          if (conversationId) {
+            conversations.push({ id: conversationId, username });
+          }
+        })
+        .catch(error => {
+          console.error(`  ❌ Failed to create conversation for ${username}:`, error.message);
+        });
+      promises.push(promise);
+    }
+  }
+  await Promise.all(promises);
+  const timings = client.getTimings();
+  metrics.addTimings(timings);
+  metrics.endScenario();
+  metrics.printScenarioSummary(scenarioName);
+  return { conversations, timings };
+}

scripts/stress-test/scenarios/user-registration.ts ADDED Viewed

	@@ -0,0 +1,56 @@

+/**
+ * Scenario 1: User Registration
+ * Simulates mass user registration
+ */
+import { StressTestConfig } from '../config';
+import { HTTPClient, generateUsername, authHeader } from '../utils';
+import { MetricsCollector } from '../metrics';
+export interface UserRegistrationResult {
+  usernames: string[];
+  timings: any[];
+}
+export async function userRegistrationScenario(
+  config: StressTestConfig,
+  metrics: MetricsCollector
+): Promise<UserRegistrationResult> {
+  const scenarioName = 'User Registration';
+  console.log(`\n[${scenarioName}] Registering ${config.concurrentUsers} users...`);
+  metrics.startScenario(scenarioName);
+  const client = new HTTPClient(config);
+  const usernames: string[] = [];
+  const promises: Promise<void>[] = [];
+  // Register users concurrently
+  for (let i = 0; i < config.concurrentUsers; i++) {
+    const username = generateUsername();
+    usernames.push(username);
+    const promise = client
+      .post('/api/auth/register', { username })
+      .then(() => {
+        if (config.verbose && i % 10 === 0) {
+          console.log(`  Registered ${i + 1}/${config.concurrentUsers} users`);
+        }
+      })
+      .catch(error => {
+        console.error(`  ❌ Failed to register ${username}:`, error.message);
+      });
+    promises.push(promise);
+  }
+  await Promise.all(promises);
+  const timings = client.getTimings();
+  metrics.addTimings(timings);
+  metrics.endScenario();
+  metrics.printScenarioSummary(scenarioName);
+  return { usernames, timings };
+}

scripts/stress-test/utils.ts ADDED Viewed

	@@ -0,0 +1,255 @@

+/**
+ * Utility Functions for Stress Testing
+ * HTTP client, authentication, random data generation
+ */
+import { StressTestConfig } from './config';
+export interface RequestTiming {
+  url: string;
+  method: string;
+  startTime: number;
+  endTime: number;
+  duration: number;
+  status: number;
+  success: boolean;
+  error?: string;
+}
+/**
+ * Generate Basic Auth header
+ */
+export function authHeader(username: string, password: string): string {
+  return 'Basic ' + Buffer.from(`${username}:${password}`).toString('base64');
+}
+/**
+ * Generate unique test username
+ */
+export function generateUsername(prefix: string = 'stress_test'): string {
+  const timestamp = Date.now();
+  const random = Math.random().toString(36).substring(7);
+  return `${prefix}_${timestamp}_${random}`;
+}
+/**
+ * Generate random Chinese message
+ */
+export function generateMessage(): string {
+  const messages = [
+    '你好',
+    '我需要帮助',
+    '这个作业好难',
+    '我不明白',
+    '可以再解释一次吗？',
+    '谢谢老师',
+    '我明白了',
+    '还有问题',
+  ];
+  return messages[Math.floor(Math.random() * messages.length)];
+}
+/**
+ * HTTP client with timing and retry logic
+ */
+export class HTTPClient {
+  private config: StressTestConfig;
+  private timings: RequestTiming[] = [];
+  constructor(config: StressTestConfig) {
+    this.config = config;
+  }
+  /**
+   * Make HTTP request with timing
+   */
+  async request(
+    method: string,
+    path: string,
+    options: {
+      headers?: Record<string, string>;
+      body?: any;
+      timeout?: number;
+      retries?: number;
+    } = {}
+  ): Promise<{ data: any; timing: RequestTiming }> {
+    const url = `${this.config.baseURL}${path}`;
+    const startTime = Date.now();
+    const timeout = options.timeout || 30000; // 30s default
+    const retries = options.retries || 0;
+    let lastError: Error | null = null;
+    for (let attempt = 0; attempt <= retries; attempt++) {
+      try {
+        const controller = new AbortController();
+        const timeoutId = setTimeout(() => controller.abort(), timeout);
+        const response = await fetch(url, {
+          method,
+          headers: {
+            'Content-Type': 'application/json',
+            ...options.headers,
+          },
+          body: options.body ? JSON.stringify(options.body) : undefined,
+          signal: controller.signal,
+        });
+        clearTimeout(timeoutId);
+        const endTime = Date.now();
+        const duration = endTime - startTime;
+        let data: any;
+        const contentType = response.headers.get('content-type');
+        if (contentType?.includes('application/json')) {
+          data = await response.json();
+        } else {
+          data = await response.text();
+        }
+        const timing: RequestTiming = {
+          url,
+          method,
+          startTime,
+          endTime,
+          duration,
+          status: response.status,
+          success: response.ok,
+          error: response.ok ? undefined : `HTTP ${response.status}`,
+        };
+        this.timings.push(timing);
+        if (!response.ok) {
+          throw new Error(`HTTP ${response.status}: ${JSON.stringify(data)}`);
+        }
+        return { data, timing };
+      } catch (error: any) {
+        lastError = error;
+        if (attempt < retries) {
+          // Exponential backoff
+          const delay = Math.min(1000 * Math.pow(2, attempt), 10000);
+          if (this.config.verbose) {
+            console.log(
+              `  Retry ${attempt + 1}/${retries} after ${delay}ms for ${method} ${path}`
+            );
+          }
+          await sleep(delay);
+          continue;
+        }
+        // Final attempt failed
+        const endTime = Date.now();
+        const duration = endTime - startTime;
+        const timing: RequestTiming = {
+          url,
+          method,
+          startTime,
+          endTime,
+          duration,
+          status: 0,
+          success: false,
+          error: error.message,
+        };
+        this.timings.push(timing);
+        throw error;
+      }
+    }
+    throw lastError!;
+  }
+  /**
+   * GET request
+   */
+  async get(path: string, headers?: Record<string, string>): Promise<any> {
+    const { data } = await this.request('GET', path, { headers });
+    return data;
+  }
+  /**
+   * POST request
+   */
+  async post(
+    path: string,
+    body: any,
+    headers?: Record<string, string>,
+    timeout?: number
+  ): Promise<any> {
+    const { data } = await this.request('POST', path, { headers, body, timeout });
+    return data;
+  }
+  /**
+   * PUT request
+   */
+  async put(
+    path: string,
+    body: any,
+    headers?: Record<string, string>
+  ): Promise<any> {
+    const { data } = await this.request('PUT', path, { headers, body });
+    return data;
+  }
+  /**
+   * DELETE request
+   */
+  async delete(path: string, headers?: Record<string, string>): Promise<any> {
+    const { data } = await this.request('DELETE', path, { headers });
+    return data;
+  }
+  /**
+   * Get all request timings
+   */
+  getTimings(): RequestTiming[] {
+    return this.timings;
+  }
+  /**
+   * Clear timings
+   */
+  clearTimings(): void {
+    this.timings = [];
+  }
+}
+/**
+ * Sleep utility
+ */
+export function sleep(ms: number): Promise<void> {
+  return new Promise(resolve => setTimeout(resolve, ms));
+}
+/**
+ * Calculate percentile
+ */
+export function percentile(values: number[], p: number): number {
+  if (values.length === 0) return 0;
+  const sorted = values.slice().sort((a, b) => a - b);
+  const index = Math.ceil(sorted.length * p) - 1;
+  return sorted[Math.max(0, index)];
+}
+/**
+ * Format duration in ms to human readable
+ */
+export function formatDuration(ms: number): string {
+  if (ms < 1000) return `${ms}ms`;
+  if (ms < 60000) return `${(ms / 1000).toFixed(1)}s`;
+  return `${(ms / 60000).toFixed(1)}m`;
+}
+/**
+ * Random element from array
+ */
+export function randomElement<T>(arr: T[]): T {
+  return arr[Math.floor(Math.random() * arr.length)];
+}