Spaces:

nothingworry
/

IntegraChat

Sleeping

App Files Files Community

IntegraChat / TESTING_GUIDE.md

nothingworry

all the thing

78b6d7b 25 days ago

preview code

raw

history blame

11.1 kB

	# IntegraChat Testing Guide

	This guide explains how to test all the new features and improvements in IntegraChat.

	## Prerequisites

	1. Install Dependencies
	```bash
	pip install -r requirements.txt
	```

	2. Environment Setup
	- Create a `.env` file or set environment variables
	- Optional: Set up Ollama for LLM testing
	- Optional: Set up Supabase for production analytics

	## Test Structure

	### 1. Unit Tests

	Run unit tests for individual components:

	```bash
	# Run all unit tests
	pytest backend/tests/

	# Run specific test files
	pytest backend/tests/test_analytics_store.py -v
	pytest backend/tests/test_enhanced_admin_rules.py -v
	pytest backend/tests/test_api_endpoints.py -v

	# Run with coverage
	pytest backend/tests/ --cov=backend/api --cov-report=html
	```

	### 2. Integration Tests

	Test API endpoints with the FastAPI test client:

	```bash
	pytest backend/tests/test_api_endpoints.py -v
	```

	Note: Some integration tests may fail if MCP servers or LLM are not running. That's expected.

	### 3. Manual Testing Scripts

	Create test data and verify functionality manually:

	#### A. Test Analytics Store

	```bash
	python -c "
	from backend.api.storage.analytics_store import AnalyticsStore
	import time

	store = AnalyticsStore()

	# Log tool usage
	store.log_tool_usage('test_tenant', 'rag', latency_ms=150, tokens_used=500, success=True)
	store.log_tool_usage('test_tenant', 'web', latency_ms=80, success=True)

	# Log red-flag violation
	store.log_redflag_violation(
	'test_tenant',
	'rule1',
	'.password.',
	'high',
	'password123',
	confidence=0.95
	)

	# Log RAG search
	store.log_rag_search('test_tenant', 'test query', hits_count=5, avg_score=0.85, top_score=0.92)

	# Log agent query
	store.log_agent_query('test_tenant', 'test message', intent='rag', tools_used=['rag', 'llm'], total_tokens=1000)

	# Get stats
	print('Tool Usage:', store.get_tool_usage_stats('test_tenant'))
	print('Violations:', store.get_redflag_violations('test_tenant'))
	print('Activity:', store.get_activity_summary('test_tenant'))
	print('RAG Quality:', store.get_rag_quality_metrics('test_tenant'))
	"
	```

	#### B. Test Admin Rules with Regex

	```bash
	python -c "
	from backend.api.storage.rules_store import RulesStore
	import re

	store = RulesStore()

	# Add rule with regex pattern
	store.add_rule(
	'test_tenant',
	'Block password queries',
	pattern='.password.\|.pwd.',
	severity='high',
	description='Blocks password-related queries'
	)

	# Get detailed rules
	rules = store.get_rules_detailed('test_tenant')
	print('Rules:', rules)

	# Test regex matching
	pattern = rules[0]['pattern']
	regex = re.compile(pattern, re.IGNORECASE)
	test_text = 'What is my password?'
	match = regex.search(test_text)
	print(f'Match for \"{test_text}\": {match is not None}')
	"
	```

	## API Endpoint Testing

	### Using curl

	#### 1. Test Analytics Endpoints

	```bash
	# Overview
	curl -X GET "http://localhost:8000/analytics/overview?days=30" \
	-H "x-tenant-id: test_tenant"

	# Tool Usage
	curl -X GET "http://localhost:8000/analytics/tool-usage?days=30" \
	-H "x-tenant-id: test_tenant"

	# RAG Quality
	curl -X GET "http://localhost:8000/analytics/rag-quality?days=30" \
	-H "x-tenant-id: test_tenant"

	# Red Flags
	curl -X GET "http://localhost:8000/analytics/redflags?limit=50&days=30" \
	-H "x-tenant-id: test_tenant"
	```

	#### 2. Test Admin Endpoints

	```bash
	# Add rule with regex and severity
	curl -X POST "http://localhost:8000/admin/rules" \
	-H "x-tenant-id: test_tenant" \
	-H "Content-Type: application/json" \
	-d '{
	"rule": "Block password queries",
	"pattern": ".password.",
	"severity": "high",
	"description": "Blocks password-related queries"
	}'

	# Get detailed rules
	curl -X GET "http://localhost:8000/admin/rules?detailed=true" \
	-H "x-tenant-id: test_tenant"

	# Get violations
	curl -X GET "http://localhost:8000/admin/violations?limit=50&days=30" \
	-H "x-tenant-id: test_tenant"

	# Get tool logs
	curl -X GET "http://localhost:8000/admin/tools/logs?tool_name=rag&days=7" \
	-H "x-tenant-id: test_tenant"
	```

	#### 3. Test Agent Endpoints

	```bash
	# Agent chat (normal)
	curl -X POST "http://localhost:8000/agent/message" \
	-H "Content-Type: application/json" \
	-d '{
	"tenant_id": "test_tenant",
	"message": "What is the company policy?",
	"temperature": 0.0
	}'

	# Agent debug
	curl -X POST "http://localhost:8000/agent/debug" \
	-H "Content-Type: application/json" \
	-d '{
	"tenant_id": "test_tenant",
	"message": "What is the company policy?",
	"temperature": 0.0
	}'

	# Agent plan
	curl -X POST "http://localhost:8000/agent/plan" \
	-H "Content-Type: application/json" \
	-d '{
	"tenant_id": "test_tenant",
	"message": "What is the company policy?",
	"temperature": 0.0
	}'
	```

	### Using Python requests

	Create a test script `test_api_manual.py`:

	```python
	import requests
	import json

	BASE_URL = "http://localhost:8000"
	TENANT_ID = "test_tenant"

	headers = {"x-tenant-id": TENANT_ID}

	# Test analytics
	print("Testing Analytics Endpoints...")
	response = requests.get(f"{BASE_URL}/analytics/overview?days=30", headers=headers)
	print(f"Overview: {response.status_code} - {json.dumps(response.json(), indent=2)}")

	response = requests.get(f"{BASE_URL}/analytics/tool-usage?days=30", headers=headers)
	print(f"Tool Usage: {response.status_code} - {json.dumps(response.json(), indent=2)}")

	# Test admin rules
	print("\nTesting Admin Rules...")
	response = requests.post(
	f"{BASE_URL}/admin/rules",
	headers=headers,
	json={
	"rule": "Block password queries",
	"pattern": ".password.",
	"severity": "high"
	}
	)
	print(f"Add Rule: {response.status_code} - {json.dumps(response.json(), indent=2)}")

	response = requests.get(
	f"{BASE_URL}/admin/rules?detailed=true",
	headers=headers
	)
	print(f"Get Rules: {response.status_code} - {json.dumps(response.json(), indent=2)}")

	# Test agent endpoints
	print("\nTesting Agent Endpoints...")
	response = requests.post(
	f"{BASE_URL}/agent/plan",
	json={
	"tenant_id": TENANT_ID,
	"message": "What is the company policy?",
	"temperature": 0.0
	}
	)
	print(f"Agent Plan: {response.status_code} - {json.dumps(response.json(), indent=2)}")
	```

	Run it:
	```bash
	python test_api_manual.py
	```

	## End-to-End Testing Workflow

	### Step 1: Start Backend Services

	```bash
	# Terminal 1: Start FastAPI backend
	cd backend/api
	uvicorn main:app --port 8000 --reload

	# Terminal 2: Start unified MCP server (rag/web/admin tools)
	python backend/mcp_server/server.py

	# Optional: Start Ollama for LLM
	ollama serve
	```

	### Step 2: Generate Test Data

	Run the analytics and rules tests to populate the database:

	```bash
	pytest backend/tests/test_analytics_store.py -v
	pytest backend/tests/test_enhanced_admin_rules.py -v
	```

	### Step 3: Test Agent Flow

	1. Add some admin rules:
	```bash
	curl -X POST "http://localhost:8000/admin/rules" \
	-H "x-tenant-id: test_tenant" \
	-H "Content-Type: application/json" \
	-d '{"rule": "Block password queries", "pattern": ".password.", "severity": "high"}'
	```

	2. Send a query that triggers red-flag:
	```bash
	curl -X POST "http://localhost:8000/agent/message" \
	-H "Content-Type: application/json" \
	-d '{"tenant_id": "test_tenant", "message": "What is my password?"}'
	```

	3. Check violations were logged:
	```bash
	curl -X GET "http://localhost:8000/admin/violations" \
	-H "x-tenant-id: test_tenant"
	```

	4. Send normal queries and check analytics:
	```bash
	curl -X POST "http://localhost:8000/agent/message" \
	-H "Content-Type: application/json" \
	-d '{"tenant_id": "test_tenant", "message": "What is the company policy?"}'

	curl -X GET "http://localhost:8000/analytics/overview" \
	-H "x-tenant-id: test_tenant"
	```

	5. Use debug endpoint to see reasoning:
	```bash
	curl -X POST "http://localhost:8000/agent/debug" \
	-H "Content-Type: application/json" \
	-d '{"tenant_id": "test_tenant", "message": "What is the company policy?"}'
	```

	### Step 4: Verify Database

	Check that data is being stored:

	```bash
	# SQLite databases are in data/ directory
	sqlite3 data/analytics.db "SELECT * FROM tool_usage_events LIMIT 10;"
	sqlite3 data/analytics.db "SELECT * FROM redflag_violations LIMIT 10;"
	sqlite3 data/admin_rules.db "SELECT * FROM admin_rules;"
	```

	## Testing Checklist

	### Analytics Store
	- [ ] Tool usage logging works
	- [ ] Red-flag violations are logged
	- [ ] RAG search events are logged with quality metrics
	- [ ] Agent query events are logged
	- [ ] Stats can be filtered by time
	- [ ] Multiple tenants are isolated

	### Admin Rules
	- [ ] Rules can be added with regex patterns
	- [ ] Severity levels work (low/medium/high/critical)
	- [ ] Rules without pattern use rule text
	- [ ] Disabled rules are not returned
	- [ ] Multiple tenants are isolated
	- [ ] Regex patterns actually match correctly

	### API Endpoints
	- [ ] `/analytics/overview` returns correct data
	- [ ] `/analytics/tool-usage` returns stats
	- [ ] `/analytics/rag-quality` returns metrics
	- [ ] `/admin/rules` accepts regex/severity
	- [ ] `/admin/violations` returns violations
	- [ ] `/admin/tools/logs` returns tool usage
	- [ ] `/agent/debug` returns reasoning trace
	- [ ] `/agent/plan` returns tool selection plan
	- [ ] Missing tenant_id returns 400

	### Integration
	- [ ] Agent orchestrator logs to analytics
	- [ ] Red-flag detector logs violations
	- [ ] Tool calls are tracked
	- [ ] Multi-step workflows are logged
	- [ ] Errors are logged correctly

	## Common Issues

	### Database Not Found
	- Ensure `data/` directory exists
	- Analytics store will create it automatically

	### Tests Fail Due to Missing Services
	- Some tests require MCP servers or LLM to be running
	- Mock these services or skip tests if services unavailable
	- Unit tests should work without external services

	### Import Errors
	- Ensure you're running from project root
	- Check that `backend/` is in Python path
	- Install all dependencies: `pip install -r requirements.txt`

	## Performance Testing

	For large-scale testing:

	```python
	# Load test analytics store
	from backend.api.storage.analytics_store import AnalyticsStore
	import time

	store = AnalyticsStore()
	tenant_id = "load_test_tenant"

	start = time.time()
	for i in range(1000):
	store.log_tool_usage(tenant_id, "rag", latency_ms=100 + i % 50)

	elapsed = time.time() - start
	print(f"Logged 1000 events in {elapsed:.2f}s ({1000/elapsed:.0f} events/sec)")

	# Query performance
	start = time.time()
	stats = store.get_tool_usage_stats(tenant_id)
	elapsed = time.time() - start
	print(f"Query took {elapsed*1000:.2f}ms")
	```

	## Next Steps

	1. Add more test cases for edge cases
	2. Set up CI/CD to run tests automatically
	3. Add performance benchmarks for analytics queries
	4. Create integration test suite that spins up all services
	5. Add E2E tests using Playwright or Selenium for frontend

	For questions or issues, check the test files in `backend/tests/` or refer to the main README.md.