Testing Guide
Comprehensive testing documentation for the Real-Time Misinformation Heatmap system.
Overview
This guide covers all aspects of testing the misinformation heatmap system, including unit tests, integration tests, end-to-end tests, and performance testing.
Test Architecture
tests/
βββ unit/ # Unit tests for individual components
β βββ test_nlp_analyzer.py
β βββ test_satellite_client.py
β βββ test_database.py
β βββ test_api.py
βββ integration/ # Integration tests for component interactions
β βββ test_ingestion_pipeline.py
β βββ test_api_integration.py
β βββ test_deployment.py
βββ e2e/ # End-to-end tests for complete workflows
β βββ test_end_to_end.py
β βββ run_e2e_tests.sh
β βββ run_e2e_tests.ps1
βββ performance/ # Performance and load tests
β βββ test_performance.py
β βββ load_test.py
βββ fixtures/ # Test data and fixtures
βββ sample_events.json
βββ test_states.geojson
βββ mock_responses/
Test Categories
1. Unit Tests
Unit tests verify individual components in isolation.
Running Unit Tests
# Run all unit tests
python -m pytest tests/unit/ -v
# Run specific test file
python -m pytest tests/unit/test_nlp_analyzer.py -v
# Run with coverage
python -m pytest tests/unit/ --cov=backend --cov-report=html
Unit Test Coverage
| Component | Test File | Coverage Target | Current Coverage |
|---|---|---|---|
| NLP Analyzer | test_nlp_analyzer.py |
90% | 85% |
| Satellite Client | test_satellite_client.py |
85% | 80% |
| Database Layer | test_database.py |
95% | 90% |
| API Endpoints | test_api.py |
90% | 88% |
| Configuration | test_config.py |
100% | 95% |
Key Unit Test Scenarios
NLP Analyzer Tests:
- Text preprocessing and cleaning
- Language detection accuracy
- Entity extraction for Indian locations
- Claim extraction from various text types
- Virality and reality score calculations
- Error handling for malformed input
Satellite Client Tests:
- API authentication and connection
- Coordinate validation for India boundaries
- Embedding extraction and similarity calculation
- Stub mode functionality
- Error handling and retry logic
Database Tests:
- Connection management (SQLite and BigQuery)
- CRUD operations for events
- Query optimization and indexing
- Data validation and constraints
- Migration and schema updates
API Tests:
- Endpoint response formats
- Input validation and sanitization
- Authentication and authorization
- Rate limiting functionality
- Error response handling
2. Integration Tests
Integration tests verify component interactions and data flow.
Running Integration Tests
# Run all integration tests
python -m pytest tests/integration/ -v
# Run with specific markers
python -m pytest tests/integration/ -m "database" -v
python -m pytest tests/integration/ -m "api" -v
Integration Test Scenarios
Ingestion Pipeline Tests:
- End-to-end event processing flow
- NLP analysis integration with database storage
- Satellite validation integration
- Error propagation and handling
- Data consistency across components
API Integration Tests:
- Database query integration
- Real-time data updates
- Cross-component error handling
- Performance under concurrent requests
Deployment Tests:
- Local environment setup validation
- Cloud deployment verification
- Service connectivity and health checks
- Configuration validation across environments
3. End-to-End Tests
E2E tests verify complete user workflows from frontend to backend.
Running E2E Tests
# Local mode
cd tests/e2e
./run_e2e_tests.sh --mode local --verbose
# Cloud mode
./run_e2e_tests.sh --mode cloud --output results.json
# Windows PowerShell
.\run_e2e_tests.ps1 -Mode local -Verbose
E2E Test Scenarios
Complete User Workflows:
Data Ingestion to Visualization
- Submit test events via API
- Verify NLP processing and satellite validation
- Check heatmap data updates
- Validate frontend display
Interactive Map Usage
- Load frontend application
- Verify map initialization and state boundaries
- Test state click interactions
- Validate modal displays and data accuracy
Real-time Updates
- Inject new events during testing
- Verify real-time data refresh
- Check status indicators and notifications
Error Handling and Recovery
- Test invalid input handling
- Verify graceful degradation
- Check error message display
E2E Test Results Interpretation
Success Criteria:
- All API endpoints respond correctly (< 2s response time)
- Frontend loads and displays map within 10 seconds
- Data ingestion processes events within 30 seconds
- Real-time updates reflect within 60 seconds
- Error scenarios handled gracefully
Common Issues and Solutions:
| Issue | Symptoms | Solution |
|---|---|---|
| Service startup timeout | Tests fail with connection errors | Increase startup wait time, check service logs |
| WebDriver failures | Browser tests skip or fail | Install Chrome/Chromium, check headless mode |
| Data inconsistency | Heatmap shows incorrect data | Verify database state, check processing pipeline |
| Performance degradation | Slow response times | Check system resources, optimize queries |
4. Performance Tests
Performance tests verify system behavior under load and measure response times.
Running Performance Tests
# Basic performance test
python tests/performance/test_performance.py
# Load testing
python tests/performance/load_test.py --users 50 --duration 300
# Benchmark specific components
python scripts/performance_benchmark.py --component nlp
Performance Benchmarks
API Response Times (95th percentile):
/health: < 100ms/heatmap: < 1000ms/region/{state}: < 800ms/ingest/test: < 2000ms
Processing Times:
- NLP Analysis: < 500ms per event
- Satellite Validation: < 1000ms per event
- Database Query (heatmap): < 200ms
- Frontend Load Time: < 5 seconds
Throughput Targets:
- API Requests: 100 requests/second
- Event Processing: 50 events/second
- Concurrent Users: 100 users
Load Testing Scenarios
Steady Load Test
- 50 concurrent users
- 5-minute duration
- Mixed API endpoints
Spike Test
- Ramp up to 200 users in 30 seconds
- Hold for 2 minutes
- Ramp down to 10 users
Stress Test
- Gradually increase load until failure
- Identify breaking point
- Measure recovery time
Test Data Management
Test Fixtures
Sample Events (fixtures/sample_events.json):
[
{
"text": "Breaking news about infrastructure development in Maharashtra",
"source": "test_news",
"location": "Maharashtra",
"category": "infrastructure",
"expected_virality": 0.6,
"expected_reality": 0.8
},
{
"text": "False claims about Karnataka government policies",
"source": "test_social",
"location": "Karnataka",
"category": "politics",
"expected_virality": 0.8,
"expected_reality": 0.2
}
]
Mock Responses:
- Satellite API responses with various similarity scores
- NLP model outputs for different text types
- Database query results for different scenarios
Test Database Setup
Local Testing:
# Initialize test database
python backend/init_db.py --mode local --test-data
# Reset test database
rm data/test_heatmap.db
python backend/init_db.py --mode local --test-data
Cloud Testing:
# Setup test dataset in BigQuery
./scripts/setup_bigquery.sh --project test-project --test-data
Continuous Integration
GitHub Actions Workflow
name: Test Suite
on: [push, pull_request]
jobs:
unit-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.8'
- name: Install dependencies
run: pip install -r backend/requirements.txt
- name: Run unit tests
run: python -m pytest tests/unit/ --cov=backend
integration-tests:
runs-on: ubuntu-latest
needs: unit-tests
steps:
- uses: actions/checkout@v3
- name: Setup services
run: ./scripts/run_local.sh &
- name: Run integration tests
run: python -m pytest tests/integration/
e2e-tests:
runs-on: ubuntu-latest
needs: integration-tests
steps:
- uses: actions/checkout@v3
- name: Setup Chrome
uses: browser-actions/setup-chrome@latest
- name: Run E2E tests
run: cd tests/e2e && ./run_e2e_tests.sh --mode local
Test Quality Gates
Pull Request Requirements:
- All unit tests pass (100%)
- Integration tests pass (100%)
- Code coverage > 80%
- No critical security vulnerabilities
- Performance benchmarks within acceptable range
Release Requirements:
- All test suites pass (100%)
- E2E tests pass in both local and cloud modes
- Performance tests meet SLA requirements
- Load tests demonstrate system stability
Test Environment Setup
Local Development Testing
Prerequisites:
# Install Python dependencies pip install -r backend/requirements.txt pip install -r tests/e2e/requirements.txt # Install Chrome for Selenium tests # Ubuntu/Debian: sudo apt-get install google-chrome-stable # macOS: brew install --cask google-chromeEnvironment Variables:
export MODE=local export PYTHONPATH=backend:$PYTHONPATH export LOG_LEVEL=DEBUGStart Services:
./scripts/run_local.sh
Cloud Testing Environment
Setup GCP Project:
gcloud projects create test-misinformation-heatmap gcloud config set project test-misinformation-heatmapDeploy Test Infrastructure:
./scripts/setup_bigquery.sh --project test-misinformation-heatmap ./scripts/pubsub_setup.sh --project test-misinformation-heatmapDeploy Application:
./scripts/deploy_cloudrun.sh --project test-misinformation-heatmap
Troubleshooting Test Issues
Common Test Failures
1. Database Connection Errors
Error: sqlite3.OperationalError: database is locked
Solution: Ensure no other processes are using the test database, or use a unique database file for each test run.
2. Selenium WebDriver Issues
Error: selenium.common.exceptions.WebDriverException: chrome not reachable
Solution: Install Chrome/Chromium, update ChromeDriver, or run tests in headless mode.
3. API Timeout Errors
Error: requests.exceptions.ConnectTimeout: HTTPSConnectionPool
Solution: Increase timeout values, check service startup, verify network connectivity.
4. Memory Issues During Testing
Error: MemoryError: Unable to allocate array
Solution: Reduce test data size, increase system memory, or run tests in smaller batches.
Debug Mode Testing
Enable Debug Logging:
export LOG_LEVEL=DEBUG
python -m pytest tests/ -v -s --log-cli-level=DEBUG
Run Single Test with Debugging:
python -m pytest tests/unit/test_nlp_analyzer.py::test_claim_extraction -v -s --pdb
Profile Test Performance:
python -m pytest tests/ --profile --profile-svg
Test Metrics and Reporting
Coverage Reports
Generate HTML Coverage Report:
python -m pytest tests/ --cov=backend --cov-report=html
open htmlcov/index.html
Coverage Targets:
- Overall: > 80%
- Critical components (NLP, Database): > 90%
- API endpoints: > 85%
- Configuration and utilities: > 75%
Test Execution Reports
JUnit XML Report:
python -m pytest tests/ --junitxml=test-results.xml
HTML Test Report:
python -m pytest tests/ --html=test-report.html --self-contained-html
Performance Metrics
Response Time Percentiles:
- P50 (median): Target response time for typical usage
- P95: Maximum acceptable response time
- P99: Response time for worst-case scenarios
Resource Usage:
- CPU utilization during tests
- Memory consumption patterns
- Database query performance
- Network I/O metrics
Best Practices
Writing Effective Tests
Test Naming Convention:
def test_should_extract_claims_when_given_valid_text(): # Test implementationArrange-Act-Assert Pattern:
def test_nlp_analysis(): # Arrange analyzer = NLPAnalyzer() text = "Sample misinformation text" # Act result = analyzer.analyze(text) # Assert assert result.claims is not None assert len(result.claims) > 0Use Fixtures for Common Setup:
@pytest.fixture def sample_event(): return { "text": "Test event text", "source": "test", "location": "Maharashtra" }Mock External Dependencies:
@patch('backend.satellite_client.requests.get') def test_satellite_validation(mock_get): mock_get.return_value.json.return_value = {"similarity": 0.8} # Test implementation
Test Maintenance
Regular Test Review:
- Remove obsolete tests
- Update test data to reflect current requirements
- Refactor duplicated test code
Test Data Management:
- Keep test data minimal and focused
- Use factories for generating test objects
- Clean up test data after execution
Performance Monitoring:
- Track test execution times
- Identify and optimize slow tests
- Parallelize independent tests
Conclusion
This testing guide provides comprehensive coverage of all testing aspects for the misinformation heatmap system. Regular execution of these tests ensures system reliability, performance, and maintainability.
For questions or issues with testing, please refer to the troubleshooting section or contact the development team.