Test Strategy and Coverage
This document outlines the updated test strategy for the AI Due Diligence application, focusing on end-to-end (e2e) testing and behavior-driven tests rather than implementation-specific testing.
Test Philosophy
Preferred Approach: End-to-End Testing
- Focus: User workflows and behavior from the user's perspective
- Coverage: Complete user journeys through the application
- Benefits: Tests real functionality, catches integration issues, more maintainable
Minimal Unit Testing
- Scope: Only for core behavior that can't be tested end-to-end
- Focus: Public API behavior, not internal implementation
- Examples: Configuration validation, error classification, session management
No Implementation-Specific Testing
- Removed: Tests that mock internal classes and methods
- Avoided: Testing internal implementation details
- Rationale: Such tests break easily and don't provide value to users
Test Structure
E2E Tests (tests/e2e/)
Core Application Tests
test_app_startup.py: Basic app loading, navigation, accessibility, responsivenesstest_document_processing.py: Data room setup, document processing workflowstest_ai_analysis.py: AI-powered analysis features, configuration, error handlingtest_performance.py: Performance characteristics, load handling, memory usage
User Journey Tests
test_complete_workflows.py: Complete end-to-end workflows covering all major featurestest_user_journeys.py: Role-based user scenarios (M&A analyst, legal counsel, consultant)test_robustness.py: Edge cases, error conditions, recovery scenarios
Integration Tests (tests/integration/)
test_critical_workflows.py: Real workflow testing with minimal mockingtest_export_and_ui.py: Export functionality and UI integration testing
Unit Tests (tests/unit/)
test_config.py: Configuration behavior and validationtest_session.py: Session management behaviortest_error_handling.py: Error classification and handling behavior
Test Coverage by User Workflow
β Company Analysis Workflow
- Data room configuration and processing
- Comprehensive analysis generation
- Strategic assessment
- Export functionality
- Error handling (missing API key, invalid paths)
β Checklist Matching Workflow
- Checklist processing and matching
- Results display and navigation
- Export functionality
β Due Diligence Questions Workflow
- Question processing and analysis
- Answer generation and display
- Question-specific insights
β Q&A Session Workflow
- Interactive question input
- Document search integration
- Answer generation with citations
- Session persistence
β Knowledge Graph Workflow
- Graph generation and visualization
- Entity and relationship exploration
- Graph navigation and export
β Data Room Management
- Path configuration and validation
- Document processing and indexing
- Status reporting and progress tracking
β Export and Download
- Multiple export formats
- Content validation
- Download workflows
β Error Recovery and Robustness
- Invalid input handling
- Network interruption recovery
- Session timeout handling
- Concurrent operation management
Test Execution
Running E2E Tests
# All e2e tests
uv run pytest tests/e2e/ -v
# Specific test file
uv run pytest tests/e2e/test_complete_workflows.py -v
# Slow tests (with extended timeouts)
uv run pytest tests/e2e/ -m slow -v
# Skip slow tests
uv run pytest tests/e2e/ -m "not slow" -v
Running Integration Tests
# Integration tests with real data
uv run pytest tests/integration/ -v
Running Unit Tests
# Behavior-focused unit tests
uv run pytest tests/unit/ -v
Running All Tests
# Complete test suite
uv run pytest tests/ -v
# With coverage
uv run pytest tests/ --cov=app --cov-report=html
Test Configuration
Browser Setup (E2E Tests)
- Primary: Chromium (headless by default)
- Viewport: 1280x720 (desktop), with mobile testing
- Timeouts: 30s default, 2min for slow AI operations
- Configuration:
playwright.config.pyandtests/e2e/conftest.py
Test Data
- Sample VDR:
data/vdrs/automated-services-transformation/ - Strategy Files:
data/strategy/ - Checklists:
data/checklist/ - Mock Data: Generated in test fixtures when needed
Performance Considerations
- Fast Tests: Basic UI, navigation, configuration (< 10s)
- Medium Tests: Document processing, workflow simulation (< 60s)
- Slow Tests: Full AI workflows, comprehensive analysis (< 5min)
Continuous Integration
Test Stages
- Fast Tests: Basic functionality and UI tests
- Integration Tests: Workflow testing with real data
- Slow Tests: Full e2e scenarios with AI operations
Failure Handling
- Screenshot Capture: Automatic on test failures
- Video Recording: Available for debugging
- Error Recovery: Tests include recovery scenario validation
Test Maintenance Guidelines
Adding New Tests
- Prefer E2E: Add user workflow tests to
tests/e2e/ - User Perspective: Write tests from user's point of view
- Real Scenarios: Use realistic data and user interactions
- Error Cases: Include error and recovery scenarios
Updating Tests
- Behavior Focus: Test what the feature does, not how it does it
- User Impact: Only test changes that affect user experience
- Minimal Mocking: Use real components whenever possible
- Clear Assertions: Assert on user-visible outcomes
Removing Tests
- Implementation Details: Remove tests of internal methods
- Heavy Mocking: Remove tests with excessive mocking
- Redundant Coverage: Remove duplicate coverage of same user workflow
Coverage Goals
Primary Goals (Must Have)
- β All main user workflows covered end-to-end
- β Error conditions and recovery scenarios
- β Cross-browser compatibility basics
- β Performance characteristics within acceptable ranges
Secondary Goals (Should Have)
- β Accessibility testing basics
- β Mobile/responsive design testing
- β Different user role scenarios
- β Session persistence and state management
Nice to Have
- Load testing with multiple concurrent users
- Extended browser compatibility testing
- Detailed performance profiling
- Automated visual regression testing
Monitoring and Metrics
Key Metrics
- Test Execution Time: E2E tests < 10 minutes total
- Test Reliability: > 95% pass rate in CI
- Coverage: 100% of user workflows covered
- Performance: All performance tests pass within thresholds
Success Criteria
- All user workflows testable without API keys
- Tests catch real user issues before deployment
- Test suite runs reliably in CI/CD pipeline
- New features automatically include e2e test coverage