dd-poc / tests /README.md
Juan Salas
Refactor test suite: Remove implementation tests, add comprehensive E2E coverage
32ea56b

Test Strategy and Coverage

This document outlines the updated test strategy for the AI Due Diligence application, focusing on end-to-end (e2e) testing and behavior-driven tests rather than implementation-specific testing.

Test Philosophy

Preferred Approach: End-to-End Testing

  • Focus: User workflows and behavior from the user's perspective
  • Coverage: Complete user journeys through the application
  • Benefits: Tests real functionality, catches integration issues, more maintainable

Minimal Unit Testing

  • Scope: Only for core behavior that can't be tested end-to-end
  • Focus: Public API behavior, not internal implementation
  • Examples: Configuration validation, error classification, session management

No Implementation-Specific Testing

  • Removed: Tests that mock internal classes and methods
  • Avoided: Testing internal implementation details
  • Rationale: Such tests break easily and don't provide value to users

Test Structure

E2E Tests (tests/e2e/)

Core Application Tests

  • test_app_startup.py: Basic app loading, navigation, accessibility, responsiveness
  • test_document_processing.py: Data room setup, document processing workflows
  • test_ai_analysis.py: AI-powered analysis features, configuration, error handling
  • test_performance.py: Performance characteristics, load handling, memory usage

User Journey Tests

  • test_complete_workflows.py: Complete end-to-end workflows covering all major features
  • test_user_journeys.py: Role-based user scenarios (M&A analyst, legal counsel, consultant)
  • test_robustness.py: Edge cases, error conditions, recovery scenarios

Integration Tests (tests/integration/)

  • test_critical_workflows.py: Real workflow testing with minimal mocking
  • test_export_and_ui.py: Export functionality and UI integration testing

Unit Tests (tests/unit/)

  • test_config.py: Configuration behavior and validation
  • test_session.py: Session management behavior
  • test_error_handling.py: Error classification and handling behavior

Test Coverage by User Workflow

βœ… Company Analysis Workflow

  • Data room configuration and processing
  • Comprehensive analysis generation
  • Strategic assessment
  • Export functionality
  • Error handling (missing API key, invalid paths)

βœ… Checklist Matching Workflow

  • Checklist processing and matching
  • Results display and navigation
  • Export functionality

βœ… Due Diligence Questions Workflow

  • Question processing and analysis
  • Answer generation and display
  • Question-specific insights

βœ… Q&A Session Workflow

  • Interactive question input
  • Document search integration
  • Answer generation with citations
  • Session persistence

βœ… Knowledge Graph Workflow

  • Graph generation and visualization
  • Entity and relationship exploration
  • Graph navigation and export

βœ… Data Room Management

  • Path configuration and validation
  • Document processing and indexing
  • Status reporting and progress tracking

βœ… Export and Download

  • Multiple export formats
  • Content validation
  • Download workflows

βœ… Error Recovery and Robustness

  • Invalid input handling
  • Network interruption recovery
  • Session timeout handling
  • Concurrent operation management

Test Execution

Running E2E Tests

# All e2e tests
uv run pytest tests/e2e/ -v

# Specific test file
uv run pytest tests/e2e/test_complete_workflows.py -v

# Slow tests (with extended timeouts)
uv run pytest tests/e2e/ -m slow -v

# Skip slow tests  
uv run pytest tests/e2e/ -m "not slow" -v

Running Integration Tests

# Integration tests with real data
uv run pytest tests/integration/ -v

Running Unit Tests

# Behavior-focused unit tests
uv run pytest tests/unit/ -v

Running All Tests

# Complete test suite
uv run pytest tests/ -v

# With coverage
uv run pytest tests/ --cov=app --cov-report=html

Test Configuration

Browser Setup (E2E Tests)

  • Primary: Chromium (headless by default)
  • Viewport: 1280x720 (desktop), with mobile testing
  • Timeouts: 30s default, 2min for slow AI operations
  • Configuration: playwright.config.py and tests/e2e/conftest.py

Test Data

  • Sample VDR: data/vdrs/automated-services-transformation/
  • Strategy Files: data/strategy/
  • Checklists: data/checklist/
  • Mock Data: Generated in test fixtures when needed

Performance Considerations

  • Fast Tests: Basic UI, navigation, configuration (< 10s)
  • Medium Tests: Document processing, workflow simulation (< 60s)
  • Slow Tests: Full AI workflows, comprehensive analysis (< 5min)

Continuous Integration

Test Stages

  1. Fast Tests: Basic functionality and UI tests
  2. Integration Tests: Workflow testing with real data
  3. Slow Tests: Full e2e scenarios with AI operations

Failure Handling

  • Screenshot Capture: Automatic on test failures
  • Video Recording: Available for debugging
  • Error Recovery: Tests include recovery scenario validation

Test Maintenance Guidelines

Adding New Tests

  1. Prefer E2E: Add user workflow tests to tests/e2e/
  2. User Perspective: Write tests from user's point of view
  3. Real Scenarios: Use realistic data and user interactions
  4. Error Cases: Include error and recovery scenarios

Updating Tests

  1. Behavior Focus: Test what the feature does, not how it does it
  2. User Impact: Only test changes that affect user experience
  3. Minimal Mocking: Use real components whenever possible
  4. Clear Assertions: Assert on user-visible outcomes

Removing Tests

  1. Implementation Details: Remove tests of internal methods
  2. Heavy Mocking: Remove tests with excessive mocking
  3. Redundant Coverage: Remove duplicate coverage of same user workflow

Coverage Goals

Primary Goals (Must Have)

  • βœ… All main user workflows covered end-to-end
  • βœ… Error conditions and recovery scenarios
  • βœ… Cross-browser compatibility basics
  • βœ… Performance characteristics within acceptable ranges

Secondary Goals (Should Have)

  • βœ… Accessibility testing basics
  • βœ… Mobile/responsive design testing
  • βœ… Different user role scenarios
  • βœ… Session persistence and state management

Nice to Have

  • Load testing with multiple concurrent users
  • Extended browser compatibility testing
  • Detailed performance profiling
  • Automated visual regression testing

Monitoring and Metrics

Key Metrics

  • Test Execution Time: E2E tests < 10 minutes total
  • Test Reliability: > 95% pass rate in CI
  • Coverage: 100% of user workflows covered
  • Performance: All performance tests pass within thresholds

Success Criteria

  • All user workflows testable without API keys
  • Tests catch real user issues before deployment
  • Test suite runs reliably in CI/CD pipeline
  • New features automatically include e2e test coverage