# API Testing and Profiling Guide This guide explains how to test and profile the YLFF API endpoints using the test script. ## Quick Start ### 1. Start the API Server ```bash # From project root python -m uvicorn ylff.api:app --host 0.0.0.0 --port 8000 ``` Or if running in Docker/RunPod, the server should already be running. ### 2. Run the Test Script ```bash # Basic test (auto-detects test data) python scripts/experiments/test_api_with_profiling.py # Test with specific data python scripts/experiments/test_api_with_profiling.py \ --sequence-dir data/arkit_ba_validation/ba_work/images \ --arkit-dir data/arkit_ba_validation # Test against remote server python scripts/experiments/test_api_with_profiling.py \ --base-url https://your-pod-id-8000.proxy.runpod.net # Save results to custom location python scripts/experiments/test_api_with_profiling.py \ --output data/test_results/api_test_$(date +%Y%m%d_%H%M%S).json ``` ## Test Script Features The test script (`scripts/experiments/test_api_with_profiling.py`) automatically: 1. **Tests all API endpoints**: - Health check (`/health`) - API info (`/`) - Models list (`/models`) - Sequence validation (`/api/v1/validate/sequence`) - ARKit validation (`/api/v1/validate/arkit`) - Job management (`/api/v1/jobs`, `/api/v1/jobs/{job_id}`) - Profiling endpoints (metrics, hot paths, latency, system) 2. **Profiles code execution**: - Tracks API request latencies - Monitors function execution times - Identifies hot paths (most time-consuming operations) - Tracks system resources (CPU, memory, GPU) 3. **Auto-detects test data**: - Looks for `assets/` folder first - Falls back to `data/` folder - Uses existing validation data if available 4. **Generates reports**: - Saves detailed JSON results - Prints profiling summary - Shows latency breakdown by stage ## Test Data Structure The script looks for test data in this order: 1. **`assets/examples/ARKit/`** - ARKit video and metadata 2. **`assets/examples/*/`** - Image sequences 3. **`data/arkit_ba_validation/`** - Existing ARKit validation data 4. **`data/*/ba_work/images/`** - BA work directories with images ### Creating Test Assets If you want to use a custom `assets/` folder: ```bash mkdir -p assets/examples/ARKit # Place your ARKit video and metadata here # Or place image sequences in assets/examples/your_sequence/ ``` ## Profiling Results The test script generates profiling data in two ways: ### 1. Local Profiling (in test script) The script uses the `Profiler` class to track: - API request durations - Function execution times - Memory usage - GPU memory usage ### 2. Server-Side Profiling (via API) The API server also tracks profiling data. Access it via: ```bash # Get all metrics curl http://localhost:8000/api/v1/profiling/metrics # Get hot paths (top time-consuming operations) curl http://localhost:8000/api/v1/profiling/hot-paths # Get latency breakdown by stage curl http://localhost:8000/api/v1/profiling/latency # Get system metrics (CPU, memory, GPU) curl http://localhost:8000/api/v1/profiling/system # Get stats for specific stage curl http://localhost:8000/api/v1/profiling/stage/api_request # Reset profiling data curl -X POST http://localhost:8000/api/v1/profiling/reset ``` ## Example Output ``` ================================================================================ YLFF API Testing and Profiling ================================================================================ Base URL: http://localhost:8000 Start time: 2024-01-15T10:30:00 [1/11] Testing /health endpoint... ✓ Health check passed: {'status': 'healthy'} [2/11] Testing / endpoint... ✓ API info retrieved: YLFF API v1.0.0 [3/11] Testing /models endpoint... ✓ Found 5 models [4/11] Testing /api/v1/validate/sequence endpoint... Using sequence: data/arkit_ba_validation/ba_work/images ✓ Validation job queued: abc123-def456-... ... ================================================================================ Profiling Summary ================================================================================ Total entries: 45 Stages tracked: 3 Functions tracked: 11 Latency Breakdown: api_request 12.345s ( 45.2%) avg: 0.123s calls: 100 validate_sequence 8.901s ( 32.6%) avg: 8.901s calls: 1 validate_arkit 6.234s ( 22.2%) avg: 6.234s calls: 1 ``` ## Interpreting Results ### Latency Breakdown Shows where time is spent: - **api_request**: Time spent in API layer (network + processing) - **validate_sequence**: Time spent in sequence validation - **validate_arkit**: Time spent in ARKit validation - **gpu**: GPU computation time - **cpu**: CPU computation time - **data_loading**: Data I/O time ### Hot Paths Shows the most time-consuming functions: - Functions with highest total execution time - Useful for identifying bottlenecks ### System Metrics Shows resource utilization: - CPU usage percentage - Memory usage percentage - GPU memory usage (if available) ## Troubleshooting ### Connection Errors If you get connection errors: ```bash # Check if server is running curl http://localhost:8000/health # Check server logs # (if running locally, check terminal output) ``` ### Missing Test Data If test data is not found: ```bash # Specify paths explicitly python scripts/experiments/test_api_with_profiling.py \ --sequence-dir /path/to/images \ --arkit-dir /path/to/arkit ``` ### Timeout Errors If requests timeout: ```bash # Increase timeout (default: 300s) python scripts/experiments/test_api_with_profiling.py --timeout 600 ``` ## Continuous Profiling For continuous profiling during development: ```bash # Run tests in a loop while true; do python scripts/experiments/test_api_with_profiling.py --output "data/profiling/run_$(date +%s).json" sleep 60 done ``` ## Integration with CI/CD Add to your CI pipeline: ```yaml - name: Test API Endpoints run: | python scripts/experiments/test_api_with_profiling.py \ --base-url http://localhost:8000 \ --output test_results/api_test.json ``` ## Next Steps - Review profiling results to identify bottlenecks - Optimize hot paths identified in profiling - Use system metrics to tune resource allocation - Compare profiling results across different model sizes/configurations