Spaces:
Running
Running
| # CLI Test Suite Quickstart | |
| ## Prerequisites | |
| Ensure you have the conda environment activated: | |
| ```bash | |
| conda activate conformal-s | |
| ``` | |
| ## Running Tests | |
| ### Run all CLI tests | |
| ```bash | |
| cd /groups/doudna/projects/ronb/conformal-protein-retrieval | |
| pytest tests/test_cli.py -v | |
| ``` | |
| Expected output: | |
| ``` | |
| tests/test_cli.py::test_main_help PASSED [ 4%] | |
| tests/test_cli.py::test_main_no_command PASSED [ 8%] | |
| tests/test_cli.py::test_embed_help PASSED [ 12%] | |
| tests/test_cli.py::test_search_help PASSED [ 16%] | |
| ... | |
| ======================== 24 passed in 2.34s ======================== | |
| ``` | |
| ### Run a single test | |
| ```bash | |
| pytest tests/test_cli.py::test_search_with_mock_data -v | |
| ``` | |
| ### Run tests with detailed output | |
| ```bash | |
| pytest tests/test_cli.py -v -s | |
| ``` | |
| The `-s` flag shows print statements from the code. | |
| ### Run tests and see which code is tested | |
| ```bash | |
| pytest tests/test_cli.py --cov=protein_conformal.cli --cov-report=term-missing | |
| ``` | |
| ## What Each Test Does | |
| ### Help Tests (fast, no computation) | |
| ```bash | |
| # These verify help text is correct | |
| pytest tests/test_cli.py -k "help" -v | |
| ``` | |
| Tests: `test_*_help` (7 tests) | |
| - Verifies all commands have proper documentation | |
| - Checks that all options are listed | |
| - Confirms command structure is correct | |
| ### Search Tests (uses mock data) | |
| ```bash | |
| # These test the search functionality | |
| pytest tests/test_cli.py -k "search" -v | |
| ``` | |
| Tests: `test_search_*` (8 tests) | |
| - Creates small mock embeddings (5x128 and 20x128) | |
| - Tests FAISS similarity search | |
| - Tests threshold filtering | |
| - Tests metadata merging | |
| - Tests edge cases | |
| ### Probability Tests (uses mock calibration) | |
| ```bash | |
| # These test probability conversion | |
| pytest tests/test_cli.py -k "prob" -v | |
| ``` | |
| Tests: `test_prob_*` (3 tests) | |
| - Creates mock calibration data | |
| - Tests Venn-Abers probability conversion | |
| - Tests CSV input/output | |
| ### Calibration Tests (uses mock data) | |
| ```bash | |
| # These test threshold calibration | |
| pytest tests/test_cli.py -k "calibrate" -v | |
| ``` | |
| Tests: `test_calibrate_*` (2 tests) | |
| - Creates mock similarity/label pairs | |
| - Tests FDR/FNR threshold computation | |
| - Tests multiple calibration trials | |
| ## Example Test Walkthrough | |
| Let's look at `test_search_with_mock_data()` in detail: | |
| ```python | |
| def test_search_with_mock_data(tmp_path): | |
| """Test search command with small mock embeddings.""" | |
| # 1. Create mock query embeddings (5 proteins, 128-dim) | |
| query_embeddings = np.random.randn(5, 128).astype(np.float32) | |
| # 2. Create mock database embeddings (20 proteins, 128-dim) | |
| db_embeddings = np.random.randn(20, 128).astype(np.float32) | |
| # 3. Normalize to unit vectors (for cosine similarity) | |
| query_embeddings = query_embeddings / np.linalg.norm(...) | |
| db_embeddings = db_embeddings / np.linalg.norm(...) | |
| # 4. Save to temporary files | |
| np.save(tmp_path / "query.npy", query_embeddings) | |
| np.save(tmp_path / "db.npy", db_embeddings) | |
| # 5. Run CLI command via subprocess | |
| subprocess.run([ | |
| sys.executable, '-m', 'protein_conformal.cli', | |
| 'search', | |
| '--query', str(tmp_path / "query.npy"), | |
| '--database', str(tmp_path / "db.npy"), | |
| '--output', str(tmp_path / "results.csv"), | |
| '--k', '3' | |
| ]) | |
| # 6. Verify output exists and has correct structure | |
| df = pd.read_csv(tmp_path / "results.csv") | |
| assert len(df) == 5 * 3 # 5 queries * 3 neighbors | |
| assert 'similarity' in df.columns | |
| ``` | |
| ## Understanding Test Failures | |
| ### Import Errors | |
| ``` | |
| ModuleNotFoundError: No module named 'faiss' | |
| ``` | |
| **Solution**: Install dependencies | |
| ```bash | |
| conda install -c conda-forge faiss-cpu | |
| ``` | |
| ### File Not Found | |
| ``` | |
| FileNotFoundError: [Errno 2] No such file or directory: '/tmp/...' | |
| ``` | |
| **Solution**: This shouldn't happen with `tmp_path` fixture. Check that pytest is creating temp directories. | |
| ### Assertion Errors | |
| ``` | |
| AssertionError: assert 8 == 15 | |
| ``` | |
| **Solution**: Check if test expectations match actual behavior. This could indicate: | |
| - Bug in code | |
| - Test expectations wrong | |
| - Random seed not working | |
| ### Subprocess Errors | |
| ``` | |
| subprocess.CalledProcessError: Command returned non-zero exit status 1 | |
| ``` | |
| **Solution**: Run the command manually to see error: | |
| ```bash | |
| python -m protein_conformal.cli search --query test.npy --database db.npy ... | |
| ``` | |
| ## Adding Your Own Test | |
| Template for a new CLI test: | |
| ```python | |
| def test_my_new_feature(tmp_path): | |
| """Test description here.""" | |
| # 1. Create test data | |
| test_data = np.array([1, 2, 3]) | |
| input_file = tmp_path / "input.npy" | |
| np.save(input_file, test_data) | |
| # 2. Run CLI command | |
| result = subprocess.run( | |
| [sys.executable, '-m', 'protein_conformal.cli', | |
| 'my-command', | |
| '--input', str(input_file), | |
| '--output', str(tmp_path / "output.csv")], | |
| capture_output=True, | |
| text=True | |
| ) | |
| # 3. Check return code | |
| assert result.returncode == 0 | |
| # 4. Verify output | |
| output_file = tmp_path / "output.csv" | |
| assert output_file.exists() | |
| df = pd.read_csv(output_file) | |
| assert len(df) > 0 | |
| assert 'expected_column' in df.columns | |
| ``` | |
| ## Debugging Tests | |
| ### Run test with debugger | |
| ```bash | |
| pytest tests/test_cli.py::test_search_with_mock_data --pdb | |
| ``` | |
| This will drop into Python debugger on failure. | |
| ### Show print statements | |
| ```bash | |
| pytest tests/test_cli.py::test_search_with_mock_data -s | |
| ``` | |
| This shows any `print()` statements from the code. | |
| ### Show warnings | |
| ```bash | |
| pytest tests/test_cli.py -v -W all | |
| ``` | |
| This shows all Python warnings (deprecation, etc.) | |
| ### Keep temporary files | |
| ```bash | |
| pytest tests/test_cli.py::test_search_with_mock_data --basetemp=./test_tmp | |
| ``` | |
| This keeps temp files in `./test_tmp/` for inspection. | |
| ## Performance | |
| All 24 CLI tests should complete in **< 30 seconds**: | |
| - Help tests: ~0.1s each (no computation) | |
| - Mock data tests: ~0.5-2s each (small arrays) | |
| - No GPU required | |
| - No large data files | |
| If tests are slow: | |
| 1. Check if GPU is being initialized (use `--cpu` flag) | |
| 2. Check calibration data size (should be < 100 samples in tests) | |
| 3. Check for network calls (shouldn't happen in these tests) | |
| ## Next Steps | |
| After CLI tests pass: | |
| 1. Run full test suite: `pytest tests/ -v` | |
| 2. Run paper verification: `cpr verify --check syn30` | |
| 3. Try the CLI on real data: `cpr search --query ... --database ...` | |
| 4. Read `TEST_SUMMARY.md` for complete test documentation | |