Juan Salas commited on
Commit
5bf181e
Β·
1 Parent(s): 7c525e7

docs: Update README with complete script documentation

Browse files

- Add all available uv run scripts from pyproject.toml to Quick Start section
- Include build, build-all, build-sparse, setup-datasets commands
- Add upload scripts for deployment (upload-framework, upload-indexes, upload-vdrs)
- Document testing commands (run-e2e-tests, test-legal-coreference)
- Update project structure to reflect actual scripts directory contents
- Enhance troubleshooting section with new build and test commands

Resolves documentation gap between available scripts and README content

Files changed (1) hide show
  1. README.md +30 -2
README.md CHANGED
@@ -360,10 +360,26 @@ uv run streamlit run app/main.py # Run the app
360
  # Option 3: Development mode with auto-reload
361
  uv run streamlit run app/main.py --server.runOnSave true
362
 
363
- # Option 4: Additional build commands for advanced features
 
364
  uv run build-indexes # Build search indexes (FAISS, BM25)
365
  uv run build-graphs # Build knowledge graphs with entity resolution
366
- uv run download-models # Pre-download transformer models locally
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
367
  ```
368
 
369
  ### Environment Setup (for AI features)
@@ -755,10 +771,16 @@ dd_poc/
755
  β”‚ β”œβ”€β”€ build_sparse_indexes.py # BM25 sparse index construction
756
  β”‚ β”œβ”€β”€ build.py # General build script
757
  β”‚ β”œβ”€β”€ download_models.py # Download and cache transformer models
 
 
758
  β”‚ β”œβ”€β”€ start.py # πŸš€ Launch script (Python)
 
759
  β”‚ β”œβ”€β”€ test_entity_resolution.py # Entity resolution testing and validation
760
  β”‚ β”œβ”€β”€ test_legal_coreference.py # Legal coreference testing
761
  β”‚ β”œβ”€β”€ transformer_extractors.py # Transformer-based extraction utilities
 
 
 
762
  β”‚ └── verify_test_coverage.py # Test coverage verification
763
  β”œβ”€β”€ tests/ # πŸ§ͺ Comprehensive test suite
764
  β”‚ β”œβ”€β”€ unit/ # Unit tests with entity processing tests
@@ -979,6 +1001,12 @@ uv run build-indexes && echo "βœ… Search indexes built successfully"
979
  # Build knowledge graphs with entity resolution
980
  uv run build-graphs && echo "βœ… Knowledge graphs built with entity resolution"
981
 
 
 
 
 
 
 
982
  # Verify test coverage for critical workflows
983
  uv run verify-test-coverage
984
 
 
360
  # Option 3: Development mode with auto-reload
361
  uv run streamlit run app/main.py --server.runOnSave true
362
 
363
+ # Option 4: Build commands for advanced features
364
+ uv run download-models # Pre-download transformer models locally
365
  uv run build-indexes # Build search indexes (FAISS, BM25)
366
  uv run build-graphs # Build knowledge graphs with entity resolution
367
+ uv run build-sparse # Build BM25 sparse indexes
368
+ uv run build # General build script
369
+ uv run build-all # Comprehensive build pipeline (all indexes + graphs)
370
+
371
+ # Option 5: Data management commands
372
+ uv run setup-datasets # Setup initial datasets
373
+
374
+ # Option 6: Upload commands (for deployment)
375
+ uv run upload-framework # Upload DD framework
376
+ uv run upload-indexes # Upload search indexes
377
+ uv run upload-vdrs # Upload VDR data
378
+
379
+ # Option 7: Testing commands
380
+ uv run verify-test-coverage # Verify critical test coverage
381
+ uv run run-e2e-tests # Run end-to-end tests
382
+ uv run test-legal-coreference # Test legal coreference resolution
383
  ```
384
 
385
  ### Environment Setup (for AI features)
 
771
  β”‚ β”œβ”€β”€ build_sparse_indexes.py # BM25 sparse index construction
772
  β”‚ β”œβ”€β”€ build.py # General build script
773
  β”‚ β”œβ”€β”€ download_models.py # Download and cache transformer models
774
+ β”‚ β”œβ”€β”€ run_e2e_tests.py # End-to-end test runner
775
+ β”‚ β”œβ”€β”€ setup_datasets.py # Initial dataset setup
776
  β”‚ β”œβ”€β”€ start.py # πŸš€ Launch script (Python)
777
+ β”‚ β”œβ”€β”€ streamlit_cloud_config.py # Streamlit Cloud configuration
778
  β”‚ β”œβ”€β”€ test_entity_resolution.py # Entity resolution testing and validation
779
  β”‚ β”œβ”€β”€ test_legal_coreference.py # Legal coreference testing
780
  β”‚ β”œβ”€β”€ transformer_extractors.py # Transformer-based extraction utilities
781
+ β”‚ β”œβ”€β”€ upload_dd_framework.py # Upload DD framework for deployment
782
+ β”‚ β”œβ”€β”€ upload_dd_indexes.py # Upload search indexes for deployment
783
+ β”‚ β”œβ”€β”€ upload_dd_vdrs.py # Upload VDR data for deployment
784
  β”‚ └── verify_test_coverage.py # Test coverage verification
785
  β”œβ”€β”€ tests/ # πŸ§ͺ Comprehensive test suite
786
  β”‚ β”œβ”€β”€ unit/ # Unit tests with entity processing tests
 
1001
  # Build knowledge graphs with entity resolution
1002
  uv run build-graphs && echo "βœ… Knowledge graphs built with entity resolution"
1003
 
1004
+ # Build all indexes and graphs comprehensively
1005
+ uv run build-all && echo "βœ… All indexes and graphs built successfully"
1006
+
1007
+ # Run comprehensive test suite
1008
+ uv run run-e2e-tests && echo "βœ… E2E tests completed"
1009
+
1010
  # Verify test coverage for critical workflows
1011
  uv run verify-test-coverage
1012