code2-repo / VERIFICATION_CHECKLIST.md
Deepu1965's picture
Upload folder using huggingface_hub
9b1c753 verified

Verification Checklist

Before Running

  • Install dependencies: pip install -r requirements.txt
  • Ensure CUAD dataset is at: dataset/CUAD_v1/CUAD_v1.json
  • Python 3.8+ installed

Tests to Run

1. Basic Comparison (4 methods)

python3 compare_risk_discovery.py

Expected:

  • K-Means βœ…
  • LDA βœ…
  • Hierarchical βœ…
  • DBSCAN βœ…
  • Output files created
  • No KeyError
  • No TypeError

2. Advanced Comparison (9 methods)

python3 compare_risk_discovery.py --advanced

Expected:

  • All 4 basic methods βœ…
  • NMF βœ… (no alpha parameter error)
  • Spectral βœ…
  • GMM βœ…
  • Mini-Batch K-Means βœ…
  • Risk-o-meter βœ…
  • Output files created

3. Limited Dataset

python3 compare_risk_discovery.py --max-clauses 1000

Expected:

  • Runs faster
  • Uses 1000 clauses max
  • All methods complete

4. Custom Data Path

python3 compare_risk_discovery.py --data-path dataset/CUAD_v1/CUAD_v1.json

Expected:

  • Loads from specified path
  • All methods complete

Output Files to Check

After successful run:

  • risk_discovery_comparison_report.txt exists
  • risk_discovery_comparison_results.json exists
  • Report contains all methods
  • JSON is valid and parseable

Key Metrics to Verify

In the report, check for:

  • Each method has Patterns Discovered count
  • Execution times are reasonable
  • Quality metrics are present (silhouette/perplexity)
  • Top patterns are displayed
  • Recommendations section is complete

Common Issues and Solutions

Issue: No module named 'sklearn'

Solution: pip install scikit-learn>=1.3.0

Issue: No module named 'gensim' (Risk-o-meter only)

Solution: pip install gensim>=4.3.0 or skip with basic mode

Issue: Dataset not found

Solution: Check path in --data-path argument or use default location

Issue: Out of memory

Solution: Use --max-clauses 5000 to limit dataset size

Issue: Slow execution

Solution:

  • Use basic mode (without --advanced)
  • Reduce --max-clauses
  • Skip Spectral/Hierarchical for large datasets

Performance Expectations

For ~13K clauses (full CUAD):

  • K-Means: ~10-30 seconds ⚑
  • LDA: ~30-60 seconds 🟑
  • Hierarchical: ~60-120 seconds 🟑 (memory intensive)
  • DBSCAN: ~20-40 seconds ⚑
  • NMF: ~15-45 seconds ⚑
  • Spectral: ~90-180 seconds πŸ”΄ (slow for large datasets)
  • GMM: ~40-80 seconds 🟑
  • Mini-Batch K-Means: ~5-15 seconds ⚑⚑
  • Risk-o-meter: ~60-120 seconds 🟑

Total time (advanced mode): ~6-12 minutes

Success Criteria

βœ… All methods complete without errors βœ… Output files generated βœ… Report contains meaningful patterns βœ… Quality metrics are calculated βœ… No KeyError or TypeError exceptions