File size: 2,249 Bytes
9b1c753 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
# Risk-o-meter Framework - Implementation Summary
## β
Completed
Successfully implemented the **Risk-o-meter framework** (Chakrabarti et al., 2018) and integrated it into the comparison pipeline.
## π Paper Reference
**Title**: Risk-o-meter: Automated Risk Detection in Contracts
**Authors**: Chakrabarti, A., & Dholakia, K. (2018)
**Key Achievement**: **91% accuracy on termination clauses**
**Method**: Paragraph vectors (Doc2Vec) + SVM classifiers
## π― Implementation Details
### Core Components
**File**: `risk_o_meter.py` (750+ lines)
#### 1. Doc2Vec (Paragraph Vectors)
- **Purpose**: Learn distributed semantic representations of legal clauses
- **Model**: Distributed Memory (DM) variant
- **Parameters**:
- Vector size: 100 dimensions (configurable)
- Window: 5 words context
- Epochs: 30-40 (configurable)
- Algorithm: DBOW/DM (using DM for better semantic capture)
#### 2. SVM Classifier
- **Purpose**: Multi-class risk categorization
- **Kernel**: RBF (default) or linear
- **Features**: Doc2Vec embeddings + optional TF-IDF augmentation
- **Output**: Risk categories with probability distributions
#### 3. SVR Regressors (Extension)
- **Purpose**: Predict severity and importance scores
- **Method**: Support Vector Regression
- **Output**: Continuous scores (0-10 scale)
## π§ Usage
```bash
# Test Risk-o-meter standalone
python risk_o_meter.py
# Run full comparison (9 methods including Risk-o-meter)
python compare_risk_discovery.py --advanced
```
## π Now Available: 9 Methods Total
1. K-Means (baseline)
2. LDA Topic Modeling
3. Hierarchical Clustering
4. DBSCAN
5. NMF
6. Spectral Clustering
7. GMM
8. Mini-Batch K-Means
9. **Risk-o-meter** β (NEW - Paper baseline: 91% accuracy)
## π Files Modified
1. β
**`risk_o_meter.py`** (NEW, 750+ lines)
2. β
**`compare_risk_discovery.py`** (updated for 9 methods)
3. β
**`risk_discovery_alternatives.py`** (added Method 9)
4. β
**`RISK_DISCOVERY_COMPREHENSIVE.md`** (added Risk-o-meter section)
5. β
**`requirements.txt`** (added gensim>=4.3.0)
## π Ready to Run!
All code is implemented and ready for testing. The Risk-o-meter provides a **paper-validated baseline** (91% accuracy) for comparison with the other 8 methods.
|