File size: 3,101 Bytes
9b1c753
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
# Hierarchical BERT Integration Checklist

## Files That Need Changes

### βœ… Already Updated
1. **model.py** - Added `HierarchicalLegalBERT` class

### πŸ”§ Files That Need Updates

#### **High Priority** (Training/Inference Core)
1. **trainer.py** - Update to support hierarchical model
   - Line 14: Import statement
   - Line 169: Model initialization
   - Training loop compatibility

2. **train.py** - Add hierarchical model option
   - Line 0: Add command-line argument for model type

3. **evaluate.py** - Support hierarchical model evaluation
   - Line 45: Import statement
   - Line 47: Model initialization

4. **calibrate.py** - Support hierarchical model calibration
   - Line 14: Import statement  
   - Model usage throughout

5. **advanced_analysis.py** - Support hierarchical analysis
   - Line 16: Import statement
   - Line 25: Model loading

#### **Medium Priority** (Utilities)
6. **analyze_document.py** - Add hierarchical document analysis
   - Line 274: Import statement
   - Add hierarchical inference option

7. **config.py** - Add hierarchical model config options
   - Add `use_hierarchical_model` flag
   - Add hierarchical-specific parameters

#### **Low Priority** (Testing)
8. **test_setup.py** - Add hierarchical model tests
   - Line 107: Import statement

### πŸ“ Summary of Changes Needed

**Import Changes:**
```python
# OLD:
from model import FullyLearningBasedLegalBERT

# NEW (support both):
from model import FullyLearningBasedLegalBERT, HierarchicalLegalBERT
```

**Model Selection Logic:**
```python
if config.use_hierarchical_model:
    model = HierarchicalLegalBERT(config, num_discovered_risks)
else:
    model = FullyLearningBasedLegalBERT(config, num_discovered_risks)
```

**Forward Pass Changes:**
```python
# For single-clause training (hierarchical model)
if isinstance(model, HierarchicalLegalBERT):
    outputs = model.forward_single_clause(input_ids, attention_mask)
else:
    outputs = model(input_ids, attention_mask)
```

**Inference Changes:**
```python
# For document-level inference (hierarchical model)
if isinstance(model, HierarchicalLegalBERT) and analyze_full_doc:
    results = model.predict_document(document_structure)
else:
    # Clause-by-clause inference
    results = model.predict_risk_pattern(input_ids, attention_mask)
```

---

## Implementation Order

1. βœ… **config.py** - Add configuration flags
2. βœ… **trainer.py** - Update model initialization and training loop
3. βœ… **train.py** - Add command-line args
4. βœ… **evaluate.py** - Add hierarchical evaluation
5. βœ… **calibrate.py** - Add hierarchical calibration
6. βœ… **advanced_analysis.py** - Add hierarchical analysis
7. βœ… **analyze_document.py** - Add hierarchical document analysis
8. βœ… **test_setup.py** - Add tests

---

## Backward Compatibility

All changes maintain backward compatibility:
- Default behavior: Uses `FullyLearningBasedLegalBERT` (current model)
- Optional: Use `HierarchicalLegalBERT` via config flag
- Training: Both models train the same way (clause-level)
- Inference: Hierarchical model offers enhanced document-level analysis