File size: 3,101 Bytes
9b1c753 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
# Hierarchical BERT Integration Checklist
## Files That Need Changes
### β
Already Updated
1. **model.py** - Added `HierarchicalLegalBERT` class
### π§ Files That Need Updates
#### **High Priority** (Training/Inference Core)
1. **trainer.py** - Update to support hierarchical model
- Line 14: Import statement
- Line 169: Model initialization
- Training loop compatibility
2. **train.py** - Add hierarchical model option
- Line 0: Add command-line argument for model type
3. **evaluate.py** - Support hierarchical model evaluation
- Line 45: Import statement
- Line 47: Model initialization
4. **calibrate.py** - Support hierarchical model calibration
- Line 14: Import statement
- Model usage throughout
5. **advanced_analysis.py** - Support hierarchical analysis
- Line 16: Import statement
- Line 25: Model loading
#### **Medium Priority** (Utilities)
6. **analyze_document.py** - Add hierarchical document analysis
- Line 274: Import statement
- Add hierarchical inference option
7. **config.py** - Add hierarchical model config options
- Add `use_hierarchical_model` flag
- Add hierarchical-specific parameters
#### **Low Priority** (Testing)
8. **test_setup.py** - Add hierarchical model tests
- Line 107: Import statement
### π Summary of Changes Needed
**Import Changes:**
```python
# OLD:
from model import FullyLearningBasedLegalBERT
# NEW (support both):
from model import FullyLearningBasedLegalBERT, HierarchicalLegalBERT
```
**Model Selection Logic:**
```python
if config.use_hierarchical_model:
model = HierarchicalLegalBERT(config, num_discovered_risks)
else:
model = FullyLearningBasedLegalBERT(config, num_discovered_risks)
```
**Forward Pass Changes:**
```python
# For single-clause training (hierarchical model)
if isinstance(model, HierarchicalLegalBERT):
outputs = model.forward_single_clause(input_ids, attention_mask)
else:
outputs = model(input_ids, attention_mask)
```
**Inference Changes:**
```python
# For document-level inference (hierarchical model)
if isinstance(model, HierarchicalLegalBERT) and analyze_full_doc:
results = model.predict_document(document_structure)
else:
# Clause-by-clause inference
results = model.predict_risk_pattern(input_ids, attention_mask)
```
---
## Implementation Order
1. β
**config.py** - Add configuration flags
2. β
**trainer.py** - Update model initialization and training loop
3. β
**train.py** - Add command-line args
4. β
**evaluate.py** - Add hierarchical evaluation
5. β
**calibrate.py** - Add hierarchical calibration
6. β
**advanced_analysis.py** - Add hierarchical analysis
7. β
**analyze_document.py** - Add hierarchical document analysis
8. β
**test_setup.py** - Add tests
---
## Backward Compatibility
All changes maintain backward compatibility:
- Default behavior: Uses `FullyLearningBasedLegalBERT` (current model)
- Optional: Use `HierarchicalLegalBERT` via config flag
- Training: Both models train the same way (clause-level)
- Inference: Hierarchical model offers enhanced document-level analysis
|