| # Hierarchical BERT Integration Checklist | |
| ## Files That Need Changes | |
| ### β Already Updated | |
| 1. **model.py** - Added `HierarchicalLegalBERT` class | |
| ### π§ Files That Need Updates | |
| #### **High Priority** (Training/Inference Core) | |
| 1. **trainer.py** - Update to support hierarchical model | |
| - Line 14: Import statement | |
| - Line 169: Model initialization | |
| - Training loop compatibility | |
| 2. **train.py** - Add hierarchical model option | |
| - Line 0: Add command-line argument for model type | |
| 3. **evaluate.py** - Support hierarchical model evaluation | |
| - Line 45: Import statement | |
| - Line 47: Model initialization | |
| 4. **calibrate.py** - Support hierarchical model calibration | |
| - Line 14: Import statement | |
| - Model usage throughout | |
| 5. **advanced_analysis.py** - Support hierarchical analysis | |
| - Line 16: Import statement | |
| - Line 25: Model loading | |
| #### **Medium Priority** (Utilities) | |
| 6. **analyze_document.py** - Add hierarchical document analysis | |
| - Line 274: Import statement | |
| - Add hierarchical inference option | |
| 7. **config.py** - Add hierarchical model config options | |
| - Add `use_hierarchical_model` flag | |
| - Add hierarchical-specific parameters | |
| #### **Low Priority** (Testing) | |
| 8. **test_setup.py** - Add hierarchical model tests | |
| - Line 107: Import statement | |
| ### π Summary of Changes Needed | |
| **Import Changes:** | |
| ```python | |
| # OLD: | |
| from model import FullyLearningBasedLegalBERT | |
| # NEW (support both): | |
| from model import FullyLearningBasedLegalBERT, HierarchicalLegalBERT | |
| ``` | |
| **Model Selection Logic:** | |
| ```python | |
| if config.use_hierarchical_model: | |
| model = HierarchicalLegalBERT(config, num_discovered_risks) | |
| else: | |
| model = FullyLearningBasedLegalBERT(config, num_discovered_risks) | |
| ``` | |
| **Forward Pass Changes:** | |
| ```python | |
| # For single-clause training (hierarchical model) | |
| if isinstance(model, HierarchicalLegalBERT): | |
| outputs = model.forward_single_clause(input_ids, attention_mask) | |
| else: | |
| outputs = model(input_ids, attention_mask) | |
| ``` | |
| **Inference Changes:** | |
| ```python | |
| # For document-level inference (hierarchical model) | |
| if isinstance(model, HierarchicalLegalBERT) and analyze_full_doc: | |
| results = model.predict_document(document_structure) | |
| else: | |
| # Clause-by-clause inference | |
| results = model.predict_risk_pattern(input_ids, attention_mask) | |
| ``` | |
| --- | |
| ## Implementation Order | |
| 1. β **config.py** - Add configuration flags | |
| 2. β **trainer.py** - Update model initialization and training loop | |
| 3. β **train.py** - Add command-line args | |
| 4. β **evaluate.py** - Add hierarchical evaluation | |
| 5. β **calibrate.py** - Add hierarchical calibration | |
| 6. β **advanced_analysis.py** - Add hierarchical analysis | |
| 7. β **analyze_document.py** - Add hierarchical document analysis | |
| 8. β **test_setup.py** - Add tests | |
| --- | |
| ## Backward Compatibility | |
| All changes maintain backward compatibility: | |
| - Default behavior: Uses `FullyLearningBasedLegalBERT` (current model) | |
| - Optional: Use `HierarchicalLegalBERT` via config flag | |
| - Training: Both models train the same way (clause-level) | |
| - Inference: Hierarchical model offers enhanced document-level analysis | |