# Hierarchical BERT Integration Checklist ## Files That Need Changes ### ✅ Already Updated 1. **model.py** - Added `HierarchicalLegalBERT` class ### 🔧 Files That Need Updates #### **High Priority** (Training/Inference Core) 1. **trainer.py** - Update to support hierarchical model - Line 14: Import statement - Line 169: Model initialization - Training loop compatibility 2. **train.py** - Add hierarchical model option - Line 0: Add command-line argument for model type 3. **evaluate.py** - Support hierarchical model evaluation - Line 45: Import statement - Line 47: Model initialization 4. **calibrate.py** - Support hierarchical model calibration - Line 14: Import statement - Model usage throughout 5. **advanced_analysis.py** - Support hierarchical analysis - Line 16: Import statement - Line 25: Model loading #### **Medium Priority** (Utilities) 6. **analyze_document.py** - Add hierarchical document analysis - Line 274: Import statement - Add hierarchical inference option 7. **config.py** - Add hierarchical model config options - Add `use_hierarchical_model` flag - Add hierarchical-specific parameters #### **Low Priority** (Testing) 8. **test_setup.py** - Add hierarchical model tests - Line 107: Import statement ### 📝 Summary of Changes Needed **Import Changes:** ```python # OLD: from model import FullyLearningBasedLegalBERT # NEW (support both): from model import FullyLearningBasedLegalBERT, HierarchicalLegalBERT ``` **Model Selection Logic:** ```python if config.use_hierarchical_model: model = HierarchicalLegalBERT(config, num_discovered_risks) else: model = FullyLearningBasedLegalBERT(config, num_discovered_risks) ``` **Forward Pass Changes:** ```python # For single-clause training (hierarchical model) if isinstance(model, HierarchicalLegalBERT): outputs = model.forward_single_clause(input_ids, attention_mask) else: outputs = model(input_ids, attention_mask) ``` **Inference Changes:** ```python # For document-level inference (hierarchical model) if isinstance(model, HierarchicalLegalBERT) and analyze_full_doc: results = model.predict_document(document_structure) else: # Clause-by-clause inference results = model.predict_risk_pattern(input_ids, attention_mask) ``` --- ## Implementation Order 1. ✅ **config.py** - Add configuration flags 2. ✅ **trainer.py** - Update model initialization and training loop 3. ✅ **train.py** - Add command-line args 4. ✅ **evaluate.py** - Add hierarchical evaluation 5. ✅ **calibrate.py** - Add hierarchical calibration 6. ✅ **advanced_analysis.py** - Add hierarchical analysis 7. ✅ **analyze_document.py** - Add hierarchical document analysis 8. ✅ **test_setup.py** - Add tests --- ## Backward Compatibility All changes maintain backward compatibility: - Default behavior: Uses `FullyLearningBasedLegalBERT` (current model) - Optional: Use `HierarchicalLegalBERT` via config flag - Training: Both models train the same way (clause-level) - Inference: Hierarchical model offers enhanced document-level analysis