Hierarchical BERT Integration Checklist
Files That Need Changes
β Already Updated
- model.py - Added
HierarchicalLegalBERTclass
π§ Files That Need Updates
High Priority (Training/Inference Core)
trainer.py - Update to support hierarchical model
- Line 14: Import statement
- Line 169: Model initialization
- Training loop compatibility
train.py - Add hierarchical model option
- Line 0: Add command-line argument for model type
evaluate.py - Support hierarchical model evaluation
- Line 45: Import statement
- Line 47: Model initialization
calibrate.py - Support hierarchical model calibration
- Line 14: Import statement
- Model usage throughout
advanced_analysis.py - Support hierarchical analysis
- Line 16: Import statement
- Line 25: Model loading
Medium Priority (Utilities)
analyze_document.py - Add hierarchical document analysis
- Line 274: Import statement
- Add hierarchical inference option
config.py - Add hierarchical model config options
- Add
use_hierarchical_modelflag - Add hierarchical-specific parameters
- Add
Low Priority (Testing)
- test_setup.py - Add hierarchical model tests
- Line 107: Import statement
π Summary of Changes Needed
Import Changes:
# OLD:
from model import FullyLearningBasedLegalBERT
# NEW (support both):
from model import FullyLearningBasedLegalBERT, HierarchicalLegalBERT
Model Selection Logic:
if config.use_hierarchical_model:
model = HierarchicalLegalBERT(config, num_discovered_risks)
else:
model = FullyLearningBasedLegalBERT(config, num_discovered_risks)
Forward Pass Changes:
# For single-clause training (hierarchical model)
if isinstance(model, HierarchicalLegalBERT):
outputs = model.forward_single_clause(input_ids, attention_mask)
else:
outputs = model(input_ids, attention_mask)
Inference Changes:
# For document-level inference (hierarchical model)
if isinstance(model, HierarchicalLegalBERT) and analyze_full_doc:
results = model.predict_document(document_structure)
else:
# Clause-by-clause inference
results = model.predict_risk_pattern(input_ids, attention_mask)
Implementation Order
- β config.py - Add configuration flags
- β trainer.py - Update model initialization and training loop
- β train.py - Add command-line args
- β evaluate.py - Add hierarchical evaluation
- β calibrate.py - Add hierarchical calibration
- β advanced_analysis.py - Add hierarchical analysis
- β analyze_document.py - Add hierarchical document analysis
- β test_setup.py - Add tests
Backward Compatibility
All changes maintain backward compatibility:
- Default behavior: Uses
FullyLearningBasedLegalBERT(current model) - Optional: Use
HierarchicalLegalBERTvia config flag - Training: Both models train the same way (clause-level)
- Inference: Hierarchical model offers enhanced document-level analysis