π QUICK START GUIDE - Legal-BERT
Prerequisites
# 1. Install dependencies
pip install -r requirements.txt
# 2. Download CUAD dataset
# Place at: dataset/CUAD_v1/CUAD_v1.json
Verify Setup
python test_setup.py
Expected output: ``` π§ͺ LEGAL-BERT PROJECT - QUICK TEST
π Testing imports... β PyTorch β Transformers β scikit-learn β Pandas β NumPy ... β ALL TESTS PASSED! π Ready to train! Run: python train.py
## Training
```bash
python train.py
What it does:
- Loads CUAD dataset (19,598 clauses)
- Discovers 7 risk patterns automatically
- Trains Legal-BERT for 5 epochs (~2-4 hours on GPU)
- Saves checkpoints every epoch
- Generates training history plot
Output:
checkpoints/
βββ legal_bert_epoch_1.pt
βββ legal_bert_epoch_2.pt
βββ ...
βββ training_history.png
βββ training_summary.json
models/legal_bert/
βββ final_model.pt
Expected Results:
- Train Accuracy: >60%
- Val Accuracy: >55%
Evaluation
python evaluate.py
What it does:
- Loads trained model
- Evaluates on test set
- Calculates comprehensive metrics
- Generates visualizations
- Saves detailed report
Output:
checkpoints/
βββ evaluation_results.json
βββ confusion_matrix.png
βββ risk_distribution.png
evaluation_report.txt
Expected Results:
- Accuracy: >70%
- F1-Score: >0.65
- Precision: >0.60
- Recall: >0.60
Calibration
python calibrate.py
What it does:
- Loads trained model
- Applies temperature scaling
- Calculates ECE/MCE
- Saves calibrated model
- Exports results
Output:
checkpoints/
βββ calibration_results.json
models/legal_bert/
βββ calibrated_model.pt
Expected Results:
- ECE: 0.15 β <0.08
- MCE: 0.20 β <0.12
Complete Pipeline
# Run everything in sequence
python train.py && python evaluate.py && python calibrate.py
Configuration
Edit config.py to customize:
# Model settings
bert_model_name = "bert-base-uncased"
num_risk_categories = 7
max_sequence_length = 512
# Training settings
batch_size = 16 # Reduce if GPU OOM
num_epochs = 5 # Increase for better results
learning_rate = 2e-5 # Adjust for convergence
# Paths
data_path = "dataset/CUAD_v1/CUAD_v1.json"
checkpoint_dir = "checkpoints"
Troubleshooting
GPU Out of Memory
# In config.py, reduce:
batch_size = 8 # or even 4
Missing Dataset
# Error: Dataset not found
# Solution: Download CUAD and place at:
dataset/CUAD_v1/CUAD_v1.json
Import Errors
# Reinstall dependencies
pip install -r requirements.txt --upgrade
Visualization Errors
# If matplotlib errors occur
pip install matplotlib seaborn
# Or plots will be skipped (functionality still works)
Performance Tips
Speed Up Training
- Use GPU (CUDA): Automatic if available
- Increase batch size:
batch_size = 32 - Use fewer epochs:
num_epochs = 3
Improve Accuracy
- Train longer:
num_epochs = 10 - Adjust learning rate:
learning_rate = 3e-5 - Use larger BERT:
bert_model_name = "bert-large-uncased"
Better Calibration
- More validation data: Adjust splits in
data_loader.py - More iterations: In
calibrate.pyincreasemax_iter
File Structure
code2/
βββ train.py β Run this first
βββ evaluate.py β Then this
βββ calibrate.py β Finally this
βββ test_setup.py β Verify before training
β
βββ config.py β Edit settings here
βββ data_loader.py β Loads CUAD dataset
βββ risk_discovery.py β Discovers patterns
βββ model.py β Legal-BERT architecture
βββ trainer.py β Training logic
βββ evaluator.py β Evaluation logic
βββ utils.py β Helper functions
β
βββ README.md β Full documentation
βββ IMPLEMENTATION.md β Implementation details
βββ COMPLETION_SUMMARY.md β What was done
βββ QUICK_START.md β This file
Common Commands
# Check setup
python test_setup.py
# Train model
python train.py
# Evaluate model
python evaluate.py
# Calibrate model
python calibrate.py
# Run all
python train.py && python evaluate.py && python calibrate.py
# Python interactive (after training)
python
>>> from evaluator import LegalBertEvaluator
>>> # Load and analyze results
Expected Timeline
| Task | Time (GPU) | Time (CPU) |
|---|---|---|
| Setup verification | 30 seconds | 30 seconds |
| Training (5 epochs) | 2-4 hours | 8-12 hours |
| Evaluation | 10 minutes | 20 minutes |
| Calibration | 5 minutes | 10 minutes |
| Total | ~3 hours | ~10 hours |
Success Indicators
After Training
β
Checkpoints saved in checkpoints/
β
Training loss decreasing
β
Validation accuracy >55%
β
No CUDA errors
After Evaluation
β
Accuracy >70%
β
F1-Score >0.65
β
Confusion matrix generated
β
Report saved
After Calibration
β
ECE <0.10
β
Temperature ~1.5-2.5
β
Calibrated model saved
Getting Help
- Check
README.mdfor detailed documentation - Check
IMPLEMENTATION.mdfor technical details - Check
COMPLETION_SUMMARY.mdfor what was implemented - Review error messages carefully
- Verify setup with
python test_setup.py
Next Steps
After completing training, evaluation, and calibration:
- Analyze Results: Check evaluation report
- Tune Parameters: Adjust
config.pyif needed - Retrain: Run
train.pyagain with new settings - Deploy (optional): Create API or web interface
Key Metrics to Track
Training
- Train Loss (should decrease)
- Val Loss (should decrease)
- Train Accuracy (should increase)
- Val Accuracy (should increase)
Evaluation
- Overall Accuracy (>70%)
- F1-Score (>0.65)
- Per-pattern F1 (check which patterns need work)
- Regression RΒ² (>0.60 for severity/importance)
Calibration
- ECE (target: <0.08)
- MCE (target: <0.12)
- Temperature (typically 1.5-2.5)
Ready? Start with: python test_setup.py
Questions? Check README.md for comprehensive documentation.
π Good luck with your Legal-BERT training! π