Symio's picture
Upload folder using huggingface_hub
6bf1928 verified
---
license: mit
language:
- en
tags:
- legal
- glacier
- distillation
- sequence-classification
pipeline_tag: text-classification
datasets:
- glacier-legal/legal-distillation-data
base_model: nlpaueb/legal-bert-base-uncased
---
# GLACIER glacier-document-classifier
**Distilled legal AI model** for the [GLACIER pipeline](https://github.com/OrionDevPartners/glacier-legal-mcp) — Gated Legal Analysis, Citation Intelligence, Evidence Routing.
## Model Description
This model is distilled from Claude Opus 4.6 (via AWS Bedrock) into a lightweight transformer for fast, local inference. It handles **legal document type classification (complaint, motion, brief, etc.)** as part of the GLACIER 6-stage legal document production pipeline.
- **Base model:** [nlpaueb/legal-bert-base-uncased](https://huggingface.co/nlpaueb/legal-bert-base-uncased)
- **Task:** sequence-classification
- **Labels:** 12 classes
- **Max length:** 512 tokens
## Labels
- `complaint`
- `answer`
- `motion`
- `brief`
- `order`
- `opinion`
- `notice`
- `subpoena`
- `affidavit`
- `demand_letter`
- `bar_complaint`
- `other`
## Usage
```python
from glacier_distill.inference import GlacierPipeline
pipeline = GlacierPipeline()
result = pipeline.classify_document("your legal text here")
print(result)
```
Or use directly with transformers:
```python
from transformers import pipeline
classifier = pipeline("text-classification", model="glacier-legal/glacier-document-classifier")
result = classifier("your legal text here")
```
## Training
- **Teacher:** Claude Opus 4.6 (AWS Bedrock)
- **Method:** Knowledge distillation (Hinton et al., 2015) with temperature=4.0, alpha=0.7
- **Data:** CourtListener case law + synthetic labeled examples
- **Framework:** HuggingFace Transformers + custom DistillationLoss
## GLACIER Pipeline
This model is part of the GLACIER pipeline stages:
```
Stage 1: QUERY -> jurisdiction-router + document-classifier
Stage 2: RESEARCH -> legal-ner (entity extraction)
Stage 3: WDC #1 -> (full model review)
Stage 4: DRAFT -> legal-ner + citation-classifier
Stage 5: WDC #2 -> hallucination-detector + citation-classifier
Stage 6: FINAL -> (human review)
```
## Limitations
- Distilled models are optimized for US legal text (federal + state)
- Not a substitute for full model review in GLACIER Stages 3/5
- Citation hallucination detection is a pre-filter, not a replacement for external verification
- Jurisdiction coverage: Florida, Mississippi, Federal (primary); other states (limited)
## License
MIT — Part of the GLACIER Legal AI Framework by Orion Dev Partners, LLC.