File size: 2,621 Bytes
6bf1928
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
license: mit
language:
  - en
tags:
  - legal
  - glacier
  - distillation
  - sequence-classification
pipeline_tag: text-classification
datasets:
  - glacier-legal/legal-distillation-data
base_model: nlpaueb/legal-bert-base-uncased
---

# GLACIER glacier-document-classifier

**Distilled legal AI model** for the [GLACIER pipeline](https://github.com/OrionDevPartners/glacier-legal-mcp) — Gated Legal Analysis, Citation Intelligence, Evidence Routing.

## Model Description

This model is distilled from Claude Opus 4.6 (via AWS Bedrock) into a lightweight transformer for fast, local inference. It handles **legal document type classification (complaint, motion, brief, etc.)** as part of the GLACIER 6-stage legal document production pipeline.

- **Base model:** [nlpaueb/legal-bert-base-uncased](https://huggingface.co/nlpaueb/legal-bert-base-uncased)
- **Task:** sequence-classification
- **Labels:** 12 classes
- **Max length:** 512 tokens

## Labels

- `complaint`
- `answer`
- `motion`
- `brief`
- `order`
- `opinion`
- `notice`
- `subpoena`
- `affidavit`
- `demand_letter`
- `bar_complaint`
- `other`

## Usage

```python
from glacier_distill.inference import GlacierPipeline

pipeline = GlacierPipeline()
result = pipeline.classify_document("your legal text here")
print(result)
```

Or use directly with transformers:

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="glacier-legal/glacier-document-classifier")
result = classifier("your legal text here")
```

## Training

- **Teacher:** Claude Opus 4.6 (AWS Bedrock)
- **Method:** Knowledge distillation (Hinton et al., 2015) with temperature=4.0, alpha=0.7
- **Data:** CourtListener case law + synthetic labeled examples
- **Framework:** HuggingFace Transformers + custom DistillationLoss

## GLACIER Pipeline

This model is part of the GLACIER pipeline stages:

```
Stage 1: QUERY    -> jurisdiction-router + document-classifier
Stage 2: RESEARCH -> legal-ner (entity extraction)
Stage 3: WDC #1   -> (full model review)
Stage 4: DRAFT    -> legal-ner + citation-classifier
Stage 5: WDC #2   -> hallucination-detector + citation-classifier
Stage 6: FINAL    -> (human review)
```

## Limitations

- Distilled models are optimized for US legal text (federal + state)
- Not a substitute for full model review in GLACIER Stages 3/5
- Citation hallucination detection is a pre-filter, not a replacement for external verification
- Jurisdiction coverage: Florida, Mississippi, Federal (primary); other states (limited)

## License

MIT — Part of the GLACIER Legal AI Framework by Orion Dev Partners, LLC.