| --- |
| license: mit |
| language: |
| - en |
| tags: |
| - legal |
| - glacier |
| - distillation |
| - sequence-classification |
| pipeline_tag: text-classification |
| datasets: |
| - glacier-legal/legal-distillation-data |
| base_model: nlpaueb/legal-bert-base-uncased |
| --- |
| |
| # GLACIER glacier-document-classifier |
|
|
| **Distilled legal AI model** for the [GLACIER pipeline](https://github.com/OrionDevPartners/glacier-legal-mcp) — Gated Legal Analysis, Citation Intelligence, Evidence Routing. |
|
|
| ## Model Description |
|
|
| This model is distilled from Claude Opus 4.6 (via AWS Bedrock) into a lightweight transformer for fast, local inference. It handles **legal document type classification (complaint, motion, brief, etc.)** as part of the GLACIER 6-stage legal document production pipeline. |
|
|
| - **Base model:** [nlpaueb/legal-bert-base-uncased](https://huggingface.co/nlpaueb/legal-bert-base-uncased) |
| - **Task:** sequence-classification |
| - **Labels:** 12 classes |
| - **Max length:** 512 tokens |
|
|
| ## Labels |
|
|
| - `complaint` |
| - `answer` |
| - `motion` |
| - `brief` |
| - `order` |
| - `opinion` |
| - `notice` |
| - `subpoena` |
| - `affidavit` |
| - `demand_letter` |
| - `bar_complaint` |
| - `other` |
|
|
| ## Usage |
|
|
| ```python |
| from glacier_distill.inference import GlacierPipeline |
| |
| pipeline = GlacierPipeline() |
| result = pipeline.classify_document("your legal text here") |
| print(result) |
| ``` |
|
|
| Or use directly with transformers: |
|
|
| ```python |
| from transformers import pipeline |
| |
| classifier = pipeline("text-classification", model="glacier-legal/glacier-document-classifier") |
| result = classifier("your legal text here") |
| ``` |
|
|
| ## Training |
|
|
| - **Teacher:** Claude Opus 4.6 (AWS Bedrock) |
| - **Method:** Knowledge distillation (Hinton et al., 2015) with temperature=4.0, alpha=0.7 |
| - **Data:** CourtListener case law + synthetic labeled examples |
| - **Framework:** HuggingFace Transformers + custom DistillationLoss |
|
|
| ## GLACIER Pipeline |
|
|
| This model is part of the GLACIER pipeline stages: |
|
|
| ``` |
| Stage 1: QUERY -> jurisdiction-router + document-classifier |
| Stage 2: RESEARCH -> legal-ner (entity extraction) |
| Stage 3: WDC #1 -> (full model review) |
| Stage 4: DRAFT -> legal-ner + citation-classifier |
| Stage 5: WDC #2 -> hallucination-detector + citation-classifier |
| Stage 6: FINAL -> (human review) |
| ``` |
|
|
| ## Limitations |
|
|
| - Distilled models are optimized for US legal text (federal + state) |
| - Not a substitute for full model review in GLACIER Stages 3/5 |
| - Citation hallucination detection is a pre-filter, not a replacement for external verification |
| - Jurisdiction coverage: Florida, Mississippi, Federal (primary); other states (limited) |
|
|
| ## License |
|
|
| MIT — Part of the GLACIER Legal AI Framework by Orion Dev Partners, LLC. |
|
|