Update README.md
Browse files
README.md
CHANGED
|
@@ -21,7 +21,7 @@ We build open-source NLP models and tools that help people *find and understand
|
|
| 21 |
Legal documents are dense and time-consuming. Our goal is to make them more accessible by:
|
| 22 |
- highlighting clauses that commonly reduce user rights,
|
| 23 |
- labeling the *type* of risk (e.g., unilateral changes, arbitration),
|
| 24 |
-
- enabling downstream apps to display
|
| 25 |
|
| 26 |
---
|
| 27 |
|
|
@@ -33,8 +33,8 @@ Our models perform **multi-label classification** at the sentence/clause level:
|
|
| 33 |
|
| 34 |
This makes the models suitable for:
|
| 35 |
- clause highlighting in a document viewer,
|
| 36 |
-
- ranking
|
| 37 |
-
- powering a lightweight
|
| 38 |
|
| 39 |
---
|
| 40 |
|
|
@@ -42,7 +42,7 @@ This makes the models suitable for:
|
|
| 42 |
|
| 43 |
We currently support **8** types of potentially unfair clauses:
|
| 44 |
|
| 45 |
-
- **Limitation of liability** β Limits the provider
|
| 46 |
- **Unilateral termination** β Provider may terminate/suspend without clear cause
|
| 47 |
- **Unilateral change** β Terms can change with minimal notice or constraints
|
| 48 |
- **Content removal** β Provider may remove user content at discretion
|
|
@@ -62,16 +62,18 @@ We report the same metric set across models whenever possible.
|
|
| 62 |
|
| 63 |
| Model | Task | Key metric(s) |
|
| 64 |
|------|------|---------------|
|
| 65 |
-
| **[deberta-unfair-tos](https://huggingface.co/Agreemind/deberta-unfair-tos)** | ToS clause risk classification | **F1: 0.
|
|
|
|
| 66 |
| [electra-large-unfair-tos](https://huggingface.co/Agreemind/electra-large-unfair-tos) | ToS clause risk classification | Accuracy: 77.3% |
|
| 67 |
| [legalbert-unfair-tos](https://huggingface.co/Agreemind/legalbert-unfair-tos) | ToS clause risk classification | Accuracy: 74.9% |
|
| 68 |
| [modernbert-unfair-tos](https://huggingface.co/Agreemind/modernbert-unfair-tos) | ToS clause risk classification | Accuracy: 70.6% |
|
| 69 |
| [legalbert-large-unfair-tos](https://huggingface.co/Agreemind/legalbert-large-unfair-tos) | ToS clause risk classification | Accuracy: 66.3% |
|
| 70 |
|
| 71 |
**Notes**
|
| 72 |
-
-
|
| 73 |
-
|
| 74 |
-
- For
|
|
|
|
| 75 |
|
| 76 |
---
|
| 77 |
|
|
@@ -81,7 +83,7 @@ We report the same metric set across models whenever possible.
|
|
| 81 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
| 82 |
import torch
|
| 83 |
|
| 84 |
-
model_id = "Agreemind/deberta-unfair-tos"
|
| 85 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 86 |
model = AutoModelForSequenceClassification.from_pretrained(model_id)
|
| 87 |
|
|
@@ -105,7 +107,7 @@ probs = torch.sigmoid(logits).squeeze().tolist()
|
|
| 105 |
|
| 106 |
top = sorted(zip(labels, probs), key=lambda x: x[1], reverse=True)[:3]
|
| 107 |
print(top)
|
| 108 |
-
```
|
| 109 |
|
| 110 |
---
|
| 111 |
|
|
@@ -135,4 +137,3 @@ Always present outputs as **informational signals**, ideally with:
|
|
| 135 |
## π License
|
| 136 |
|
| 137 |
Models and code are released under the **MIT License**, unless otherwise stated in individual repositories/models.
|
| 138 |
-
|
|
|
|
| 21 |
Legal documents are dense and time-consuming. Our goal is to make them more accessible by:
|
| 22 |
- highlighting clauses that commonly reduce user rights,
|
| 23 |
- labeling the *type* of risk (e.g., unilateral changes, arbitration),
|
| 24 |
+
- enabling downstream apps to display "risk badges" and evidence-backed highlights.
|
| 25 |
|
| 26 |
---
|
| 27 |
|
|
|
|
| 33 |
|
| 34 |
This makes the models suitable for:
|
| 35 |
- clause highlighting in a document viewer,
|
| 36 |
+
- ranking "most risky" clauses first,
|
| 37 |
+
- powering a lightweight "risk badge" in a UI.
|
| 38 |
|
| 39 |
---
|
| 40 |
|
|
|
|
| 42 |
|
| 43 |
We currently support **8** types of potentially unfair clauses:
|
| 44 |
|
| 45 |
+
- **Limitation of liability** β Limits the provider's legal responsibility
|
| 46 |
- **Unilateral termination** β Provider may terminate/suspend without clear cause
|
| 47 |
- **Unilateral change** β Terms can change with minimal notice or constraints
|
| 48 |
- **Content removal** β Provider may remove user content at discretion
|
|
|
|
| 62 |
|
| 63 |
| Model | Task | Key metric(s) |
|
| 64 |
|------|------|---------------|
|
| 65 |
+
| **[deberta-unfair-tos-augmented](https://huggingface.co/Agreemind/deberta-unfair-tos-augmented)** | ToS clause risk classification | **F1: 0.96** β’ Accuracy: 94.12% β |
|
| 66 |
+
| [deberta-unfair-tos](https://huggingface.co/Agreemind/deberta-unfair-tos) | ToS clause risk classification | F1: 0.87 β’ Accuracy: 78.8% |
|
| 67 |
| [electra-large-unfair-tos](https://huggingface.co/Agreemind/electra-large-unfair-tos) | ToS clause risk classification | Accuracy: 77.3% |
|
| 68 |
| [legalbert-unfair-tos](https://huggingface.co/Agreemind/legalbert-unfair-tos) | ToS clause risk classification | Accuracy: 74.9% |
|
| 69 |
| [modernbert-unfair-tos](https://huggingface.co/Agreemind/modernbert-unfair-tos) | ToS clause risk classification | Accuracy: 70.6% |
|
| 70 |
| [legalbert-large-unfair-tos](https://huggingface.co/Agreemind/legalbert-large-unfair-tos) | ToS clause risk classification | Accuracy: 66.3% |
|
| 71 |
|
| 72 |
**Notes**
|
| 73 |
+
- **Accuracy** = Exact Match (all 8 labels correct per sample)
|
| 74 |
+
- **F1** = Micro-F1 across all labels
|
| 75 |
+
- For production use, we recommend tuning **per-class thresholds** on your domain.
|
| 76 |
+
- The augmented model was trained with 605 additional synthetic examples for weak classes.
|
| 77 |
|
| 78 |
---
|
| 79 |
|
|
|
|
| 83 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
| 84 |
import torch
|
| 85 |
|
| 86 |
+
model_id = "Agreemind/deberta-unfair-tos-augmented" # Best model
|
| 87 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 88 |
model = AutoModelForSequenceClassification.from_pretrained(model_id)
|
| 89 |
|
|
|
|
| 107 |
|
| 108 |
top = sorted(zip(labels, probs), key=lambda x: x[1], reverse=True)[:3]
|
| 109 |
print(top)
|
| 110 |
+
```
|
| 111 |
|
| 112 |
---
|
| 113 |
|
|
|
|
| 137 |
## π License
|
| 138 |
|
| 139 |
Models and code are released under the **MIT License**, unless otherwise stated in individual repositories/models.
|
|
|