canpolatbulbul commited on
Commit
ff0784c
Β·
verified Β·
1 Parent(s): ddd04d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -11
README.md CHANGED
@@ -21,7 +21,7 @@ We build open-source NLP models and tools that help people *find and understand
21
  Legal documents are dense and time-consuming. Our goal is to make them more accessible by:
22
  - highlighting clauses that commonly reduce user rights,
23
  - labeling the *type* of risk (e.g., unilateral changes, arbitration),
24
- - enabling downstream apps to display β€œrisk badges” and evidence-backed highlights.
25
 
26
  ---
27
 
@@ -33,8 +33,8 @@ Our models perform **multi-label classification** at the sentence/clause level:
33
 
34
  This makes the models suitable for:
35
  - clause highlighting in a document viewer,
36
- - ranking β€œmost risky” clauses first,
37
- - powering a lightweight β€œrisk badge” in a UI.
38
 
39
  ---
40
 
@@ -42,7 +42,7 @@ This makes the models suitable for:
42
 
43
  We currently support **8** types of potentially unfair clauses:
44
 
45
- - **Limitation of liability** β€” Limits the provider’s legal responsibility
46
  - **Unilateral termination** β€” Provider may terminate/suspend without clear cause
47
  - **Unilateral change** β€” Terms can change with minimal notice or constraints
48
  - **Content removal** β€” Provider may remove user content at discretion
@@ -62,16 +62,18 @@ We report the same metric set across models whenever possible.
62
 
63
  | Model | Task | Key metric(s) |
64
  |------|------|---------------|
65
- | **[deberta-unfair-tos](https://huggingface.co/Agreemind/deberta-unfair-tos)** | ToS clause risk classification | **F1: 0.87** β€’ Accuracy: 78.8% ⭐ |
 
66
  | [electra-large-unfair-tos](https://huggingface.co/Agreemind/electra-large-unfair-tos) | ToS clause risk classification | Accuracy: 77.3% |
67
  | [legalbert-unfair-tos](https://huggingface.co/Agreemind/legalbert-unfair-tos) | ToS clause risk classification | Accuracy: 74.9% |
68
  | [modernbert-unfair-tos](https://huggingface.co/Agreemind/modernbert-unfair-tos) | ToS clause risk classification | Accuracy: 70.6% |
69
  | [legalbert-large-unfair-tos](https://huggingface.co/Agreemind/legalbert-large-unfair-tos) | ToS clause risk classification | Accuracy: 66.3% |
70
 
71
  **Notes**
72
- - β€œAccuracy” can mean different things for multi-label tasks (e.g., exact match vs per-label).
73
- Each model card should specify **exactly** how metrics are computed.
74
- - For real-world use, we recommend tuning **per-class thresholds** on your domain.
 
75
 
76
  ---
77
 
@@ -81,7 +83,7 @@ We report the same metric set across models whenever possible.
81
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
82
  import torch
83
 
84
- model_id = "Agreemind/deberta-unfair-tos"
85
  tokenizer = AutoTokenizer.from_pretrained(model_id)
86
  model = AutoModelForSequenceClassification.from_pretrained(model_id)
87
 
@@ -105,7 +107,7 @@ probs = torch.sigmoid(logits).squeeze().tolist()
105
 
106
  top = sorted(zip(labels, probs), key=lambda x: x[1], reverse=True)[:3]
107
  print(top)
108
- ````
109
 
110
  ---
111
 
@@ -135,4 +137,3 @@ Always present outputs as **informational signals**, ideally with:
135
  ## πŸ“„ License
136
 
137
  Models and code are released under the **MIT License**, unless otherwise stated in individual repositories/models.
138
-
 
21
  Legal documents are dense and time-consuming. Our goal is to make them more accessible by:
22
  - highlighting clauses that commonly reduce user rights,
23
  - labeling the *type* of risk (e.g., unilateral changes, arbitration),
24
+ - enabling downstream apps to display "risk badges" and evidence-backed highlights.
25
 
26
  ---
27
 
 
33
 
34
  This makes the models suitable for:
35
  - clause highlighting in a document viewer,
36
+ - ranking "most risky" clauses first,
37
+ - powering a lightweight "risk badge" in a UI.
38
 
39
  ---
40
 
 
42
 
43
  We currently support **8** types of potentially unfair clauses:
44
 
45
+ - **Limitation of liability** β€” Limits the provider's legal responsibility
46
  - **Unilateral termination** β€” Provider may terminate/suspend without clear cause
47
  - **Unilateral change** β€” Terms can change with minimal notice or constraints
48
  - **Content removal** β€” Provider may remove user content at discretion
 
62
 
63
  | Model | Task | Key metric(s) |
64
  |------|------|---------------|
65
+ | **[deberta-unfair-tos-augmented](https://huggingface.co/Agreemind/deberta-unfair-tos-augmented)** | ToS clause risk classification | **F1: 0.96** β€’ Accuracy: 94.12% ⭐ |
66
+ | [deberta-unfair-tos](https://huggingface.co/Agreemind/deberta-unfair-tos) | ToS clause risk classification | F1: 0.87 β€’ Accuracy: 78.8% |
67
  | [electra-large-unfair-tos](https://huggingface.co/Agreemind/electra-large-unfair-tos) | ToS clause risk classification | Accuracy: 77.3% |
68
  | [legalbert-unfair-tos](https://huggingface.co/Agreemind/legalbert-unfair-tos) | ToS clause risk classification | Accuracy: 74.9% |
69
  | [modernbert-unfair-tos](https://huggingface.co/Agreemind/modernbert-unfair-tos) | ToS clause risk classification | Accuracy: 70.6% |
70
  | [legalbert-large-unfair-tos](https://huggingface.co/Agreemind/legalbert-large-unfair-tos) | ToS clause risk classification | Accuracy: 66.3% |
71
 
72
  **Notes**
73
+ - **Accuracy** = Exact Match (all 8 labels correct per sample)
74
+ - **F1** = Micro-F1 across all labels
75
+ - For production use, we recommend tuning **per-class thresholds** on your domain.
76
+ - The augmented model was trained with 605 additional synthetic examples for weak classes.
77
 
78
  ---
79
 
 
83
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
84
  import torch
85
 
86
+ model_id = "Agreemind/deberta-unfair-tos-augmented" # Best model
87
  tokenizer = AutoTokenizer.from_pretrained(model_id)
88
  model = AutoModelForSequenceClassification.from_pretrained(model_id)
89
 
 
107
 
108
  top = sorted(zip(labels, probs), key=lambda x: x[1], reverse=True)[:3]
109
  print(top)
110
+ ```
111
 
112
  ---
113
 
 
137
  ## πŸ“„ License
138
 
139
  Models and code are released under the **MIT License**, unless otherwise stated in individual repositories/models.