File size: 3,589 Bytes
1286a8a ddd04d0 1286a8a 17a517e 1286a8a 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ff0784c ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 17a517e ddd04d0 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | ---
title: Agreemind
emoji: ⚖️
colorFrom: blue
colorTo: purple
sdk: static
pinned: false
---
# Agreemind
**AI-powered legal risk detection for Terms of Service.**
We build models that automatically classify unfair clauses in Terms of Service documents, helping legal teams and consumers identify potentially harmful terms.
## Models
All models are fine-tuned on the [LexGLUE UNFAIR-ToS](https://huggingface.co/datasets/coastalcph/lex_glue) benchmark and evaluated on the official test set (1,607 samples) using the paper's methodology ([Chalkidis et al., 2022](https://arxiv.org/abs/2110.00976)).
| Model | μ-F1 | m-F1 | Speed | Best for |
|-------|------|------|-------|----------|
| [lexglue-roberta-unfair-tos](https://huggingface.co/Agreemind/lexglue-roberta-unfair-tos) | **96.1** | **84.4** | Normal | 🥇 Best accuracy |
| [lexglue-legalbert-unfair-tos](https://huggingface.co/Agreemind/lexglue-legalbert-unfair-tos) | **96.0** | **84.1** | Normal | 🥈 Legal domain |
| [lexglue-deberta-unfair-tos](https://huggingface.co/Agreemind/lexglue-deberta-unfair-tos) | 95.6 | 82.2 | Normal | General purpose |
| [lexglue-legalbert-small-unfair-tos](https://huggingface.co/Agreemind/lexglue-legalbert-small-unfair-tos) | 95.0 | 78.5 | ⚡ ~3x faster | Fast inference |
**LexGLUE Leaderboard comparison:** Legal-BERT (paper) = 96.0 μ-F1 / 83.0 m-F1. Our top models match or exceed this.
## Quick Start
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "Agreemind/lexglue-roberta-unfair-tos" # Best model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
labels = [
"Limitation of liability", "Unilateral termination",
"Unilateral change", "Content removal",
"Contract by using", "Choice of law",
"Jurisdiction", "Arbitration",
]
text = "We may terminate your account at any time without notice."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
probs = torch.sigmoid(model(**inputs).logits).squeeze()
for label, prob in sorted(zip(labels, probs), key=lambda x: x[1], reverse=True):
if prob > 0.5:
print(f" {label}: {prob:.3f}")
```
## Risk Categories
| Category | Description |
|----------|-------------|
| Limitation of liability | Limits the provider's legal responsibility |
| Unilateral termination | Provider may terminate without clear cause |
| Unilateral change | Terms can change with minimal notice |
| Content removal | Provider may remove user content |
| Contract by using | Agreement implied by using the service |
| Choice of law | Specifies governing jurisdiction's law |
| Jurisdiction | Specifies where disputes are handled |
| Arbitration | Requires arbitration instead of court |
## Training Methodology
- **Dataset:** LexGLUE UNFAIR-ToS (standard split, no augmentation)
- **Loss:** Standard BCEWithLogitsLoss
- **LR:** 3e-5 with linear schedule
- **Batch size:** 8
- **Epochs:** Up to 20 with early stopping (patience=5)
- **Evaluation:** Official LexGLUE test set with paper's metric computation
## Links
- [LexGLUE Paper](https://arxiv.org/abs/2110.00976)
- [LexGLUE Dataset](https://huggingface.co/datasets/coastalcph/lex_glue)
* 🌐 Website: [https://agreemind.vercel.app](https://agreemind.vercel.app)
* 💻 GitHub: [https://github.com/agree-mind](https://github.com/agree-mind)
---
## 📄 License
Models and code are released under the **MIT License**, unless otherwise stated in individual repositories/models.
|