README / README.md
canpolatbulbul's picture
Update README.md
17a517e verified
---
title: Agreemind
emoji: ⚖️
colorFrom: blue
colorTo: purple
sdk: static
pinned: false
---
# Agreemind
**AI-powered legal risk detection for Terms of Service.**
We build models that automatically classify unfair clauses in Terms of Service documents, helping legal teams and consumers identify potentially harmful terms.
## Models
All models are fine-tuned on the [LexGLUE UNFAIR-ToS](https://huggingface.co/datasets/coastalcph/lex_glue) benchmark and evaluated on the official test set (1,607 samples) using the paper's methodology ([Chalkidis et al., 2022](https://arxiv.org/abs/2110.00976)).
| Model | μ-F1 | m-F1 | Speed | Best for |
|-------|------|------|-------|----------|
| [lexglue-roberta-unfair-tos](https://huggingface.co/Agreemind/lexglue-roberta-unfair-tos) | **96.1** | **84.4** | Normal | 🥇 Best accuracy |
| [lexglue-legalbert-unfair-tos](https://huggingface.co/Agreemind/lexglue-legalbert-unfair-tos) | **96.0** | **84.1** | Normal | 🥈 Legal domain |
| [lexglue-deberta-unfair-tos](https://huggingface.co/Agreemind/lexglue-deberta-unfair-tos) | 95.6 | 82.2 | Normal | General purpose |
| [lexglue-legalbert-small-unfair-tos](https://huggingface.co/Agreemind/lexglue-legalbert-small-unfair-tos) | 95.0 | 78.5 | ⚡ ~3x faster | Fast inference |
**LexGLUE Leaderboard comparison:** Legal-BERT (paper) = 96.0 μ-F1 / 83.0 m-F1. Our top models match or exceed this.
## Quick Start
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "Agreemind/lexglue-roberta-unfair-tos" # Best model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
labels = [
"Limitation of liability", "Unilateral termination",
"Unilateral change", "Content removal",
"Contract by using", "Choice of law",
"Jurisdiction", "Arbitration",
]
text = "We may terminate your account at any time without notice."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
with torch.no_grad():
probs = torch.sigmoid(model(**inputs).logits).squeeze()
for label, prob in sorted(zip(labels, probs), key=lambda x: x[1], reverse=True):
if prob > 0.5:
print(f" {label}: {prob:.3f}")
```
## Risk Categories
| Category | Description |
|----------|-------------|
| Limitation of liability | Limits the provider's legal responsibility |
| Unilateral termination | Provider may terminate without clear cause |
| Unilateral change | Terms can change with minimal notice |
| Content removal | Provider may remove user content |
| Contract by using | Agreement implied by using the service |
| Choice of law | Specifies governing jurisdiction's law |
| Jurisdiction | Specifies where disputes are handled |
| Arbitration | Requires arbitration instead of court |
## Training Methodology
- **Dataset:** LexGLUE UNFAIR-ToS (standard split, no augmentation)
- **Loss:** Standard BCEWithLogitsLoss
- **LR:** 3e-5 with linear schedule
- **Batch size:** 8
- **Epochs:** Up to 20 with early stopping (patience=5)
- **Evaluation:** Official LexGLUE test set with paper's metric computation
## Links
- [LexGLUE Paper](https://arxiv.org/abs/2110.00976)
- [LexGLUE Dataset](https://huggingface.co/datasets/coastalcph/lex_glue)
* 🌐 Website: [https://agreemind.vercel.app](https://agreemind.vercel.app)
* 💻 GitHub: [https://github.com/agree-mind](https://github.com/agree-mind)
---
## 📄 License
Models and code are released under the **MIT License**, unless otherwise stated in individual repositories/models.