--- title: Agreemind emoji: ⚖️ colorFrom: blue colorTo: purple sdk: static pinned: false --- # Agreemind **AI-powered legal risk detection for Terms of Service.** We build models that automatically classify unfair clauses in Terms of Service documents, helping legal teams and consumers identify potentially harmful terms. ## Models All models are fine-tuned on the [LexGLUE UNFAIR-ToS](https://huggingface.co/datasets/coastalcph/lex_glue) benchmark and evaluated on the official test set (1,607 samples) using the paper's methodology ([Chalkidis et al., 2022](https://arxiv.org/abs/2110.00976)). | Model | μ-F1 | m-F1 | Speed | Best for | |-------|------|------|-------|----------| | [lexglue-roberta-unfair-tos](https://huggingface.co/Agreemind/lexglue-roberta-unfair-tos) | **96.1** | **84.4** | Normal | 🥇 Best accuracy | | [lexglue-legalbert-unfair-tos](https://huggingface.co/Agreemind/lexglue-legalbert-unfair-tos) | **96.0** | **84.1** | Normal | 🥈 Legal domain | | [lexglue-deberta-unfair-tos](https://huggingface.co/Agreemind/lexglue-deberta-unfair-tos) | 95.6 | 82.2 | Normal | General purpose | | [lexglue-legalbert-small-unfair-tos](https://huggingface.co/Agreemind/lexglue-legalbert-small-unfair-tos) | 95.0 | 78.5 | ⚡ ~3x faster | Fast inference | **LexGLUE Leaderboard comparison:** Legal-BERT (paper) = 96.0 μ-F1 / 83.0 m-F1. Our top models match or exceed this. ## Quick Start ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch model_id = "Agreemind/lexglue-roberta-unfair-tos" # Best model tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForSequenceClassification.from_pretrained(model_id) labels = [ "Limitation of liability", "Unilateral termination", "Unilateral change", "Content removal", "Contract by using", "Choice of law", "Jurisdiction", "Arbitration", ] text = "We may terminate your account at any time without notice." inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128) with torch.no_grad(): probs = torch.sigmoid(model(**inputs).logits).squeeze() for label, prob in sorted(zip(labels, probs), key=lambda x: x[1], reverse=True): if prob > 0.5: print(f" {label}: {prob:.3f}") ``` ## Risk Categories | Category | Description | |----------|-------------| | Limitation of liability | Limits the provider's legal responsibility | | Unilateral termination | Provider may terminate without clear cause | | Unilateral change | Terms can change with minimal notice | | Content removal | Provider may remove user content | | Contract by using | Agreement implied by using the service | | Choice of law | Specifies governing jurisdiction's law | | Jurisdiction | Specifies where disputes are handled | | Arbitration | Requires arbitration instead of court | ## Training Methodology - **Dataset:** LexGLUE UNFAIR-ToS (standard split, no augmentation) - **Loss:** Standard BCEWithLogitsLoss - **LR:** 3e-5 with linear schedule - **Batch size:** 8 - **Epochs:** Up to 20 with early stopping (patience=5) - **Evaluation:** Official LexGLUE test set with paper's metric computation ## Links - [LexGLUE Paper](https://arxiv.org/abs/2110.00976) - [LexGLUE Dataset](https://huggingface.co/datasets/coastalcph/lex_glue) * 🌐 Website: [https://agreemind.vercel.app](https://agreemind.vercel.app) * 💻 GitHub: [https://github.com/agree-mind](https://github.com/agree-mind) --- ## 📄 License Models and code are released under the **MIT License**, unless otherwise stated in individual repositories/models.