Spaces:

Agreemind
/

README

Running

App Files Files Community

README / README.md

canpolatbulbul

Update README.md

17a517e verified 10 days ago

preview code

raw

history blame contribute delete

3.59 kB

	---
	title: Agreemind
	emoji: ⚖️
	colorFrom: blue
	colorTo: purple
	sdk: static
	pinned: false
	---
	# Agreemind

	AI-powered legal risk detection for Terms of Service.

	We build models that automatically classify unfair clauses in Terms of Service documents, helping legal teams and consumers identify potentially harmful terms.

	## Models

	All models are fine-tuned on the [LexGLUE UNFAIR-ToS](https://huggingface.co/datasets/coastalcph/lex_glue) benchmark and evaluated on the official test set (1,607 samples) using the paper's methodology ([Chalkidis et al., 2022](https://arxiv.org/abs/2110.00976)).

	\| Model \| μ-F1 \| m-F1 \| Speed \| Best for \|
	\|-------\|------\|------\|-------\|----------\|
	\| [lexglue-roberta-unfair-tos](https://huggingface.co/Agreemind/lexglue-roberta-unfair-tos) \| 96.1 \| 84.4 \| Normal \| 🥇 Best accuracy \|
	\| [lexglue-legalbert-unfair-tos](https://huggingface.co/Agreemind/lexglue-legalbert-unfair-tos) \| 96.0 \| 84.1 \| Normal \| 🥈 Legal domain \|
	\| [lexglue-deberta-unfair-tos](https://huggingface.co/Agreemind/lexglue-deberta-unfair-tos) \| 95.6 \| 82.2 \| Normal \| General purpose \|
	\| [lexglue-legalbert-small-unfair-tos](https://huggingface.co/Agreemind/lexglue-legalbert-small-unfair-tos) \| 95.0 \| 78.5 \| ⚡ ~3x faster \| Fast inference \|

	LexGLUE Leaderboard comparison: Legal-BERT (paper) = 96.0 μ-F1 / 83.0 m-F1. Our top models match or exceed this.

	## Quick Start

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model_id = "Agreemind/lexglue-roberta-unfair-tos" # Best model
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForSequenceClassification.from_pretrained(model_id)

	labels = [
	"Limitation of liability", "Unilateral termination",
	"Unilateral change", "Content removal",
	"Contract by using", "Choice of law",
	"Jurisdiction", "Arbitration",
	]

	text = "We may terminate your account at any time without notice."
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

	with torch.no_grad():
	probs = torch.sigmoid(model(**inputs).logits).squeeze()

	for label, prob in sorted(zip(labels, probs), key=lambda x: x[1], reverse=True):
	if prob > 0.5:
	print(f" {label}: {prob:.3f}")
	```

	## Risk Categories

	\| Category \| Description \|
	\|----------\|-------------\|
	\| Limitation of liability \| Limits the provider's legal responsibility \|
	\| Unilateral termination \| Provider may terminate without clear cause \|
	\| Unilateral change \| Terms can change with minimal notice \|
	\| Content removal \| Provider may remove user content \|
	\| Contract by using \| Agreement implied by using the service \|
	\| Choice of law \| Specifies governing jurisdiction's law \|
	\| Jurisdiction \| Specifies where disputes are handled \|
	\| Arbitration \| Requires arbitration instead of court \|

	## Training Methodology

	- Dataset: LexGLUE UNFAIR-ToS (standard split, no augmentation)
	- Loss: Standard BCEWithLogitsLoss
	- LR: 3e-5 with linear schedule
	- Batch size: 8
	- Epochs: Up to 20 with early stopping (patience=5)
	- Evaluation: Official LexGLUE test set with paper's metric computation

	## Links

	- [LexGLUE Paper](https://arxiv.org/abs/2110.00976)
	- [LexGLUE Dataset](https://huggingface.co/datasets/coastalcph/lex_glue)

	* 🌐 Website: [https://agreemind.vercel.app](https://agreemind.vercel.app)
	* 💻 GitHub: [https://github.com/agree-mind](https://github.com/agree-mind)

	---

	## 📄 License

	Models and code are released under the MIT License, unless otherwise stated in individual repositories/models.