|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- rouge |
|
|
base_model: |
|
|
- allenai/led-base-16384 |
|
|
pipeline_tag: summarization |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# Longformer fine-tuned to summarize Terms of Service |
|
|
Terms of Service documents are lengthy, complex, and time-consuming to read. Due to its vague language, people often don’t understand what they are agreeing to. Hence, we have fine-tuned Longformer model to help summarize TOS and make it easy to read and understand. |
|
|
<br><br> |
|
|
This model is a fine-tuned version of [`allenai/led-base-16384`](https://huggingface.co/allenai/led-base-16384) <br> |
|
|
Dataset used: TL;DRLegal and TOS;DR website <br> |
|
|
It achieves the following results on the validation set:<br> |
|
|
* ROUGE-1: 0.28 |
|
|
* ROUGE-2: 0.13 |
|
|
* ROUGE-L: 0.27 |
|
|
|
|
|
## How to Use |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("aarushi-211/TOS-Longformer") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("aarushi-211/TOS-Longformer") |
|
|
|
|
|
inputs = tokenizer("Your input text here", return_tensors="pt", truncation=True) |
|
|
outputs = model(**inputs) |
|
|
logits = outputs.logits |