improve code example and correct link
Browse files
README.md
CHANGED
|
@@ -6,7 +6,7 @@ language:
|
|
| 6 |
# GeNTE Evaluator
|
| 7 |
|
| 8 |
The **Gender-Neutral Translation (GeNTE) Evaluator** is a sequence classification model used for evaluating inclusive rewriting and translations into Italian with the [GeNTE corpus](https://huggingface.co/datasets/FBK-MT/GeNTE).
|
| 9 |
-
It is built by fine-tuning the RoBERTa-based [UmBERTo model](https://huggingface.co/Musixmatch/umberto-
|
| 10 |
|
| 11 |
More details on the training process and the reproducibility can be found in the [official repository](https://github.com/hlt-mt/fbk-NEUTR-evAL/blob/main/solutions/GeNTE.md) and the [paper](https://aclanthology.org/2024.eacl-short.23/).
|
| 12 |
|
|
@@ -16,18 +16,19 @@ You can use the GeNTE Evaluator as follows:
|
|
| 16 |
|
| 17 |
```
|
| 18 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
|
|
|
| 19 |
|
| 20 |
# load the tokenizer of UmBERTo
|
| 21 |
-
tokenizer = AutoTokenizer.from_pretrained("Musixmatch/umberto-
|
| 22 |
|
| 23 |
# load GeNTE Evaluator
|
| 24 |
model = AutoModelForSequenceClassification.from_pretrained("FBK-MT/GeNTE-evaluator")
|
| 25 |
|
| 26 |
# neutral example
|
| 27 |
-
sample = "Condividiamo il parere di chi ha presentato la relazione
|
| 28 |
-
|
| 29 |
-
in particolare nel campo sanitario e della sicurezza."
|
| 30 |
-
input = tokenizer(sample, return_tensors='pt')
|
| 31 |
|
| 32 |
with torch.no_grad():
|
| 33 |
probs = model(**input).logits
|
|
|
|
| 6 |
# GeNTE Evaluator
|
| 7 |
|
| 8 |
The **Gender-Neutral Translation (GeNTE) Evaluator** is a sequence classification model used for evaluating inclusive rewriting and translations into Italian with the [GeNTE corpus](https://huggingface.co/datasets/FBK-MT/GeNTE).
|
| 9 |
+
It is built by fine-tuning the RoBERTa-based [UmBERTo model](https://huggingface.co/Musixmatch/umberto-commoncrawl-cased-v1).
|
| 10 |
|
| 11 |
More details on the training process and the reproducibility can be found in the [official repository](https://github.com/hlt-mt/fbk-NEUTR-evAL/blob/main/solutions/GeNTE.md) and the [paper](https://aclanthology.org/2024.eacl-short.23/).
|
| 12 |
|
|
|
|
| 16 |
|
| 17 |
```
|
| 18 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
| 19 |
+
import torch
|
| 20 |
|
| 21 |
# load the tokenizer of UmBERTo
|
| 22 |
+
tokenizer = AutoTokenizer.from_pretrained("Musixmatch/umberto-commoncrawl-cased-v1", do_lower_case=False)
|
| 23 |
|
| 24 |
# load GeNTE Evaluator
|
| 25 |
model = AutoModelForSequenceClassification.from_pretrained("FBK-MT/GeNTE-evaluator")
|
| 26 |
|
| 27 |
# neutral example
|
| 28 |
+
sample = ("Condividiamo il parere di chi ha presentato la relazione che ha posto "
|
| 29 |
+
"notevole enfasi sull'informazione in relazione ai rischi e sulla trasparenza, "
|
| 30 |
+
"in particolare nel campo sanitario e della sicurezza.")
|
| 31 |
+
input = tokenizer(sample, return_tensors='pt', truncation=True, max_length=64)
|
| 32 |
|
| 33 |
with torch.no_grad():
|
| 34 |
probs = model(**input).logits
|