FBK-MT
/

GeNTE-evaluator

dfucci commited on Aug 27, 2024

Commit

9cf716e

1 Parent(s): be47ebd

improve code example and correct link

Files changed (1) hide show

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ language:
 # GeNTE Evaluator
 The **Gender-Neutral Translation (GeNTE) Evaluator** is a sequence classification model used for evaluating inclusive rewriting and translations into Italian with the [GeNTE corpus](https://huggingface.co/datasets/FBK-MT/GeNTE).
-It is built by fine-tuning the RoBERTa-based [UmBERTo model](https://huggingface.co/Musixmatch/umberto-wikipedia-uncased-v1).
 More details on the training process and the reproducibility can be found in the [official repository](https://github.com/hlt-mt/fbk-NEUTR-evAL/blob/main/solutions/GeNTE.md) and the [paper](https://aclanthology.org/2024.eacl-short.23/).
@@ -16,18 +16,19 @@ You can use the GeNTE Evaluator as follows:
 ```
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
 # load the tokenizer of UmBERTo
-tokenizer = AutoTokenizer.from_pretrained("Musixmatch/umberto-wikipedia-uncased-v1", do_lower_case=False)
 # load GeNTE Evaluator
 model = AutoModelForSequenceClassification.from_pretrained("FBK-MT/GeNTE-evaluator")
 # neutral example
-sample = "Condividiamo il parere di chi ha presentato la relazione
-          che ha posto notevole enfasi sull'informazione in relazione ai rischi e sulla trasparenza,
-          in particolare nel campo sanitario e della sicurezza."
-input = tokenizer(sample, return_tensors='pt')
 with torch.no_grad():
   probs = model(**input).logits

 # GeNTE Evaluator
 The **Gender-Neutral Translation (GeNTE) Evaluator** is a sequence classification model used for evaluating inclusive rewriting and translations into Italian with the [GeNTE corpus](https://huggingface.co/datasets/FBK-MT/GeNTE).
+It is built by fine-tuning the RoBERTa-based [UmBERTo model](https://huggingface.co/Musixmatch/umberto-commoncrawl-cased-v1).
 More details on the training process and the reproducibility can be found in the [official repository](https://github.com/hlt-mt/fbk-NEUTR-evAL/blob/main/solutions/GeNTE.md) and the [paper](https://aclanthology.org/2024.eacl-short.23/).
 ```
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
+import torch
 # load the tokenizer of UmBERTo
+tokenizer = AutoTokenizer.from_pretrained("Musixmatch/umberto-commoncrawl-cased-v1", do_lower_case=False)
 # load GeNTE Evaluator
 model = AutoModelForSequenceClassification.from_pretrained("FBK-MT/GeNTE-evaluator")
 # neutral example
+sample = ("Condividiamo il parere di chi ha presentato la relazione che  ha posto "
+          "notevole enfasi sull'informazione in relazione ai rischi e sulla trasparenza, "
+          "in particolare nel campo sanitario e della sicurezza.")
+input = tokenizer(sample, return_tensors='pt', truncation=True, max_length=64)
 with torch.no_grad():
   probs = model(**input).logits