melll-uff
/

bertweetbr

Model card Files Files and versions

Fernando Carneiro commited on Sep 11, 2022

Commit

07fab57

·

1 Parent(s): 966dd59

Readme

Files changed (1) hide show

README.md +21 -2

README.md CHANGED Viewed

@@ -9,13 +9,14 @@ Having the same architecture of [BERTweet](https://huggingface.co/docs/transform
 ## Usage
 ```python
 import torch
 from transformers import AutoModel, AutoTokenizer
 model = AutoModel.from_pretrained('melll-uff/bertweetbr')
-tokenizer = AutoTokenizer.from_pretrained('melll-uff/bertweetbr')
 # INPUT TWEET IS ALREADY NORMALIZED!
 line = "Tem vídeo novo no canal do @USER :rosto_sorridente_com_olhos_de_coração: Passem por lá e confiram : HTTPURL"
@@ -24,4 +25,22 @@ input_ids = tokenizer(line, return_tensors="pt")
 with torch.no_grad():
     features = model(**input_ids)  # Models outputs are now tuples
 ```

 ## Usage
+### Normalized Inputs
 ```python
 import torch
 from transformers import AutoModel, AutoTokenizer
 model = AutoModel.from_pretrained('melll-uff/bertweetbr')
+tokenizer = AutoTokenizer.from_pretrained('melll-uff/bertweetbr', normalization=False)
 # INPUT TWEET IS ALREADY NORMALIZED!
 line = "Tem vídeo novo no canal do @USER :rosto_sorridente_com_olhos_de_coração: Passem por lá e confiram : HTTPURL"
 with torch.no_grad():
     features = model(**input_ids)  # Models outputs are now tuples
+```
+ ### Normalize raw input Tweets
+ ```python
+import torch
+from transformers import AutoModel, AutoTokenizer
+ ```python
+from transformers import pipeline
+model_name = 'melll-uff/bertweetbr'
+tokenizer = AutoTokenizer.from_pretrained('melll-uff/bertweetbr', normalization=False)
+filler_mask = pipeline("fill-mask", model=model_name, tokenizer=tokenizer)
+filler_mask("Rio é a <mask> cidade do Brasil.", top_k=5)
 ```