File size: 1,859 Bytes
a90e83e ee8d642 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
# roberta_sentiments_es , a Sentiment Analysis model for Spanish sentences
This is a roBERTa-base model trained on ~58M tweets and finetuned for sentiment analysis. This model currently supports Spanish sentences
## Example of classification
```python
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
import pandas as pd
from scipy.special import softmax
MODEL = 'Manauu17/roberta_sentiments_es_en'
tokenizer = AutoTokenizer.from_pretrained(MODEL)
# PyTorch
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
text = ['@usuario siempre es bueno la opinión de un playo',
'Bendito año el que me espera']
encoded_input = tokenizer(text, return_tensors='pt', padding=True, truncation=True)
output = model(**encoded_input)
scores = output[0].detach().numpy()
# TensorFlow
model = TFAutoModelForSequenceClassification.from_pretrained(MODEL)
text = ['La guerra no es buena para nadie.','Espero que mi jefe me de mañana libre']
encoded_input = tokenizer(text, return_tensors='tf', padding=True, truncation=True)
output = model(encoded_input)
scores = output[0].numpy()
# Results
def get_scores(model_output, labels_dict):
scores = softmax(model_output)
frame = pd.DataFrame(scores, columns=labels.values())
frame.style.highlight_max(axis=1,color="green")
return frame
```
Output:
```
# PyTorch
get_scores(scores, labels_dict).style.highlight_max(axis=1, color="green")
Negative Neutral Positive
0 0.000607 0.004851 0.906596
1 0.079812 0.006650 0.001484
# TensorFlow
get_scores(scores, labels_dict).style.highlight_max(axis=1, color="green")
Negative Neutral Positive
0 0.017030 0.008920 0.000667
1 0.000260 0.001695 0.971429
```
|