Commit
·
36d8ee0
1
Parent(s):
4331f24
Update README.md
Browse files
README.md
CHANGED
|
@@ -4,9 +4,9 @@ datasets:
|
|
| 4 |
language:
|
| 5 |
- en
|
| 6 |
---
|
| 7 |
-
# Twitter-roBERTa
|
| 8 |
|
| 9 |
-
This is a roBERTa
|
| 10 |
|
| 11 |
- Reference Paper: [_TweetEval_ (Findings of EMNLP 2020)](https://arxiv.org/pdf/2010.12421.pdf).
|
| 12 |
- Git Repo: [Tweeteval official repository](https://github.com/cardiffnlp/tweeteval).
|
|
@@ -16,8 +16,6 @@ This is a roBERTa-base model trained on ~58M tweets and finetuned for sentiment
|
|
| 16 |
1 -> Neutral;
|
| 17 |
2 -> Positive
|
| 18 |
|
| 19 |
-
<b>New!</b> We just released a new sentiment analysis model trained on more recent and a larger quantity of tweets.
|
| 20 |
-
See [twitter-roberta-base-sentiment-latest](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest) and [TweetNLP](https://tweetnlp.org) for more details.
|
| 21 |
|
| 22 |
## Example of classification
|
| 23 |
|
|
@@ -46,13 +44,13 @@ def preprocess(text):
|
|
| 46 |
# stance/abortion, stance/atheism, stance/climate, stance/feminist, stance/hillary
|
| 47 |
|
| 48 |
task='sentiment'
|
| 49 |
-
MODEL = f"
|
| 50 |
|
| 51 |
tokenizer = AutoTokenizer.from_pretrained(MODEL)
|
| 52 |
|
| 53 |
# download label mapping
|
| 54 |
labels=[]
|
| 55 |
-
mapping_link = f"https://raw.githubusercontent.com/
|
| 56 |
with urllib.request.urlopen(mapping_link) as f:
|
| 57 |
html = f.read().decode('utf-8').split("\n")
|
| 58 |
csvreader = csv.reader(html, delimiter='\t')
|
|
|
|
| 4 |
language:
|
| 5 |
- en
|
| 6 |
---
|
| 7 |
+
# Twitter-roBERTa for Sentiment Analysis
|
| 8 |
|
| 9 |
+
This is a roBERTa model trained on ~58M tweets and finetuned for sentiment analysis with the TweetEval benchmark. This model is suitable for English.
|
| 10 |
|
| 11 |
- Reference Paper: [_TweetEval_ (Findings of EMNLP 2020)](https://arxiv.org/pdf/2010.12421.pdf).
|
| 12 |
- Git Repo: [Tweeteval official repository](https://github.com/cardiffnlp/tweeteval).
|
|
|
|
| 16 |
1 -> Neutral;
|
| 17 |
2 -> Positive
|
| 18 |
|
|
|
|
|
|
|
| 19 |
|
| 20 |
## Example of classification
|
| 21 |
|
|
|
|
| 44 |
# stance/abortion, stance/atheism, stance/climate, stance/feminist, stance/hillary
|
| 45 |
|
| 46 |
task='sentiment'
|
| 47 |
+
MODEL = f"researchworkai/Sentiment-roBERTa-Twitter-{task}"
|
| 48 |
|
| 49 |
tokenizer = AutoTokenizer.from_pretrained(MODEL)
|
| 50 |
|
| 51 |
# download label mapping
|
| 52 |
labels=[]
|
| 53 |
+
mapping_link = f"https://raw.githubusercontent.com/researchworkai/tweeteval/main/datasets/{task}/mapping.txt"
|
| 54 |
with urllib.request.urlopen(mapping_link) as f:
|
| 55 |
html = f.read().decode('utf-8').split("\n")
|
| 56 |
csvreader = csv.reader(html, delimiter='\t')
|