ElKulako
/

cryptobert

Text Classification

sentiment classification

sentiment analysis

cryptocurrency sentiment analysis

Model card Files Files and versions

ElKulako commited on Jun 26, 2022

Commit

376dd88

·

1 Parent(s): 8578a56

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -1,14 +1,11 @@
-# CryptoBERT
-CryptoBERT is a pre-trained NLP model to analyse the language and sentiments of cryptocurrency-related social media posts and messages. It is built by further training the [cardiffnlp's  Twitter-roBERTa-base](https://huggingface.co/cardiffnlp/twitter-roberta-base) language model on the cryptocurrency domain, using a corpus of over 3.2M unique cryptocurrency-related social media posts.
-# Example of Classification
 ---
 datasets:
 - ElKulako/StockTwits-crypto
 ---
 ## Classification Training
 The model was trained on the following labels: "Bearish" : 0, "Neutral": 1, "Bullish": 2
@@ -16,6 +13,9 @@ CryptoBERT's sentiment classification head was fine-tuned on a balanced dataset
 CryptoBERT was trained with a max sequence length of 128. Technically, it can handle sequences of up to 514 tokens, however, going beyond 128 is not recommended.
 ## Training Corpus
 CryptoBERT was trained on 3.2M social media posts regarding various cryptocurrencies. Only non-duplicate posts of length above 4 words were considered. The following communities were used as sources for our corpora:

 ---
 datasets:
 - ElKulako/StockTwits-crypto
 ---
+# CryptoBERT
+CryptoBERT is a pre-trained NLP model to analyse the language and sentiments of cryptocurrency-related social media posts and messages. It was built by further training the [cardiffnlp's  Twitter-roBERTa-base](https://huggingface.co/cardiffnlp/twitter-roberta-base) language model on the cryptocurrency domain, using a corpus of over 3.2M unique cryptocurrency-related social media posts.
 ## Classification Training
 The model was trained on the following labels: "Bearish" : 0, "Neutral": 1, "Bullish": 2
 CryptoBERT was trained with a max sequence length of 128. Technically, it can handle sequences of up to 514 tokens, however, going beyond 128 is not recommended.
+# Classification Example
 ## Training Corpus
 CryptoBERT was trained on 3.2M social media posts regarding various cryptocurrencies. Only non-duplicate posts of length above 4 words were considered. The following communities were used as sources for our corpora: