ElKulako
/

cryptobert

Text Classification

sentiment classification

sentiment analysis

cryptocurrency sentiment analysis

Model card Files Files and versions

ElKulako commited on Jun 26, 2022

Commit

c47aef6

·

1 Parent(s): 31b8743

Update README.md

Files changed (1) hide show

README.md +12 -5

README.md CHANGED Viewed

@@ -17,18 +17,25 @@ CryptoBERT was trained with a max sequence length of 128. Technically, it can ha
 # Classification Example
 ```python
 from transformers import TextClassificationPipeline, AutoModelForSequenceClassification, AutoTokenizer
-from datasets import load_dataset
-dataset_name = "ElKulako/stocktwits-crypto"
-dataset = load_dataset(dataset_name)
 model_name = "ElKulako/cryptobert"
-tokenizer_ = AutoTokenizer.from_pretrained(model_name, use_fast=True)
 model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels = 3)
-pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, batch_size=64, max_length=64, truncation=True, padding = 'max_length')
 preds = pipe(df_posts)
 ```
 ## Training Corpus
 CryptoBERT was trained on 3.2M social media posts regarding various cryptocurrencies. Only non-duplicate posts of length above 4 words were considered. The following communities were used as sources for our corpora:

 # Classification Example
 ```python
 from transformers import TextClassificationPipeline, AutoModelForSequenceClassification, AutoTokenizer
 model_name = "ElKulako/cryptobert"
+tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
 model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels = 3)
+pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, max_length=64, truncation=True, padding = 'max_length')
+# post_1 & post_3 = bullish, post_2 = bearish
+post_1 = " see y'all tomorrow and can't wait to see ada in the morning, i wonder what price it is going to be at. 😎🐂🤠💯😴, bitcoin is looking good go for it and flash by that 45k. "
+post_2 = "  alright racers, it’s a race to the bottom! good luck today and remember there are no losers (minus those who invested in currency nobody really uses) take your marks... are you ready? go!!"
+post_3 = " i'm never selling. the whole market can bottom out. i'll continue to hold this dumpster fire until the day i die if i need to."
+df_posts = [post_1, post_2, post_3]
 preds = pipe(df_posts)
+print(preds)
 ```
+```
+[{'label': 'Bullish', 'score': 0.8734585642814636}, {'label': 'Bearish', 'score': 0.9889495372772217}, {'label': 'Bullish', 'score': 0.6595883965492249}]
+```
 ## Training Corpus
 CryptoBERT was trained on 3.2M social media posts regarding various cryptocurrencies. Only non-duplicate posts of length above 4 words were considered. The following communities were used as sources for our corpora: