AfterRain007
/

cryptobertRefined

Text Classification

Sentiment Analysis

text-embeddings-inference

Model card Files Files and versions

AfterRain007 commited on Feb 23, 2024

Commit

3412acb

·

verified ·

1 Parent(s): 9e390d8

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -44,14 +44,15 @@ Output:
 Total of 3.803 text have been labelled manually to fine tune the model, with consideration of non-duplicate and a minimum of 4 words after cleaning. The following website were used for our training dataset:
 1. Bitcoin tweet dataset from [Kaggle Datasets](https://www.kaggle.com/datasets/kaushiksuresh147/bitcoin-tweets) (Randomly picked).
 2. Labelled crypto sentiment dataset from [SurgeAI](https://www.surgehq.ai/datasets/crypto-sentiment-dataset).
-3. Reddit thread r/Bitcoin with the topic "Daily Discussion" (Randomly picked).
-Data augmentation is done to enrich the dataset, Back-Translation were used with Google Translate API on 10 language ('it', 'fr', "sv", "da", 'pt', 'id', 'pl', 'hr', "bg", "fi").
 # Source Code
 See [Github](https://github.com/AfterRain007/cryptobertRefined) for the source code to finetune cryptoBERT model into cryptoBERTRefined.
 # Credit
-Credit where credit's due, thank you for all!
 1. Muhaza Liebenlito, M.Si and Prof. Dr. Nur Inayah, M.Si. as my academic advisor.
 2. Risky Amalia Marhariyadi for helping labelling the dataset.

 Total of 3.803 text have been labelled manually to fine tune the model, with consideration of non-duplicate and a minimum of 4 words after cleaning. The following website were used for our training dataset:
 1. Bitcoin tweet dataset from [Kaggle Datasets](https://www.kaggle.com/datasets/kaushiksuresh147/bitcoin-tweets) (Randomly picked).
 2. Labelled crypto sentiment dataset from [SurgeAI](https://www.surgehq.ai/datasets/crypto-sentiment-dataset).
+3. Reddit thread r/Bitcoin with the topic "Daily Discussion" (Randomly picked)
+Data augmentation was also performed to enrich the dataset, Back-Translation was used with Google Translate API on 10 language ('it', 'fr', "sv", "da", 'pt', 'id', 'pl', 'hr', "bg", "fi").
 # Source Code
 See [Github](https://github.com/AfterRain007/cryptobertRefined) for the source code to finetune cryptoBERT model into cryptoBERTRefined.
 # Credit
+Credit where credit is due, thank you for all!
 1. Muhaza Liebenlito, M.Si and Prof. Dr. Nur Inayah, M.Si. as my academic advisor.
 2. Risky Amalia Marhariyadi for helping labelling the dataset.