Update README.md
Browse files
README.md
CHANGED
|
@@ -12,4 +12,17 @@ tags:
|
|
| 12 |
- RoBERTa
|
| 13 |
- NLP
|
| 14 |
- Cryptocurrency
|
| 15 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
- RoBERTa
|
| 13 |
- NLP
|
| 14 |
- Cryptocurrency
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
# CryptoBERTRefined
|
| 18 |
+
CryptoBERTRefined is a fine tuned model from [CryptoBERT by Elkulako](https://huggingface.co/ElKulako/cryptobert) model (See the base model to see it's training corpus).
|
| 19 |
+
|
| 20 |
+
# Training Process
|
| 21 |
+
Total of 3.803 text have been labelled manually to fine tune the model, and data augmentation is done with Back-Translation using Google Translate API with 10 language ('it', 'fr', "sv", "da", 'pt', 'id', 'pl', 'hr', "bg", "fi").
|
| 22 |
+
|
| 23 |
+
# Training Corpus
|
| 24 |
+
Randomly picked text from [kaggle datasets](https://www.kaggle.com/datasets/kaushiksuresh147/bitcoin-tweets)
|
| 25 |
+
Labelled sentiment text from [surgeAI](https://www.surgehq.ai/datasets/crypto-sentiment-dataset)
|
| 26 |
+
|
| 27 |
+
# Source Code
|
| 28 |
+
See [Github](https://github.com/AfterRain007/cryptobertRefined) for the source code to finetune cryptoBERT model into cryptoBERTRefined.
|