niobures's picture
CryptoBERT (code, models, paper)
fa16a1a verified
metadata
library_name: transformers
tags:
  - finance
  - crypto
  - text-classification
  - bert
  - turkish
  - BTC
  - ETH
  - XRP
license: mit
base_model:
  - dbmdz/bert-base-turkish-cased
pipeline_tag: text-classification
language:
  - tr

SkyWalkertT1/crypto_bert_sentiment

πŸ“Œ Model Details

Model Description

This is a BERT-based sentiment classification model fine-tuned on Turkish-language cryptocurrency-related comments. It predicts one of three sentiment classes: positive, neutral, or negative. This model was built using the Hugging Face πŸ€— Transformers library and is suitable for analyzing sentiment in crypto communities, forums, or financial social media texts in Turkish.

  • Developed by: [SkyWalkertT1 - Furkan Fatih Γ‡iftΓ§i]
  • Funded by: Personal / Community Open Source
  • Shared by: SkyWalkertT1
  • Model type: BERT-based Sequence Classification
  • Language(s) (NLP): Turkish
  • License: Apache 2.0
  • Finetuned from model: dbmdz/bert-base-turkish-cased

πŸ“š Training Details

Training Data

Dataset consists of labeled Turkish-language comments related to cryptocurrency, manually tagged with 3 sentiment labels.
The dataset used for training this model is proprietary and was created and labeled by the author.
The dataset shape is approximately (1171, 2) β€” indicating 1171 samples with 2 columns (text and label).

Model Sources

πŸ” Uses

Direct Use

  • Turkish sentiment analysis on crypto/financial text
  • Educational / experimental use for NLP in Turkish

Downstream Use

  • Integration into crypto sentiment bots
  • Turkish language feedback systems
  • Sentiment dashboards for crypto forums

Out-of-Scope Use

  • Use on non-Turkish text
  • Medical, legal, or other high-risk domain sentiment prediction

⚠️ Bias, Risks, and Limitations

The model was trained on data specific to cryptocurrency sentiment in Turkish. It may not generalize to other domains. Model performance may vary depending on the writing style and slang usage.

Recommendations

  • Do not use this model for critical decision-making.
  • Human validation should accompany any automated output.

πŸš€ How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_path = "SkyWalkertT1/my_crypto_comment_model"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)

text = "Bugün piyasada büyük bir düşüş bekliyorum."
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
inputs = {k: v.to(device) for k, v in inputs.items()}

with torch.no_grad():
    outputs = model(**inputs)

logits = outputs.logits
predicted_class = torch.argmax(logits, dim=1).item()
labels = ['negative', 'neutral', 'positive']

print(f"Prediction: {labels[predicted_class]}")

πŸ“š Training Details

Training Data

Dataset consists of labeled Turkish-language comments related to cryptocurrency, manually tagged with 3 sentiment labels.

Training Procedure

Model was fine-tuned using Hugging Face's Trainer API.

Training Hyperparameters

  • Epochs: 4
  • Batch size: 16
  • Optimizer: AdamW
  • Learning rate: 2e-5
  • Precision: fp32

πŸ“ˆ Evaluation

Testing Data, Factors & Metrics

Model evaluated on a 20% validation split from the same dataset.

Metrics

  • Accuracy
  • F1-score (macro average)

Results

  • Accuracy: ~85%
  • F1-macro: ~84%

🌍 Environmental Impact

Carbon emissions are minimal due to fine-tuning only (~4 hours on a single NVIDIA T4 GPU).

  • Hardware Type: NVIDIA T4 (Google Colab)
  • Hours used: ~4
  • Cloud Provider: Google Colab
  • Carbon Emitted: Approx. ~1 kg CO2eq

🧠 Technical Specifications

Model Architecture and Objective

BERT transformer architecture with a classification head on top for sequence classification into 3 sentiment classes.

Compute Infrastructure

  • Google Colab
  • PyTorch + Transformers

πŸ“£ Citation

BibTeX:

@misc{SkyWalkertT1_crypto_bert,
  author = {Furkan Fatih Γ‡iftΓ§i},
  title = {Turkish Crypto Sentiment Model},
  year = {2025.08.03},
  howpublished = {\url{https://huggingface.co/SkyWalkertT1/my_crypto_comment_model}},
}

πŸ“¬ Contact

For feedback or collaboration: