An NLI-Based Approach to Asset-Specific Stance Detection in Cryptocurrency Tweets

This model classifies the stance of tweets toward Bitcoin (BTC) and Ethereum (ETH) as Bullish, Bearish, or Neutral using a Natural Language Inference (NLI) approach.

It was fine-tuned from MoritzLaurer/DeBERTa-v3-base-mnli as part of a master's thesis on NLI-based cryptocurrency stance detection.

How it works

Instead of standard 3-class classification, this model frames stance detection as an entailment task. For each tweet, three hypotheses are constructed (one per stance), and the model scores which hypothesis is most entailed by the tweet:

Stance Hypothesis
Bullish "The author's perspective on {target} in this tweet reflects a bullish sentiment."
Bearish "The author's perspective on {target} in this tweet reflects a bearish sentiment."
Neutral "The author's perspective on {target} in this tweet reflects a neutral sentiment."

The predicted stance is the one with the highest entailment score.

Usage

from transformers import pipeline

classifier = pipeline("zero-shot-classification", model="syahrezapratama/crypto-stance-nli")

tweet = "Bitcoin is going to the moon! $100k is just the beginning 🚀"

result = classifier(
    tweet,
    candidate_labels=["bullish", "bearish", "neutral"],
    hypothesis_template="The author's perspective on BTC in this tweet reflects a {} sentiment.",
)

print(result["labels"][0])  # "bullish"
print(result["scores"][0])  # ~0.999

Manual inference (more control)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "syahrezapratama/crypto-stance-nli"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()

tweet = "I'm not sure where ETH is headed, could go either way"
stances = ["bullish", "bearish", "neutral"]
template = "The author's perspective on ETH in this tweet reflects a {} sentiment."

scores = []
for stance in stances:
    hypothesis = template.format(stance)
    inputs = tokenizer(tweet, hypothesis, return_tensors="pt", truncation=True, max_length=128)
    with torch.no_grad():
        logits = model(**inputs).logits
    # Entailment is index 0 for DeBERTa-MNLI
    entailment_score = torch.softmax(logits, dim=-1)[0, 0].item()
    scores.append(entailment_score)

predicted = stances[scores.index(max(scores))]
print(f"Predicted stance: {predicted}")  # "neutral"

Performance

Evaluated on a held-out test set of 450 tweets (70/15/15 train/val/test split, seed=42).

Overall metrics

Metric Value
Accuracy 80.67%
Macro F1 0.7606
Weighted F1 0.8009

Per-class metrics

Class Precision Recall F1 Support
Bearish 0.8108 0.6383 0.7143 47
Neutral 0.8137 0.9154 0.8616 272
Bullish 0.7850 0.6412 0.7059 131

Comparison with baselines

Model Paradigm Accuracy Macro F1
DeBERTa-MNLI Zero-Shot (Baseline) 43.33% 0.4192
DeBERTa-MNLI Zero-Shot (OPRO) 50.44% 0.4538
DeBERTa-NLI Fine-Tuned 80.67% 0.7606
GPT-4o Zero-Shot 76.67% 0.7275

Training details

Dataset

  • Source: 3,000 cryptocurrency tweets about BTC and ETH
  • Labels: Bullish (873), Neutral (1,814), Bearish (313)
  • Split: 2,100 train / 450 val / 450 test (seed=42)

NLI training approach

Each tweet is expanded into 3 NLI premise–hypothesis pairs:

  • Correct stance → entailment
  • Incorrect stances → contradiction

This results in 6,300 training pairs from 2,100 tweets.

Hyperparameters

Parameter Value
Base model MoritzLaurer/DeBERTa-v3-base-mnli
Learning rate 2e-5
Batch size 8 (physical) × 2 (accumulation) = 16 (effective)
Max epochs 5 (early stopping patience = 3)
Max sequence length 128
Warmup 10% linear warmup + linear decay
Weight decay 0.01
Class weights Bearish=3.19, Neutral=0.55, Bullish=1.15
Optimizer AdamW
Best epoch Early stopped based on validation macro F1

Hypothesis template (OPRO-optimized)

The hypothesis template was optimized using OPRO (Optimization by PROmpting) with GPT-4o-mini:

"The author's perspective on {target} in this tweet reflects a {stance} sentiment."

Limitations

  • Domain-specific: Trained only on cryptocurrency tweets (BTC and ETH). May not generalize to other financial assets or domains.
  • Class imbalance: Bearish tweets are underrepresented (10.4% of data), leading to lower recall on bearish stance despite class weighting.
  • Language: English only.
  • Temporal: Trained on tweets from a specific time period. Cryptocurrency language and sentiment patterns evolve rapidly.
  • Tweet preprocessing: Best results when input text is preprocessed (URL/mention removal, emoji handling) similar to the training data.

Citation

If you use this model, please cite:

@mastersthesis{pratama2026cryptostancenli,
  title={NLI-Based Cryptocurrency Stance Detection},
  author={Pratama, Syahreza},
  year={2026},
  school={Your University}
}

License

MIT

Downloads last month
28
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results