metadata
language: en
tags:
- twitch
- roberta
- domain-adaptation
- nlp
- masked-language-modeling
license: mit
base_model: cardiffnlp/twitter-roberta-base-sentiment-latest
Twitch-RoBERTa-Base (Domain Adapted)
This is a Domain-Adapted RoBERTa-Base model pre-trained on ~1.1 million real Twitch chat messages.
It solves the "Domain Shift" problem where standard NLP models (trained on Wikipedia/Twitter) fail to understand gaming slang. For example, standard models often classify "cracked" as Negative (broken) or "cap" as Neutral (hat). This model understands that in a gaming context, "cracked" means Skillful and "cap" means Lie.
Model Performance
| Metric | Baseline (Twitter-RoBERTa) | Twitch-RoBERTa (This Model) |
|---|---|---|
| Perplexity | ~21,375 | ~5.5 |
| Loss | 9.97 | 1.7 |
Result: A ~82% reduction in perplexity, effectively teaching the model the specific vocabulary, syntax, and emote usage patterns of the Twitch community.
Architecture & Training
- Base Architecture:
roberta-base(125M parameters) - Training Task: Masked Language Modeling (MLM)
- Dataset: ~1.1 million diverse Twitch messages (aggregated from various popular twitch channels to ensure generalization).
- Optimization:
- Precision: FP16 Mixed Precision
- Batch Strategy (On local GPU): Gradient Accumulation (4 Microsteps Per Batch x 4 Batches = Effective Batch Size: 16)
- Masking: Dynamic Masking (0.15 probability) to prevent overfitting on small data.
Intended Use
Recommended Use Cases:
- Fine-Tuning: Use this model as the base for training a Sentiment Classifier, Toxicity Detector, or Spam Filter for Twitch chat. It will converge significantly faster and with higher accuracy than a generic BERT model.
- Masked Prediction: Auto-completing gaming messages or understanding slang context.
Usage Example
from transformers import pipeline
fill_mask = pipeline(
"fill-mask",
model="osamuyiohenhen/twitch-roberta-base",
tokenizer="osamuyiohenhen/twitch-roberta-base"
)
# Test the model's understanding of slang
result = fill_mask("That play was absolutely <mask>.")
print(result)
# Likely predictions: "cracked", "insane", "nuts", "crazy"
Limitations
- Context Window: 128 tokens (optimized for short chat messages).