twitch-roberta-base / README.md
muyihenhen's picture
Update README.md
c97b90b verified
metadata
language: en
tags:
  - twitch
  - roberta
  - domain-adaptation
  - nlp
  - masked-language-modeling
license: mit
base_model: cardiffnlp/twitter-roberta-base-sentiment-latest

Twitch-RoBERTa-Base (Domain Adapted)

This is a Domain-Adapted RoBERTa-Base model pre-trained on ~1.1 million real Twitch chat messages.

It solves the "Domain Shift" problem where standard NLP models (trained on Wikipedia/Twitter) fail to understand gaming slang. For example, standard models often classify "cracked" as Negative (broken) or "cap" as Neutral (hat). This model understands that in a gaming context, "cracked" means Skillful and "cap" means Lie.

Model Performance

Metric Baseline (Twitter-RoBERTa) Twitch-RoBERTa (This Model)
Perplexity ~21,375 ~5.5
Loss 9.97 1.7

Result: A ~82% reduction in perplexity, effectively teaching the model the specific vocabulary, syntax, and emote usage patterns of the Twitch community.

Architecture & Training

  • Base Architecture: roberta-base (125M parameters)
  • Training Task: Masked Language Modeling (MLM)
  • Dataset: ~1.1 million diverse Twitch messages (aggregated from various popular twitch channels to ensure generalization).
  • Optimization:
    • Precision: FP16 Mixed Precision
    • Batch Strategy (On local GPU): Gradient Accumulation (4 Microsteps Per Batch x 4 Batches = Effective Batch Size: 16)
    • Masking: Dynamic Masking (0.15 probability) to prevent overfitting on small data.

Intended Use

Recommended Use Cases:

  1. Fine-Tuning: Use this model as the base for training a Sentiment Classifier, Toxicity Detector, or Spam Filter for Twitch chat. It will converge significantly faster and with higher accuracy than a generic BERT model.
  2. Masked Prediction: Auto-completing gaming messages or understanding slang context.

Usage Example

from transformers import pipeline

fill_mask = pipeline(
    "fill-mask",
    model="osamuyiohenhen/twitch-roberta-base",
    tokenizer="osamuyiohenhen/twitch-roberta-base"
)

# Test the model's understanding of slang
result = fill_mask("That play was absolutely <mask>.")
print(result)
# Likely predictions: "cracked", "insane", "nuts", "crazy"

Limitations

  • Context Window: 128 tokens (optimized for short chat messages).