AIxcellent Vibes at GermEval 2025 Shared Task on Candy Speech Detection: Improving Model Performance by Span-Level Training
Paper • 2509.07459 • Published
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("cortex359/germeval2025", dtype="auto")This model is a fine-tuned XLM-RoBERTa-Large adapted for the GermEval 2025 Shared Task on Candy Speech Detection. It was trained to identify candy speech at both:
The span-level model also proved effective for binary detection by classifying a comment as candy speech if at least one positive span was detected.
Dataset: 46k German YouTube comments, annotated with candy speech spans.
Training Data Split: 37,057 comments (train), 9,229 (test).
Shared Task Results:
If you use this model, please cite:
@inproceedings{thelen-etal-2025-aixcellent,
title = "{AI}xcellent Vibes at {G}erm{E}val 2025 Shared Task on Candy Speech Detection: Improving Model Performance by Span-Level Training",
author = "Thelen, Christian Rene and
Blaneck, Patrick Gustav and
Bornheim, Tobias and
Grieger, Niklas and
Bialonski, Stephan",
editor = "Wartena, Christian and
Heid, Ulrich",
booktitle = "Proceedings of the 21st Conference on Natural Language Processing (KONVENS 2025): Workshops",
month = sep,
year = "2025",
address = "Hannover, Germany",
publisher = "HsH Applied Academics",
url = "https://aclanthology.org/2025.konvens-2.33/",
pages = "398--403"
}
AIxcellent Vibes at GermEval 2025 Shared Task on Candy Speech Detection: Improving Model Performance by Span-Level Training (Thelen et al., KONVENS 2025)
Base model
FacebookAI/xlm-roberta-base
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="cortex359/germeval2025")