Overview

This model fine-tunes the XLM-RoBERTa-base model with the train partition (346977 unique entries) of the Kaggle dataset Augmented data for LLM - Detect AI Generated Text. The training data is English, but the model should have cross-lingual capabilities. The model has not been fine-tuned to reach state-of-the-art, but serves more as a proof-of-concept for low-resource languages.

Task

The model is trained to predict whether a text/essay is written by a human (output label 0) or AI (output label 1).

Downloads last month: 9

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for daalft/xlmr-ai-text-detection

Base model

FacebookAI/xlm-roberta-base

Finetuned

(3979)

this model