The AI ​​probability always extremely high?

#1
by 3unny - opened

Hi! Could you please help me understand why the detection rate always displays 'AI Generated'? I was wondering if there might be something I'm doing wrong. Thank you.
image

3unny changed discussion title from Why is the AI ​​probability always extreme high? to The AI ​​probability always extremely high?

second this, seems to be either bad model weights, or fake results in general

Owner

What dataset are you using? The model was only trained and tested on the RAID dataset. This is the same model I used to generate the results for my RAID submission.

What dataset are you using? The model was only trained and tested on the RAID dataset. This is the same model I used to generate the results for my RAID submission.

Hello, I use some WSJ news articles and text collections I use daily, running them in Python like Usage example, but almost all of them are Prediction: AI-generated.
image

Owner

The model is trained only on the RAID dataset, so it’s not surprising that it doesn’t generalize well to OOD samples.
If you retrain it according to the methodology in the README, it should work much better.

Yes, same issue. Keeps giving high probability to AI generated content. Maybe the fact that it is trained on a base version of RoBERTa makes it behave like that.

Hi! Could you please help me understand why the detection rate always displays 'AI Generated'? I was wondering if there might be something I'm doing wrong. Thank you.
image

Ensure the text does not contain tab (\n), as the detection model is highly affected by it. That is keep all the text in one line with \n (tab) break.

The model is trained only on the RAID dataset, so it’s not surprising that it doesn’t generalize well to OOD samples.
If you retrain it according to the methodology in the README, it should work much better.

Is your fine-tuning code public?

Sign up or log in to comment