Papers
arxiv:2601.04633

MAGA-Bench: Machine-Augment-Generated Text via Alignment Detection Benchmark

Published on Jan 8
Authors:
,
,
,

Abstract

Machine-Augment-Generated Text via Alignment (MAGA) enhances text generation alignment through reinforced learning from detectors feedback to improve detector generalization and robustness.

AI-generated summary

Large Language Models (LLMs) alignment is constantly evolving. Machine-Generated Text (MGT) is becoming increasingly difficult to distinguish from Human-Written Text (HWT). This has exacerbated abuse issues such as fake news and online fraud. Fine-tuned detectors' generalization ability is highly dependent on dataset quality, and simply expanding the sources of MGT is insufficient. Further augment of generation process is required. According to HC-Var's theory, enhancing the alignment of generated text can not only facilitate attacks on existing detectors to test their robustness, but also help improve the generalization ability of detectors fine-tuned on it. Therefore, we propose Machine-Augment-Generated Text via Alignment (MAGA). MAGA's pipeline achieves comprehensive alignment from prompt construction to reasoning process, among which Reinforced Learning from Detectors Feedback (RLDF), systematically proposed by us, serves as a key component. In our experiments, the RoBERTa detector fine-tuned on MAGA training set achieved an average improvement of 4.60\% in generalization detection AUC. MAGA Dataset caused an average decrease of 8.13\% in the AUC of the selected detectors, expecting to provide indicative significance for future research on the generalization detection ability of detectors.

Community

Sign up or log in to comment

Models citing this paper 4

Datasets citing this paper 3

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.04633 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.