metadata
license: mit
MoSEs SAR Models: Stylistics-Aware Router
This repository contains the trained Stylistics-Aware Router (SAR) models for the MoSEs framework, an uncertainty-aware AI-generated text detection system. The SAR models are used to route input texts to relevant reference samples based on stylistic features.
Model Overview
Two SAR models are provided, trained on different datasets:
main_1000.pt (Main SAR Model)
- Training Dataset: Main dataset with 8,000 samples across 8 domains
- Domains: CMV, SciXGen, WP, Xsum (with human and AI-generated continuations)
tiny_200.pt (Tiny SAR Model)
- Training Dataset: Tiny dataset with 1,600 samples across 4 domains
- Domains: CNN, DialogSum, IMDB, PubMed (with human and GPT-4 generated variants)
Citation
If you use these models in your research, please cite the MoSEs paper:
@inproceedings{wu2025moses,
title={MoSEs: Uncertainty-Aware AI-Generated Text Detection via Mixture of Stylistics Experts with Conditional Thresholds},
author={Wu, Junxi and Wang, Jinpeng and Liu, Zheng and Chen, Bin and Hu, Dongjian and Wu, Hao and Xia, Shu-Tao},
booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
year={2025},
publisher={Association for Computational Linguistics}
}
For the specific SAR models:
@model{moses_sar_models,
title={MoSEs Stylistics-Aware Router},
author={Wu, Junxi and Wang, Jinpeng and Liu, Zheng and Chen, Bin and Hu, Dongjian and Wu, Hao and Xia, Shu-Tao},
year={2025},
url={https://huggingface.co/zhengliu8/Stylistics_Aware_Router}
}
Related Resources
- MoSEs Paper: arXiv:2509.02499
- MoSEs Code: GitHub Repository
- Stylistics Reference Repository: HuggingFace Dataset
License
This model is licensed under MIT Licence.