| | --- |
| | datasets: |
| | - Hello-SimpleAI/HC3-Chinese |
| | language: |
| | - zh |
| | pipeline_tag: text-classification |
| | tags: |
| | - chatgpt |
| | --- |
| | |
| | # Model Card for `Hello-SimpleAI/chatgpt-detector-roberta-chinese` |
| |
|
| | This model is trained on **the mix of full-text and splitted sentences** of `answer`s from [Hello-SimpleAI/HC3-Chinese](https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese). |
| |
|
| | More details refer to [arxiv: 2301.07597](https://arxiv.org/abs/2301.07597) and Gtihub project [Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection). |
| |
|
| |
|
| | The base checkpoint is [hfl/chinese-roberta-wwm-ext](https://huggingface.co/hfl/chinese-roberta-wwm-ext). |
| | We train it with all [Hello-SimpleAI/HC3-Chinese](https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese) data (without held-out) for 2 epochs. |
| |
|
| | (2-epoch is consistent with the experiments in [our paper](https://arxiv.org/abs/2301.07597).) |
| |
|
| | ## Citation |
| |
|
| | Checkout this papaer [arxiv: 2301.07597](https://arxiv.org/abs/2301.07597) |
| |
|
| | ``` |
| | @article{guo-etal-2023-hc3, |
| | title = "How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection", |
| | author = "Guo, Biyang and |
| | Zhang, Xin and |
| | Wang, Ziyuan and |
| | Jiang, Minqi and |
| | Nie, Jinran and |
| | Ding, Yuxuan and |
| | Yue, Jianwei and |
| | Wu, Yupeng", |
| | journal={arXiv preprint arxiv:2301.07597} |
| | year = "2023", |
| | } |
| | ``` |