Hello-SimpleAI
/

chatgpt-detector-roberta-chinese

Text Classification

text-embeddings-inference

Model card Files Files and versions

chatgpt-detector-roberta-chinese / README.md

izhx's picture

Update README.md

2f0f5f2 about 3 years ago

|

1.4 kB

	---
	datasets:
	- Hello-SimpleAI/HC3-Chinese
	language:
	- zh
	pipeline_tag: text-classification
	tags:
	- chatgpt
	---

	# Model Card for `Hello-SimpleAI/chatgpt-detector-roberta-chinese`

	This model is trained on the mix of full-text and splitted sentences of `answer`s from [Hello-SimpleAI/HC3-Chinese](https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese).

	More details refer to [arxiv: 2301.07597](https://arxiv.org/abs/2301.07597) and Gtihub project [Hello-SimpleAI/chatgpt-comparison-detection](https://github.com/Hello-SimpleAI/chatgpt-comparison-detection).


	The base checkpoint is [hfl/chinese-roberta-wwm-ext](https://huggingface.co/hfl/chinese-roberta-wwm-ext).
	We train it with all [Hello-SimpleAI/HC3-Chinese](https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese) data (without held-out) for 2 epochs.

	(2-epoch is consistent with the experiments in [our paper](https://arxiv.org/abs/2301.07597).)

	## Citation

	Checkout this papaer [arxiv: 2301.07597](https://arxiv.org/abs/2301.07597)

	```
	@article{guo-etal-2023-hc3,
	title = "How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection",
	author = "Guo, Biyang and
	Zhang, Xin and
	Wang, Ziyuan and
	Jiang, Minqi and
	Nie, Jinran and
	Ding, Yuxuan and
	Yue, Jianwei and
	Wu, Yupeng",
	journal={arXiv preprint arxiv:2301.07597}
	year = "2023",
	}
	```