hugfaceguy0001
/

AITextDetector

Text Classification

Model card Files Files and versions

hugfaceguy0001 commited on Sep 3, 2024

Commit

6cc6947

·

verified ·

1 Parent(s): 20e0f0f

Update README.md

Files changed (1) hide show

README.md +35 -3

README.md CHANGED Viewed

@@ -1,3 +1,35 @@
----
-license: openrail
----

+---
+license: openrail
+datasets:
+- Skywork/SkyPile-150B
+- wangrui6/Zhihu-KOL
+- silk-road/alpaca-data-gpt4-chinese
+language:
+- zh
+base_model: openai-community/gpt2
+pipeline_tag: text-classification
+tags:
+- text-classification
+---
+# AI文本检测器
+本模型是gpt2的微调模型，用于文本分类。
+本模型支持三个类别：AI,zhihu,other. 它们分别表示AI生成的文本，知乎用户回答文本和其他文本。
+## 训练数据
+使用 `alpaca-data-gpt4-chinese` 中的约52000条回答文本作为AI生成文本，`Zhihu-KOL` 中随机选择的约52000条回答文本作为知乎用户回答文本，`SkyPile-150B` 中随机选择的约52000条文本作为其他文本。
+共约15.6万条分类文本，组成数据集，其中80%用于训练，20%用于测试。
+## 性能
+分类准确率 `accuracy = 0.9802627363024672`.
+各样本真实标签和检测结果组成的混淆矩阵为
+|       | AI    | zhihu | other |
+|:-----:|:-----:|:-----:|:-----:|
+| AI    | 10325 | 326   | 18    |
+| zhihu | 143   | 9969  | 87    |
+| other | 0     | 42    | 10300 |