Instructions to use LDKSolutions/Ko-Wiki-ChatGPT-Detector-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LDKSolutions/Ko-Wiki-ChatGPT-Detector-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="LDKSolutions/Ko-Wiki-ChatGPT-Detector-v1")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("LDKSolutions/Ko-Wiki-ChatGPT-Detector-v1") model = AutoModelForMaskedLM.from_pretrained("LDKSolutions/Ko-Wiki-ChatGPT-Detector-v1") - Notebooks
- Google Colab
- Kaggle
ChatGPT ์ ๋ณด์ฑ (Wikipedia) ๊ธ ํ์ง๊ธฐ
Description
Model that detects if a Korean wikipedia-style text is written by ChatGPT.
KoBigBird model fine-tuned with approximately 30,000 data of human written wikipedia summaries and ChatGPT generated wikipedia summaries.
Our classifier based on this checkpoint reached a validation accuracy of 99.2% after 10 epochs of training.
Example Usage
from transformers import *
import torch
import torch.nn as nn
model = AutoModelForSequenceClassification.from_pretrained("LDKSolutions/Ko-Wiki-ChatGPT-Detector-v1", num_labels=2)
model.eval()
tokenizer = AutoTokenizer.from_pretrained("LDKSolutions/Ko-Wiki-ChatGPT-Detector-v1")
# this is text generated by ChatGPT (GPT 3.5)
text = '''์ค๋งํธ ์ปจํธ๋ํธ๋ ๋ธ๋ก์ฒด์ธ ๊ธฐ์ ์์ ์ฌ์ฉ๋๋ ํ๋ก๊ทธ๋จ ์ฝ๋์ ์ผ์ข
์ผ๋ก, ๊ณ์ฝ ์กฐ๊ฑด์ ์๋์ผ๋ก ๊ฒ์ฆํ๊ณ ์คํํ๋ ํ๋ก๊ทธ๋จ์
๋๋ค. ์ค๋งํธ ์ปจํธ๋ํธ๋ ๋ธ๋ก์ฒด์ธ ๋คํธ์ํฌ ์์์ ์คํ๋๋ฉฐ, ๋ธ๋ก์ฒด์ธ์์ ๋ฐ์ํ๋ ๋ชจ๋ ํธ๋์ญ์
์ ์ค๋งํธ ์ปจํธ๋ํธ๋ฅผ ํตํด ์ฒ๋ฆฌ๋ฉ๋๋ค.
์ค๋งํธ ์ปจํธ๋ํธ๋ ์กฐ๊ฑด๊ณผ ์คํ ์ฝ๋๋ก ์ด๋ฃจ์ด์ ธ ์์ต๋๋ค. ์๋ฅผ ๋ค์ด, A๊ฐ B์๊ฒ 1,000๋ฌ๋ฌ๋ฅผ ์ง๋ถํด์ผํ๋ ๊ณ์ฝ์ด ์๋ค๋ฉด, ์ด ๊ณ์ฝ์ ์กฐ๊ฑด์ ์ค๋งํธ ์ปจํธ๋ํธ๋ก ์์ฑํ์ฌ ๊ณ์ฝ์ด ์๋์ผ๋ก ์คํ๋๋๋ก ํ ์ ์์ต๋๋ค. ์ด๋ฌํ ์ค๋งํธ ์ปจํธ๋ํธ๋ ์๋์ผ๋ก ์กฐ๊ฑด์ ๊ฒ์ฆํ๊ณ , ์ง์ ๋ ์กฐ๊ฑด์ด ์ถฉ์กฑ๋์์ ๋ ๊ณ์ฝ์ ์คํ ์ฝ๋๋ฅผ ์คํํ์ฌ ๊ณ์ฝ์ ์ดํํฉ๋๋ค. ์ด๋ฅผ ํตํด ๊ณ์ฝ ๋น์ฌ์๋ ์๋ก๋ฅผ ์ ๋ขฐํ์ง ์์๋ ์์ ํ๊ฒ ๊ฑฐ๋๋ฅผ ์งํํ ์ ์์ต๋๋ค.
์ค๋งํธ ์ปจํธ๋ํธ๋ ๋ธ๋ก์ฒด์ธ์์ ์คํ๋๊ธฐ ๋๋ฌธ์ ๋ชจ๋ ๊ฑฐ๋ ๋ด์ญ์ด ํฌ๋ช
ํ๊ฒ ๊ธฐ๋ก๋๋ฉฐ, ์ค๊ฐ์ธ์ด๋ ์ค์ ๊ธฐ๊ด์ ๊ฐ์
์ด ์๊ธฐ ๋๋ฌธ์ ๊ฑฐ๋ ๋น์ฉ์ด ์ค์ด๋ญ๋๋ค. ๋ํ ์ค๋งํธ ์ปจํธ๋ํธ๋ ์ฝ๋๋ก ์์ฑ๋๊ธฐ ๋๋ฌธ์ ์๋ํ๊ฐ ๊ฐ๋ฅํ๋ฉฐ, ํ๋ก๊ทธ๋จ์ ๋ฐ๋ฅธ ๋น์ฆ๋์ค ๋ก์ง์ ์คํํ๋ ๊ฒ์ด ๊ฐ๋ฅํฉ๋๋ค. ๋ฐ๋ผ์ ์ค๋งํธ ์ปจํธ๋ํธ๋ ๋ธ๋ก์ฒด์ธ ๊ธฐ์ ์ ํต์ฌ ๊ธฐ๋ฅ ์ค ํ๋๋ก, ๋ถ์ฐํ ์ ํ๋ฆฌ์ผ์ด์
(DApp) ๊ฐ๋ฐ ๋ฑ์ ํ์ฉ๋๊ณ ์์ต๋๋ค.'''
encoded_inputs = tokenizer(text, max_length=512, truncation=True, padding="max_length", return_tensors="pt")
with torch.no_grad():
outputs = model(**encoded_inputs).logits
probability = nn.Softmax()(outputs)
predicted_class = torch.argmax(probability).item()
if predicted_class == 1:
print("ChatGPT๊ฐ ์์ฑํ์ ํ๋ฅ ์ด ๋์ต๋๋ค!")
else:
print("์ธ๊ฐ์ด ์์ฑํ์ ํ๋ฅ ์ด ๋์ต๋๋ค!")
- Downloads last month
- 30