WangKaiLin commited on
Commit
39b9688
·
verified ·
1 Parent(s): e1d1b36

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +157 -154
README.md CHANGED
@@ -1,155 +1,158 @@
1
- ---
2
- language:
3
- - zh
4
- - en
5
- tags:
6
- - embeddings
7
- - retrieval
8
- - transformer-free
9
- - safetensors
10
- - edge-ai
11
- license: mit
12
- ---
13
-
14
- # CleanOwl — AI Slop Detector
15
-
16
- **I hate AI-SLOP SO I MADE THIS.**
17
-
18
- ![CleanOwl](./CleanOwl.png)
19
-
20
- CleanOwl is a lightweight **human-likeness scoring engine**.
21
-
22
- It estimates how “human” a piece of text feels — not by classification,
23
- but by analyzing structural signals such as:
24
-
25
- - token distribution irregularity
26
- - semantic continuity
27
- - punctuation behavior
28
-
29
- No transformers. No fine-tuning. Just statistical signals.
30
-
31
- ---
32
-
33
- ## 🧠 What it actually measures
34
-
35
- CleanOwl does **not** directly detect AI.
36
-
37
- Instead, it measures how **smooth vs irregular** a piece of writing is:
38
-
39
- - Human writing → irregular, biased, “spiky”
40
- - AI / formal text → smooth, evenly distributed
41
-
42
- ---
43
-
44
- ## 📊 Score Interpretation
45
-
46
- | Score | Meaning |
47
- |------|--------|
48
- | < 60 | Likely AI-generated / formal text |
49
- | 60–75 | Mixed / ambiguous |
50
- | > 75 | Likely human-like message |
51
-
52
- > This is a heuristic scoring system, not a classifier.
53
-
54
- ---
55
-
56
- ## ⚠️ Limitations
57
-
58
- - Short sentences may be unstable
59
- - Highly polished human writing (e.g. essays, Wikipedia) may look AI-like
60
- - AI can mimic human irregularity
61
-
62
- This is a **lightweight detector**, not a definitive AI classifier.
63
-
64
- ---
65
-
66
- ## Quickstart
67
-
68
- ### 1️⃣ Install
69
-
70
- ```bash
71
- git clone https://huggingface.co/WangKaiLin/CleanOwl-AI-Slop-Detector
72
- cd CleanOwl-AI-Slop-Detector
73
-
74
- pip install numpy safetensors fastapi uvicorn
75
- ```
76
-
77
- ### 2️⃣ Run Local API
78
- uvicorn app:app --host 127.0.0.1 --port 8000 --reload
79
-
80
- Open in browser:
81
-
82
- http://127.0.0.1:8000/docs
83
-
84
- If you see /detect, the API is running correctly.
85
-
86
- ### 3️⃣ Chrome Extension Setup
87
-
88
- CleanOwl works via a local API + Chrome extension.
89
-
90
- Open Chrome:
91
- chrome://extensions/
92
- Enable Developer Mode (top right)
93
- Click Load unpacked
94
- Select:
95
- CleanOwl-AI-Slop-Detector/extension/
96
- Refresh any webpage (Ctrl + R)
97
-
98
- 👉 CleanOwl will now analyze the page automatically.
99
-
100
- ### 🔒 Privacy
101
-
102
- CleanOwl runs entirely on your local machine.
103
- No data is sent to any external server.
104
-
105
- ### Usage
106
- ```bash
107
- # CLI (scoring)
108
- python ai_score.py
109
-
110
- # Embedding demo
111
- python quickstart.py
112
- ```
113
-
114
- ## Extension perform
115
-
116
- ![detect](./detect.png)
117
-
118
- ## Example(ai_score.py)
119
-
120
- ```bash
121
- 請輸入文字:先思考:在 AI 時代,什麼樣的人才不會被取代?我的答案是:具備溝通能力的人、擁有韌性的人,以及始終願意站在第一線的人。
122
-
123
- human score: 47.13
124
- label: ai_slop_like
125
-
126
- 請輸入文字:身為專業的肥宅 都會把脂肪放在身上
127
-
128
- human score: 76.88
129
- label: maybe_human_like
130
- ```
131
-
132
- ## Repository Structure
133
-
134
- ```bash
135
- CleanOwl-AI-Slop-Detector/
136
- ├─ ai_score.py # scoring logic (CleanOwl core)
137
- ├─ quickstart.py # embedding demo CLI
138
- ├─ engine.py # PipeOwl tokenizer + embedding loader
139
- ├─ pipeowl.safetensors # embeddings + delta_field
140
- ├─ tokenizer.json
141
- ├─ ptt.npy # style field (PTT-like distribution)
142
- ├─ config.json
143
- ├─ app.py # FastAPI server
144
- ├─ requirements.txt
145
- ├─ extension/
146
- ├─ content.js # Chrome content script
147
- │ └manifest.json
148
- ├─ example.md
149
- ├─ README.md
150
- └─ LICENSE
151
- ```
152
-
153
- ## LICENSE
154
-
 
 
 
155
  MIT
 
1
+ ---
2
+ language:
3
+ - zh
4
+ - en
5
+ tags:
6
+ - embeddings
7
+ - retrieval
8
+ - transformer-free
9
+ - safetensors
10
+ - edge-ai
11
+ license: mit
12
+ ---
13
+
14
+ # CleanOwl — AI Slop Detector
15
+
16
+ **I hate AI-SLOP SO I MADE THIS.**
17
+
18
+ ![CleanOwl](./CleanOwl.png)
19
+
20
+ CleanOwl is a lightweight **human-likeness scoring engine**.
21
+
22
+ It estimates how “human” a piece of text feels — not by classification,
23
+ but by analyzing structural signals such as:
24
+
25
+ - token distribution irregularity
26
+ - semantic continuity
27
+ - punctuation behavior
28
+
29
+ No transformers. No fine-tuning. Just statistical signals.
30
+
31
+ ---
32
+
33
+ ## 🧠 What it actually measures
34
+
35
+ CleanOwl does **not** directly detect AI.
36
+
37
+ Instead, it measures how **smooth vs irregular** a piece of writing is:
38
+
39
+ - Human writing → irregular, biased, “spiky”
40
+ - AI / formal text → smooth, evenly distributed
41
+
42
+ ---
43
+
44
+ ## 📊 Score Interpretation
45
+
46
+ | Score | Meaning |
47
+ |------|--------|
48
+ | < 60 | Likely AI-generated / formal text |
49
+ | 60–75 | Mixed / ambiguous |
50
+ | > 75 | Likely human-like message |
51
+
52
+ > This is a heuristic scoring system, not a classifier.
53
+
54
+ ---
55
+
56
+ ## ⚠️ Limitations
57
+
58
+ - Short sentences may be unstable
59
+ - Highly polished human writing (e.g. essays, Wikipedia) may look AI-like
60
+ - AI can mimic human irregularity
61
+
62
+ This is a **lightweight detector**, not a definitive AI classifier.
63
+
64
+ ---
65
+
66
+ ## Quickstart
67
+
68
+ ### 1️⃣ Install
69
+
70
+ ```bash
71
+ git clone https://huggingface.co/WangKaiLin/CleanOwl-AI-Slop-Detector
72
+ cd CleanOwl-AI-Slop-Detector
73
+
74
+ pip install numpy safetensors fastapi uvicorn
75
+ ```
76
+
77
+ ### 2️⃣ Run Local API
78
+
79
+ ```bash
80
+ uvicorn app:app --host 127.0.0.1 --port 8000 --reload
81
+ ```
82
+
83
+ Open in browser:
84
+
85
+ http://127.0.0.1:8000/docs
86
+
87
+ If you see /detect, the API is running correctly.
88
+
89
+ ### 3️⃣ Chrome Extension Setup
90
+
91
+ CleanOwl works via a local API + Chrome extension.
92
+
93
+ Open Chrome:
94
+ chrome://extensions/
95
+ Enable Developer Mode (top right)
96
+ Click Load unpacked
97
+ Select:
98
+ CleanOwl-AI-Slop-Detector/extension/
99
+ Refresh any webpage (Ctrl + R)
100
+
101
+ 👉 CleanOwl will now analyze the page automatically.
102
+
103
+ ### 🔒 Privacy
104
+
105
+ CleanOwl runs entirely on your local machine.
106
+ No data is sent to any external server.
107
+
108
+ ### Usage
109
+ ```bash
110
+ # CLI (scoring)
111
+ python ai_score.py
112
+
113
+ # Embedding demo
114
+ python quickstart.py
115
+ ```
116
+
117
+ ## Extension perform
118
+
119
+ ![detect](./detect.png)
120
+
121
+ ## Example(ai_score.py)
122
+
123
+ ```bash
124
+ 請輸入文字:先思考:在 AI 時代,什麼樣的人才不會被取代?我的答案是:具備溝通能力的人、擁有韌性的人,以及始終願意站在第一線的人。
125
+
126
+ human score: 47.13
127
+ label: ai_slop_like
128
+
129
+ 請輸入文字:身為專業的肥宅 都會把脂肪放在身上
130
+
131
+ human score: 76.88
132
+ label: maybe_human_like
133
+ ```
134
+
135
+ ## Repository Structure
136
+
137
+ ```bash
138
+ CleanOwl-AI-Slop-Detector/
139
+ ├─ ai_score.py # scoring logic (CleanOwl core)
140
+ ├─ quickstart.py # embedding demo CLI
141
+ ├─ engine.py # PipeOwl tokenizer + embedding loader
142
+ ├─ pipeowl.safetensors # embeddings + delta_field
143
+ ├─ tokenizer.json
144
+ ├─ ptt.npy # style field (PTT-like distribution)
145
+ ├─ config.json
146
+ ├─ app.py # FastAPI server
147
+ requirements.txt
148
+ ├─ extension/
149
+ ├─ content.js # Chrome content script
150
+ └─ manifest.json
151
+ ├─ example.md
152
+ ├─ README.md
153
+ └─ LICENSE
154
+ ```
155
+
156
+ ## LICENSE
157
+
158
  MIT