File size: 3,410 Bytes
39b9688
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
edf5b07
 
 
 
 
 
 
 
 
 
 
 
39b9688
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1591c0c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
---
language:
- zh
- en
tags:
- embeddings
- retrieval
- transformer-free
- safetensors
- edge-ai
license: mit
---

# CleanOwl — AI Slop Detector

**I hate AI-SLOP SO I MADE THIS.**

![CleanOwl](./CleanOwl.png)

CleanOwl is a lightweight **human-likeness scoring engine**.

It estimates how “human” a piece of text feels — not by classification,  
but by analyzing structural signals such as:

- token distribution irregularity  
- semantic continuity  
- punctuation behavior  

No transformers. No fine-tuning. Just statistical signals.

---

## Performance

- ~0.04 ms per token
- ~0.8 ms per sentence (typical)
- ~120 ms startup

Runs entirely on CPU.

Linear time complexity: O(n)

---

## 🧠 What it actually measures

CleanOwl does **not** directly detect AI.

Instead, it measures how **smooth vs irregular** a piece of writing is:

- Human writing → irregular, biased, “spiky”
- AI / formal text → smooth, evenly distributed

---

## 📊 Score Interpretation

| Score | Meaning |
|------|--------|
| < 60 | Likely AI-generated / formal text |
| 60–75 | Mixed / ambiguous |
| > 75 | Likely human-like message |

> This is a heuristic scoring system, not a classifier.

---

## ⚠️ Limitations

- Short sentences may be unstable  
- Highly polished human writing (e.g. essays, Wikipedia) may look AI-like  
- AI can mimic human irregularity  

This is a **lightweight detector**, not a definitive AI classifier.

---

## Quickstart

### 1️⃣ Install

```bash
git clone https://huggingface.co/WangKaiLin/CleanOwl-AI-Slop-Detector
cd CleanOwl-AI-Slop-Detector

pip install numpy safetensors fastapi uvicorn
```

### 2️⃣ Run Local API

```bash
uvicorn app:app --host 127.0.0.1 --port 8000 --reload
```

Open in browser:

http://127.0.0.1:8000/docs

If you see /detect, the API is running correctly.

### 3️⃣ Chrome Extension Setup

CleanOwl works via a local API + Chrome extension.

Open Chrome:
chrome://extensions/
Enable Developer Mode (top right)
Click Load unpacked
Select:
CleanOwl-AI-Slop-Detector/extension/
Refresh any webpage (Ctrl + R)

👉 CleanOwl will now analyze the page automatically.

### 🔒 Privacy

CleanOwl runs entirely on your local machine.
No data is sent to any external server.

### Usage
```bash
# CLI (scoring)
python ai_score.py

# Embedding demo
python quickstart.py
```

## Extension perform

![detect](./detect.png)

## Example(ai_score.py)

```bash
請輸入文字:先思考:在 AI 時代,什麼樣的人才不會被取代?我的答案是:具備溝通能力的人、擁有韌性的人,以及始終願意站在第一線的人。

human score: 47.13
label: ai_slop_like

請輸入文字:身為專業的肥宅 都會把脂肪放在身上

human score: 76.88
label: maybe_human_like
```

## Repository Structure

```bash
CleanOwl-AI-Slop-Detector/
├─ ai_score.py          # scoring logic (CleanOwl core)
├─ quickstart.py        # embedding demo CLI
├─ engine.py            # PipeOwl tokenizer + embedding loader
├─ pipeowl.safetensors  # embeddings + delta_field
├─ tokenizer.json
├─ ptt.npy              # style field (PTT-like distribution)
├─ config.json
├─ app.py               # FastAPI server
├─ requirements.txt
├─ extension/
│  ├─ content.js        # Chrome content script
│  └─ manifest.json
├─ example.md
├─ README.md
└─ LICENSE
```

## LICENSE

MIT