File size: 3,410 Bytes
39b9688 edf5b07 39b9688 1591c0c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | ---
language:
- zh
- en
tags:
- embeddings
- retrieval
- transformer-free
- safetensors
- edge-ai
license: mit
---
# CleanOwl — AI Slop Detector
**I hate AI-SLOP SO I MADE THIS.**

CleanOwl is a lightweight **human-likeness scoring engine**.
It estimates how “human” a piece of text feels — not by classification,
but by analyzing structural signals such as:
- token distribution irregularity
- semantic continuity
- punctuation behavior
No transformers. No fine-tuning. Just statistical signals.
---
## Performance
- ~0.04 ms per token
- ~0.8 ms per sentence (typical)
- ~120 ms startup
Runs entirely on CPU.
Linear time complexity: O(n)
---
## 🧠 What it actually measures
CleanOwl does **not** directly detect AI.
Instead, it measures how **smooth vs irregular** a piece of writing is:
- Human writing → irregular, biased, “spiky”
- AI / formal text → smooth, evenly distributed
---
## 📊 Score Interpretation
| Score | Meaning |
|------|--------|
| < 60 | Likely AI-generated / formal text |
| 60–75 | Mixed / ambiguous |
| > 75 | Likely human-like message |
> This is a heuristic scoring system, not a classifier.
---
## ⚠️ Limitations
- Short sentences may be unstable
- Highly polished human writing (e.g. essays, Wikipedia) may look AI-like
- AI can mimic human irregularity
This is a **lightweight detector**, not a definitive AI classifier.
---
## Quickstart
### 1️⃣ Install
```bash
git clone https://huggingface.co/WangKaiLin/CleanOwl-AI-Slop-Detector
cd CleanOwl-AI-Slop-Detector
pip install numpy safetensors fastapi uvicorn
```
### 2️⃣ Run Local API
```bash
uvicorn app:app --host 127.0.0.1 --port 8000 --reload
```
Open in browser:
http://127.0.0.1:8000/docs
If you see /detect, the API is running correctly.
### 3️⃣ Chrome Extension Setup
CleanOwl works via a local API + Chrome extension.
Open Chrome:
chrome://extensions/
Enable Developer Mode (top right)
Click Load unpacked
Select:
CleanOwl-AI-Slop-Detector/extension/
Refresh any webpage (Ctrl + R)
👉 CleanOwl will now analyze the page automatically.
### 🔒 Privacy
CleanOwl runs entirely on your local machine.
No data is sent to any external server.
### Usage
```bash
# CLI (scoring)
python ai_score.py
# Embedding demo
python quickstart.py
```
## Extension perform

## Example(ai_score.py)
```bash
請輸入文字:先思考:在 AI 時代,什麼樣的人才不會被取代?我的答案是:具備溝通能力的人、擁有韌性的人,以及始終願意站在第一線的人。
human score: 47.13
label: ai_slop_like
請輸入文字:身為專業的肥宅 都會把脂肪放在身上
human score: 76.88
label: maybe_human_like
```
## Repository Structure
```bash
CleanOwl-AI-Slop-Detector/
├─ ai_score.py # scoring logic (CleanOwl core)
├─ quickstart.py # embedding demo CLI
├─ engine.py # PipeOwl tokenizer + embedding loader
├─ pipeowl.safetensors # embeddings + delta_field
├─ tokenizer.json
├─ ptt.npy # style field (PTT-like distribution)
├─ config.json
├─ app.py # FastAPI server
├─ requirements.txt
├─ extension/
│ ├─ content.js # Chrome content script
│ └─ manifest.json
├─ example.md
├─ README.md
└─ LICENSE
```
## LICENSE
MIT |