WangKaiLin's picture
Update README.md
edf5b07 verified
---
language:
- zh
- en
tags:
- embeddings
- retrieval
- transformer-free
- safetensors
- edge-ai
license: mit
---
# CleanOwl — AI Slop Detector
**I hate AI-SLOP SO I MADE THIS.**
![CleanOwl](./CleanOwl.png)
CleanOwl is a lightweight **human-likeness scoring engine**.
It estimates how “human” a piece of text feels — not by classification,
but by analyzing structural signals such as:
- token distribution irregularity
- semantic continuity
- punctuation behavior
No transformers. No fine-tuning. Just statistical signals.
---
## Performance
- ~0.04 ms per token
- ~0.8 ms per sentence (typical)
- ~120 ms startup
Runs entirely on CPU.
Linear time complexity: O(n)
---
## 🧠 What it actually measures
CleanOwl does **not** directly detect AI.
Instead, it measures how **smooth vs irregular** a piece of writing is:
- Human writing → irregular, biased, “spiky”
- AI / formal text → smooth, evenly distributed
---
## 📊 Score Interpretation
| Score | Meaning |
|------|--------|
| < 60 | Likely AI-generated / formal text |
| 60–75 | Mixed / ambiguous |
| > 75 | Likely human-like message |
> This is a heuristic scoring system, not a classifier.
---
## ⚠️ Limitations
- Short sentences may be unstable
- Highly polished human writing (e.g. essays, Wikipedia) may look AI-like
- AI can mimic human irregularity
This is a **lightweight detector**, not a definitive AI classifier.
---
## Quickstart
### 1️⃣ Install
```bash
git clone https://huggingface.co/WangKaiLin/CleanOwl-AI-Slop-Detector
cd CleanOwl-AI-Slop-Detector
pip install numpy safetensors fastapi uvicorn
```
### 2️⃣ Run Local API
```bash
uvicorn app:app --host 127.0.0.1 --port 8000 --reload
```
Open in browser:
http://127.0.0.1:8000/docs
If you see /detect, the API is running correctly.
### 3️⃣ Chrome Extension Setup
CleanOwl works via a local API + Chrome extension.
Open Chrome:
chrome://extensions/
Enable Developer Mode (top right)
Click Load unpacked
Select:
CleanOwl-AI-Slop-Detector/extension/
Refresh any webpage (Ctrl + R)
👉 CleanOwl will now analyze the page automatically.
### 🔒 Privacy
CleanOwl runs entirely on your local machine.
No data is sent to any external server.
### Usage
```bash
# CLI (scoring)
python ai_score.py
# Embedding demo
python quickstart.py
```
## Extension perform
![detect](./detect.png)
## Example(ai_score.py)
```bash
請輸入文字:先思考:在 AI 時代,什麼樣的人才不會被取代?我的答案是:具備溝通能力的人、擁有韌性的人,以及始終願意站在第一線的人。
human score: 47.13
label: ai_slop_like
請輸入文字:身為專業的肥宅 都會把脂肪放在身上
human score: 76.88
label: maybe_human_like
```
## Repository Structure
```bash
CleanOwl-AI-Slop-Detector/
├─ ai_score.py # scoring logic (CleanOwl core)
├─ quickstart.py # embedding demo CLI
├─ engine.py # PipeOwl tokenizer + embedding loader
├─ pipeowl.safetensors # embeddings + delta_field
├─ tokenizer.json
├─ ptt.npy # style field (PTT-like distribution)
├─ config.json
├─ app.py # FastAPI server
├─ requirements.txt
├─ extension/
│ ├─ content.js # Chrome content script
│ └─ manifest.json
├─ example.md
├─ README.md
└─ LICENSE
```
## LICENSE
MIT