WangKaiLin
/

CleanOwl-AI-Slop-Detector

transformer-free

Model card Files Files and versions

CleanOwl-AI-Slop-Detector / README.md

WangKaiLin's picture

Update README.md

edf5b07 verified 17 days ago

|

history blame contribute delete

3.41 kB

	---
	language:
	- zh
	- en
	tags:
	- embeddings
	- retrieval
	- transformer-free
	- safetensors
	- edge-ai
	license: mit
	---

	# CleanOwl — AI Slop Detector

	I hate AI-SLOP SO I MADE THIS.

	![CleanOwl](./CleanOwl.png)

	CleanOwl is a lightweight human-likeness scoring engine.

	It estimates how “human” a piece of text feels — not by classification,
	but by analyzing structural signals such as:

	- token distribution irregularity
	- semantic continuity
	- punctuation behavior

	No transformers. No fine-tuning. Just statistical signals.

	---

	## Performance

	- ~0.04 ms per token
	- ~0.8 ms per sentence (typical)
	- ~120 ms startup

	Runs entirely on CPU.

	Linear time complexity: O(n)

	---

	## 🧠 What it actually measures

	CleanOwl does not directly detect AI.

	Instead, it measures how smooth vs irregular a piece of writing is:

	- Human writing → irregular, biased, “spiky”
	- AI / formal text → smooth, evenly distributed

	---

	## 📊 Score Interpretation

	\| Score \| Meaning \|
	\|------\|--------\|
	\| < 60 \| Likely AI-generated / formal text \|
	\| 60–75 \| Mixed / ambiguous \|
	\| > 75 \| Likely human-like message \|

	> This is a heuristic scoring system, not a classifier.

	---

	## ⚠️ Limitations

	- Short sentences may be unstable
	- Highly polished human writing (e.g. essays, Wikipedia) may look AI-like
	- AI can mimic human irregularity

	This is a lightweight detector, not a definitive AI classifier.

	---

	## Quickstart

	### 1️⃣ Install

	```bash
	git clone https://huggingface.co/WangKaiLin/CleanOwl-AI-Slop-Detector
	cd CleanOwl-AI-Slop-Detector

	pip install numpy safetensors fastapi uvicorn
	```

	### 2️⃣ Run Local API

	```bash
	uvicorn app:app --host 127.0.0.1 --port 8000 --reload
	```

	Open in browser:

	http://127.0.0.1:8000/docs

	If you see /detect, the API is running correctly.

	### 3️⃣ Chrome Extension Setup

	CleanOwl works via a local API + Chrome extension.

	Open Chrome:
	chrome://extensions/
	Enable Developer Mode (top right)
	Click Load unpacked
	Select:
	CleanOwl-AI-Slop-Detector/extension/
	Refresh any webpage (Ctrl + R)

	👉 CleanOwl will now analyze the page automatically.

	### 🔒 Privacy

	CleanOwl runs entirely on your local machine.
	No data is sent to any external server.

	### Usage
	```bash
	# CLI (scoring)
	python ai_score.py

	# Embedding demo
	python quickstart.py
	```

	## Extension perform

	![detect](./detect.png)

	## Example(ai_score.py)

	```bash
	請輸入文字：先思考：在 AI 時代，什麼樣的人才不會被取代？我的答案是：具備溝通能力的人、擁有韌性的人，以及始終願意站在第一線的人。

	human score: 47.13
	label: ai_slop_like

	請輸入文字：身為專業的肥宅都會把脂肪放在身上

	human score: 76.88
	label: maybe_human_like
	```

	## Repository Structure

	```bash
	CleanOwl-AI-Slop-Detector/
	├─ ai_score.py # scoring logic (CleanOwl core)
	├─ quickstart.py # embedding demo CLI
	├─ engine.py # PipeOwl tokenizer + embedding loader
	├─ pipeowl.safetensors # embeddings + delta_field
	├─ tokenizer.json
	├─ ptt.npy # style field (PTT-like distribution)
	├─ config.json
	├─ app.py # FastAPI server
	├─ requirements.txt
	├─ extension/
	│ ├─ content.js # Chrome content script
	│ └─ manifest.json
	├─ example.md
	├─ README.md
	└─ LICENSE
	```

	## LICENSE

	MIT