--- language: - zh - en tags: - embeddings - retrieval - transformer-free - safetensors - edge-ai license: mit --- # CleanOwl โ€” AI Slop Detector **I hate AI-SLOP SO I MADE THIS.** ![CleanOwl](./CleanOwl.png) CleanOwl is a lightweight **human-likeness scoring engine**. It estimates how โ€œhumanโ€ a piece of text feels โ€” not by classification, but by analyzing structural signals such as: - token distribution irregularity - semantic continuity - punctuation behavior No transformers. No fine-tuning. Just statistical signals. --- ## Performance - ~0.04 ms per token - ~0.8 ms per sentence (typical) - ~120 ms startup Runs entirely on CPU. Linear time complexity: O(n) --- ## ๐Ÿง  What it actually measures CleanOwl does **not** directly detect AI. Instead, it measures how **smooth vs irregular** a piece of writing is: - Human writing โ†’ irregular, biased, โ€œspikyโ€ - AI / formal text โ†’ smooth, evenly distributed --- ## ๐Ÿ“Š Score Interpretation | Score | Meaning | |------|--------| | < 60 | Likely AI-generated / formal text | | 60โ€“75 | Mixed / ambiguous | | > 75 | Likely human-like message | > This is a heuristic scoring system, not a classifier. --- ## โš ๏ธ Limitations - Short sentences may be unstable - Highly polished human writing (e.g. essays, Wikipedia) may look AI-like - AI can mimic human irregularity This is a **lightweight detector**, not a definitive AI classifier. --- ## Quickstart ### 1๏ธโƒฃ Install ```bash git clone https://huggingface.co/WangKaiLin/CleanOwl-AI-Slop-Detector cd CleanOwl-AI-Slop-Detector pip install numpy safetensors fastapi uvicorn ``` ### 2๏ธโƒฃ Run Local API ```bash uvicorn app:app --host 127.0.0.1 --port 8000 --reload ``` Open in browser: http://127.0.0.1:8000/docs If you see /detect, the API is running correctly. ### 3๏ธโƒฃ Chrome Extension Setup CleanOwl works via a local API + Chrome extension. Open Chrome: chrome://extensions/ Enable Developer Mode (top right) Click Load unpacked Select: CleanOwl-AI-Slop-Detector/extension/ Refresh any webpage (Ctrl + R) ๐Ÿ‘‰ CleanOwl will now analyze the page automatically. ### ๐Ÿ”’ Privacy CleanOwl runs entirely on your local machine. No data is sent to any external server. ### Usage ```bash # CLI (scoring) python ai_score.py # Embedding demo python quickstart.py ``` ## Extension perform ![detect](./detect.png) ## Example(ai_score.py) ```bash ่ซ‹่ผธๅ…ฅๆ–‡ๅญ—๏ผšๅ…ˆๆ€่€ƒ๏ผšๅœจ AI ๆ™‚ไปฃ๏ผŒไป€้บผๆจฃ็š„ไบบๆ‰ไธๆœƒ่ขซๅ–ไปฃ๏ผŸๆˆ‘็š„็ญ”ๆกˆๆ˜ฏ๏ผšๅ…ทๅ‚™ๆบ้€š่ƒฝๅŠ›็š„ไบบใ€ๆ“ๆœ‰้ŸŒๆ€ง็š„ไบบ๏ผŒไปฅๅŠๅง‹็ต‚้ก˜ๆ„็ซ™ๅœจ็ฌฌไธ€็ทš็š„ไบบใ€‚ human score: 47.13 label: ai_slop_like ่ซ‹่ผธๅ…ฅๆ–‡ๅญ—๏ผš่บซ็‚บๅฐˆๆฅญ็š„่‚ฅๅฎ… ้ƒฝๆœƒๆŠŠ่„‚่‚ชๆ”พๅœจ่บซไธŠ human score: 76.88 label: maybe_human_like ``` ## Repository Structure ```bash CleanOwl-AI-Slop-Detector/ โ”œโ”€ ai_score.py # scoring logic (CleanOwl core) โ”œโ”€ quickstart.py # embedding demo CLI โ”œโ”€ engine.py # PipeOwl tokenizer + embedding loader โ”œโ”€ pipeowl.safetensors # embeddings + delta_field โ”œโ”€ tokenizer.json โ”œโ”€ ptt.npy # style field (PTT-like distribution) โ”œโ”€ config.json โ”œโ”€ app.py # FastAPI server โ”œโ”€ requirements.txt โ”œโ”€ extension/ โ”‚ โ”œโ”€ content.js # Chrome content script โ”‚ โ””โ”€ manifest.json โ”œโ”€ example.md โ”œโ”€ README.md โ””โ”€ LICENSE ``` ## LICENSE MIT