phi-coherence / README.md
bitsabhi's picture
v3: Credibility Scoring
1813a42
---
title: φ-Coherence v3
emoji: 🔬
colorFrom: purple
colorTo: blue
sdk: docker
app_file: app.py
pinned: true
license: mit
short_description: Credibility scoring for any text 88% accuracy, pure math
---
# φ-Coherence v3 — Credibility Scoring
**Detect fabrication patterns in ANY text — human or AI.** 88% accuracy. No knowledge base. Pure math.
## The Insight
> Truth and fabrication have different structural fingerprints. You don't need to know the facts to detect the fingerprints.
LLMs generate text that *sounds like* truth. Humans inflate resumes, pad essays, write fake reviews. Both exhibit the same patterns:
- Vague attribution ("Studies show...")
- Overclaiming ("Every scientist agrees")
- Absolutist language ("Exactly 25,000", "Always", "Never")
This tool detects the **structural signature of fabrication** — regardless of whether a human or AI wrote it.
## Use Cases
| Domain | What It Catches |
|--------|-----------------|
| **AI Output Screening** | LLM hallucinations before they reach users |
| **Fake Review Detection** | "This product completely changed my life. Everyone agrees it's the best." |
| **Resume/Essay Inflation** | Vague claims, overclaiming, padding |
| **Marketing Copy** | Unsubstantiated superlatives |
| **News/Article Verification** | Fabricated quotes, fake consensus claims |
| **RAG Quality Filtering** | Rank retrieved content by credibility |
## What It Detects
| Pattern | Fabrication Example | Truth Example |
|---------|--------------------| --------------|
| **Vague Attribution** | "Studies show..." | "According to the 2012 WHO report..." |
| **Overclaiming** | "Every scientist agrees" | "The leading theory suggests..." |
| **Absolutist Language** | "Exactly 25,000 km" | "Approximately 21,196 km" |
| **Stasis Claims** | "Has never been questioned" | "Continues to be refined" |
| **Excessive Negation** | "Requires NO sunlight" | "Uses sunlight as energy" |
| **Topic Drift** | "Saturn... wedding rings... aliens" | Stays on subject |
## Why It Works
LLMs are next-token predictors. They generate sequences with high probability — "sounds right." But "sounds right" ≠ "is right."
Your tool detects when "sounds like truth" and "structured like truth" diverge.
**The LLM is good at mimicking content. This tool checks the structural signature.**
## Benchmark
| Version | Test | Accuracy |
|---------|------|----------|
| v1 | Single sentences | 40% |
| v2 | Paragraphs (12 pairs) | 75% |
| **v3** | **Paragraphs (25 pairs)** | **88%** |
| Random | Coin flip | 50% |
## API
```python
from gradio_client import Client
client = Client("bitsabhi/phi-coherence")
result = client.predict(text="Your text here...", api_name="/analyze_text")
```
## Limitations
- Cannot distinguish swapped numbers ("299,792" vs "150,000") without knowledge
- Well-crafted lies with proper hedging will score high
- Best on paragraphs (2+ sentences), not single claims
---
**Built by [Space (Abhishek Srivastava)](https://github.com/0x-auth/bazinga-indeed)**
*"Truth and fabrication have different structural fingerprints."*