Spaces:

Scam-AI
/

README

Running

App Files Files Community

StephenSAI commited on 1 day ago

Commit

e50ced5

verified ·

1 Parent(s): df87edc

Initial org card

Browse files

Files changed (1) hide show

README.md +90 -5

README.md CHANGED Viewed

@@ -1,10 +1,95 @@
 ---
-title: README
-emoji: 🐢
-colorFrom: purple
-colorTo: green
 sdk: static
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Scam.AI
+emoji: 🛡️
+colorFrom: blue
+colorTo: indigo
 sdk: static
 pinned: false
 ---
+# Scam.AI
+**Detection systems for AI-driven fraud — deepfakes, document forgery, synthetic media, and adversarial attacks against identity verification.**
+[![Website](https://img.shields.io/badge/scam.ai-Website-blue)](https://www.scam.ai)
+[![Research](https://img.shields.io/badge/Research-Publications-orange)](https://www.scam.ai/en/research)
+[![Datasets](https://img.shields.io/badge/Datasets-7%20open-green)](https://huggingface.co/Scam-AI)
+---
+## What We Do
+Scam.AI builds detection systems that protect identity-verification pipelines, financial-document workflows, and digital media ecosystems from the next generation of AI-driven fraud. Our research portfolio spans **deepfake detection, document forgery forensics, AI-generated image attribution, age-estimation robustness, and behavioral-biometric verification** — published at top venues (CVPR, arXiv) and released here as open benchmarks for the community.
+---
+## 🔬 Research Areas
+| Area | Focus | Key Datasets |
+|------|-------|--------------|
+| **🎭 Deepfake Detection** | Real-world faceswap detection beyond academic benchmarks | [RWFS](./datasets/Scam-AI/RWFS) |
+| **📄 Document Forgery** | AI-inpainted receipts, forms, and financial documents | [AIForge-Doc-v2](./datasets/Scam-AI/AIForge-Doc-v2) · [AIForge-Doc-v1](./datasets/Scam-AI/AIForge-Doc-v1) · [gpt4o-receipt](./datasets/Scam-AI/gpt4o-receipt) |
+| **🖼️ AI-Generated Image Detection** | Self-reported AI-generated images in the wild | [gpt-image-2](./datasets/Scam-AI/gpt-image-2) |
+| **🛡️ Age Estimation Robustness** | Cosmetic adversarial attacks against age verification | [age-adversarial-attack](./datasets/Scam-AI/age-adversarial-attack) |
+| **👁️ Behavioral Biometrics** | Gaze-based liveness for video interview verification | [synthetic-gaze-reading](./datasets/Scam-AI/synthetic-gaze-reading) |
+---
+## 📚 Featured Datasets
+All datasets are released for **academic research and non-commercial use** under CC-BY-NC-SA 4.0. Email-gated download with automatic approval.
+### 🎭 Deepfake Detection
+- **[RWFS — Real-World Faceswap Dataset](./datasets/Scam-AI/RWFS)** — 847 deepfakes from 8 production faceswap tools (Pixlr, Magic Hour, Remaker, etc) + 900 authentic faces. The first dataset reflecting how deepfakes actually appear in the wild.
+  > *Ren et al., "Do Deepfake Detectors Work in Reality?" — arXiv:2502.10920*
+### 📄 Document Forgery & Forensics
+- **[AIForge-Doc v2](./datasets/Scam-AI/AIForge-Doc-v2)** — 3,066 GPT-Image-2 inpainted document forgeries paired with authentic source + pixel-precise tampering masks. DocTamper-compatible.
+- **[AIForge-Doc v1](./datasets/Scam-AI/AIForge-Doc-v1)** — 4,061 forgeries via Gemini 2.5 / Ideogram v2. Same-spec pairing with v2 enables cross-generator detector analysis.
+- **[GPT4o-Receipt](./datasets/Scam-AI/gpt4o-receipt)** — 935 fully AI-synthesized receipts (GPT-4o + GPT-Image-1) across 159 merchant categories. Companion human-vs-LLM forensic detection study.
+### 🖼️ AI-Generated Image Detection
+- **[GPT-Image-2 Twitter Dataset](./datasets/Scam-AI/gpt-image-2)** — 10,217 confirmed GPT-Image-2 outputs scraped from Twitter/X in the first week post-launch. Multi-language: EN (40%), JA (33%), ZH (19%).
+### 🛡️ Identity Verification Robustness
+- **[Age Adversarial Attack Dataset](./datasets/Scam-AI/age-adversarial-attack)** — 5,809 VLM-simulated cosmetic attacks (beard, gray hair, makeup, wrinkles) demonstrating 29–65% attack-conversion rate on production age estimators.
+  > *Ren et al., CVPR 2026*
+- **[Synthetic Eye Movement Dataset](./datasets/Scam-AI/synthetic-gaze-reading)** — 12 hours of synthetic eye-movement video (144 sessions × 5 min) for script-reading detection in video interviews.
+---
+## 📑 Publications
+13 papers across deepfake detection, AI-generated detection, document forgery, age estimation, and interview technology. Browse the full list at **[scam.ai/research](https://www.scam.ai/en/research)**.
+Selected work:
+- **Do Deepfake Detectors Work in Reality?** — Ren, Patil, Zewde et al.
+- **AIForge-Doc: A Benchmark for Detecting AI-Forged Tampering in Financial and Form Documents** — Wu, Zhou, Xu et al. (arXiv:2602.20569)
+- **GPT-Image-2 in the Wild** — Zewde, Ren, Shen et al. (arXiv:2604.25370)
+- **Can a Teenager Fool an AI? Evaluating Low-Cost Cosmetic Attacks on Age Estimation Systems** — Shen, Duong, An et al. (arXiv:2602.19539, CVPR 2026)
+---
+## 💼 For Enterprise
+The datasets above are released for the research community. For production needs we offer:
+- **Detection APIs** — Deepfake, document forgery, AI-image, and age-verification endpoints with latency and accuracy SLAs
+- **On-premise deployment** — Private cloud or air-gapped installations for regulated industries (banking, government, healthcare)
+- **Commercial licensing** — Use our datasets and models in commercial pipelines
+- **Custom models** — Trained on your domain, evaluated against the threat models we've published
+📧 **sales@scam.ai** · 🌐 **[scam.ai](https://www.scam.ai)**
+---
+## 🤝 Get Involved
+- ⭐ **Follow** this org to get notified of new dataset releases
+- 📥 **Download** any dataset (free for non-commercial research, just provide name + email)
+- 📝 **Cite** our papers if you publish work building on these resources
+- 🐛 **Open a discussion** on any dataset to report issues or share results
+---
+*Building detection systems for an era when generative AI makes every digital artifact suspect.*