StephenSAI commited on
Commit
157dc45
·
verified ·
1 Parent(s): 6bfbab6

Shorten org card — drop research-areas table, get-involved, publications section

Browse files
Files changed (1) hide show
  1. README.md +21 -67
README.md CHANGED
@@ -9,87 +9,41 @@ pinned: false
9
 
10
  # Scam.AI
11
 
12
- **Detection systems for AI-driven fraud — deepfakes, document forgery, synthetic media, and adversarial attacks against identity verification.**
13
 
14
- [![Website](https://img.shields.io/badge/scam.ai-Website-blue)](https://www.scam.ai)
15
- [![Research](https://img.shields.io/badge/Research-Publications-orange)](https://www.scam.ai/en/research)
16
- [![Datasets](https://img.shields.io/badge/Datasets-7%20open-green)](https://huggingface.co/Scam-AI)
17
 
18
- ---
19
-
20
- ## What We Do
21
-
22
- Scam.AI builds detection systems that protect identity-verification pipelines, financial-document workflows, and digital media ecosystems from the next generation of AI-driven fraud. Our research portfolio spans **deepfake detection, document forgery forensics, AI-generated image attribution, age-estimation robustness, and behavioral-biometric verification** — published at top venues (CVPR, arXiv) and released here as open benchmarks for the community.
23
-
24
- ---
25
-
26
- ## 🔬 Research Areas
27
-
28
- | Area | Focus | Key Datasets |
29
- |------|-------|--------------|
30
- | **🎭 Deepfake Detection** | Real-world faceswap detection beyond academic benchmarks | [RWFS](https://huggingface.co/datasets/Scam-AI/RWFS) |
31
- | **📄 Document Forgery** | AI-inpainted receipts, forms, and financial documents | [AIForge-Doc-v2](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v2) · [AIForge-Doc-v1](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v1) · [gpt4o-receipt](https://huggingface.co/datasets/Scam-AI/gpt4o-receipt) |
32
- | **🖼️ AI-Generated Image Detection** | Self-reported AI-generated images in the wild | [gpt-image-2](https://huggingface.co/datasets/Scam-AI/gpt-image-2) |
33
- | **🛡️ Age Estimation Robustness** | Cosmetic adversarial attacks against age verification | [age-adversarial-attack](https://huggingface.co/datasets/Scam-AI/age-adversarial-attack) |
34
- | **👁️ Behavioral Biometrics** | Gaze-based liveness for video interview verification | [synthetic-gaze-reading](https://huggingface.co/datasets/Scam-AI/synthetic-gaze-reading) |
35
 
36
  ---
37
 
38
- ## 📚 Featured Datasets
39
-
40
- All datasets are released for **academic research and non-commercial use** under CC-BY-NC-SA 4.0. Email-gated download with automatic approval.
41
-
42
- ### 🎭 Deepfake Detection
43
- - **[RWFS — Real-World Faceswap Dataset](https://huggingface.co/datasets/Scam-AI/RWFS)** — 847 deepfakes from 8 production faceswap tools (Pixlr, Magic Hour, Remaker, etc) + 900 authentic faces. The first dataset reflecting how deepfakes actually appear in the wild.
44
- > *Ren et al., "Do Deepfake Detectors Work in Reality?" — arXiv:2502.10920*
45
-
46
- ### 📄 Document Forgery & Forensics
47
- - **[AIForge-Doc v2](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v2)** — 3,066 GPT-Image-2 inpainted document forgeries paired with authentic source + pixel-precise tampering masks. DocTamper-compatible.
48
- - **[AIForge-Doc v1](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v1)** — 4,061 forgeries via Gemini 2.5 / Ideogram v2. Same-spec pairing with v2 enables cross-generator detector analysis.
49
- - **[GPT4o-Receipt](https://huggingface.co/datasets/Scam-AI/gpt4o-receipt)** — 935 fully AI-synthesized receipts (GPT-4o + GPT-Image-1) across 159 merchant categories. Companion human-vs-LLM forensic detection study.
50
 
51
- ### 🖼️ AI-Generated Image Detection
52
- - **[GPT-Image-2 Twitter Dataset](https://huggingface.co/datasets/Scam-AI/gpt-image-2)** — 10,217 confirmed GPT-Image-2 outputs scraped from Twitter/X in the first week post-launch. Multi-language: EN (40%), JA (33%), ZH (19%).
53
 
54
- ### 🛡️ Identity Verification Robustness
55
- - **[Age Adversarial Attack Dataset](https://huggingface.co/datasets/Scam-AI/age-adversarial-attack)** — 5,809 VLM-simulated cosmetic attacks (beard, gray hair, makeup, wrinkles) demonstrating 29–65% attack-conversion rate on production age estimators.
56
- > *Ren et al., CVPR 2026*
57
- - **[Synthetic Eye Movement Dataset](https://huggingface.co/datasets/Scam-AI/synthetic-gaze-reading)** 12 hours of synthetic eye-movement video (144 sessions × 5 min) for script-reading detection in video interviews.
58
-
59
- ---
60
-
61
- ## 📑 Publications
62
-
63
- 13 papers across deepfake detection, AI-generated detection, document forgery, age estimation, and interview technology. Browse the full list at **[scam.ai/research](https://www.scam.ai/en/research)**.
64
-
65
- Selected work:
66
- - **Do Deepfake Detectors Work in Reality?** — Ren, Patil, Zewde et al.
67
- - **AIForge-Doc: A Benchmark for Detecting AI-Forged Tampering in Financial and Form Documents** — Wu, Zhou, Xu et al. (arXiv:2602.20569)
68
- - **GPT-Image-2 in the Wild** — Zewde, Ren, Shen et al. (arXiv:2604.25370)
69
- - **Can a Teenager Fool an AI? Evaluating Low-Cost Cosmetic Attacks on Age Estimation Systems** — Shen, Duong, An et al. (arXiv:2602.19539, CVPR 2026)
70
 
71
  ---
72
 
73
  ## 💼 For Enterprise
74
 
75
- The datasets above are released for the research community. For production needs we offer:
76
-
77
- - **Detection APIs** — Deepfake, document forgery, AI-image, and age-verification endpoints with latency and accuracy SLAs
78
- - **On-premise deployment** — Private cloud or air-gapped installations for regulated industries (banking, government, healthcare)
79
- - **Commercial licensing** — Use our datasets and models in commercial pipelines
80
- - **Custom models** — Trained on your domain, evaluated against the threat models we've published
81
-
82
- 📧 **sales@scam.ai** · 🌐 **[scam.ai](https://www.scam.ai)**
83
-
84
- ---
85
 
86
- ## 🤝 Get Involved
 
 
 
87
 
88
- - **Follow** this org to get notified of new dataset releases
89
- - 📥 **Download** any dataset (free for non-commercial research, just provide name + email)
90
- - 📝 **Cite** our papers if you publish work building on these resources
91
- - 🐛 **Open a discussion** on any dataset to report issues or share results
92
 
93
  ---
94
 
95
- *Building detection systems for an era when generative AI makes every digital artifact suspect.*
 
9
 
10
  # Scam.AI
11
 
12
+ **Detection systems for AI-driven fraud.**
13
 
14
+ We build production-grade detectors for deepfakes, document forgery, AI-generated media, and adversarial attacks against identity verification — and release the underlying benchmarks for the research community.
 
 
15
 
16
+ 🌐 [scam.ai](https://www.scam.ai) · 📑 [Research](https://www.scam.ai/en/research) · 💼 sales@scam.ai
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  ---
19
 
20
+ ## 📚 Open Datasets
 
 
 
 
 
 
 
 
 
 
 
21
 
22
+ 7 datasets · email-gated · CC-BY-NC-SA 4.0 · auto-approved
 
23
 
24
+ | Dataset | What it is |
25
+ |---|---|
26
+ | [**RWFS**](https://huggingface.co/datasets/Scam-AI/RWFS) 🎭 | 847 real-world deepfakes from 8 production faceswap tools. Reveals a 30+ pt AUC gap between academic and real-world performance. |
27
+ | [**AIForge-Doc v2**](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v2) 📄 | 3,066 GPT-Image-2 inpainted document forgeries with pixel-precise masks. |
28
+ | [**AIForge-Doc v1**](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v1) 📄 | 4,061 forgeries via Gemini 2.5 / Ideogram v2. Cross-generator pairing with v2. |
29
+ | [**GPT4o-Receipt**](https://huggingface.co/datasets/Scam-AI/gpt4o-receipt) 📄 | 935 fully AI-synthesized receipts across 159 merchant categories. |
30
+ | [**GPT-Image-2 Twitter**](https://huggingface.co/datasets/Scam-AI/gpt-image-2) 🖼️ | 10,217 confirmed GPT-Image-2 outputs scraped in the first week post-launch. |
31
+ | [**Age Adversarial Attack**](https://huggingface.co/datasets/Scam-AI/age-adversarial-attack) 🛡️ | 5,809 cosmetic attacks fooling production age estimators 69% of the time. *(CVPR 2026)* |
32
+ | [**Synthetic Gaze Reading**](https://huggingface.co/datasets/Scam-AI/synthetic-gaze-reading) 👁️ | 12 hours of synthetic eye-movement video for interview liveness. |
 
 
 
 
 
 
 
33
 
34
  ---
35
 
36
  ## 💼 For Enterprise
37
 
38
+ Need production-grade detection?
 
 
 
 
 
 
 
 
 
39
 
40
+ - **Detection APIs** with latency / accuracy SLAs
41
+ - **On-premise deployment** for regulated industries
42
+ - **Commercial licensing** of our datasets and models
43
+ - **Custom models** trained on your domain
44
 
45
+ 📧 **sales@scam.ai**
 
 
 
46
 
47
  ---
48
 
49
+ *Building detection for an era when every digital artifact is suspect.*