StephenSAI commited on
Commit
6bfbab6
Β·
verified Β·
1 Parent(s): 8921005

Fix relative dataset links to absolute URLs

Browse files
Files changed (1) hide show
  1. README.md +12 -12
README.md CHANGED
@@ -27,11 +27,11 @@ Scam.AI builds detection systems that protect identity-verification pipelines, f
27
 
28
  | Area | Focus | Key Datasets |
29
  |------|-------|--------------|
30
- | **🎭 Deepfake Detection** | Real-world faceswap detection beyond academic benchmarks | [RWFS](./datasets/Scam-AI/RWFS) |
31
- | **πŸ“„ Document Forgery** | AI-inpainted receipts, forms, and financial documents | [AIForge-Doc-v2](./datasets/Scam-AI/AIForge-Doc-v2) Β· [AIForge-Doc-v1](./datasets/Scam-AI/AIForge-Doc-v1) Β· [gpt4o-receipt](./datasets/Scam-AI/gpt4o-receipt) |
32
- | **πŸ–ΌοΈ AI-Generated Image Detection** | Self-reported AI-generated images in the wild | [gpt-image-2](./datasets/Scam-AI/gpt-image-2) |
33
- | **πŸ›‘οΈ Age Estimation Robustness** | Cosmetic adversarial attacks against age verification | [age-adversarial-attack](./datasets/Scam-AI/age-adversarial-attack) |
34
- | **πŸ‘οΈ Behavioral Biometrics** | Gaze-based liveness for video interview verification | [synthetic-gaze-reading](./datasets/Scam-AI/synthetic-gaze-reading) |
35
 
36
  ---
37
 
@@ -40,21 +40,21 @@ Scam.AI builds detection systems that protect identity-verification pipelines, f
40
  All datasets are released for **academic research and non-commercial use** under CC-BY-NC-SA 4.0. Email-gated download with automatic approval.
41
 
42
  ### 🎭 Deepfake Detection
43
- - **[RWFS β€” Real-World Faceswap Dataset](./datasets/Scam-AI/RWFS)** β€” 847 deepfakes from 8 production faceswap tools (Pixlr, Magic Hour, Remaker, etc) + 900 authentic faces. The first dataset reflecting how deepfakes actually appear in the wild.
44
  > *Ren et al., "Do Deepfake Detectors Work in Reality?" β€” arXiv:2502.10920*
45
 
46
  ### πŸ“„ Document Forgery & Forensics
47
- - **[AIForge-Doc v2](./datasets/Scam-AI/AIForge-Doc-v2)** β€” 3,066 GPT-Image-2 inpainted document forgeries paired with authentic source + pixel-precise tampering masks. DocTamper-compatible.
48
- - **[AIForge-Doc v1](./datasets/Scam-AI/AIForge-Doc-v1)** β€” 4,061 forgeries via Gemini 2.5 / Ideogram v2. Same-spec pairing with v2 enables cross-generator detector analysis.
49
- - **[GPT4o-Receipt](./datasets/Scam-AI/gpt4o-receipt)** β€” 935 fully AI-synthesized receipts (GPT-4o + GPT-Image-1) across 159 merchant categories. Companion human-vs-LLM forensic detection study.
50
 
51
  ### πŸ–ΌοΈ AI-Generated Image Detection
52
- - **[GPT-Image-2 Twitter Dataset](./datasets/Scam-AI/gpt-image-2)** β€” 10,217 confirmed GPT-Image-2 outputs scraped from Twitter/X in the first week post-launch. Multi-language: EN (40%), JA (33%), ZH (19%).
53
 
54
  ### πŸ›‘οΈ Identity Verification Robustness
55
- - **[Age Adversarial Attack Dataset](./datasets/Scam-AI/age-adversarial-attack)** β€” 5,809 VLM-simulated cosmetic attacks (beard, gray hair, makeup, wrinkles) demonstrating 29–65% attack-conversion rate on production age estimators.
56
  > *Ren et al., CVPR 2026*
57
- - **[Synthetic Eye Movement Dataset](./datasets/Scam-AI/synthetic-gaze-reading)** β€” 12 hours of synthetic eye-movement video (144 sessions Γ— 5 min) for script-reading detection in video interviews.
58
 
59
  ---
60
 
 
27
 
28
  | Area | Focus | Key Datasets |
29
  |------|-------|--------------|
30
+ | **🎭 Deepfake Detection** | Real-world faceswap detection beyond academic benchmarks | [RWFS](https://huggingface.co/datasets/Scam-AI/RWFS) |
31
+ | **πŸ“„ Document Forgery** | AI-inpainted receipts, forms, and financial documents | [AIForge-Doc-v2](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v2) Β· [AIForge-Doc-v1](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v1) Β· [gpt4o-receipt](https://huggingface.co/datasets/Scam-AI/gpt4o-receipt) |
32
+ | **πŸ–ΌοΈ AI-Generated Image Detection** | Self-reported AI-generated images in the wild | [gpt-image-2](https://huggingface.co/datasets/Scam-AI/gpt-image-2) |
33
+ | **πŸ›‘οΈ Age Estimation Robustness** | Cosmetic adversarial attacks against age verification | [age-adversarial-attack](https://huggingface.co/datasets/Scam-AI/age-adversarial-attack) |
34
+ | **πŸ‘οΈ Behavioral Biometrics** | Gaze-based liveness for video interview verification | [synthetic-gaze-reading](https://huggingface.co/datasets/Scam-AI/synthetic-gaze-reading) |
35
 
36
  ---
37
 
 
40
  All datasets are released for **academic research and non-commercial use** under CC-BY-NC-SA 4.0. Email-gated download with automatic approval.
41
 
42
  ### 🎭 Deepfake Detection
43
+ - **[RWFS β€” Real-World Faceswap Dataset](https://huggingface.co/datasets/Scam-AI/RWFS)** β€” 847 deepfakes from 8 production faceswap tools (Pixlr, Magic Hour, Remaker, etc) + 900 authentic faces. The first dataset reflecting how deepfakes actually appear in the wild.
44
  > *Ren et al., "Do Deepfake Detectors Work in Reality?" β€” arXiv:2502.10920*
45
 
46
  ### πŸ“„ Document Forgery & Forensics
47
+ - **[AIForge-Doc v2](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v2)** β€” 3,066 GPT-Image-2 inpainted document forgeries paired with authentic source + pixel-precise tampering masks. DocTamper-compatible.
48
+ - **[AIForge-Doc v1](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v1)** β€” 4,061 forgeries via Gemini 2.5 / Ideogram v2. Same-spec pairing with v2 enables cross-generator detector analysis.
49
+ - **[GPT4o-Receipt](https://huggingface.co/datasets/Scam-AI/gpt4o-receipt)** β€” 935 fully AI-synthesized receipts (GPT-4o + GPT-Image-1) across 159 merchant categories. Companion human-vs-LLM forensic detection study.
50
 
51
  ### πŸ–ΌοΈ AI-Generated Image Detection
52
+ - **[GPT-Image-2 Twitter Dataset](https://huggingface.co/datasets/Scam-AI/gpt-image-2)** β€” 10,217 confirmed GPT-Image-2 outputs scraped from Twitter/X in the first week post-launch. Multi-language: EN (40%), JA (33%), ZH (19%).
53
 
54
  ### πŸ›‘οΈ Identity Verification Robustness
55
+ - **[Age Adversarial Attack Dataset](https://huggingface.co/datasets/Scam-AI/age-adversarial-attack)** β€” 5,809 VLM-simulated cosmetic attacks (beard, gray hair, makeup, wrinkles) demonstrating 29–65% attack-conversion rate on production age estimators.
56
  > *Ren et al., CVPR 2026*
57
+ - **[Synthetic Eye Movement Dataset](https://huggingface.co/datasets/Scam-AI/synthetic-gaze-reading)** β€” 12 hours of synthetic eye-movement video (144 sessions Γ— 5 min) for script-reading detection in video interviews.
58
 
59
  ---
60