Update README.md
Browse files
README.md
CHANGED
|
@@ -1,8 +1,8 @@
|
|
| 1 |
---
|
| 2 |
title: Digital Integrity
|
| 3 |
emoji: 🐠
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: 6.3.0
|
| 8 |
app_file: app.py
|
|
@@ -11,4 +11,87 @@ license: apache-2.0
|
|
| 11 |
short_description: Elm Challenge 2 - Computer Vision
|
| 12 |
---
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
title: Digital Integrity
|
| 3 |
emoji: 🐠
|
| 4 |
+
colorFrom: yellow
|
| 5 |
+
colorTo: green
|
| 6 |
sdk: gradio
|
| 7 |
sdk_version: 6.3.0
|
| 8 |
app_file: app.py
|
|
|
|
| 11 |
short_description: Elm Challenge 2 - Computer Vision
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# Theme: Detecting GenAI & Sophisticated Manipulation in Public Media 🏆🏆🏆
|
| 15 |
+
|
| 16 |
+
## Context & Motivation
|
| 17 |
+
|
| 18 |
+
As Generative AI becomes mainstream, the line between reality and synthetic media is blurring. On social media,
|
| 19 |
+
"perfect" AI influencers are indistinguishable from humans, and in real estate, "virtual staging" can mislead buyers by
|
| 20 |
+
hiding structural flaws.
|
| 21 |
+
|
| 22 |
+
Existing content moderation tools often check for "Community Guidelines" (violence, hate speech), but fail to detect
|
| 23 |
+
Authenticity. This hackathon challenges you to build a two-module system that identifies GenAI-generated or
|
| 24 |
+
heavily manipulated images in high-stakes public domains (Social Media & Real Estate).
|
| 25 |
+
|
| 26 |
+
## System Design Requirements
|
| 27 |
+
|
| 28 |
+
Participants must design a dual-path pipeline:
|
| 29 |
+
|
| 30 |
+
**Module 1: The Forensic Signal Detector (Pixel-Level)**
|
| 31 |
+
|
| 32 |
+
• Objective: Identify low-level technical anomalies.
|
| 33 |
+
|
| 34 |
+
• Target: Frequency analysis, noise residuals, and GAN/Diffusion artifacts.
|
| 35 |
+
|
| 36 |
+
• Expected Signals:
|
| 37 |
+
|
| 38 |
+
o Texture Consistency: Detecting "unnatural smoothness" in skin or wall textures.
|
| 39 |
+
|
| 40 |
+
o Compression Discrepancies: Identifying if an object (e.g., a furniture piece or a person) was digitally "spliced" into a scene.
|
| 41 |
+
|
| 42 |
+
o Frequency Domain Analysis: Using FFT to find the mathematical "fingerprint" left by upscalers or generators.
|
| 43 |
+
|
| 44 |
+
**Module 2: The VLM Logic Reasoner (Semantic-Level)**
|
| 45 |
+
|
| 46 |
+
• Objective: Use a Vision-Language Model (VLM) to provide a "Human-in-the-loop" style reasoning.
|
| 47 |
+
|
| 48 |
+
• Target: Detecting "The Uncanny Valley" and physical impossibilities.
|
| 49 |
+
|
| 50 |
+
• Expected Reasoning:
|
| 51 |
+
|
| 52 |
+
o Physics Check: Do the shadows of the AI-generated model match the sun’s direction in the background?
|
| 53 |
+
|
| 54 |
+
o Structural Integrity: Does the "renovated" real estate kitchen have impossible geometry (e.g., cabinets merging into walls)?
|
| 55 |
+
|
| 56 |
+
o Explanation: A natural language output explaining why the image is flagged (e.g., "The reflection in the window shows a different room layout than the one pictured. " )
|
| 57 |
+
|
| 58 |
+
## Challenge Tracks (Choose One)
|
| 59 |
+
|
| 60 |
+
**Track A: Social Media & Influencer Authenticity**
|
| 61 |
+
|
| 62 |
+
• Problem: Detection of "AI-Wash" filters and fully synthetic personas.
|
| 63 |
+
|
| 64 |
+
• Data Focus: Portraits, lifestyle photography, and high-fashion edits.
|
| 65 |
+
|
| 66 |
+
• Goal: Differentiate between "Touch-ups" (acceptable) and "Identity Fabrication" (adversarial).
|
| 67 |
+
|
| 68 |
+
**Track B: Real Estate & Commercial Integrity**
|
| 69 |
+
|
| 70 |
+
• Problem: Detecting deceptive virtual staging or AI-generated property photos.
|
| 71 |
+
|
| 72 |
+
• Data Focus: Interior/Exterior architectural photos.
|
| 73 |
+
|
| 74 |
+
• Goal: Identify where AI has been used to remove power lines, hide cracks in walls, or completely replace furniture in a misleading way.
|
| 75 |
+
|
| 76 |
+
## Submission Deliverables
|
| 77 |
+
|
| 78 |
+
1. **Inference Script**: A clean Python script to process a folder of images.
|
| 79 |
+
2. **The "Audit Report"**: For every flagged image, the system must produce a JSON output containing:
|
| 80 |
+
|
| 81 |
+
o authenticity_score: (0.0 to 1.0)
|
| 82 |
+
|
| 83 |
+
o manipulation_type: (e.g., "In-painting", "Full Synthesis", "Filter")
|
| 84 |
+
|
| 85 |
+
o vlm_reasoning: A 2-sentence explanation of the red flags.
|
| 86 |
+
|
| 87 |
+
3. **Technical Report**: A 3-page summary of the architecture and the "Fusion Strategy" used to combine Module 1 and Module 2.
|
| 88 |
+
|
| 89 |
+
## Evaluation Criteria
|
| 90 |
+
|
| 91 |
+
• Detection Accuracy (40%): Performance on a hidden test set containing 50/50 real and manipulated images.
|
| 92 |
+
|
| 93 |
+
• Explainability (30%): How logical and accurate are the VLM’s explanations? (Evaluated by human judges).
|
| 94 |
+
|
| 95 |
+
• Generalization (20%): Does the model work on different lighting, resolutions, and "unseen" AI generators (e.g., Midjourney v6 vs. Flux)?
|
| 96 |
+
|
| 97 |
+
• Efficiency (10%): Speed of inference and model size.
|