ArtifactDetect: Forensic Pixel Detector & VLM Reasoner 🏆

Overview

ArtifactDetect is a dual-stream forensic pipeline designed to detect sophisticated Generative AI manipulations in high-stakes media, specifically focusing on Real Estate & Commercial Integrity (Track B).

Developed for the MenaML Winter School 2026 GenAI Detection Challenge, this system moves beyond black-box detection by grounding its analysis in physical signal processing and semantic reasoning.

Key Performance

Detection Accuracy: 99.13% on the blind test set (229/231 images correctly classified).
Calibration: Optimized decision threshold (0.20) for high-sensitivity detection of "in-the-wild" web-scraped content.

🧠 System Architecture

Our solution employs a Dual-Stream Fusion Strategy to ensure both technical accuracy and logical explainability:

Module 1: Forensic Signal Detector (Pixel-Level)

Backbone: EfficientNet-B0 (Pretrained) for robust feature extraction.
Forensic Extractors: Custom layers that analyze:
- Frequency Domain: FFT fingerprinting to detect GAN grid artifacts.
- Noise Residuals: High-pass filtering to identify splicing anomalies.
- Texture Consistency: Gradient analysis for unnatural "waxy" smoothing.
Output: Binary Authenticity Score + Specific Manipulation Technique (12 classes).

Module 2: VLM Logic Reasoner (Semantic-Level)

Model: llava-hf/llava-1.5-7b-hf (Large Language-and-Vision Assistant).
Optimization: 4-bit quantization (BitsAndBytes) for efficient inference on standard GPUs.
Mechanism: Uses Prompt-Guided Injection. The specific technique detected by Module 1 (e.g., "Shadow Mismatch") is injected into the VLM prompt to generate grounded, human-readable explanations.

🧪 Data Engineering: "Mathematical Injection"

Unlike standard approaches relying on generative models, we engineered a deterministic mathematical pipeline to create our training data. This ensures precise ground truth without hallucination.

Techniques: We rigorously injected specific artifacts (e.g., quantization noise, affine geometric shifts) into 2,000 authentic images.
Datasets:
- Training Set: genai-manipulation-detection-interior
- Test Set: Forensic-Manipulation-Test-Set

🛠️ Installation & Usage

1. Requirements

pip install -r requirements.txt

Key dependencies: torch, transf

ormers, timm, opencv-python, huggingface_hub, bitsandbytes

Run Inference Our predict.py script is fully self-contained. It will automatically download our custom weights and the test dataset from Hugging Face. python predict.py

Output: A submission.json file containing:
- authenticity_score: (0.0 - 1.0)
- manipulation_type: Specific technique (e.g., "Shadow Mismatch")
- vlm_reasoning: Explanation (e.g., "Shadows from the chair point left, while window light suggests they should point right.")

🔗 Resources

All assets are publicly hosted on Hugging Face for reproducibility:

Model Weights: FatimahEmadEldin/Forensic-Pixel-Detector
Combined Weights File: best_combined_model.pth

👥 Team Details

Team Name: ArtifactDetect Track: B (Real Estate & Commercial Integrity)

Fatimah Emad Eldin (Fatemah.it@gmail.com)
Mohammed Mustafa Bremoo (mohabremoo@gmail.com)
Abdulellah Mojalled (abdulellah.mazen@gmail.com)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

FatimahEmadEldin
/

forensic-pixel-detector