π΅οΈββοΈ Real Estate Manipulation Detector (Hybrid VLM-CNN)
Team Name: Lina Alkhatib
Track: Track B (Real Estate)
Date: January 28, 2026
1. Executive Summary
This project implements an automated forensic system designed to detect and explain digital manipulations in real estate imagery. Addressing the challenge of "fake listings," our solution employs a Hybrid Vision-Language Architecture. By combining the high-speed pattern recognition of a Convolutional Neural Network (ResNet-18) with the semantic reasoning capabilities of a Vision-Language Model (BLIP), the system achieves both high detection accuracy and human-readable interpretability.
2. System Architecture
The system operates on a Serial Cascading Pipeline, utilizing two distinct modules:
Module 1: The Detector (Quantitative Analysis)
- Architecture: ResNet-18 (Residual Neural Network).
- Role: Rapid binary classification and manipulation type scoring.
- Classes:
Real,Fake_AI,Fake_Splice. - Output: An
Authenticity Score(0.0 - 1.0) and a predicted class label.
Module 2: The Reasoner (Qualitative Forensics)
- Architecture: BLIP (Visual Question Answering).
- Role: Semantic analysis and report generation.
- Mechanism: The model answers specific physics-based questions about shadows, lighting, and object floating to generate a forensic report.
3. The "Fusion Strategy"
We use a Conditional Logic Fusion Strategy:
- Step 1: The image is passed through ResNet-18.
- Step 2: If flagged as
Fake, the image is passed to BLIP. - Step 3: BLIP is "interrogated" with targeted prompts ("Does the object cast a shadow?", "Is lighting consistent?").
- Step 4: A logic layer synthesizes the answers into a final text report (e.g., "Manipulation detected: the chair lacks a grounded contact shadow").
4. How to Run
- Clone this repository.
- Install dependencies:
pip install -r requirements.txt - Run the inference script:
python predict.py --input_dir ./test_images --output_file submission.json --model_path detector_model.pth
5. Files in this Repo
predict.py: The main inference script.detector_model.pth: The trained ResNet-18 weights.requirements.txt: Python dependencies.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support