LinaAlkh's picture
Create README.md
e0127c3 verified
metadata
title: Real Estate Manipulation Detector
emoji: 🏠
colorFrom: blue
colorTo: indigo
sdk: static
pinned: false
tags:
  - computer-vision
  - forensics
  - real-estate
  - blip
  - resnet
license: mit

🕵️‍♂️ Real Estate Manipulation Detector (Hybrid VLM-CNN)

Team Name: Lina Alkhatib
Track: Track B (Real Estate)
Date: January 28, 2026

1. Executive Summary

This project implements an automated forensic system designed to detect and explain digital manipulations in real estate imagery. Addressing the challenge of "fake listings," our solution employs a Hybrid Vision-Language Architecture. By combining the high-speed pattern recognition of a Convolutional Neural Network (ResNet-18) with the semantic reasoning capabilities of a Vision-Language Model (BLIP), the system achieves both high detection accuracy and human-readable interpretability.

2. System Architecture

The system operates on a Serial Cascading Pipeline, utilizing two distinct modules:

Module 1: The Detector (Quantitative Analysis)

  • Architecture: ResNet-18 (Residual Neural Network).
  • Role: Rapid binary classification and manipulation type scoring.
  • Classes: Real, Fake_AI, Fake_Splice.
  • Output: An Authenticity Score (0.0 - 1.0) and a predicted class label.

Module 2: The Reasoner (Qualitative Forensics)

  • Architecture: BLIP (Visual Question Answering).
  • Role: Semantic analysis and report generation.
  • Mechanism: The model answers specific physics-based questions about shadows, lighting, and object floating to generate a forensic report.

3. The "Fusion Strategy"

We use a Conditional Logic Fusion Strategy:

  1. Step 1: The image is passed through ResNet-18.
  2. Step 2: If flagged as Fake, the image is passed to BLIP.
  3. Step 3: BLIP is "interrogated" with targeted prompts ("Does the object cast a shadow?", "Is lighting consistent?").
  4. Step 4: A logic layer synthesizes the answers into a final text report (e.g., "Manipulation detected: the chair lacks a grounded contact shadow").

4. How to Run

  1. Clone this repository.
  2. Install dependencies: pip install -r requirements.txt
  3. Run the inference script:
    python predict.py --input_dir ./test_images --output_file submission.json --model_path detector_model.pth
    

5. Files in this Repo

  • predict.py: The main inference script.
  • detector_model.pth: The trained ResNet-18 weights.
  • requirements.txt: Python dependencies.