π Multimodal Fake Review Detector
Detect AI-generated fake reviews using text-image contrastive learning.
Model Description
This model analyzes both review text and associated product images to detect fake reviews. It leverages:
- BERT for text encoding (contextual semantics)
- CLIP Vision for image encoding (visual features)
- Contrastive Learning for text-image alignment scoring
Key Insight
Authentic reviews exhibit higher semantic consistency between text descriptions and product images. AI-generated fake reviews often have lower text-image alignment since text and images are generated independently.
Performance
| Metric | Score |
|---|---|
| Accuracy | 91.2% |
| F1 Score | 91.0% |
| Precision | 91.5% |
| Recall | 91.2% |
| AUC-ROC | 96.2% |
Dataset
Trained on AiGen-FoodReview dataset:
- 20,144 review-image pairs
- 50% authentic (real Yelp/TripAdvisor reviews)
- 50% fake (GPT-4-Turbo text + DALL-E-2 images)
Usage
Web Interface
Simply upload an image and enter review text to get:
- Prediction: Authentic or Fake
- Confidence Score: Model certainty
- Consistency Score: Text-image alignment measure
- Interpretation: Human-readable explanation
Python API
from app import load_model, predict
from PIL import Image
# Load model
load_model()
# Predict
image = Image.open("food.jpg")
result = predict("Amazing pizza with perfect crust!", image)
print(result)
Architecture
Input: (Review Text, Product Image)
β β
βΌ βΌ
βββββββββββ ββββββββββββ
β BERT β β CLIP β
β Encoder β β Vision β
ββββββ¬βββββ ββββββ¬ββββββ
β β
ββββββ΄βββββ ββββββ΄ββββββ
β 768-dim β β 768-dim β
ββββββ¬βββββ ββββββ¬ββββββ
β β
ββββββ΄βββββ ββββββ΄ββββββ
βText Projβ βImage Projβ
β Head β β Head β
ββββββ¬βββββ ββββββ¬ββββββ
β β
βΌ βΌ
βββββββββββββββββββββββ
β Consistency Score β β Cosine Similarity
β (Contrastive Loss) β
βββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββ
β Concatenation β
β (1536-dim) β
ββββββββββββ¬βββββββββββ
β
ββββββββββββΌβββββββββββ
β MLP Classifier β
β 512 β 256 β 2 β
ββββββββββββ¬βββββββββββ
β
βΌ
Output: [Authentic, Fake]
Citation
If you use this model, please cite:
@article{multimodal_fake_review_2026,
title={Multimodal Fake Review Detection Using Contrastive Learning:
Leveraging Text-Image Alignment for AI-Generated Content Identification},
author={Your Name},
year={2026}
}
License
MIT License
Acknowledgments
- AiGen-FoodReview Dataset: Hugging Face
- BERT: Google Research
- CLIP: OpenAI
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support