FactSight / README.md
DeepActionPotential's picture
Initial project upload via Python API for Flask Space
e0f2d0e verified
metadata
title: FactSight
emoji: πŸš€
colorFrom: blue
colorTo: purple
sdk: docker

FactSight: Advanced Fake News Detection System

Python 3.11+ Flask PyTorch License: MIT

πŸ“‹ Table of Contents

🎯 Overview

FactSight is a comprehensive fake news detection system that combines multiple AI/ML technologies to analyze news content for authenticity. The system performs multi-modal analysis using both text and image content to determine if news articles are genuine or fabricated.

Key Features

  • πŸ” Multi-Modal Analysis: Combines text and image analysis for comprehensive fact-checking
  • πŸ€– AI Content Detection: Identifies AI-generated text and images
  • 🧠 Deep Learning Models: Uses state-of-the-art neural networks for classification
  • πŸ”— External Fact-Checking: Integrates with Google Fact Check Tools API
  • πŸ“Š Emotion Analysis: Detects sensationalism in news content
  • πŸ‘€ Face Analysis: Analyzes faces in images for deepfake detection
  • 🌐 Web Interface: User-friendly web application for easy analysis

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        FactSight System                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚   Flask     β”‚  β”‚   Core      β”‚  β”‚  Services   β”‚              β”‚
β”‚  β”‚   Web App   β”‚  β”‚  Manager    β”‚  β”‚ & Models    β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚   Text      β”‚  β”‚   Image     β”‚  β”‚   Fact      β”‚              β”‚
β”‚  β”‚  Analysis   β”‚  β”‚  Analysis   β”‚  β”‚  Checking   β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚   LSTM      β”‚  β”‚ EfficientNetβ”‚  β”‚   Google    β”‚              β”‚
β”‚  β”‚ Classifier  β”‚  β”‚    B3       β”‚  β”‚ Fact Check  β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Installation

Prerequisites

  • Python 3.11+
  • CUDA-compatible GPU (recommended for faster inference)
  • 8GB+ RAM
  • 4GB+ free disk space

Setup Instructions

  1. Clone the repository

    git clone <repository-url>
    cd FactSight
    
  2. Install dependencies

    pip install -r requirements.txt
    
  3. Download required models from HUGGINGFACE (if not included)

    # Models will be automatically downloaded on first run
    # or placed in the models/ directory
    
  4. Configure environment

    # Set your Google Fact Check API key in config.py
    FACT_API_KEY = "your-api-key-here"
    
  5. Run the application

    python app.py
    
  6. Access the web interface

    Open http://localhost:5000 in your browser
    

πŸ“– Usage

Web Interface

  1. Navigate to the homepage (/)
  2. Paste news text in the text area (max 10,000 characters)
  3. Upload images related to the news (drag & drop or browse)
  4. Click "Analyze" to start the fact-checking process
  5. View detailed results including:
    • Overall authenticity score
    • Text analysis breakdown
    • Image analysis results
    • Fact-checking verification
    • Emotion analysis

API Usage

import requests

# Submit for analysis
response = requests.post('http://localhost:5000/analyze', json={
    'text': 'Your news article text here...',
    'images': [
        {'data': 'base64-encoded-image-data'}
    ]
})

analysis_id = response.json()['analysis_id']

# Get results
results = requests.get(f'http://localhost:5000/analysis/{analysis_id}')

πŸ”¬ Services & Models Deep Dive

Text Analysis Services

1. AI Text Detection (ai_text_service.py)

  • Model: Custom-trained classifier [placeholder for url]
  • Architecture: Joblib-based scikit-learn model
  • Performance: 99% F1-Score
  • Purpose: Distinguishes between AI-generated and human-written text
  • Features: Text preprocessing, feature extraction, binary classification

2. Fake News Classification (fake_text_news_service.py)

  • Model: Custom LSTM Neural Network [placeholder for url]
  • Architecture:
    • Embedding layer (100 dimensions)
    • Bidirectional LSTM (128 hidden units)
    • Dropout regularization (0.5)
    • Sigmoid output layer
  • Performance: 99% F1-Score
  • Purpose: Classifies news as fake or real
  • Features:
    • Text cleaning and preprocessing
    • Contraction expansion
    • Stopword removal
    • Sequence padding/truncation (300 tokens)

3. Emotion Analysis (text_emotion_service.py)

  • Model: j-hartmann/emotion-english-distilroberta-base
  • Architecture: DistilRoBERTa transformer model
  • Purpose: Detects emotional tone (anger, fear, joy, etc.)
  • Features:
    • Chunked processing for long texts
    • Confidence-weighted aggregation
    • Sensationalism detection

4. Search Query Extraction (search_queries_service.py)

  • Model: transformers pipeline for question generation
  • Purpose: Extracts key claims from text for fact-checking
  • Features: NLP-based query generation

Image Analysis Services

1. AI Image Detection (ai_image_service.py)

  • Model: Custom EfficientNet-B3 [placeholder for url]
  • Architecture:
    • EfficientNet-B3 backbone
    • Custom classification head
    • Binary classification (AI vs Real)
  • Performance: 99% F1-Score
  • Purpose: Identifies AI-generated vs real images
  • Features:
    • 300x300 input resolution
    • Batch normalization
    • GPU acceleration

2. Face Detection (face_detection_service.py)

  • Model: scrfd (pre-trained ONNX model)
  • Purpose: Detects and locates faces in images
  • Features:
    • Multi-scale face detection
    • Landmark extraction (eyes, nose, mouth)
    • Confidence scoring
    • NMS (Non-Maximum Suppression)

3. Deepfake Detection (deepfake_service.py)

  • Model: Meso4 architecture (two variants)
    • Meso4_DF.h5: Trained on DeepFake dataset
    • Meso4_F2F.h5: Trained on Face2Face dataset
  • Architecture:
    • 4 convolutional layers
    • Batch normalization
    • Max pooling
    • Dropout (0.5)
    • Sigmoid classification
  • Purpose: Detects manipulated faces in images
  • Features:
    • Dual-model ensemble
    • Face cropping integration
    • Confidence threshold adjustment

Fact-Checking Integration

Fact Check Service (fact_search_service.py)

  • API: Google Fact Check Tools API v1alpha1
  • Purpose: External verification of claims
  • Features:
    • Multi-source fact-checking
    • Verdict aggregation
    • Confidence scoring
    • Source credibility assessment

πŸ“Š Model Performance

Model Type F1-Score Dataset Purpose
AI Text Detector Custom Classifier 99% Custom AI vs Human Text
Fake News LSTM LSTM Neural Network 99% Custom Fake vs Real News
AI Image Detector EfficientNet-B3 99% Custom AI vs Real Images
Emotion Detector DistilRoBERTa N/A Pre-trained Emotion Analysis
Face Detector SCRFD N/A Pre-trained Face Detection
Deepfake Detector Meso4 N/A Custom Deepfake Detection

🎬 Demo Media Files

The demo/ folder contains sample media:

Videos

Images

  • demo1.png - Demo 1
  • demo2.png - Demo 2
  • demo3.png - Demo 3
  • demo4.png - Demo 4
  • demo5.png - Demo 5
  • demo6.png - Demo 6
  • demo7.png - Demo 7

βš™οΈ Configuration

Environment Variables

# config.py
class Config:
    flask: FlaskConfig = FlaskConfig(
        SECRET_KEY = os.environ.get('SECRET_KEY', 'dev-secret-key'),
        UPLOAD_FOLDER = 'static/uploads',
        MAX_CONTENT_LENGTH = 50 * 1024 * 1024  # 50MB
    )

    service: ServiceConfig = ServiceConfig(
        FACT_API_KEY = "your-google-fact-check-api-key",
        FAKENESS_SCORE_THRESHOLD = 0.6,
        FACE_DETECTION_THRESHOLD = 0.5
    )

Model Paths

# Model file locations
AI_TEXT_DETECTOR = "./models/ai_text_detector.joblib"
FAKE_NEWS_DETECTOR = "models/fake_news_detector.pt"
EFFICIENTNET_AI_IMAGE = "./models/efficientnet_b3_full_ai_image_classifier.pt"
FACE_DETECTION = "models/face_det_10g.onnx"
MESO4_DF = "models/Meso4_DF.h5"
MESO4_F2F = "models/Meso4_F2F.h5"

πŸ”— API Endpoints

Method Endpoint Description
GET / Main analysis interface
POST /analyze Submit content for analysis
GET /analysis/<id> View analysis results
GET /health System health check

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add tests for new features
  • Update documentation
  • Ensure model compatibility

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Hugging Face for transformer models and pipelines
  • Google for Fact Check Tools API
  • PyTorch Team for deep learning framework
  • OpenCV for computer vision utilities
  • SCRFD for face detection models

Support

For support and questions:

Version History

v1.0.0

  • Initial release with full multi-modal fake news detection
  • 99% F1-scores on custom trained models
  • Comprehensive web interface
  • Integration with external fact-checking services

FactSight - Bringing truth to the digital age through advanced AI-powered analysis. πŸ›‘οΈ