test / README.md
broskiiii's picture
Initial commit including Dockerized FastApi app
8d1d8b8
metadata
title: Test
emoji: 🦀
colorFrom: green
colorTo: purple
sdk: docker
pinned: false
short_description: deepfakes

Anti-Phishing AI Backend

FastAPI backend and HTML demo for phishing and deepfake detection. Built for hackathon.

Stack

  • Framework: FastAPI
  • AI Agent: LangChain + Google Gemini 2.0 Flash (text, image, video, audio)
  • Deepfake APIs: HuggingFace Inference API (image + audio)
  • Video: Gemini Files API + frame-level HF image model

Setup

cd antiphish
pip install -r requirements.txt

Copy the .env file from the parent directory or create one:

GEMINI_API_KEY=your_key_here
HUGGING_FACE_TOKEN=your_hf_token_here

Run

uvicorn app.main:app --reload

Walkthrough: How To Use

The Anti-Phishing AI app analyzes text, images, videos, and audio for phishing attempts, scams, and deepfakes.

1. Web Interface Walkthrough

When you open http://localhost:8000, you will see a simple user interface. Switch between tabs depending on the type of media you want to analyze:

  • Text & URLs: Paste suspicious emails, SMS messages, or links. The app uses Gemini to detect urgency language, impersonation tactics, credential harvesting, and cross-references any URLs against suspicious top-level domains.
  • Images: Upload an image (like a screenshot of a login page or a photo of a document). The app uses a HuggingFace model to detect if the face in the image is a deepfake, and Gemini Vision to see if the image is a fake login screen or brand impersonation.
  • Video: Upload a short .mp4 video. The app samples frames and runs deepfake diagnostics on them, while simultaneously uploading the video to Gemini to check for unnatural blinking, lip-sync inconsistencies, and visual anomalies.
  • Audio: Upload an audio file (like a voicemail or recorded phone call). The HuggingFace integration checks the audio waveform for synthetic/AI-generated markers, while Gemini listens for common scam scripts (e.g., "fake bank security alert" or "tech support").

2. API / Developer Walkthrough

You can integrate this backend with another app or bot by sending requests directly to the API endpoints.

Checking the API documentation: All automated Swagger docs are at http://localhost:8000/docs.

Testing the Text Endpoint via terminal:

curl -X POST http://localhost:8000/analyze/text \
-H "Content-Type: application/json" \
-d '{"text": "URGENT: Your Paypal account has been locked. Click here to verify your identity: http://paypal-secure.ml/login"}'

Testing the Image/Audio/Video Endpoints: For media, send the file as a multipart/form-data upload:

curl -X POST http://localhost:8000/analyze/image \
  -F "file=@/path/to/suspicious_image.jpg"

Endpoints

Method Endpoint Input Description
POST /analyze/text JSON {"text": "..."} Phishing text + URL detection
POST /analyze/image multipart file Deepfake + phishing screenshot detection
POST /analyze/video multipart file Deepfake video detection
POST /analyze/audio multipart file Deepfake / AI voice detection

Response Format

{
  "risk_score": 0.87,
  "risk_level": "HIGH",
  "threat_types": ["phishing", "urgency_language", "malicious_url"],
  "explanation": "Human-readable analysis from Gemini.",
  "tool_outputs": { ... }
}

risk_level: LOW (0-0.3) | MEDIUM (0.3-0.6) | HIGH (0.6-0.85) | CRITICAL (0.85-1.0)

Models Used

Modality Model
Text / URL Gemini 2.0 Flash (structured JSON prompt)
Image deepfake dima806/deepfake_vs_real_image_detection (HF API)
Image phishing Gemini Vision (multimodal)
Video deepfake Gemini Files API + frame-sampled HF image model
Audio deepfake motheecreator/deepfake-audio-detection-v2 (HF API)
Audio voice scam Gemini Audio (multimodal)