Spaces:

broskiiii
/

test

Running

App Files Files Community

broskiiii commited on Mar 8

Commit

8d1d8b8

0 Parent(s):

Initial commit including Dockerized FastApi app

Browse files

Files changed (24) hide show

.dockerignore +15 -0
.gitignore +5 -0
Dockerfile +50 -0
README.md +106 -0
app/__init__.py +1 -0
app/agent.py +120 -0
app/config.py +12 -0
app/main.py +41 -0
app/models.py +14 -0
app/routers/__init__.py +1 -0
app/routers/audio.py +20 -0
app/routers/image.py +20 -0
app/routers/text.py +16 -0
app/routers/video.py +20 -0
app/tools/__init__.py +1 -0
app/tools/audio_tools.py +91 -0
app/tools/image_tools.py +84 -0
app/tools/retry_utils.py +25 -0
app/tools/text_tools.py +79 -0
app/tools/video_tools.py +75 -0
docker-compose.yml +13 -0
download_models.py +39 -0
frontend/index.html +122 -0
requirements.txt +18 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,15 @@

+# Ignore local virtual environment
+venv/
+.env
+# Ignore git
+.git/
+# Ignore Python cache
+__pycache__/
+*.pyc
+*.pyo
+*.pyd
+.Python
+env/
+.pytest_cache/

.gitignore ADDED Viewed

	@@ -0,0 +1,5 @@

+venv/
+.env
+__pycache__/
+*.pyc
+.DS_Store

Dockerfile ADDED Viewed

	@@ -0,0 +1,50 @@

+# Use a slim Python base image
+FROM python:3.10-slim
+# Create a non-root user 'user' with UID 1000
+# This is required by Hugging Face Spaces
+RUN useradd -m -u 1000 user
+# Set environment variables
+ENV PYTHONDONTWRITEBYTECODE=1
+ENV PYTHONUNBUFFERED=1
+ENV HOME=/home/user
+ENV PATH=$HOME/.local/bin:$PATH
+# Install system dependencies required for OpenCV, audio processing, etc.
+RUN apt-get update && apt-get install -y \
+    libgl1 \
+    libglib2.0-0 \
+    libsndfile1 \
+    && rm -rf /var/lib/apt/lists/*
+# Set the working directory
+WORKDIR $HOME/app
+# Change ownership of the app directory to the 'user'
+RUN chown -R user:user $HOME/app
+# Switch to the non-root user
+USER user
+# Copy requirements and install them
+COPY --chown=user:user requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy the download script and execute it to cache models
+# This bakes the downloaded models directly into the Docker image layers
+COPY --chown=user:user download_models.py .
+ARG HUGGING_FACE_TOKEN
+ENV HUGGING_FACE_TOKEN=$HUGGING_FACE_TOKEN
+RUN python download_models.py
+# Copy the rest of the application code
+COPY --chown=user:user app ./app
+COPY --chown=user:user frontend ./frontend
+# Expose port (HF Spaces routes traffic to 7860 by default)
+EXPOSE 7860
+# Command to run the application (assuming FastAPI via Uvicorn)
+CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "7860"]

README.md ADDED Viewed

	@@ -0,0 +1,106 @@

+---
+title: Test
+emoji: 🦀
+colorFrom: green
+colorTo: purple
+sdk: docker
+pinned: false
+short_description: deepfakes
+---
+# Anti-Phishing AI Backend
+FastAPI backend and HTML demo for phishing and deepfake detection. Built for hackathon.
+## Stack
+- **Framework**: FastAPI
+- **AI Agent**: LangChain + Google Gemini 2.0 Flash (text, image, video, audio)
+- **Deepfake APIs**: HuggingFace Inference API (image + audio)
+- **Video**: Gemini Files API + frame-level HF image model
+## Setup
+```bash
+cd antiphish
+pip install -r requirements.txt
+```
+Copy the `.env` file from the parent directory or create one:
+```
+GEMINI_API_KEY=your_key_here
+HUGGING_FACE_TOKEN=your_hf_token_here
+```
+## Run
+```bash
+uvicorn app.main:app --reload
+```
+- **API docs**: http://localhost:8000/docs
+- **Demo frontend**: http://localhost:8000
+## Walkthrough: How To Use
+The Anti-Phishing AI app analyzes text, images, videos, and audio for phishing attempts, scams, and deepfakes.
+### 1. Web Interface Walkthrough
+When you open `http://localhost:8000`, you will see a simple user interface. Switch between tabs depending on the type of media you want to analyze:
+* **Text & URLs:** Paste suspicious emails, SMS messages, or links. The app uses Gemini to detect urgency language, impersonation tactics, credential harvesting, and cross-references any URLs against suspicious top-level domains.
+* **Images:** Upload an image (like a screenshot of a login page or a photo of a document). The app uses a HuggingFace model to detect if the face in the image is a deepfake, and Gemini Vision to see if the image is a fake login screen or brand impersonation.
+* **Video:** Upload a short `.mp4` video. The app samples frames and runs deepfake diagnostics on them, while simultaneously uploading the video to Gemini to check for unnatural blinking, lip-sync inconsistencies, and visual anomalies.
+* **Audio:** Upload an audio file (like a voicemail or recorded phone call). The HuggingFace integration checks the audio waveform for synthetic/AI-generated markers, while Gemini listens for common scam scripts (e.g., "fake bank security alert" or "tech support").
+### 2. API / Developer Walkthrough
+You can integrate this backend with another app or bot by sending requests directly to the API endpoints.
+**Checking the API documentation:** All automated Swagger docs are at http://localhost:8000/docs.
+**Testing the Text Endpoint via terminal:**
+```bash
+curl -X POST http://localhost:8000/analyze/text \
+-H "Content-Type: application/json" \
+-d '{"text": "URGENT: Your Paypal account has been locked. Click here to verify your identity: http://paypal-secure.ml/login"}'
+```
+**Testing the Image/Audio/Video Endpoints:**
+For media, send the file as a `multipart/form-data` upload:
+```bash
+curl -X POST http://localhost:8000/analyze/image \
+  -F "file=@/path/to/suspicious_image.jpg"
+```
+## Endpoints
+| Method | Endpoint | Input | Description |
+|---|---|---|---|
+| POST | `/analyze/text` | JSON `{"text": "..."}` | Phishing text + URL detection |
+| POST | `/analyze/image` | multipart file | Deepfake + phishing screenshot detection |
+| POST | `/analyze/video` | multipart file | Deepfake video detection |
+| POST | `/analyze/audio` | multipart file | Deepfake / AI voice detection |
+## Response Format
+```json
+{
+  "risk_score": 0.87,
+  "risk_level": "HIGH",
+  "threat_types": ["phishing", "urgency_language", "malicious_url"],
+  "explanation": "Human-readable analysis from Gemini.",
+  "tool_outputs": { ... }
+}
+```
+`risk_level`: `LOW` (0-0.3) | `MEDIUM` (0.3-0.6) | `HIGH` (0.6-0.85) | `CRITICAL` (0.85-1.0)
+## Models Used
+| Modality | Model |
+|---|---|
+| Text / URL | Gemini 2.0 Flash (structured JSON prompt) |
+| Image deepfake | `dima806/deepfake_vs_real_image_detection` (HF API) |
+| Image phishing | Gemini Vision (multimodal) |
+| Video deepfake | Gemini Files API + frame-sampled HF image model |
+| Audio deepfake | `motheecreator/deepfake-audio-detection-v2` (HF API) |
+| Audio voice scam | Gemini Audio (multimodal) |

app/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Empty init to make app a package."""

app/agent.py ADDED Viewed

	@@ -0,0 +1,120 @@

+"""
+LangChain agent wiring: registers all tools and invokes them per modality.
+Returns a structured AnalysisResult.
+"""
+import json
+from langchain_google_genai import ChatGoogleGenerativeAI
+from langchain_core.messages import HumanMessage, SystemMessage
+from app.config import GEMINI_API_KEY, GEMINI_MODEL, GEMINI_MODEL_FALLBACKS
+from app.models import AnalysisResult
+def _risk_level(score: float) -> str:
+    if score < 0.3:
+        return "LOW"
+    elif score < 0.6:
+        return "MEDIUM"
+    elif score < 0.85:
+        return "HIGH"
+    return "CRITICAL"
+def invoke_with_fallback(messages: list) -> str:
+    """Try GEMINI_MODEL then each fallback until one succeeds."""
+    models_to_try = [GEMINI_MODEL] + GEMINI_MODEL_FALLBACKS
+    last_err = None
+    for model_name in models_to_try:
+        try:
+            llm = ChatGoogleGenerativeAI(
+                model=model_name,
+                google_api_key=GEMINI_API_KEY,
+                temperature=0.1,
+            )
+            return llm.invoke(messages).content
+        except Exception as e:
+            last_err = e
+            if "429" not in str(e) and "RESOURCE_EXHAUSTED" not in str(e):
+                raise
+    raise RuntimeError(f"All Gemini models exhausted. Last error: {last_err}")
+def run_text_agent(text: str, url_flags: dict) -> AnalysisResult:
+    system = (
+        "You are a cybersecurity expert specializing in phishing detection. "
+        "Analyse the provided text for phishing indicators: urgency language, "
+        "impersonation, social engineering, suspicious URLs, credential harvesting. "
+        "Respond ONLY with valid JSON matching this schema: "
+        '{"risk_score": <float 0-1>, "threat_types": [<strings>], "explanation": <string>}'
+    )
+    prompt = f"TEXT TO ANALYSE:\n{text}\n\nURL SCAN RESULTS:\n{json.dumps(url_flags)}"
+    raw = invoke_with_fallback([SystemMessage(content=system), HumanMessage(content=prompt)])
+    raw = raw.strip().strip("```json").strip("```").strip()
+    data = json.loads(raw)
+    score = float(data["risk_score"])
+    return AnalysisResult(
+        risk_score=score,
+        risk_level=_risk_level(score),
+        threat_types=data.get("threat_types", []),
+        explanation=data.get("explanation", ""),
+        tool_outputs={"gemini_text": data, "url_scan": url_flags},
+    )
+def run_image_agent(hf_result: dict, gemini_result: dict) -> AnalysisResult:
+    hf_score = hf_result.get("deepfake_score", 0.0)
+    gemini_score = gemini_result.get("risk_score", 0.0)
+    combined = round((hf_score * 0.5) + (gemini_score * 0.5), 3)
+    threat_types = list(
+        set(hf_result.get("threat_types", []) + gemini_result.get("threat_types", []))
+    )
+    explanation = (
+        f"HuggingFace deepfake model: {hf_result.get('label', 'N/A')} "
+        f"(confidence {hf_score:.2f}). "
+        f"Gemini vision analysis: {gemini_result.get('explanation', '')}"
+    )
+    return AnalysisResult(
+        risk_score=combined,
+        risk_level=_risk_level(combined),
+        threat_types=threat_types,
+        explanation=explanation,
+        tool_outputs={"hf_deepfake": hf_result, "gemini_vision": gemini_result},
+    )
+def run_video_agent(gemini_result: dict, frame_scores: list[float]) -> AnalysisResult:
+    gemini_score = gemini_result.get("risk_score", 0.0)
+    avg_frame = sum(frame_scores) / len(frame_scores) if frame_scores else 0.0
+    combined = round((gemini_score * 0.6) + (avg_frame * 0.4), 3)
+    explanation = (
+        f"Gemini video analysis score: {gemini_score:.2f}. "
+        f"Frame-level deepfake average: {avg_frame:.2f} over {len(frame_scores)} frames. "
+        f"{gemini_result.get('explanation', '')}"
+    )
+    return AnalysisResult(
+        risk_score=combined,
+        risk_level=_risk_level(combined),
+        threat_types=gemini_result.get("threat_types", ["deepfake_video"]),
+        explanation=explanation,
+        tool_outputs={"gemini_video": gemini_result, "frame_scores": frame_scores},
+    )
+def run_audio_agent(hf_result: dict, gemini_result: dict) -> AnalysisResult:
+    hf_score = hf_result.get("deepfake_score", 0.0)
+    gemini_score = gemini_result.get("risk_score", 0.0)
+    combined = round((hf_score * 0.5) + (gemini_score * 0.5), 3)
+    threat_types = list(
+        set(hf_result.get("threat_types", []) + gemini_result.get("threat_types", []))
+    )
+    explanation = (
+        f"HuggingFace audio deepfake model: {hf_result.get('label', 'N/A')} "
+        f"(confidence {hf_score:.2f}). "
+        f"Gemini audio analysis: {gemini_result.get('explanation', '')}"
+    )
+    return AnalysisResult(
+        risk_score=combined,
+        risk_level=_risk_level(combined),
+        threat_types=threat_types,
+        explanation=explanation,
+        tool_outputs={"hf_audio": hf_result, "gemini_audio": gemini_result},
+    )

app/config.py ADDED Viewed

	@@ -0,0 +1,12 @@

+import os
+from dotenv import load_dotenv
+load_dotenv(dotenv_path=os.path.join(os.path.dirname(__file__), "..", ".env"))
+GEMINI_API_KEY: str = os.environ["GEMINI_API_KEY"]
+HUGGING_FACE_TOKEN: str = os.environ["HUGGING_FACE_TOKEN"]
+HF_IMAGE_MODEL = "dima806/deepfake_vs_real_image_detection"
+HF_AUDIO_MODEL = "mo-thecreator/Deepfake-audio-detection"
+GEMINI_MODEL = "gemini-2.5-flash"
+GEMINI_MODEL_FALLBACKS = ["gemini-2.5-flash"]

app/main.py ADDED Viewed

	@@ -0,0 +1,41 @@

+"""
+FastAPI entrypoint for the Anti-Phishing Backend.
+Serves all 4 analysis routers and the plain HTML demo frontend.
+"""
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.staticfiles import StaticFiles
+from fastapi.responses import FileResponse
+import os
+from app.routers import text, image, video, audio
+app = FastAPI(
+    title="Anti-Phishing AI Backend",
+    description="LangChain + Gemini powered phishing and deepfake detection API",
+    version="1.0.0",
+)
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+app.include_router(text.router, prefix="/analyze", tags=["Text"])
+app.include_router(image.router, prefix="/analyze", tags=["Image"])
+app.include_router(video.router, prefix="/analyze", tags=["Video"])
+app.include_router(audio.router, prefix="/analyze", tags=["Audio"])
+FRONTEND_DIR = os.path.join(os.path.dirname(__file__), "..", "frontend")
+@app.get("/", include_in_schema=False)
+async def serve_frontend():
+    return FileResponse(os.path.join(FRONTEND_DIR, "index.html"))
+@app.get("/health")
+async def health():
+    return {"status": "ok"}

app/models.py ADDED Viewed

	@@ -0,0 +1,14 @@

+from pydantic import BaseModel, Field
+from typing import Any
+class TextRequest(BaseModel):
+    text: str
+class AnalysisResult(BaseModel):
+    risk_score: float = Field(..., ge=0.0, le=1.0, description="0.0 = safe, 1.0 = critical threat")
+    risk_level: str = Field(..., description="LOW | MEDIUM | HIGH | CRITICAL")
+    threat_types: list[str] = Field(default_factory=list)
+    explanation: str
+    tool_outputs: dict[str, Any] = Field(default_factory=dict)

app/routers/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Empty init to make app/routers a package."""

app/routers/audio.py ADDED Viewed

	@@ -0,0 +1,20 @@

+from fastapi import APIRouter, UploadFile, File, HTTPException
+from app.models import AnalysisResult
+from app.tools.audio_tools import hf_detect_audio_deepfake, gemini_analyze_audio
+from app.agent import run_audio_agent
+router = APIRouter()
+@router.post("/audio", response_model=AnalysisResult)
+async def analyze_audio(file: UploadFile = File(...)):
+    allowed = {"audio/wav", "audio/mpeg", "audio/mp3", "audio/ogg", "audio/x-wav"}
+    if file.content_type not in allowed:
+        raise HTTPException(status_code=400, detail=f"Unsupported audio type: {file.content_type}")
+    try:
+        audio_bytes = await file.read()
+        hf_result = hf_detect_audio_deepfake(audio_bytes)
+        gemini_result = gemini_analyze_audio(audio_bytes, mime_type=file.content_type)
+        return run_audio_agent(hf_result, gemini_result)
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))

app/routers/image.py ADDED Viewed

	@@ -0,0 +1,20 @@

+from fastapi import APIRouter, UploadFile, File, HTTPException
+from app.models import AnalysisResult
+from app.tools.image_tools import hf_detect_image_deepfake, gemini_analyze_image
+from app.agent import run_image_agent
+router = APIRouter()
+@router.post("/image", response_model=AnalysisResult)
+async def analyze_image(file: UploadFile = File(...)):
+    allowed = {"image/jpeg", "image/png", "image/webp", "image/gif"}
+    if file.content_type not in allowed:
+        raise HTTPException(status_code=400, detail=f"Unsupported image type: {file.content_type}")
+    try:
+        image_bytes = await file.read()
+        hf_result = hf_detect_image_deepfake(image_bytes, mime_type=file.content_type)
+        gemini_result = gemini_analyze_image(image_bytes, mime_type=file.content_type)
+        return run_image_agent(hf_result, gemini_result)
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))

app/routers/text.py ADDED Viewed

	@@ -0,0 +1,16 @@

+from fastapi import APIRouter, HTTPException
+from app.models import TextRequest, AnalysisResult
+from app.tools.text_tools import analyze_urls_in_text, gemini_analyze_text
+from app.agent import run_text_agent
+router = APIRouter()
+@router.post("/text", response_model=AnalysisResult)
+async def analyze_text(request: TextRequest):
+    try:
+        url_flags = analyze_urls_in_text(request.text)
+        gemini_data = gemini_analyze_text(request.text)
+        return run_text_agent(request.text, url_flags)
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))

app/routers/video.py ADDED Viewed

	@@ -0,0 +1,20 @@

+from fastapi import APIRouter, UploadFile, File, HTTPException
+from app.models import AnalysisResult
+from app.tools.video_tools import gemini_analyze_video, score_video_frames
+from app.agent import run_video_agent
+router = APIRouter()
+@router.post("/video", response_model=AnalysisResult)
+async def analyze_video(file: UploadFile = File(...)):
+    allowed = {"video/mp4", "video/mpeg", "video/webm", "video/quicktime"}
+    if file.content_type not in allowed:
+        raise HTTPException(status_code=400, detail=f"Unsupported video type: {file.content_type}")
+    try:
+        video_bytes = await file.read()
+        gemini_result = gemini_analyze_video(video_bytes, mime_type=file.content_type)
+        frame_scores = score_video_frames(video_bytes)
+        return run_video_agent(gemini_result, frame_scores)
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))

app/tools/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Empty init to make app/tools a package."""

app/tools/audio_tools.py ADDED Viewed

	@@ -0,0 +1,91 @@

+"""
+Audio tools:
+- HuggingFace Inference API for deepfake/AI-voice detection
+- Gemini for voice-scam content analysis
+"""
+import json
+import base64
+import tempfile
+import os
+from huggingface_hub import InferenceClient
+from langchain_google_genai import ChatGoogleGenerativeAI
+from langchain_core.messages import HumanMessage, SystemMessage
+from app.config import GEMINI_API_KEY, GEMINI_MODEL, GEMINI_MODEL_FALLBACKS, HF_AUDIO_MODEL
+from transformers import pipeline
+# Global pipeline instance to load model once at startup (or first use)
+_audio_classifier = None
+def get_audio_classifier():
+    global _audio_classifier
+    if _audio_classifier is None:
+        # Load the local model that was downloaded by download_models.py
+        _audio_classifier = pipeline("audio-classification", model=HF_AUDIO_MODEL)
+    return _audio_classifier
+def hf_detect_audio_deepfake(audio_bytes: bytes) -> dict:
+    import io
+    import soundfile as sf
+    import numpy as np
+    try:
+        # Read the audio bytes directly into a numpy array (skipping the disk write)
+        # pipeline("audio-classification") expects either a file path or a raw waveform array
+        audio_data, sample_rate = sf.read(io.BytesIO(audio_bytes))
+        # sf.read might return stereo (2D array), convert to mono if necessary
+        if len(audio_data.shape) > 1:
+            audio_data = audio_data.mean(axis=1)
+        # Get the initialized pipeline
+        classifier = get_audio_classifier()
+        # We must provide the correct sampling rate if passing raw arrays
+        # The pipeline usually expects 16kHz for deepfake models,
+        # but passing it as a dict {raw: array, sampling_rate: sr} is supported by HF pipelines
+        results = classifier({"raw": audio_data, "sampling_rate": sample_rate})
+        # Transformers pipeline returns a list of dicts like: [{'label': 'real', 'score': 0.9}, ...]
+        label_map = {r['label'].lower(): r['score'] for r in results}
+        fake_score = label_map.get("fake", label_map.get("spoof", label_map.get("ai-generated", 0.0)))
+        label = "FAKE" if fake_score > 0.5 else "REAL"
+        return {
+            "label": label,
+            "deepfake_score": round(fake_score, 4),
+            "threat_types": ["deepfake_audio", "ai_voice"] if fake_score > 0.5 else [],
+            "raw": [{"label": r['label'], "score": r['score']} for r in results],
+        }
+    except Exception as e:
+        return {"label": "ERROR", "deepfake_score": 0.0, "threat_types": [], "error": str(e)}
+def gemini_analyze_audio(audio_bytes: bytes, mime_type: str = "audio/wav") -> dict:
+    b64 = base64.b64encode(audio_bytes).decode()
+    system = (
+        "You are a voice fraud and deepfake audio expert. Analyse this audio for: "
+        "AI-generated voice patterns, robotic/synthetic speech artifacts, voice cloning indicators, "
+        "scam scripts (fake tech support, fake bank calls, urgent threats). "
+        "Reply ONLY with valid JSON: "
+        '{"risk_score": <float 0-1>, "threat_types": [<strings>], "explanation": <string>}'
+    )
+    message = HumanMessage(
+        content=[
+            {"type": "text", "text": system},
+            {"type": "media", "data": b64, "mime_type": mime_type},
+        ]
+    )
+    from app.tools.retry_utils import execute_with_retry
+    for model in [GEMINI_MODEL] + GEMINI_MODEL_FALLBACKS:
+        try:
+            resp = execute_with_retry(
+                lambda m=model: ChatGoogleGenerativeAI(model=m, google_api_key=GEMINI_API_KEY, temperature=0.1).invoke([message])
+            )
+            raw = resp.content.strip().strip("```json").strip("```").strip()
+            return json.loads(raw)
+        except Exception as e:
+            if "429" not in str(e) and "RESOURCE_EXHAUSTED" not in str(e):
+                raise
+    return {"risk_score": 0.0, "threat_types": [], "explanation": "Gemini quota exhausted for all models"}

app/tools/image_tools.py ADDED Viewed

	@@ -0,0 +1,84 @@

+"""
+Image tools:
+- HuggingFace Inference API for deepfake image detection
+- Gemini Vision for phishing screenshot / fake document analysis
+"""
+import json
+import base64
+import tempfile
+import os
+from huggingface_hub import InferenceClient
+from langchain_google_genai import ChatGoogleGenerativeAI
+from langchain_core.messages import HumanMessage, SystemMessage
+from app.config import GEMINI_API_KEY, GEMINI_MODEL, GEMINI_MODEL_FALLBACKS, HF_IMAGE_MODEL
+from transformers import pipeline
+# Global pipeline instance to load model once at startup (or first use)
+_image_classifier = None
+def get_image_classifier():
+    global _image_classifier
+    if _image_classifier is None:
+        # Load the local model that was downloaded by download_models.py
+        _image_classifier = pipeline("image-classification", model=HF_IMAGE_MODEL)
+    return _image_classifier
+def hf_detect_image_deepfake(image_bytes: bytes, mime_type: str = "image/jpeg") -> dict:
+    from PIL import Image
+    import io
+    try:
+        # Open image using PIL
+        image = Image.open(io.BytesIO(image_bytes)).convert("RGB")
+        # Get the initialized pipeline
+        classifier = get_image_classifier()
+        # Run inference locally
+        results = classifier(image)
+        # Transformers pipeline returns a list of dicts like: [{'label': 'real', 'score': 0.9}, ...]
+        label_map = {r['label'].lower(): r['score'] for r in results}
+        fake_score = label_map.get("fake", label_map.get("deepfake", 0.0))
+        label = "FAKE" if fake_score > 0.5 else "REAL"
+        return {
+            "label": label,
+            "deepfake_score": round(fake_score, 4),
+            "threat_types": ["deepfake_image"] if fake_score > 0.5 else [],
+            "raw": [{"label": r['label'], "score": r['score']} for r in results],
+        }
+    except Exception as e:
+        return {"label": "ERROR", "deepfake_score": 0.0, "threat_types": [], "error": str(e)}
+def gemini_analyze_image(image_bytes: bytes, mime_type: str = "image/jpeg") -> dict:
+    b64 = base64.b64encode(image_bytes).decode()
+    system = (
+        "You are a cybersecurity image analyst. Examine the image for: "
+        "fake login pages, phishing screenshots, fake documents, impersonated brands, "
+        "deepfake or AI-generated faces/content. "
+        "Reply ONLY with valid JSON: "
+        '{"risk_score": <float 0-1>, "threat_types": [<strings>], "explanation": <string>}'
+    )
+    message = HumanMessage(
+        content=[
+            {"type": "text", "text": system},
+            {"type": "image_url", "image_url": {"url": f"data:{mime_type};base64,{b64}"}},
+        ]
+    )
+    from app.tools.retry_utils import execute_with_retry
+    for model in [GEMINI_MODEL] + GEMINI_MODEL_FALLBACKS:
+        try:
+            resp = execute_with_retry(
+                lambda m=model: ChatGoogleGenerativeAI(model=m, google_api_key=GEMINI_API_KEY, temperature=0.1).invoke([message])
+            )
+            raw = resp.content.strip().strip("```json").strip("```").strip()
+            return json.loads(raw)
+        except Exception as e:
+            if "429" not in str(e) and "RESOURCE_EXHAUSTED" not in str(e):
+                raise
+    return {"risk_score": 0.0, "threat_types": [], "explanation": "Gemini quota exhausted for all models"}

app/tools/retry_utils.py ADDED Viewed

	@@ -0,0 +1,25 @@

+import time
+import logging
+from typing import Callable, Any
+logger = logging.getLogger(__name__)
+def execute_with_retry(func: Callable, max_retries: int = 4, initial_backoff: float = 2.0, *args, **kwargs) -> Any:
+    """
+    Executes a function with exponential backoff for 429 and RESOURCE_EXHAUSTED errors.
+    """
+    backoff = initial_backoff
+    for attempt in range(max_retries):
+        try:
+            return func(*args, **kwargs)
+        except Exception as e:
+            error_msg = str(e)
+            if "429" in error_msg or "RESOURCE_EXHAUSTED" in error_msg:
+                if attempt == max_retries - 1:
+                    logger.error(f"Max retries reached for 429 error: {e}")
+                    raise
+                logger.warning(f"Rate limited (429/Resource Exhausted). Retrying in {backoff} seconds...")
+                time.sleep(backoff)
+                backoff *= 2
+            else:
+                raise

app/tools/text_tools.py ADDED Viewed

	@@ -0,0 +1,79 @@

+"""
+Text tools: phishing text analysis via Gemini + URL extraction/scoring.
+"""
+import re
+import json
+from langchain_google_genai import ChatGoogleGenerativeAI
+from langchain_core.messages import HumanMessage, SystemMessage
+from app.config import GEMINI_API_KEY, GEMINI_MODEL, GEMINI_MODEL_FALLBACKS
+_SUSPICIOUS_TLD = re.compile(
+    r"https?://[^\s\"'<>]+", re.IGNORECASE
+)
+_BAD_TLDS = {".tk", ".ml", ".ga", ".cf", ".gq", ".xyz", ".top", ".click", ".loan", ".work"}
+def extract_urls(text: str) -> list[str]:
+    return _SUSPICIOUS_TLD.findall(text)
+def score_url(url: str) -> dict:
+    from urllib.parse import urlparse
+    parsed = urlparse(url)
+    domain = parsed.netloc.lower()
+    flags = []
+    is_suspicious = False
+    for tld in _BAD_TLDS:
+        if domain.endswith(tld):
+            flags.append(f"suspicious_tld:{tld}")
+            is_suspicious = True
+    brand_impersonations = ["paypal", "amazon", "google", "microsoft", "apple", "bank", "secure", "login", "verify"]
+    for brand in brand_impersonations:
+        if brand in domain and not domain.startswith(brand + "."):
+            flags.append(f"impersonation:{brand}")
+            is_suspicious = True
+    if re.search(r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}", domain):
+        flags.append("ip_address_url")
+        is_suspicious = True
+    return {"url": url, "suspicious": is_suspicious, "flags": flags}
+def analyze_urls_in_text(text: str) -> dict:
+    urls = extract_urls(text)
+    scored = [score_url(u) for u in urls]
+    suspicious_count = sum(1 for s in scored if s["suspicious"])
+    return {
+        "urls_found": len(urls),
+        "suspicious_count": suspicious_count,
+        "url_details": scored,
+    }
+def _invoke(messages):
+    from app.tools.retry_utils import execute_with_retry
+    for model in [GEMINI_MODEL] + GEMINI_MODEL_FALLBACKS:
+        try:
+            return execute_with_retry(
+                lambda m=model: ChatGoogleGenerativeAI(model=m, google_api_key=GEMINI_API_KEY, temperature=0.1).invoke(messages).content
+            )
+        except Exception as e:
+            if "429" not in str(e) and "RESOURCE_EXHAUSTED" not in str(e):
+                raise
+    raise RuntimeError("All Gemini models quota exhausted")
+def gemini_analyze_text(text: str) -> dict:
+    system = (
+        "You are an expert phishing and scam text analyser. "
+        "Detect: urgency language, impersonation, social engineering, credential harvesting, "
+        "suspicious links, fake authority claims. "
+        "Reply ONLY with valid JSON: "
+        '{"risk_score": <float 0.0-1.0>, "threat_types": [<strings>], "explanation": <string>}'
+    )
+    raw = _invoke([SystemMessage(content=system), HumanMessage(content=text)])
+    raw = raw.strip().strip("```json").strip("```").strip()
+    return json.loads(raw)

app/tools/video_tools.py ADDED Viewed

	@@ -0,0 +1,75 @@

+import os
+import json
+import tempfile
+import time
+import cv2
+from google import genai
+from google.genai import types
+from langchain_google_genai import ChatGoogleGenerativeAI
+from langchain_core.messages import HumanMessage, SystemMessage
+from app.config import GEMINI_API_KEY, GEMINI_MODEL
+from app.tools.image_tools import hf_detect_image_deepfake
+client = genai.Client(api_key=GEMINI_API_KEY)
+def sample_frames(video_bytes: bytes, n_frames: int = 8) -> list[bytes]:
+    with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as tmp:
+        tmp.write(video_bytes)
+        tmp_path = tmp.name
+    frames = []
+    try:
+        cap = cv2.VideoCapture(tmp_path)
+        total = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
+        step = max(1, total // n_frames)
+        for i in range(n_frames):
+            cap.set(cv2.CAP_PROP_POS_FRAMES, i * step)
+            ret, frame = cap.read()
+            if ret:
+                _, buf = cv2.imencode(".jpg", frame)
+                frames.append(buf.tobytes())
+        cap.release()
+    finally:
+        os.unlink(tmp_path)
+    return frames
+def gemini_analyze_video(video_bytes: bytes, mime_type: str = "video/mp4") -> dict:
+    with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as tmp:
+        tmp.write(video_bytes)
+        tmp_path = tmp.name
+    try:
+        uploaded = client.files.upload(file=tmp_path, config=types.UploadFileConfig(mime_type=mime_type))
+        while uploaded.state.name == "PROCESSING":
+            time.sleep(2)
+            uploaded = client.files.get(name=uploaded.name)
+        system_prompt = (
+            "You are a deepfake and media manipulation expert. Analyse this video for: "
+            "AI-generated faces, lip-sync inconsistencies, unnatural blinking, "
+            "visual artefacts, lighting inconsistencies, and audio-visual mismatch. "
+            "Reply ONLY with valid JSON: "
+            '{"risk_score": <float 0-1>, "threat_types": [<strings>], "explanation": <string>}'
+        )
+        from app.tools.retry_utils import execute_with_retry
+        response = execute_with_retry(
+            lambda: client.models.generate_content(
+                model=GEMINI_MODEL,
+                contents=[system_prompt, uploaded],
+            )
+        )
+        raw = response.text.strip().strip("```json").strip("```").strip()
+        return json.loads(raw)
+    except Exception as e:
+        return {"risk_score": 0.0, "threat_types": [], "explanation": f"Error: {e}"}
+    finally:
+        os.unlink(tmp_path)
+def score_video_frames(video_bytes: bytes) -> list[float]:
+    frames = sample_frames(video_bytes)
+    scores = []
+    for frame_bytes in frames:
+        result = hf_detect_image_deepfake(frame_bytes, mime_type="image/jpeg")
+        scores.append(result.get("deepfake_score", 0.0))
+    return scores

docker-compose.yml ADDED Viewed

	@@ -0,0 +1,13 @@

+version: '3.8'
+services:
+  antiphish:
+    build:
+      context: .
+      args:
+        HUGGING_FACE_TOKEN: ${HUGGING_FACE_TOKEN}
+    ports:
+      - "7860:7860"
+    env_file:
+      - .env
+    restart: unless-stopped

download_models.py ADDED Viewed

	@@ -0,0 +1,39 @@

+import os
+from transformers import pipeline
+import logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Hardcoded model names to avoid needing .env during build
+HF_IMAGE_MODEL = "dima806/deepfake_vs_real_image_detection"
+HF_AUDIO_MODEL = "mo-thecreator/Deepfake-audio-detection"
+def download_models():
+    """
+    Downloads and caches the models in the Hugging Face cache directory.
+    This script is intended to be run during the Docker image build process
+    so that the models are baked into the image layers.
+    """
+    logger.info(f"Downloading image model: {HF_IMAGE_MODEL}...")
+    try:
+        # Load pipeline to force download of weights, tokenizer, config
+        _ = pipeline("image-classification", model=HF_IMAGE_MODEL)
+        logger.info("Successfully cached image model.")
+    except Exception as e:
+        logger.error(f"Failed to cache image model: {e}")
+        raise
+    logger.info(f"Downloading audio model: {HF_AUDIO_MODEL}...")
+    try:
+        # Load pipeline to force download, providing token for gated models
+        hf_token = os.environ.get("HUGGING_FACE_TOKEN")
+        _ = pipeline("audio-classification", model=HF_AUDIO_MODEL, token=hf_token)
+        logger.info("Successfully cached audio model.")
+    except Exception as e:
+        logger.error(f"Failed to cache audio model: {e}")
+        raise
+if __name__ == "__main__":
+    download_models()
+    logger.info("All models downloaded and cached successfully.")

frontend/index.html ADDED Viewed

	@@ -0,0 +1,122 @@

+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8">
+  <title>Anti-Phishing AI - Demo</title>
+</head>
+<body>
+  <h1>Anti-Phishing AI Backend - Demo</h1>
+  <p>Test all analysis endpoints. Results shown as raw JSON below each form.</p>
+  <hr>
+  <!-- TEXT -->
+  <h2>Text Analysis (phishing + URL detection)</h2>
+  <form id="textForm">
+    <label>Paste text or email body:<br>
+      <textarea id="textInput" rows="6" cols="80" placeholder="URGENT: Your account has been suspended. Click http://fake-bank.tk to verify..."></textarea>
+    </label><br>
+    <button type="submit">Analyze Text</button>
+  </form>
+  <pre id="textResult"></pre>
+  <hr>
+  <!-- IMAGE -->
+  <h2>Image Analysis (deepfake + phishing screenshot)</h2>
+  <form id="imageForm">
+    <label>Upload image (JPG, PNG, WEBP):<br>
+      <input type="file" id="imageInput" accept="image/*">
+    </label><br>
+    <button type="submit">Analyze Image</button>
+  </form>
+  <pre id="imageResult"></pre>
+  <hr>
+  <!-- VIDEO -->
+  <h2>Video Analysis (deepfake video detection)</h2>
+  <form id="videoForm">
+    <label>Upload video (MP4, WEBM):<br>
+      <input type="file" id="videoInput" accept="video/*">
+    </label><br>
+    <button type="submit">Analyze Video</button>
+    <span id="videoStatus"></span>
+  </form>
+  <pre id="videoResult"></pre>
+  <hr>
+  <!-- AUDIO -->
+  <h2>Audio Analysis (deepfake / AI voice detection)</h2>
+  <form id="audioForm">
+    <label>Upload audio (WAV, MP3, OGG):<br>
+      <input type="file" id="audioInput" accept="audio/*">
+    </label><br>
+    <button type="submit">Analyze Audio</button>
+  </form>
+  <pre id="audioResult"></pre>
+  <script>
+    const BASE = "";
+    async function postJSON(url, body, resultId) {
+      const el = document.getElementById(resultId);
+      el.textContent = "Analyzing...";
+      try {
+        const res = await fetch(BASE + url, {
+          method: "POST",
+          headers: { "Content-Type": "application/json" },
+          body: JSON.stringify(body),
+        });
+        const data = await res.json();
+        el.textContent = JSON.stringify(data, null, 2);
+      } catch (e) {
+        el.textContent = "ERROR: " + e.message;
+      }
+    }
+    async function postFile(url, file, resultId, statusId) {
+      const el = document.getElementById(resultId);
+      const statusEl = statusId ? document.getElementById(statusId) : null;
+      el.textContent = "Uploading and analyzing...";
+      if (statusEl) statusEl.textContent = " (this may take 10-30s for video)";
+      const form = new FormData();
+      form.append("file", file);
+      try {
+        const res = await fetch(BASE + url, { method: "POST", body: form });
+        const data = await res.json();
+        el.textContent = JSON.stringify(data, null, 2);
+        if (statusEl) statusEl.textContent = "";
+      } catch (e) {
+        el.textContent = "ERROR: " + e.message;
+        if (statusEl) statusEl.textContent = "";
+      }
+    }
+    document.getElementById("textForm").addEventListener("submit", e => {
+      e.preventDefault();
+      const text = document.getElementById("textInput").value.trim();
+      if (!text) return;
+      postJSON("/analyze/text", { text }, "textResult");
+    });
+    document.getElementById("imageForm").addEventListener("submit", e => {
+      e.preventDefault();
+      const file = document.getElementById("imageInput").files[0];
+      if (!file) return;
+      postFile("/analyze/image", file, "imageResult", null);
+    });
+    document.getElementById("videoForm").addEventListener("submit", e => {
+      e.preventDefault();
+      const file = document.getElementById("videoInput").files[0];
+      if (!file) return;
+      postFile("/analyze/video", file, "videoResult", "videoStatus");
+    });
+    document.getElementById("audioForm").addEventListener("submit", e => {
+      e.preventDefault();
+      const file = document.getElementById("audioInput").files[0];
+      if (!file) return;
+      postFile("/analyze/audio", file, "audioResult", null);
+    });
+  </script>
+</body>
+</html>

requirements.txt ADDED Viewed

	@@ -0,0 +1,18 @@

+fastapi
+uvicorn[standard]
+python-multipart
+google-generativeai
+langchain
+langchain-google-genai
+langchain-core
+huggingface_hub
+pydantic
+python-dotenv
+Pillow
+opencv-python-headless
+requests
+transformers>=4.41.2
+torch>=2.1.2
+torchaudio>=2.1.2
+torchvision>=0.16.2
+soundfile>=0.12.1