broskiiii commited on
Commit
8d1d8b8
·
0 Parent(s):

Initial commit including Dockerized FastApi app

Browse files
.dockerignore ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Ignore local virtual environment
2
+ venv/
3
+ .env
4
+
5
+ # Ignore git
6
+ .git/
7
+
8
+ # Ignore Python cache
9
+ __pycache__/
10
+ *.pyc
11
+ *.pyo
12
+ *.pyd
13
+ .Python
14
+ env/
15
+ .pytest_cache/
.gitignore ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ venv/
2
+ .env
3
+ __pycache__/
4
+ *.pyc
5
+ .DS_Store
Dockerfile ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use a slim Python base image
2
+ FROM python:3.10-slim
3
+
4
+ # Create a non-root user 'user' with UID 1000
5
+ # This is required by Hugging Face Spaces
6
+ RUN useradd -m -u 1000 user
7
+
8
+ # Set environment variables
9
+ ENV PYTHONDONTWRITEBYTECODE=1
10
+ ENV PYTHONUNBUFFERED=1
11
+ ENV HOME=/home/user
12
+ ENV PATH=$HOME/.local/bin:$PATH
13
+
14
+ # Install system dependencies required for OpenCV, audio processing, etc.
15
+ RUN apt-get update && apt-get install -y \
16
+ libgl1 \
17
+ libglib2.0-0 \
18
+ libsndfile1 \
19
+ && rm -rf /var/lib/apt/lists/*
20
+
21
+ # Set the working directory
22
+ WORKDIR $HOME/app
23
+
24
+ # Change ownership of the app directory to the 'user'
25
+ RUN chown -R user:user $HOME/app
26
+
27
+ # Switch to the non-root user
28
+ USER user
29
+
30
+ # Copy requirements and install them
31
+ COPY --chown=user:user requirements.txt .
32
+ RUN pip install --no-cache-dir -r requirements.txt
33
+
34
+ # Copy the download script and execute it to cache models
35
+ # This bakes the downloaded models directly into the Docker image layers
36
+ COPY --chown=user:user download_models.py .
37
+ ARG HUGGING_FACE_TOKEN
38
+ ENV HUGGING_FACE_TOKEN=$HUGGING_FACE_TOKEN
39
+ RUN python download_models.py
40
+
41
+ # Copy the rest of the application code
42
+ COPY --chown=user:user app ./app
43
+ COPY --chown=user:user frontend ./frontend
44
+
45
+
46
+ # Expose port (HF Spaces routes traffic to 7860 by default)
47
+ EXPOSE 7860
48
+
49
+ # Command to run the application (assuming FastAPI via Uvicorn)
50
+ CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "7860"]
README.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Test
3
+ emoji: 🦀
4
+ colorFrom: green
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ short_description: deepfakes
9
+ ---
10
+
11
+ # Anti-Phishing AI Backend
12
+
13
+ FastAPI backend and HTML demo for phishing and deepfake detection. Built for hackathon.
14
+
15
+ ## Stack
16
+ - **Framework**: FastAPI
17
+ - **AI Agent**: LangChain + Google Gemini 2.0 Flash (text, image, video, audio)
18
+ - **Deepfake APIs**: HuggingFace Inference API (image + audio)
19
+ - **Video**: Gemini Files API + frame-level HF image model
20
+
21
+ ## Setup
22
+
23
+ ```bash
24
+ cd antiphish
25
+ pip install -r requirements.txt
26
+ ```
27
+
28
+ Copy the `.env` file from the parent directory or create one:
29
+ ```
30
+ GEMINI_API_KEY=your_key_here
31
+ HUGGING_FACE_TOKEN=your_hf_token_here
32
+ ```
33
+
34
+ ## Run
35
+
36
+ ```bash
37
+ uvicorn app.main:app --reload
38
+ ```
39
+
40
+ - **API docs**: http://localhost:8000/docs
41
+ - **Demo frontend**: http://localhost:8000
42
+
43
+ ## Walkthrough: How To Use
44
+
45
+ The Anti-Phishing AI app analyzes text, images, videos, and audio for phishing attempts, scams, and deepfakes.
46
+
47
+ ### 1. Web Interface Walkthrough
48
+ When you open `http://localhost:8000`, you will see a simple user interface. Switch between tabs depending on the type of media you want to analyze:
49
+
50
+ * **Text & URLs:** Paste suspicious emails, SMS messages, or links. The app uses Gemini to detect urgency language, impersonation tactics, credential harvesting, and cross-references any URLs against suspicious top-level domains.
51
+ * **Images:** Upload an image (like a screenshot of a login page or a photo of a document). The app uses a HuggingFace model to detect if the face in the image is a deepfake, and Gemini Vision to see if the image is a fake login screen or brand impersonation.
52
+ * **Video:** Upload a short `.mp4` video. The app samples frames and runs deepfake diagnostics on them, while simultaneously uploading the video to Gemini to check for unnatural blinking, lip-sync inconsistencies, and visual anomalies.
53
+ * **Audio:** Upload an audio file (like a voicemail or recorded phone call). The HuggingFace integration checks the audio waveform for synthetic/AI-generated markers, while Gemini listens for common scam scripts (e.g., "fake bank security alert" or "tech support").
54
+
55
+ ### 2. API / Developer Walkthrough
56
+ You can integrate this backend with another app or bot by sending requests directly to the API endpoints.
57
+
58
+ **Checking the API documentation:** All automated Swagger docs are at http://localhost:8000/docs.
59
+
60
+ **Testing the Text Endpoint via terminal:**
61
+ ```bash
62
+ curl -X POST http://localhost:8000/analyze/text \
63
+ -H "Content-Type: application/json" \
64
+ -d '{"text": "URGENT: Your Paypal account has been locked. Click here to verify your identity: http://paypal-secure.ml/login"}'
65
+ ```
66
+
67
+ **Testing the Image/Audio/Video Endpoints:**
68
+ For media, send the file as a `multipart/form-data` upload:
69
+ ```bash
70
+ curl -X POST http://localhost:8000/analyze/image \
71
+ -F "file=@/path/to/suspicious_image.jpg"
72
+ ```
73
+
74
+ ## Endpoints
75
+
76
+ | Method | Endpoint | Input | Description |
77
+ |---|---|---|---|
78
+ | POST | `/analyze/text` | JSON `{"text": "..."}` | Phishing text + URL detection |
79
+ | POST | `/analyze/image` | multipart file | Deepfake + phishing screenshot detection |
80
+ | POST | `/analyze/video` | multipart file | Deepfake video detection |
81
+ | POST | `/analyze/audio` | multipart file | Deepfake / AI voice detection |
82
+
83
+ ## Response Format
84
+
85
+ ```json
86
+ {
87
+ "risk_score": 0.87,
88
+ "risk_level": "HIGH",
89
+ "threat_types": ["phishing", "urgency_language", "malicious_url"],
90
+ "explanation": "Human-readable analysis from Gemini.",
91
+ "tool_outputs": { ... }
92
+ }
93
+ ```
94
+
95
+ `risk_level`: `LOW` (0-0.3) | `MEDIUM` (0.3-0.6) | `HIGH` (0.6-0.85) | `CRITICAL` (0.85-1.0)
96
+
97
+ ## Models Used
98
+
99
+ | Modality | Model |
100
+ |---|---|
101
+ | Text / URL | Gemini 2.0 Flash (structured JSON prompt) |
102
+ | Image deepfake | `dima806/deepfake_vs_real_image_detection` (HF API) |
103
+ | Image phishing | Gemini Vision (multimodal) |
104
+ | Video deepfake | Gemini Files API + frame-sampled HF image model |
105
+ | Audio deepfake | `motheecreator/deepfake-audio-detection-v2` (HF API) |
106
+ | Audio voice scam | Gemini Audio (multimodal) |
app/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Empty init to make app a package."""
app/agent.py ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ LangChain agent wiring: registers all tools and invokes them per modality.
3
+ Returns a structured AnalysisResult.
4
+ """
5
+ import json
6
+ from langchain_google_genai import ChatGoogleGenerativeAI
7
+ from langchain_core.messages import HumanMessage, SystemMessage
8
+ from app.config import GEMINI_API_KEY, GEMINI_MODEL, GEMINI_MODEL_FALLBACKS
9
+ from app.models import AnalysisResult
10
+
11
+
12
+ def _risk_level(score: float) -> str:
13
+ if score < 0.3:
14
+ return "LOW"
15
+ elif score < 0.6:
16
+ return "MEDIUM"
17
+ elif score < 0.85:
18
+ return "HIGH"
19
+ return "CRITICAL"
20
+
21
+
22
+ def invoke_with_fallback(messages: list) -> str:
23
+ """Try GEMINI_MODEL then each fallback until one succeeds."""
24
+ models_to_try = [GEMINI_MODEL] + GEMINI_MODEL_FALLBACKS
25
+ last_err = None
26
+ for model_name in models_to_try:
27
+ try:
28
+ llm = ChatGoogleGenerativeAI(
29
+ model=model_name,
30
+ google_api_key=GEMINI_API_KEY,
31
+ temperature=0.1,
32
+ )
33
+ return llm.invoke(messages).content
34
+ except Exception as e:
35
+ last_err = e
36
+ if "429" not in str(e) and "RESOURCE_EXHAUSTED" not in str(e):
37
+ raise
38
+ raise RuntimeError(f"All Gemini models exhausted. Last error: {last_err}")
39
+
40
+
41
+ def run_text_agent(text: str, url_flags: dict) -> AnalysisResult:
42
+ system = (
43
+ "You are a cybersecurity expert specializing in phishing detection. "
44
+ "Analyse the provided text for phishing indicators: urgency language, "
45
+ "impersonation, social engineering, suspicious URLs, credential harvesting. "
46
+ "Respond ONLY with valid JSON matching this schema: "
47
+ '{"risk_score": <float 0-1>, "threat_types": [<strings>], "explanation": <string>}'
48
+ )
49
+ prompt = f"TEXT TO ANALYSE:\n{text}\n\nURL SCAN RESULTS:\n{json.dumps(url_flags)}"
50
+ raw = invoke_with_fallback([SystemMessage(content=system), HumanMessage(content=prompt)])
51
+ raw = raw.strip().strip("```json").strip("```").strip()
52
+ data = json.loads(raw)
53
+ score = float(data["risk_score"])
54
+ return AnalysisResult(
55
+ risk_score=score,
56
+ risk_level=_risk_level(score),
57
+ threat_types=data.get("threat_types", []),
58
+ explanation=data.get("explanation", ""),
59
+ tool_outputs={"gemini_text": data, "url_scan": url_flags},
60
+ )
61
+
62
+
63
+ def run_image_agent(hf_result: dict, gemini_result: dict) -> AnalysisResult:
64
+ hf_score = hf_result.get("deepfake_score", 0.0)
65
+ gemini_score = gemini_result.get("risk_score", 0.0)
66
+ combined = round((hf_score * 0.5) + (gemini_score * 0.5), 3)
67
+ threat_types = list(
68
+ set(hf_result.get("threat_types", []) + gemini_result.get("threat_types", []))
69
+ )
70
+ explanation = (
71
+ f"HuggingFace deepfake model: {hf_result.get('label', 'N/A')} "
72
+ f"(confidence {hf_score:.2f}). "
73
+ f"Gemini vision analysis: {gemini_result.get('explanation', '')}"
74
+ )
75
+ return AnalysisResult(
76
+ risk_score=combined,
77
+ risk_level=_risk_level(combined),
78
+ threat_types=threat_types,
79
+ explanation=explanation,
80
+ tool_outputs={"hf_deepfake": hf_result, "gemini_vision": gemini_result},
81
+ )
82
+
83
+
84
+ def run_video_agent(gemini_result: dict, frame_scores: list[float]) -> AnalysisResult:
85
+ gemini_score = gemini_result.get("risk_score", 0.0)
86
+ avg_frame = sum(frame_scores) / len(frame_scores) if frame_scores else 0.0
87
+ combined = round((gemini_score * 0.6) + (avg_frame * 0.4), 3)
88
+ explanation = (
89
+ f"Gemini video analysis score: {gemini_score:.2f}. "
90
+ f"Frame-level deepfake average: {avg_frame:.2f} over {len(frame_scores)} frames. "
91
+ f"{gemini_result.get('explanation', '')}"
92
+ )
93
+ return AnalysisResult(
94
+ risk_score=combined,
95
+ risk_level=_risk_level(combined),
96
+ threat_types=gemini_result.get("threat_types", ["deepfake_video"]),
97
+ explanation=explanation,
98
+ tool_outputs={"gemini_video": gemini_result, "frame_scores": frame_scores},
99
+ )
100
+
101
+
102
+ def run_audio_agent(hf_result: dict, gemini_result: dict) -> AnalysisResult:
103
+ hf_score = hf_result.get("deepfake_score", 0.0)
104
+ gemini_score = gemini_result.get("risk_score", 0.0)
105
+ combined = round((hf_score * 0.5) + (gemini_score * 0.5), 3)
106
+ threat_types = list(
107
+ set(hf_result.get("threat_types", []) + gemini_result.get("threat_types", []))
108
+ )
109
+ explanation = (
110
+ f"HuggingFace audio deepfake model: {hf_result.get('label', 'N/A')} "
111
+ f"(confidence {hf_score:.2f}). "
112
+ f"Gemini audio analysis: {gemini_result.get('explanation', '')}"
113
+ )
114
+ return AnalysisResult(
115
+ risk_score=combined,
116
+ risk_level=_risk_level(combined),
117
+ threat_types=threat_types,
118
+ explanation=explanation,
119
+ tool_outputs={"hf_audio": hf_result, "gemini_audio": gemini_result},
120
+ )
app/config.py ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from dotenv import load_dotenv
3
+
4
+ load_dotenv(dotenv_path=os.path.join(os.path.dirname(__file__), "..", ".env"))
5
+
6
+ GEMINI_API_KEY: str = os.environ["GEMINI_API_KEY"]
7
+ HUGGING_FACE_TOKEN: str = os.environ["HUGGING_FACE_TOKEN"]
8
+
9
+ HF_IMAGE_MODEL = "dima806/deepfake_vs_real_image_detection"
10
+ HF_AUDIO_MODEL = "mo-thecreator/Deepfake-audio-detection"
11
+ GEMINI_MODEL = "gemini-2.5-flash"
12
+ GEMINI_MODEL_FALLBACKS = ["gemini-2.5-flash"]
app/main.py ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ FastAPI entrypoint for the Anti-Phishing Backend.
3
+ Serves all 4 analysis routers and the plain HTML demo frontend.
4
+ """
5
+ from fastapi import FastAPI
6
+ from fastapi.middleware.cors import CORSMiddleware
7
+ from fastapi.staticfiles import StaticFiles
8
+ from fastapi.responses import FileResponse
9
+ import os
10
+
11
+ from app.routers import text, image, video, audio
12
+
13
+ app = FastAPI(
14
+ title="Anti-Phishing AI Backend",
15
+ description="LangChain + Gemini powered phishing and deepfake detection API",
16
+ version="1.0.0",
17
+ )
18
+
19
+ app.add_middleware(
20
+ CORSMiddleware,
21
+ allow_origins=["*"],
22
+ allow_methods=["*"],
23
+ allow_headers=["*"],
24
+ )
25
+
26
+ app.include_router(text.router, prefix="/analyze", tags=["Text"])
27
+ app.include_router(image.router, prefix="/analyze", tags=["Image"])
28
+ app.include_router(video.router, prefix="/analyze", tags=["Video"])
29
+ app.include_router(audio.router, prefix="/analyze", tags=["Audio"])
30
+
31
+ FRONTEND_DIR = os.path.join(os.path.dirname(__file__), "..", "frontend")
32
+
33
+
34
+ @app.get("/", include_in_schema=False)
35
+ async def serve_frontend():
36
+ return FileResponse(os.path.join(FRONTEND_DIR, "index.html"))
37
+
38
+
39
+ @app.get("/health")
40
+ async def health():
41
+ return {"status": "ok"}
app/models.py ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pydantic import BaseModel, Field
2
+ from typing import Any
3
+
4
+
5
+ class TextRequest(BaseModel):
6
+ text: str
7
+
8
+
9
+ class AnalysisResult(BaseModel):
10
+ risk_score: float = Field(..., ge=0.0, le=1.0, description="0.0 = safe, 1.0 = critical threat")
11
+ risk_level: str = Field(..., description="LOW | MEDIUM | HIGH | CRITICAL")
12
+ threat_types: list[str] = Field(default_factory=list)
13
+ explanation: str
14
+ tool_outputs: dict[str, Any] = Field(default_factory=dict)
app/routers/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Empty init to make app/routers a package."""
app/routers/audio.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import APIRouter, UploadFile, File, HTTPException
2
+ from app.models import AnalysisResult
3
+ from app.tools.audio_tools import hf_detect_audio_deepfake, gemini_analyze_audio
4
+ from app.agent import run_audio_agent
5
+
6
+ router = APIRouter()
7
+
8
+
9
+ @router.post("/audio", response_model=AnalysisResult)
10
+ async def analyze_audio(file: UploadFile = File(...)):
11
+ allowed = {"audio/wav", "audio/mpeg", "audio/mp3", "audio/ogg", "audio/x-wav"}
12
+ if file.content_type not in allowed:
13
+ raise HTTPException(status_code=400, detail=f"Unsupported audio type: {file.content_type}")
14
+ try:
15
+ audio_bytes = await file.read()
16
+ hf_result = hf_detect_audio_deepfake(audio_bytes)
17
+ gemini_result = gemini_analyze_audio(audio_bytes, mime_type=file.content_type)
18
+ return run_audio_agent(hf_result, gemini_result)
19
+ except Exception as e:
20
+ raise HTTPException(status_code=500, detail=str(e))
app/routers/image.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import APIRouter, UploadFile, File, HTTPException
2
+ from app.models import AnalysisResult
3
+ from app.tools.image_tools import hf_detect_image_deepfake, gemini_analyze_image
4
+ from app.agent import run_image_agent
5
+
6
+ router = APIRouter()
7
+
8
+
9
+ @router.post("/image", response_model=AnalysisResult)
10
+ async def analyze_image(file: UploadFile = File(...)):
11
+ allowed = {"image/jpeg", "image/png", "image/webp", "image/gif"}
12
+ if file.content_type not in allowed:
13
+ raise HTTPException(status_code=400, detail=f"Unsupported image type: {file.content_type}")
14
+ try:
15
+ image_bytes = await file.read()
16
+ hf_result = hf_detect_image_deepfake(image_bytes, mime_type=file.content_type)
17
+ gemini_result = gemini_analyze_image(image_bytes, mime_type=file.content_type)
18
+ return run_image_agent(hf_result, gemini_result)
19
+ except Exception as e:
20
+ raise HTTPException(status_code=500, detail=str(e))
app/routers/text.py ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import APIRouter, HTTPException
2
+ from app.models import TextRequest, AnalysisResult
3
+ from app.tools.text_tools import analyze_urls_in_text, gemini_analyze_text
4
+ from app.agent import run_text_agent
5
+
6
+ router = APIRouter()
7
+
8
+
9
+ @router.post("/text", response_model=AnalysisResult)
10
+ async def analyze_text(request: TextRequest):
11
+ try:
12
+ url_flags = analyze_urls_in_text(request.text)
13
+ gemini_data = gemini_analyze_text(request.text)
14
+ return run_text_agent(request.text, url_flags)
15
+ except Exception as e:
16
+ raise HTTPException(status_code=500, detail=str(e))
app/routers/video.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import APIRouter, UploadFile, File, HTTPException
2
+ from app.models import AnalysisResult
3
+ from app.tools.video_tools import gemini_analyze_video, score_video_frames
4
+ from app.agent import run_video_agent
5
+
6
+ router = APIRouter()
7
+
8
+
9
+ @router.post("/video", response_model=AnalysisResult)
10
+ async def analyze_video(file: UploadFile = File(...)):
11
+ allowed = {"video/mp4", "video/mpeg", "video/webm", "video/quicktime"}
12
+ if file.content_type not in allowed:
13
+ raise HTTPException(status_code=400, detail=f"Unsupported video type: {file.content_type}")
14
+ try:
15
+ video_bytes = await file.read()
16
+ gemini_result = gemini_analyze_video(video_bytes, mime_type=file.content_type)
17
+ frame_scores = score_video_frames(video_bytes)
18
+ return run_video_agent(gemini_result, frame_scores)
19
+ except Exception as e:
20
+ raise HTTPException(status_code=500, detail=str(e))
app/tools/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Empty init to make app/tools a package."""
app/tools/audio_tools.py ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Audio tools:
3
+ - HuggingFace Inference API for deepfake/AI-voice detection
4
+ - Gemini for voice-scam content analysis
5
+ """
6
+ import json
7
+ import base64
8
+ import tempfile
9
+ import os
10
+ from huggingface_hub import InferenceClient
11
+ from langchain_google_genai import ChatGoogleGenerativeAI
12
+ from langchain_core.messages import HumanMessage, SystemMessage
13
+ from app.config import GEMINI_API_KEY, GEMINI_MODEL, GEMINI_MODEL_FALLBACKS, HF_AUDIO_MODEL
14
+ from transformers import pipeline
15
+
16
+ # Global pipeline instance to load model once at startup (or first use)
17
+ _audio_classifier = None
18
+
19
+ def get_audio_classifier():
20
+ global _audio_classifier
21
+ if _audio_classifier is None:
22
+ # Load the local model that was downloaded by download_models.py
23
+ _audio_classifier = pipeline("audio-classification", model=HF_AUDIO_MODEL)
24
+ return _audio_classifier
25
+
26
+
27
+ def hf_detect_audio_deepfake(audio_bytes: bytes) -> dict:
28
+ import io
29
+ import soundfile as sf
30
+ import numpy as np
31
+
32
+ try:
33
+ # Read the audio bytes directly into a numpy array (skipping the disk write)
34
+ # pipeline("audio-classification") expects either a file path or a raw waveform array
35
+ audio_data, sample_rate = sf.read(io.BytesIO(audio_bytes))
36
+
37
+ # sf.read might return stereo (2D array), convert to mono if necessary
38
+ if len(audio_data.shape) > 1:
39
+ audio_data = audio_data.mean(axis=1)
40
+
41
+ # Get the initialized pipeline
42
+ classifier = get_audio_classifier()
43
+
44
+ # We must provide the correct sampling rate if passing raw arrays
45
+ # The pipeline usually expects 16kHz for deepfake models,
46
+ # but passing it as a dict {raw: array, sampling_rate: sr} is supported by HF pipelines
47
+ results = classifier({"raw": audio_data, "sampling_rate": sample_rate})
48
+
49
+ # Transformers pipeline returns a list of dicts like: [{'label': 'real', 'score': 0.9}, ...]
50
+ label_map = {r['label'].lower(): r['score'] for r in results}
51
+ fake_score = label_map.get("fake", label_map.get("spoof", label_map.get("ai-generated", 0.0)))
52
+ label = "FAKE" if fake_score > 0.5 else "REAL"
53
+
54
+ return {
55
+ "label": label,
56
+ "deepfake_score": round(fake_score, 4),
57
+ "threat_types": ["deepfake_audio", "ai_voice"] if fake_score > 0.5 else [],
58
+ "raw": [{"label": r['label'], "score": r['score']} for r in results],
59
+ }
60
+ except Exception as e:
61
+ return {"label": "ERROR", "deepfake_score": 0.0, "threat_types": [], "error": str(e)}
62
+
63
+
64
+
65
+ def gemini_analyze_audio(audio_bytes: bytes, mime_type: str = "audio/wav") -> dict:
66
+ b64 = base64.b64encode(audio_bytes).decode()
67
+ system = (
68
+ "You are a voice fraud and deepfake audio expert. Analyse this audio for: "
69
+ "AI-generated voice patterns, robotic/synthetic speech artifacts, voice cloning indicators, "
70
+ "scam scripts (fake tech support, fake bank calls, urgent threats). "
71
+ "Reply ONLY with valid JSON: "
72
+ '{"risk_score": <float 0-1>, "threat_types": [<strings>], "explanation": <string>}'
73
+ )
74
+ message = HumanMessage(
75
+ content=[
76
+ {"type": "text", "text": system},
77
+ {"type": "media", "data": b64, "mime_type": mime_type},
78
+ ]
79
+ )
80
+ from app.tools.retry_utils import execute_with_retry
81
+ for model in [GEMINI_MODEL] + GEMINI_MODEL_FALLBACKS:
82
+ try:
83
+ resp = execute_with_retry(
84
+ lambda m=model: ChatGoogleGenerativeAI(model=m, google_api_key=GEMINI_API_KEY, temperature=0.1).invoke([message])
85
+ )
86
+ raw = resp.content.strip().strip("```json").strip("```").strip()
87
+ return json.loads(raw)
88
+ except Exception as e:
89
+ if "429" not in str(e) and "RESOURCE_EXHAUSTED" not in str(e):
90
+ raise
91
+ return {"risk_score": 0.0, "threat_types": [], "explanation": "Gemini quota exhausted for all models"}
app/tools/image_tools.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Image tools:
3
+ - HuggingFace Inference API for deepfake image detection
4
+ - Gemini Vision for phishing screenshot / fake document analysis
5
+ """
6
+ import json
7
+ import base64
8
+ import tempfile
9
+ import os
10
+ from huggingface_hub import InferenceClient
11
+ from langchain_google_genai import ChatGoogleGenerativeAI
12
+ from langchain_core.messages import HumanMessage, SystemMessage
13
+ from app.config import GEMINI_API_KEY, GEMINI_MODEL, GEMINI_MODEL_FALLBACKS, HF_IMAGE_MODEL
14
+ from transformers import pipeline
15
+
16
+ # Global pipeline instance to load model once at startup (or first use)
17
+ _image_classifier = None
18
+
19
+ def get_image_classifier():
20
+ global _image_classifier
21
+ if _image_classifier is None:
22
+ # Load the local model that was downloaded by download_models.py
23
+ _image_classifier = pipeline("image-classification", model=HF_IMAGE_MODEL)
24
+ return _image_classifier
25
+
26
+
27
+
28
+ def hf_detect_image_deepfake(image_bytes: bytes, mime_type: str = "image/jpeg") -> dict:
29
+ from PIL import Image
30
+ import io
31
+
32
+ try:
33
+ # Open image using PIL
34
+ image = Image.open(io.BytesIO(image_bytes)).convert("RGB")
35
+
36
+ # Get the initialized pipeline
37
+ classifier = get_image_classifier()
38
+
39
+ # Run inference locally
40
+ results = classifier(image)
41
+
42
+ # Transformers pipeline returns a list of dicts like: [{'label': 'real', 'score': 0.9}, ...]
43
+ label_map = {r['label'].lower(): r['score'] for r in results}
44
+ fake_score = label_map.get("fake", label_map.get("deepfake", 0.0))
45
+ label = "FAKE" if fake_score > 0.5 else "REAL"
46
+
47
+ return {
48
+ "label": label,
49
+ "deepfake_score": round(fake_score, 4),
50
+ "threat_types": ["deepfake_image"] if fake_score > 0.5 else [],
51
+ "raw": [{"label": r['label'], "score": r['score']} for r in results],
52
+ }
53
+ except Exception as e:
54
+ return {"label": "ERROR", "deepfake_score": 0.0, "threat_types": [], "error": str(e)}
55
+
56
+
57
+
58
+ def gemini_analyze_image(image_bytes: bytes, mime_type: str = "image/jpeg") -> dict:
59
+ b64 = base64.b64encode(image_bytes).decode()
60
+ system = (
61
+ "You are a cybersecurity image analyst. Examine the image for: "
62
+ "fake login pages, phishing screenshots, fake documents, impersonated brands, "
63
+ "deepfake or AI-generated faces/content. "
64
+ "Reply ONLY with valid JSON: "
65
+ '{"risk_score": <float 0-1>, "threat_types": [<strings>], "explanation": <string>}'
66
+ )
67
+ message = HumanMessage(
68
+ content=[
69
+ {"type": "text", "text": system},
70
+ {"type": "image_url", "image_url": {"url": f"data:{mime_type};base64,{b64}"}},
71
+ ]
72
+ )
73
+ from app.tools.retry_utils import execute_with_retry
74
+ for model in [GEMINI_MODEL] + GEMINI_MODEL_FALLBACKS:
75
+ try:
76
+ resp = execute_with_retry(
77
+ lambda m=model: ChatGoogleGenerativeAI(model=m, google_api_key=GEMINI_API_KEY, temperature=0.1).invoke([message])
78
+ )
79
+ raw = resp.content.strip().strip("```json").strip("```").strip()
80
+ return json.loads(raw)
81
+ except Exception as e:
82
+ if "429" not in str(e) and "RESOURCE_EXHAUSTED" not in str(e):
83
+ raise
84
+ return {"risk_score": 0.0, "threat_types": [], "explanation": "Gemini quota exhausted for all models"}
app/tools/retry_utils.py ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import time
2
+ import logging
3
+ from typing import Callable, Any
4
+
5
+ logger = logging.getLogger(__name__)
6
+
7
+ def execute_with_retry(func: Callable, max_retries: int = 4, initial_backoff: float = 2.0, *args, **kwargs) -> Any:
8
+ """
9
+ Executes a function with exponential backoff for 429 and RESOURCE_EXHAUSTED errors.
10
+ """
11
+ backoff = initial_backoff
12
+ for attempt in range(max_retries):
13
+ try:
14
+ return func(*args, **kwargs)
15
+ except Exception as e:
16
+ error_msg = str(e)
17
+ if "429" in error_msg or "RESOURCE_EXHAUSTED" in error_msg:
18
+ if attempt == max_retries - 1:
19
+ logger.error(f"Max retries reached for 429 error: {e}")
20
+ raise
21
+ logger.warning(f"Rate limited (429/Resource Exhausted). Retrying in {backoff} seconds...")
22
+ time.sleep(backoff)
23
+ backoff *= 2
24
+ else:
25
+ raise
app/tools/text_tools.py ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Text tools: phishing text analysis via Gemini + URL extraction/scoring.
3
+ """
4
+ import re
5
+ import json
6
+ from langchain_google_genai import ChatGoogleGenerativeAI
7
+ from langchain_core.messages import HumanMessage, SystemMessage
8
+ from app.config import GEMINI_API_KEY, GEMINI_MODEL, GEMINI_MODEL_FALLBACKS
9
+
10
+ _SUSPICIOUS_TLD = re.compile(
11
+ r"https?://[^\s\"'<>]+", re.IGNORECASE
12
+ )
13
+ _BAD_TLDS = {".tk", ".ml", ".ga", ".cf", ".gq", ".xyz", ".top", ".click", ".loan", ".work"}
14
+
15
+
16
+ def extract_urls(text: str) -> list[str]:
17
+ return _SUSPICIOUS_TLD.findall(text)
18
+
19
+
20
+ def score_url(url: str) -> dict:
21
+ from urllib.parse import urlparse
22
+ parsed = urlparse(url)
23
+ domain = parsed.netloc.lower()
24
+ flags = []
25
+ is_suspicious = False
26
+
27
+ for tld in _BAD_TLDS:
28
+ if domain.endswith(tld):
29
+ flags.append(f"suspicious_tld:{tld}")
30
+ is_suspicious = True
31
+
32
+ brand_impersonations = ["paypal", "amazon", "google", "microsoft", "apple", "bank", "secure", "login", "verify"]
33
+ for brand in brand_impersonations:
34
+ if brand in domain and not domain.startswith(brand + "."):
35
+ flags.append(f"impersonation:{brand}")
36
+ is_suspicious = True
37
+
38
+ if re.search(r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}", domain):
39
+ flags.append("ip_address_url")
40
+ is_suspicious = True
41
+
42
+ return {"url": url, "suspicious": is_suspicious, "flags": flags}
43
+
44
+
45
+ def analyze_urls_in_text(text: str) -> dict:
46
+ urls = extract_urls(text)
47
+ scored = [score_url(u) for u in urls]
48
+ suspicious_count = sum(1 for s in scored if s["suspicious"])
49
+ return {
50
+ "urls_found": len(urls),
51
+ "suspicious_count": suspicious_count,
52
+ "url_details": scored,
53
+ }
54
+
55
+
56
+ def _invoke(messages):
57
+ from app.tools.retry_utils import execute_with_retry
58
+ for model in [GEMINI_MODEL] + GEMINI_MODEL_FALLBACKS:
59
+ try:
60
+ return execute_with_retry(
61
+ lambda m=model: ChatGoogleGenerativeAI(model=m, google_api_key=GEMINI_API_KEY, temperature=0.1).invoke(messages).content
62
+ )
63
+ except Exception as e:
64
+ if "429" not in str(e) and "RESOURCE_EXHAUSTED" not in str(e):
65
+ raise
66
+ raise RuntimeError("All Gemini models quota exhausted")
67
+
68
+
69
+ def gemini_analyze_text(text: str) -> dict:
70
+ system = (
71
+ "You are an expert phishing and scam text analyser. "
72
+ "Detect: urgency language, impersonation, social engineering, credential harvesting, "
73
+ "suspicious links, fake authority claims. "
74
+ "Reply ONLY with valid JSON: "
75
+ '{"risk_score": <float 0.0-1.0>, "threat_types": [<strings>], "explanation": <string>}'
76
+ )
77
+ raw = _invoke([SystemMessage(content=system), HumanMessage(content=text)])
78
+ raw = raw.strip().strip("```json").strip("```").strip()
79
+ return json.loads(raw)
app/tools/video_tools.py ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import json
3
+ import tempfile
4
+ import time
5
+ import cv2
6
+ from google import genai
7
+ from google.genai import types
8
+ from langchain_google_genai import ChatGoogleGenerativeAI
9
+ from langchain_core.messages import HumanMessage, SystemMessage
10
+ from app.config import GEMINI_API_KEY, GEMINI_MODEL
11
+ from app.tools.image_tools import hf_detect_image_deepfake
12
+
13
+ client = genai.Client(api_key=GEMINI_API_KEY)
14
+
15
+
16
+ def sample_frames(video_bytes: bytes, n_frames: int = 8) -> list[bytes]:
17
+ with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as tmp:
18
+ tmp.write(video_bytes)
19
+ tmp_path = tmp.name
20
+ frames = []
21
+ try:
22
+ cap = cv2.VideoCapture(tmp_path)
23
+ total = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
24
+ step = max(1, total // n_frames)
25
+ for i in range(n_frames):
26
+ cap.set(cv2.CAP_PROP_POS_FRAMES, i * step)
27
+ ret, frame = cap.read()
28
+ if ret:
29
+ _, buf = cv2.imencode(".jpg", frame)
30
+ frames.append(buf.tobytes())
31
+ cap.release()
32
+ finally:
33
+ os.unlink(tmp_path)
34
+ return frames
35
+
36
+
37
+ def gemini_analyze_video(video_bytes: bytes, mime_type: str = "video/mp4") -> dict:
38
+ with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as tmp:
39
+ tmp.write(video_bytes)
40
+ tmp_path = tmp.name
41
+ try:
42
+ uploaded = client.files.upload(file=tmp_path, config=types.UploadFileConfig(mime_type=mime_type))
43
+ while uploaded.state.name == "PROCESSING":
44
+ time.sleep(2)
45
+ uploaded = client.files.get(name=uploaded.name)
46
+
47
+ system_prompt = (
48
+ "You are a deepfake and media manipulation expert. Analyse this video for: "
49
+ "AI-generated faces, lip-sync inconsistencies, unnatural blinking, "
50
+ "visual artefacts, lighting inconsistencies, and audio-visual mismatch. "
51
+ "Reply ONLY with valid JSON: "
52
+ '{"risk_score": <float 0-1>, "threat_types": [<strings>], "explanation": <string>}'
53
+ )
54
+ from app.tools.retry_utils import execute_with_retry
55
+ response = execute_with_retry(
56
+ lambda: client.models.generate_content(
57
+ model=GEMINI_MODEL,
58
+ contents=[system_prompt, uploaded],
59
+ )
60
+ )
61
+ raw = response.text.strip().strip("```json").strip("```").strip()
62
+ return json.loads(raw)
63
+ except Exception as e:
64
+ return {"risk_score": 0.0, "threat_types": [], "explanation": f"Error: {e}"}
65
+ finally:
66
+ os.unlink(tmp_path)
67
+
68
+
69
+ def score_video_frames(video_bytes: bytes) -> list[float]:
70
+ frames = sample_frames(video_bytes)
71
+ scores = []
72
+ for frame_bytes in frames:
73
+ result = hf_detect_image_deepfake(frame_bytes, mime_type="image/jpeg")
74
+ scores.append(result.get("deepfake_score", 0.0))
75
+ return scores
docker-compose.yml ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.8'
2
+
3
+ services:
4
+ antiphish:
5
+ build:
6
+ context: .
7
+ args:
8
+ HUGGING_FACE_TOKEN: ${HUGGING_FACE_TOKEN}
9
+ ports:
10
+ - "7860:7860"
11
+ env_file:
12
+ - .env
13
+ restart: unless-stopped
download_models.py ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from transformers import pipeline
3
+ import logging
4
+
5
+ logging.basicConfig(level=logging.INFO)
6
+ logger = logging.getLogger(__name__)
7
+
8
+ # Hardcoded model names to avoid needing .env during build
9
+ HF_IMAGE_MODEL = "dima806/deepfake_vs_real_image_detection"
10
+ HF_AUDIO_MODEL = "mo-thecreator/Deepfake-audio-detection"
11
+
12
+ def download_models():
13
+ """
14
+ Downloads and caches the models in the Hugging Face cache directory.
15
+ This script is intended to be run during the Docker image build process
16
+ so that the models are baked into the image layers.
17
+ """
18
+ logger.info(f"Downloading image model: {HF_IMAGE_MODEL}...")
19
+ try:
20
+ # Load pipeline to force download of weights, tokenizer, config
21
+ _ = pipeline("image-classification", model=HF_IMAGE_MODEL)
22
+ logger.info("Successfully cached image model.")
23
+ except Exception as e:
24
+ logger.error(f"Failed to cache image model: {e}")
25
+ raise
26
+
27
+ logger.info(f"Downloading audio model: {HF_AUDIO_MODEL}...")
28
+ try:
29
+ # Load pipeline to force download, providing token for gated models
30
+ hf_token = os.environ.get("HUGGING_FACE_TOKEN")
31
+ _ = pipeline("audio-classification", model=HF_AUDIO_MODEL, token=hf_token)
32
+ logger.info("Successfully cached audio model.")
33
+ except Exception as e:
34
+ logger.error(f"Failed to cache audio model: {e}")
35
+ raise
36
+
37
+ if __name__ == "__main__":
38
+ download_models()
39
+ logger.info("All models downloaded and cached successfully.")
frontend/index.html ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <title>Anti-Phishing AI - Demo</title>
6
+ </head>
7
+ <body>
8
+ <h1>Anti-Phishing AI Backend - Demo</h1>
9
+ <p>Test all analysis endpoints. Results shown as raw JSON below each form.</p>
10
+ <hr>
11
+
12
+ <!-- TEXT -->
13
+ <h2>Text Analysis (phishing + URL detection)</h2>
14
+ <form id="textForm">
15
+ <label>Paste text or email body:<br>
16
+ <textarea id="textInput" rows="6" cols="80" placeholder="URGENT: Your account has been suspended. Click http://fake-bank.tk to verify..."></textarea>
17
+ </label><br>
18
+ <button type="submit">Analyze Text</button>
19
+ </form>
20
+ <pre id="textResult"></pre>
21
+ <hr>
22
+
23
+ <!-- IMAGE -->
24
+ <h2>Image Analysis (deepfake + phishing screenshot)</h2>
25
+ <form id="imageForm">
26
+ <label>Upload image (JPG, PNG, WEBP):<br>
27
+ <input type="file" id="imageInput" accept="image/*">
28
+ </label><br>
29
+ <button type="submit">Analyze Image</button>
30
+ </form>
31
+ <pre id="imageResult"></pre>
32
+ <hr>
33
+
34
+ <!-- VIDEO -->
35
+ <h2>Video Analysis (deepfake video detection)</h2>
36
+ <form id="videoForm">
37
+ <label>Upload video (MP4, WEBM):<br>
38
+ <input type="file" id="videoInput" accept="video/*">
39
+ </label><br>
40
+ <button type="submit">Analyze Video</button>
41
+ <span id="videoStatus"></span>
42
+ </form>
43
+ <pre id="videoResult"></pre>
44
+ <hr>
45
+
46
+ <!-- AUDIO -->
47
+ <h2>Audio Analysis (deepfake / AI voice detection)</h2>
48
+ <form id="audioForm">
49
+ <label>Upload audio (WAV, MP3, OGG):<br>
50
+ <input type="file" id="audioInput" accept="audio/*">
51
+ </label><br>
52
+ <button type="submit">Analyze Audio</button>
53
+ </form>
54
+ <pre id="audioResult"></pre>
55
+
56
+ <script>
57
+ const BASE = "";
58
+
59
+ async function postJSON(url, body, resultId) {
60
+ const el = document.getElementById(resultId);
61
+ el.textContent = "Analyzing...";
62
+ try {
63
+ const res = await fetch(BASE + url, {
64
+ method: "POST",
65
+ headers: { "Content-Type": "application/json" },
66
+ body: JSON.stringify(body),
67
+ });
68
+ const data = await res.json();
69
+ el.textContent = JSON.stringify(data, null, 2);
70
+ } catch (e) {
71
+ el.textContent = "ERROR: " + e.message;
72
+ }
73
+ }
74
+
75
+ async function postFile(url, file, resultId, statusId) {
76
+ const el = document.getElementById(resultId);
77
+ const statusEl = statusId ? document.getElementById(statusId) : null;
78
+ el.textContent = "Uploading and analyzing...";
79
+ if (statusEl) statusEl.textContent = " (this may take 10-30s for video)";
80
+ const form = new FormData();
81
+ form.append("file", file);
82
+ try {
83
+ const res = await fetch(BASE + url, { method: "POST", body: form });
84
+ const data = await res.json();
85
+ el.textContent = JSON.stringify(data, null, 2);
86
+ if (statusEl) statusEl.textContent = "";
87
+ } catch (e) {
88
+ el.textContent = "ERROR: " + e.message;
89
+ if (statusEl) statusEl.textContent = "";
90
+ }
91
+ }
92
+
93
+ document.getElementById("textForm").addEventListener("submit", e => {
94
+ e.preventDefault();
95
+ const text = document.getElementById("textInput").value.trim();
96
+ if (!text) return;
97
+ postJSON("/analyze/text", { text }, "textResult");
98
+ });
99
+
100
+ document.getElementById("imageForm").addEventListener("submit", e => {
101
+ e.preventDefault();
102
+ const file = document.getElementById("imageInput").files[0];
103
+ if (!file) return;
104
+ postFile("/analyze/image", file, "imageResult", null);
105
+ });
106
+
107
+ document.getElementById("videoForm").addEventListener("submit", e => {
108
+ e.preventDefault();
109
+ const file = document.getElementById("videoInput").files[0];
110
+ if (!file) return;
111
+ postFile("/analyze/video", file, "videoResult", "videoStatus");
112
+ });
113
+
114
+ document.getElementById("audioForm").addEventListener("submit", e => {
115
+ e.preventDefault();
116
+ const file = document.getElementById("audioInput").files[0];
117
+ if (!file) return;
118
+ postFile("/analyze/audio", file, "audioResult", null);
119
+ });
120
+ </script>
121
+ </body>
122
+ </html>
requirements.txt ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ fastapi
2
+ uvicorn[standard]
3
+ python-multipart
4
+ google-generativeai
5
+ langchain
6
+ langchain-google-genai
7
+ langchain-core
8
+ huggingface_hub
9
+ pydantic
10
+ python-dotenv
11
+ Pillow
12
+ opencv-python-headless
13
+ requests
14
+ transformers>=4.41.2
15
+ torch>=2.1.2
16
+ torchaudio>=2.1.2
17
+ torchvision>=0.16.2
18
+ soundfile>=0.12.1