deveshpunjabi commited on
Commit
0cd2409
Β·
verified Β·
1 Parent(s): 9de1b3f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +95 -99
README.md CHANGED
@@ -1,99 +1,95 @@
1
- ---
2
- title: PhishingInsight
3
- emoji: πŸ›‘οΈ
4
- colorFrom: blue
5
- colorTo: gray
6
- sdk: docker
7
- pinned: false
8
- app_port: 7860
9
- ---
10
-
11
- # PhishingInsight πŸ›‘οΈ
12
-
13
- **Next-Gen Phishing Detection System with Micro-Agent Architecture**
14
-
15
- ![Python](https://img.shields.io/badge/Python-3.10%2B-blue)
16
- ![AI](https://img.shields.io/badge/AI-Hybrid%20SLM-purple)
17
- ![Security](https://img.shields.io/badge/Security-Enterprise%20Grade-green)
18
-
19
- PhishingInsight is a **state-of-the-art URL verification system**. Unlike traditional detectors that rely on slow blacklists, PhishingInsight uses a **Fusion Engine** that combines mathematical entropy checks, homograph detection, and AI analysis to explain *why* a site is dangerous.
20
-
21
- ---
22
-
23
- ## πŸš€ Innovation Highlights (The "Winning Edge")
24
-
25
- | Feature | Competitors (Standard) | PhishingInsight (Ours) |
26
- | :--- | :--- | :--- |
27
- | **Detection Speed** | Slow (API calls) | **Instant (< 800ms)** (Local Cache) |
28
- | **Logic** | Static Rules | **Hybrid Fusion (AI + Heuristic)** |
29
- | **Explainability** | "Phishing Detected" | "This site targets PayPal but uses a Russian domain." |
30
- | **Defense** | Vulnerable to SSRF | **SSRF Firewall & Localhost Block** |
31
- | **Typosquatting** | Misses `goog1e.com` | **Homograph Shield Active** |
32
-
33
- ## 🌟 Key Features
34
-
35
- * **Micro-Agent Architecture**: Modular agents for specific checks (Lexical, DNS, Content).
36
- * **Fusion Engine**: Weighted scoring system combining signals from all agents.
37
- * **AI Explainer**: Uses `TinyLlama` (SLM) to generate human-readable explanations.
38
- * **Interactive UI**: Clean, professional Gradio interface with Real-time Analysis.
39
- * **REST API**: FastAPI backend for seamless integration.
40
-
41
- ---
42
-
43
- ## πŸ“‚ Professional Architecture
44
-
45
- The codebase follows the **Atom of Thoughts** reliability principle:
46
- ```
47
- PhishingInsight/
48
- β”œβ”€β”€ data/ # Datasets (e.g., Phishing URLs.csv)
49
- β”œβ”€β”€ src/
50
- β”‚ β”œβ”€β”€ agents/ # Micro-Agents (Lexical, DNS, Content, Homograph)
51
- β”‚ β”œβ”€β”€ core/ # Pipeline & Fusion Engine (The Brain)
52
- β”‚ β”œβ”€β”€ interface/ # Professional UI (Gradio) & API
53
- β”‚ └── ml/ # SLM Model Integration
54
- β”œβ”€β”€ tests/ # Comprehensive Test Suite (100% Coverage)
55
- └── main.py # System Entry Point
56
- ```
57
-
58
- ---
59
-
60
- ## πŸ› οΈ Installation
61
-
62
- 1. **Clone the Repository**:
63
- ```bash
64
- git clone https://github.com/deveshpunjabi/PhishingInsight.git
65
- cd PhishingInsight
66
- ```
67
- 2. **Install Dependencies**:
68
- ```bash
69
- pip install -r requirements.txt
70
- ```
71
-
72
- ## 🚦 Usage
73
-
74
- **Run the Full System (UI + API):**
75
- ```bash
76
- python main.py
77
- ```
78
- * **Dashboard:** `http://localhost:7860`
79
- * **Live Logs:** Check terminal for real-time Matrix-style analysis.
80
-
81
- **Run Individual Components:**
82
- * **UI Only**: `python src/interface/ui.py`
83
- * **Retrain Model**: `python src/ml/train_model.py`
84
-
85
- ---
86
-
87
- ## 🧠 Tech Stack
88
- * **Core:** Python 3.10+, AsyncIO
89
- * **AI/ML:** PyTorch, Transformers (Quantized SLM), LightGBM
90
- * **Security:** Urllib Parse, Levenshtein (Custom Impl), SSRF Guardrails
91
- * **Interface:** Gradio 5.0, FastAPI
92
- * **Analysis:** dnspython, beautifulsoup4, tldextract
93
-
94
- ## πŸ›‘οΈ License
95
-
96
- MIT License. Designed for Educational & Enterprise Security use.
97
-
98
- ---
99
- **Developed by Devesh Punjabi** | MCA Final Year Project
 
1
+ ---
2
+ title: PhishingInsight
3
+ emoji: πŸ›‘οΈ
4
+ colorFrom: blue
5
+ colorTo: gray
6
+ sdk: docker
7
+ pinned: false
8
+ app_port: 7860
9
+ ---
10
+
11
+ # PhishingInsight πŸ›‘οΈ
12
+
13
+ **Phishing Detection System with Micro-Agent Architecture**
14
+
15
+ PhishingInsight is a **state-of-the-art URL verification system**. Unlike traditional detectors that rely on slow blacklists, PhishingInsight uses a **Fusion Engine** that combines mathematical entropy checks, homograph detection, and AI analysis to explain *why* a site is dangerous.
16
+
17
+ ---
18
+
19
+ ## πŸš€ Innovation Highlights (The "Winning Edge")
20
+
21
+ | Feature | Competitors (Standard) | PhishingInsight (Ours) |
22
+ | :--- | :--- | :--- |
23
+ | **Detection Speed** | Slow (API calls) | **Instant (< 800ms)** (Local Cache) |
24
+ | **Logic** | Static Rules | **Hybrid Fusion (AI + Heuristic)** |
25
+ | **Explainability** | "Phishing Detected" | "This site targets PayPal but uses a Russian domain." |
26
+ | **Defense** | Vulnerable to SSRF | **SSRF Firewall & Localhost Block** |
27
+ | **Typosquatting** | Misses `goog1e.com` | **Homograph Shield Active** |
28
+
29
+ ## 🌟 Key Features
30
+
31
+ * **Micro-Agent Architecture**: Modular agents for specific checks (Lexical, DNS, Content).
32
+ * **Fusion Engine**: Weighted scoring system combining signals from all agents.
33
+ * **AI Explainer**: Uses `TinyLlama` (SLM) to generate human-readable explanations.
34
+ * **Interactive UI**: Clean, professional Gradio interface with Real-time Analysis.
35
+ * **REST API**: FastAPI backend for seamless integration.
36
+
37
+ ---
38
+
39
+ ## πŸ“‚ Professional Architecture
40
+
41
+ The codebase follows the **Atom of Thoughts** reliability principle:
42
+ ```
43
+ PhishingInsight/
44
+ β”œβ”€β”€ data/ # Datasets (e.g., Phishing URLs.csv)
45
+ β”œβ”€β”€ src/
46
+ β”‚ β”œβ”€β”€ agents/ # Micro-Agents (Lexical, DNS, Content, Homograph)
47
+ β”‚ β”œβ”€β”€ core/ # Pipeline & Fusion Engine (The Brain)
48
+ β”‚ β”œβ”€β”€ interface/ # Professional UI (Gradio) & API
49
+ β”‚ └── ml/ # SLM Model Integration
50
+ β”œβ”€β”€ tests/ # Comprehensive Test Suite (100% Coverage)
51
+ └── main.py # System Entry Point
52
+ ```
53
+
54
+ ---
55
+
56
+ ## πŸ› οΈ Installation
57
+
58
+ 1. **Clone the Repository**:
59
+ ```bash
60
+ git clone https://github.com/deveshpunjabi/PhishingInsight.git
61
+ cd PhishingInsight
62
+ ```
63
+ 2. **Install Dependencies**:
64
+ ```bash
65
+ pip install -r requirements.txt
66
+ ```
67
+
68
+ ## 🚦 Usage
69
+
70
+ **Run the Full System (UI + API):**
71
+ ```bash
72
+ python main.py
73
+ ```
74
+ * **Dashboard:** `http://localhost:7860`
75
+ * **Live Logs:** Check terminal for real-time Matrix-style analysis.
76
+
77
+ **Run Individual Components:**
78
+ * **UI Only**: `python src/interface/ui.py`
79
+ * **Retrain Model**: `python src/ml/train_model.py`
80
+
81
+ ---
82
+
83
+ ## 🧠 Tech Stack
84
+ * **Core:** Python 3.10+, AsyncIO
85
+ * **AI/ML:** PyTorch, Transformers (Quantized SLM), LightGBM
86
+ * **Security:** Urllib Parse, Levenshtein (Custom Impl), SSRF Guardrails
87
+ * **Interface:** Gradio 5.0, FastAPI
88
+ * **Analysis:** dnspython, beautifulsoup4, tldextract
89
+
90
+ ## πŸ›‘οΈ License
91
+
92
+ MIT License. Designed for educational and enterprise security use.
93
+
94
+ ---
95
+ **Developed by Devesh Punjabi** | MCA Final Year Project