mstfknn commited on
Commit
0be6ad3
·
verified ·
1 Parent(s): 0fa218c

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +33 -33
README.md CHANGED
@@ -1,44 +1,44 @@
1
- # 🛡️ Phishing Domain Classifier (FastText)
2
-
3
- This repository contains a **FastText-based supervised classification model** trained to detect phishing domains.
4
-
5
- ## 🚀 Model Overview
6
-
7
- - **Algorithm**: Facebook's [fastText](https://fasttext.cc/)
8
- - **Task**: Binary classification (`phishing` vs `clean`)
9
- - **Input format**: Domain names (e.g., `paypal-login.su`)
10
- - **Labels**: `__label__phishing`, `__label__clean`
11
- - **Features**:
12
- - Fast and lightweight
13
- - Trained with `wordNgrams = 2`
14
- - 10 epochs
15
-
16
  ---
 
 
 
 
 
 
 
 
 
 
17
 
18
- ## 📂 Files Included
19
 
20
- ```text
21
- phishing_model.bin # Trained model file (binary format)
22
- phishing_model.vec # Vector embeddings
23
- fasttext_train.txt # Training data file
24
- README.md # Documentation
25
 
 
 
 
 
26
 
27
- 🔧 Installation
28
 
29
- Option 1: From Source
 
 
 
 
30
 
31
- git clone https://github.com/facebookresearch/fastText.git
32
- cd fastText
33
- mkdir build && cd build
34
- cmake ..
35
- make
36
 
37
- Option 2: Using pip (limited support)
 
 
 
38
 
39
- pip install fasttext
 
 
 
40
 
41
- ⚠️ The pip version does not support all features. Compiling from source is recommended.
42
 
43
- Usage
44
- echo "carreeffoursa.site" | ./fasttext predict phishing_model.bin -
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - fasttext
4
+ - phishing
5
+ - domain-classification
6
+ license: mit
7
+ language:
8
+ - en
9
+ ---
10
+
11
+ # Phishing Detection Model (FastText)
12
 
13
+ This is a lightweight FastText model trained to classify domain names as either phishing or clean. It uses supervised learning with `wordNgrams=2` for better n-gram feature coverage.
14
 
15
+ ## Usage
 
 
 
 
16
 
17
+ ```bash
18
+ # Predict a single domain
19
+ echo "carreeffoursa.site" | ./fasttext predict phishing_model.bin -
20
+ ```
21
 
22
+ ## Training Info
23
 
24
+ - Framework: FastText
25
+ - Labels: `__label__phishing`, `__label__clean`
26
+ - Epochs: 10
27
+ - Learning rate: 0.5
28
+ - wordNgrams: 2
29
 
30
+ ## Example
 
 
 
 
31
 
32
+ Input:
33
+ ```
34
+ carreeffoursa.site
35
+ ```
36
 
37
+ Output:
38
+ ```
39
+ __label__phishing
40
+ ```
41
 
42
+ ## License
43
 
44
+ MIT