File size: 1,452 Bytes
0fa218c 0be6ad3 57fd5e4 0be6ad3 0fa218c 0be6ad3 0fa218c 28ec951 2ae385d 28ec951 2ae385d 28ec951 2ae385d 871700c 28ec951 d62e6eb 0be6ad3 0fa218c 0be6ad3 0fa218c 0be6ad3 0fa218c 3ab1aa0 a433b7f d3d4bf5 a433b7f 0be6ad3 0fa218c 0be6ad3 0fa218c 0be6ad3 0fa218c 0be6ad3 0fa218c 0be6ad3 4ae712e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
---
tags:
- fasttext
- phishing
- domain-classification
license: mit
language:
- en
datasets:
- mstfknn/phishing-domain-list-2m-plus
---
# Phishing Detection Model (FastText)
This is a lightweight FastText model trained to classify domain names as either phishing or clean. It uses supervised learning with `wordNgrams=2` for better n-gram feature coverage.
## Installation
Option 1: From Source
```
git clone https://github.com/facebookresearch/fastText.git
cd fastText
mkdir build && cd build
cmake ..
make
```
Option 2: Using pip (limited support)
```
pip install fasttext
```
โ ๏ธ The pip version does not support all features. Compiling from source is recommended.
## Usage
```bash
# Predict a single domain
echo "carreeffoursa.site" | ./fasttext predict phishing_model.bin -
```
## Training Info
- Framework: FastText
- Labels: `__label__phishing`, `__label__clean`
- Epochs: 10
- Learning rate: 0.5
- wordNgrams: 2
## ๐ Training Data
The model was trained on [mstfknn/phishing-domain-list-2m-plus](https://huggingface.co/datasets/mstfknn/phishing-domain-list-2m-plus), a dataset consisting of 2.000,000 domain names labeled as either phishing or clean.
## Example
Input:
```
carreeffoursa.site
```
Output:
```
__label__phishing
```
## License
MIT
---
## ๐ Links
- ๐ป [GitHub Repository](https://github.com/mstfknn/phishing-fasttext-model)
- ๐ณ [Docker Hub Image](https://hub.docker.com/r/mstfknn/phishing-fasttext) |