mstfknn's picture
Update README.md
4ae712e verified
---
tags:
- fasttext
- phishing
- domain-classification
license: mit
language:
- en
datasets:
- mstfknn/phishing-domain-list-2m-plus
---
# Phishing Detection Model (FastText)
This is a lightweight FastText model trained to classify domain names as either phishing or clean. It uses supervised learning with `wordNgrams=2` for better n-gram feature coverage.
## Installation
Option 1: From Source
```
git clone https://github.com/facebookresearch/fastText.git
cd fastText
mkdir build && cd build
cmake ..
make
```
Option 2: Using pip (limited support)
```
pip install fasttext
```
⚠️ The pip version does not support all features. Compiling from source is recommended.
## Usage
```bash
# Predict a single domain
echo "carreeffoursa.site" | ./fasttext predict phishing_model.bin -
```
## Training Info
- Framework: FastText
- Labels: `__label__phishing`, `__label__clean`
- Epochs: 10
- Learning rate: 0.5
- wordNgrams: 2
## πŸ“Š Training Data
The model was trained on [mstfknn/phishing-domain-list-2m-plus](https://huggingface.co/datasets/mstfknn/phishing-domain-list-2m-plus), a dataset consisting of 2.000,000 domain names labeled as either phishing or clean.
## Example
Input:
```
carreeffoursa.site
```
Output:
```
__label__phishing
```
## License
MIT
---
## πŸ”— Links
- πŸ’» [GitHub Repository](https://github.com/mstfknn/phishing-fasttext-model)
- 🐳 [Docker Hub Image](https://hub.docker.com/r/mstfknn/phishing-fasttext)