mstfknn
/

phishing-fasttext-model

domain-classification

Model card Files Files and versions

phishing-fasttext-model / README.md

mstfknn's picture

Update README.md

4ae712e verified 3 months ago

|

history blame contribute delete

1.45 kB

	---
	tags:
	- fasttext
	- phishing
	- domain-classification
	license: mit
	language:
	- en
	datasets:
	- mstfknn/phishing-domain-list-2m-plus
	---

	# Phishing Detection Model (FastText)

	This is a lightweight FastText model trained to classify domain names as either phishing or clean. It uses supervised learning with `wordNgrams=2` for better n-gram feature coverage.

	## Installation

	Option 1: From Source
	```
	git clone https://github.com/facebookresearch/fastText.git
	cd fastText
	mkdir build && cd build
	cmake ..
	make
	```
	Option 2: Using pip (limited support)
	```
	pip install fasttext
	```
	⚠️ The pip version does not support all features. Compiling from source is recommended.

	## Usage

	```bash
	# Predict a single domain
	echo "carreeffoursa.site" \| ./fasttext predict phishing_model.bin -
	```

	## Training Info

	- Framework: FastText
	- Labels: `__label__phishing`, `__label__clean`
	- Epochs: 10
	- Learning rate: 0.5
	- wordNgrams: 2

	## 📊 Training Data

	The model was trained on [mstfknn/phishing-domain-list-2m-plus](https://huggingface.co/datasets/mstfknn/phishing-domain-list-2m-plus), a dataset consisting of 2.000,000 domain names labeled as either phishing or clean.


	## Example

	Input:
	```
	carreeffoursa.site
	```

	Output:
	```
	__label__phishing
	```

	## License

	MIT

	---
	## 🔗 Links
	- 💻 [GitHub Repository](https://github.com/mstfknn/phishing-fasttext-model)
	- 🐳 [Docker Hub Image](https://hub.docker.com/r/mstfknn/phishing-fasttext)