corall88
/

russian_spam_detector

Text Classification

text-embeddings-inference

Model card Files Files and versions

russian_spam_detector / README.md

corall88's picture

Update README.md

38c8057 verified 7 months ago

|

history blame contribute delete

2.21 kB

	---
	license: cc-by-nc-nd-4.0
	datasets:
	- alt-gnome/telegram-spam
	language:
	- ru
	metrics:
	- accuracy
	- f1
	- recall
	- precision
	base_model:
	- deepvk/RuModernBERT-base
	pipeline_tag: text-classification
	tags:
	- spam
	- detection
	- classification
	- russian
	library_name: transformers
	---
	# russian_spam_detector

	Модель russian_spam_detector предназначена для бинарной классификации текстов на 2 категории:
	- LABEL_1 — спам-сообщение
	- LABEL_0 — нормальное сообщение (не спам)

	## 🚀 Использование

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

	model_name = "corall88/russian_spam_detector"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)

	detector = pipeline("text-classification", model=model, tokenizer=tokenizer)

	message = "Поздравляем! Вы выиграли 1000000 рублей, пройдите по ссылке - ..."
	predict = detector(message)
	print(predict)
	```

	## 📊 Датасет
	В качетсвете данных для файнтюнинга модели был выбран [датасет](https://huggingface.co/datasets/alt-gnome/telegram-spam) cо спам сообщениями.

	## 🧠 Архитектура
	Модель основана на [RuModernBERT-base](https://huggingface.co/deepvk/RuModernBERT-base) и дообучена на задаче бинарной классификации.

	## ⚙️ Параметры обучения
	- Epochs: 4
	- Batch size: 16
	- Optimizer: AdamW
	- Learning rate: 2e-5
	- Loss: CrossEntropyLoss
	- Max sequence length: 256

	## 📈 Результаты
	\| Metric \| Value \|
	\|-----------\|-------\|
	\| Accuracy \| 0.99 \|
	\| F1-score \| 0.99 \|
	\| Precision \| 0.99 \|
	\| Recall \| 0.99 \|

	## Citation
	```
	@misc{russian_spam_detector,
	title={russian_spam_detector: modern model for spam detection},
	author={corall88},
	url={https://huggingface.co/corall88/russian_spam_detector},
	publisher={Hugging Face}
	year={2025},
	}
	```