JavicR22
/

SpamVision

Text Classification

Model card Files Files and versions

SpamVision / README.md

JavicR22's picture

Update README.md

e723b19 verified about 1 month ago

|

history blame contribute delete

2.95 kB

	---
	language: es
	license: mit
	library_name: transformers
	tags:
	- spam-detection
	- sms
	- text-classification
	- beto
	- bert
	- spanish
	- pytorch
	datasets:
	- sms_spam
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	base_model: dccuchile/bert-base-spanish-wwm-cased
	pipeline_tag: text-classification
	widget:
	- text: "¡FELICIDADES! Ganaste un premio de $1000. Haz clic aquí para reclamarlo"
	example_title: "Spam - Premio falso"
	- text: "¡Increíble! Ha ganado un viaje con todos los gastos pagados a Cancún. Llame al 1-800-VIAJES"
	example_title: "Spam - Oferta fraudulenta"
	- text: "URGENTE: Su cuenta ha sido suspendida. Haga clic aquí para reactivarla"
	example_title: "Spam - Phishing bancario"
	- text: "Hola mamá, llegaré tarde a casa. Nos vemos en la cena"
	example_title: "Legítimo - Mensaje familiar"
	- text: "Buenos días, confirmo la reunión de mañana a las 3pm"
	example_title: "Legítimo - Mensaje de trabajo"
	model-index:
	- name: spamvision-beto
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: Spanish SMS Spam Detection
	type: sms_spam
	metrics:
	- type: accuracy
	value: 0.962
	name: Accuracy
	- type: f1
	value: 0.951
	name: F1 Score
	- type: precision
	value: 0.948
	name: Precision
	- type: recall
	value: 0.955
	name: Recall
	---

	# 🛡️ SpamVision BETO - Spanish SMS Spam Detector

	<div align="center">
	<img src="https://img.shields.io/badge/Language-Spanish-green" alt="Spanish">
	<img src="https://img.shields.io/badge/Accuracy-96.2%25-blue" alt="Accuracy">
	<img src="https://img.shields.io/badge/F1--Score-95.1%25-orange" alt="F1">
	<img src="https://img.shields.io/badge/License-MIT-yellow" alt="License">
	</div>

	## 📖 Model Description

	SpamVision BETO is a fine-tuned BERT model for Spanish language specifically designed to detect spam SMS messages with high accuracy. Built on top of the [BETO](https://github.com/dccuchile/beto) (BERT trained on Spanish corpus), this model achieves 96.2% accuracy in distinguishing between legitimate messages and spam.

	This model is part of the [SpamVision project](https://github.com/tu-usuario/spamvision-api), a hybrid AI system that combines rule-based filtering (AFD) with deep learning for maximum spam detection performance.

	### Key Features

	- 🎯 High Accuracy: 96.2% on test dataset
	- ⚡ Fast Inference: < 200ms per message
	- 🇪🇸 Spanish-optimized: Fine-tuned on Spanish SMS data
	- 📱 SMS-focused: Optimized for short messages (< 160 characters)
	- 🔄 Production-ready: Used in real-world mobile app

	### Model Architecture

	- Base Model: `dccuchile/bert-base-spanish-wwm-cased`
	- Parameters: ~110M
	- Layers: 12 transformer encoder layers
	- Hidden Size: 768
	- Max Sequence Length: 128 tokens
	- Vocabulary Size: 31,002 tokens

	---

	## 🚀 Quick Start

	### Installation