File size: 2,949 Bytes

e723b19

---
language: es
license: mit
library_name: transformers
tags:
- spam-detection
- sms
- text-classification
- beto
- bert
- spanish
- pytorch
datasets:
- sms_spam
metrics:
- accuracy
- f1
- precision
- recall
base_model: dccuchile/bert-base-spanish-wwm-cased
pipeline_tag: text-classification
widget:
- text: "¡FELICIDADES! Ganaste un premio de $1000. Haz clic aquí para reclamarlo"
  example_title: "Spam - Premio falso"
- text: "¡Increíble! Ha ganado un viaje con todos los gastos pagados a Cancún. Llame al 1-800-VIAJES"
  example_title: "Spam - Oferta fraudulenta"
- text: "URGENTE: Su cuenta ha sido suspendida. Haga clic aquí para reactivarla"
  example_title: "Spam - Phishing bancario"
- text: "Hola mamá, llegaré tarde a casa. Nos vemos en la cena"
  example_title: "Legítimo - Mensaje familiar"
- text: "Buenos días, confirmo la reunión de mañana a las 3pm"
  example_title: "Legítimo - Mensaje de trabajo"
model-index:
- name: spamvision-beto
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: Spanish SMS Spam Detection
      type: sms_spam
    metrics:
    - type: accuracy
      value: 0.962
      name: Accuracy
    - type: f1
      value: 0.951
      name: F1 Score
    - type: precision
      value: 0.948
      name: Precision
    - type: recall
      value: 0.955
      name: Recall
---

# 🛡️ SpamVision BETO - Spanish SMS Spam Detector

<div align="center">
  <img src="https://img.shields.io/badge/Language-Spanish-green" alt="Spanish">
  <img src="https://img.shields.io/badge/Accuracy-96.2%25-blue" alt="Accuracy">
  <img src="https://img.shields.io/badge/F1--Score-95.1%25-orange" alt="F1">
  <img src="https://img.shields.io/badge/License-MIT-yellow" alt="License">
</div>

## 📖 Model Description

**SpamVision BETO** is a fine-tuned BERT model for Spanish language specifically designed to detect spam SMS messages with high accuracy. Built on top of the [BETO](https://github.com/dccuchile/beto) (BERT trained on Spanish corpus), this model achieves **96.2% accuracy** in distinguishing between legitimate messages and spam.

This model is part of the [SpamVision project](https://github.com/tu-usuario/spamvision-api), a hybrid AI system that combines rule-based filtering (AFD) with deep learning for maximum spam detection performance.

### Key Features

- 🎯 **High Accuracy**: 96.2% on test dataset
- ⚡ **Fast Inference**: < 200ms per message
- 🇪🇸 **Spanish-optimized**: Fine-tuned on Spanish SMS data
- 📱 **SMS-focused**: Optimized for short messages (< 160 characters)
- 🔄 **Production-ready**: Used in real-world mobile app

### Model Architecture

- **Base Model**: `dccuchile/bert-base-spanish-wwm-cased`
- **Parameters**: ~110M
- **Layers**: 12 transformer encoder layers
- **Hidden Size**: 768
- **Max Sequence Length**: 128 tokens
- **Vocabulary Size**: 31,002 tokens

---

## 🚀 Quick Start

### Installation