gplsi
/

Aitana-FraudDetection-R-1.0

Text Classification

Model card Files Files and versions

ivanmartinezmurillo commited on Oct 3, 2025

Commit

9b0e57f

·

verified ·

1 Parent(s): 199413d

Create README.md

Files changed (1) hide show

README.md +81 -0

README.md ADDED Viewed

	@@ -0,0 +1,81 @@

+---
+license: apache-2.0
+language:
+- en
+base_model:
+- BSC-LT/mRoBERTa
+pipeline_tag: text-classification
+library_name: transformers
+---
+# mRoBERTa_FT1_DFT1_fraude_phishing
+## Description
+This model is fine-tuned from `BSC-LT/mRoBERTa` for **binary classification of phishing detection** in English texts.
+It predicts whether a given **SMS or email message** belongs to the category of **phishing** or **not phishing**.
+## Dataset
+The dataset used for fine-tuning contains **SMS and email texts** labeled as phishing or not phishing.
+- **Training set**: 9,422 instances
+- **Test set**: 2,357 instances
+## Training Parameters
+- learning_rate: 2e-5
+- num_train_epochs: 2
+- per_device_train_batch_size: 8
+- per_device_eval_batch_size: 8
+- overwrite_output_dir: true
+- logging_strategy: steps
+- logging_steps: 10
+- seed: 852
+- fp16: true
+## Results
+### Combined dataset (SMS + emails)
+**Confusion Matrix**
+[[1793 16]
+[ 18 530]]
+| Class | Precision | Recall | F1-score | Support |
+|-------|-----------|--------|----------|---------|
+| 0 (Not phishing) | 0.9901 | 0.9912 | 0.9906 | 1809 |
+| 1 (Phishing)     | 0.9707 | 0.9672 | 0.9689 | 548  |
+- Accuracy: **0.9856**
+- Macro Avg F1: **0.9798**
+---
+### Only Emails
+**Confusion Matrix**
+[[823 12]
+[ 14 313]]
+| Class | Precision | Recall | F1-score | Support |
+|-------|-----------|--------|----------|---------|
+| 0 (Not phishing) | 0.9833 | 0.9856 | 0.9845 | 835 |
+| 1 (Phishing)     | 0.9631 | 0.9572 | 0.9601 | 327 |
+- Accuracy: **0.9776**
+- Macro Avg F1: **0.9723**
+---
+### Only SMS
+**Confusion Matrix**
+[[969 5]
+[ 6 215]]
+| Class | Precision | Recall | F1-score | Support |
+|-------|-----------|--------|----------|---------|
+| 0 (Not phishing) | 0.9939 | 0.9949 | 0.9944 | 974 |
+| 1 (Phishing)     | 0.9773 | 0.9729 | 0.9751 | 221 |
+- Accuracy: **0.9908**
+- Macro Avg F1: **0.9847**
+---
+## Reference
+```bibtex
+@misc{gplsi-mroberta-fraudephishing,
+  author       = {Martínez-Murillo, Iván and Bonora, Mar and Sepúlveda-Torres, Robiert},
+  title        = {mRoBERTa_FT1_DFT1_fraude_phishing: Fine-tuned model for phishing detection},
+  year         = {2025},
+  howpublished = {\url{https://huggingface.co/gplsi/mRoBERTa_FT1_DFT1_fraude_phishing}},
+  note         = {Accessed: 2025-10-03}
+}