ivanmartinezmurillo commited on
Commit
fa8d7de
·
verified ·
1 Parent(s): 4bdf289

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - es
6
+ base_model:
7
+ - BSC-LT/mRoBERTa
8
+ pipeline_tag: text-classification
9
+ library_name: transformers
10
+ ---
11
+
12
+ # mRoBERTa_FT3_DFT3_fraude_spam
13
+
14
+ ## Description
15
+ This model is fine-tuned from `BSC-LT/mRoBERTa` for **binary classification of spam detection** in English and Spanish texts.
16
+ It predicts whether a given **SMS or email message** belongs to the category of **spam** or **not spam**.
17
+
18
+ The model was trained using the Hugging Face `Trainer` API with the same configuration as the phishing detection model.
19
+
20
+ ## Dataset
21
+ The dataset used for fine-tuning contains **SMS and email texts** labeled as spam or not spam in both English and Spanish.
22
+
23
+ - **Training set**: 15,893 instances
24
+ - **Test set**: 1,766 instances
25
+
26
+ ## Training Parameters
27
+ - learning_rate: 2e-5
28
+ - num_train_epochs: 2
29
+ - per_device_train_batch_size: 8
30
+ - per_device_eval_batch_size: 8
31
+ - overwrite_output_dir: true
32
+ - logging_strategy: steps
33
+ - logging_steps: 10
34
+ - seed: 852
35
+ - fp16: true
36
+
37
+ ## Results (Test set)
38
+
39
+ **Confusion Matrix**
40
+ [[1506 4]
41
+ [ 8 248]]
42
+ | Class | Precision | Recall | F1-score | Support |
43
+ |-------|-----------|--------|----------|---------|
44
+ | 0 (Not spam) | 0.9947 | 0.9974 | 0.9960 | 1510 |
45
+ | 1 (Spam) | 0.9841 | 0.9688 | 0.9764 | 256 |
46
+
47
+ - Accuracy: **0.9932**
48
+ - Macro Avg F1: **0.9862**
49
+
50
+ ---
51
+
52
+ ## Reference
53
+ ```bibtex
54
+ @misc{gplsi-mroberta-fraudespam,
55
+ author = {Bonora, Mar and Sepúlveda-Torres, Robiert and Martínez-Murillo, Iván},
56
+ title = {mRoBERTa_FT3_DFT3_fraude_spam: Fine-tuned model for spam detection},
57
+ year = {2025},
58
+ howpublished = {\url{https://huggingface.co/gplsi/mRoBERTa_FT3_DFT3_fraude_spam}},
59
+ note = {Accessed: 2025-10-03}
60
+ }