LaProfeClaudis commited on
Commit
28781d3
·
verified ·
1 Parent(s): bf177b5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -10
README.md CHANGED
@@ -3,6 +3,8 @@ library_name: transformers
3
  base_model: dccuchile/bert-base-spanish-wwm-uncased
4
  tags:
5
  - generated_from_trainer
 
 
6
  metrics:
7
  - accuracy
8
  - f1
@@ -11,35 +13,58 @@ metrics:
11
  model-index:
12
  - name: LGBeTO_detection_Model
13
  results: []
 
 
 
 
14
  ---
15
 
16
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
- should probably proofread and complete it, then remove this comment. -->
18
-
19
  # LGBeTO_detection_Model
20
 
21
- This model is a fine-tuned version of [dccuchile/bert-base-spanish-wwm-uncased](https://huggingface.co/dccuchile/bert-base-spanish-wwm-uncased) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 0.5393
24
  - Accuracy: 0.835
25
  - F1: 0.8533
26
  - Precision: 0.8205
27
  - Recall: 0.8889
28
 
 
29
  ## Model description
30
 
31
- More information needed
32
 
33
  ## Intended uses & limitations
34
 
35
- More information needed
 
 
 
 
 
 
 
 
36
 
37
  ## Training and evaluation data
38
 
39
- More information needed
 
 
 
 
 
 
40
 
41
  ## Training procedure
42
 
 
 
 
 
 
 
 
43
  ### Training hyperparameters
44
 
45
  The following hyperparameters were used during training:
@@ -57,7 +82,7 @@ The following hyperparameters were used during training:
57
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
58
  | 0.4655 | 1.0 | 50 | 0.5517 | 0.755 | 0.7538 | 0.8242 | 0.6944 |
59
  | 0.1928 | 2.0 | 100 | 0.4830 | 0.825 | 0.8523 | 0.7829 | 0.9352 |
60
- | 0.0718 | 3.0 | 150 | 0.5393 | 0.835 | 0.8533 | 0.8205 | 0.8889 |
61
 
62
 
63
  ### Framework versions
@@ -65,4 +90,4 @@ The following hyperparameters were used during training:
65
  - Transformers 4.51.3
66
  - Pytorch 2.6.0+cu124
67
  - Datasets 3.6.0
68
- - Tokenizers 0.21.1
 
3
  base_model: dccuchile/bert-base-spanish-wwm-uncased
4
  tags:
5
  - generated_from_trainer
6
+ - hatetoLGBTcomunities
7
+ - BETO
8
  metrics:
9
  - accuracy
10
  - f1
 
13
  model-index:
14
  - name: LGBeTO_detection_Model
15
  results: []
16
+ license: cc-by-4.0
17
+ language:
18
+ - es
19
+ pipeline_tag: text-classification
20
  ---
21
 
 
 
 
22
  # LGBeTO_detection_Model
23
 
24
+ This model is LGBeTO model. Corresponding to a fine-tuned version of [dccuchile/bert-base-spanish-wwm-uncased](https://huggingface.co/dccuchile/bert-base-spanish-wwm-uncased) (Cañete et al., 2023).
25
  It achieves the following results on the evaluation set:
26
+
27
  - Accuracy: 0.835
28
  - F1: 0.8533
29
  - Precision: 0.8205
30
  - Recall: 0.8889
31
 
32
+
33
  ## Model description
34
 
35
+ LGBeTO was designed to detect discriminatory or hateful language directed toward the LGBTQIA+ community, aiming to support safer and more inclusive online environments.
36
 
37
  ## Intended uses & limitations
38
 
39
+ This model was created for a study that was conducted strictly for academic and research purposes. The target of hate speech has been anonymised, and there is no intent to harm the perpetrators
40
+ in any way. We prioritize protecting the privacy and confidentiality of vulnerable individuals.
41
+ We carefully remove identifying data, such as user IDs, phone numbers, and addresses, to safeguard privacy before
42
+ sharing the data with our annotators. All data collected comes from public sources.
43
+
44
+ As authors, we affirm our deep respect for all individuals and explicitly state that we have no intention of prejudicing,
45
+ biasing, or disrespecting the LGBTQIA+ community or any group. Our work seeks to contribute constructively to inclusive
46
+ and ethical research in artificial intelligence.
47
+
48
 
49
  ## Training and evaluation data
50
 
51
+ LGBeTO was fine-tuned using comments collected from digital media, such as Twitter, Instagram, websites, and YouTube comments
52
+ The dataset is available in the Zenodo Repository.
53
+
54
+ Cite as:
55
+ Martínez-Araneda, C., Maldonado Montiel, D., Gutiérrez Valenzuela, M., Gómez Meneses, P., Segura Navarrete, A.,
56
+ & Vidal-Castro, C. (2025). LGBTQIAphobia dataset (augmented and balanced) [Data set]. Zenodo.
57
+ https://doi.org/10.5281/zenodo.15385622
58
 
59
  ## Training procedure
60
 
61
+ - step 1: Load the dataSet
62
+ - step 2: Tokenization and model generation
63
+ - step 3: Split train-validation
64
+ - step 4: Training configuration
65
+ - step 5: Training/Evaluation
66
+
67
+
68
  ### Training hyperparameters
69
 
70
  The following hyperparameters were used during training:
 
82
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
83
  | 0.4655 | 1.0 | 50 | 0.5517 | 0.755 | 0.7538 | 0.8242 | 0.6944 |
84
  | 0.1928 | 2.0 | 100 | 0.4830 | 0.825 | 0.8523 | 0.7829 | 0.9352 |
85
+ ##| 0.0718 | 3.0 | 150 | 0.5393 | 0.835 | 0.8533 | 0.8205 | 0.8889 |
86
 
87
 
88
  ### Framework versions
 
90
  - Transformers 4.51.3
91
  - Pytorch 2.6.0+cu124
92
  - Datasets 3.6.0
93
+ - Tokenizers 0.21.1