LaProfeClaudis
/

LGBeTO_detection_Model

@@ -21,7 +21,7 @@ pipeline_tag: text-classification
 # LGBeTO_detection_Model
-This model is LGBeTO model. Corresponding to a fine-tuned version of [dccuchile/bert-base-spanish-wwm-uncased](https://huggingface.co/dccuchile/bert-base-spanish-wwm-uncased) (Cañete et al., 2023).
 It achieves the following results on the evaluation set:
 - Accuracy: 0.835
@@ -29,6 +29,13 @@ It achieves the following results on the evaluation set:
 - Precision: 0.8205
 - Recall: 0.8889
 ## Model description
@@ -36,19 +43,17 @@ LGBeTO was designed to detect discriminatory or hateful language directed toward
 ## Intended uses & limitations
-This model was created for a study that was conducted strictly for academic and research purposes. The target of hate speech has been anonymised, and there is no intent to harm the perpetrators
-in any way. We prioritize protecting the privacy and confidentiality of vulnerable individuals.
-We carefully remove identifying data, such as user IDs, phone numbers, and addresses, to safeguard privacy before
 sharing the data with our annotators. All data collected comes from public sources.
-As authors, we affirm our deep respect for all individuals and explicitly state that we have no intention of prejudicing,
-biasing, or disrespecting the LGBTQIA+ community or any group. Our work seeks to contribute constructively to inclusive
 and ethical research in artificial intelligence.
 ## Training and evaluation data
-LGBeTO was fine-tuned using comments collected from digital media, such as Twitter, Instagram, websites, and YouTube comments
 The dataset is available in the Zenodo Repository.
 Cite as:
@@ -58,11 +63,11 @@ https://doi.org/10.5281/zenodo.15385622
 ## Training procedure
-- step 1: Load the dataSet
-- step 2: Tokenization and model generation
-- step 3: Split train-validation
-- step 4: Training configuration
-- step 5: Training/Evaluation
 ### Training hyperparameters
@@ -73,7 +78,6 @@ The following hyperparameters were used during training:
 - eval_batch_size: 16
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
 - num_epochs: 3
 ### Training results
@@ -82,7 +86,7 @@ The following hyperparameters were used during training:
 |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
 | 0.4655        | 1.0   | 50   | 0.5517          | 0.755    | 0.7538 | 0.8242    | 0.6944 |
 | 0.1928        | 2.0   | 100  | 0.4830          | 0.825    | 0.8523 | 0.7829    | 0.9352 |
-| 0.0718        | 3.0   | 150  | 0.5393          | 0.835    | 0.8533 | 0.8205    | 0.8889 |
 ### Framework versions

 # LGBeTO_detection_Model
+This is LGBeTO model. Corresponding to a fine-tuned version of [dccuchile/bert-base-spanish-wwm-uncased](https://huggingface.co/dccuchile/bert-base-spanish-wwm-uncased)(Cañete et al., 2023).
 It achieves the following results on the evaluation set:
 - Accuracy: 0.835
 - Precision: 0.8205
 - Recall: 0.8889
+## Authors
+- **Developed by:** Claudia Martínez-Araneda, Mariella Gutiérrez V., Pedro Gómez M., Diego Maldonado M., Alejandra Segura N., Christian Vidal-Castro
+- **Model type:** BERT-based sentiment analysis, BERT-based text classification.
+- **Language(s) (NLP):** Spanish
+- **License:** CC BY 4.0
+- **Finetuned from model:** BETO (Cañete et al., 2023)
 ## Model description
 ## Intended uses & limitations
+This model was created for a study conducted strictly for academic and research purposes. The target of hate speech has been anonymised, and there is no intent to harm the perpetrators
+in any way. We prioritise protecting the privacy and confidentiality of vulnerable individuals. We carefully remove identifying data, such as user IDs, phone numbers, and addresses, to safeguard privacy before
 sharing the data with our annotators. All data collected comes from public sources.
+As authors, we affirm our deep respect for all individuals and explicitly state that we have no intention of prejudicing, biasing, or disrespecting the LGBTQIA+ community or any group. Our work seeks to contribute constructively to inclusive
 and ethical research in artificial intelligence.
 ## Training and evaluation data
+LGBeTO was fine-tuned using comments collected from digital media, such as Twitter, Instagram, websites, and YouTube comments.
 The dataset is available in the Zenodo Repository.
 Cite as:
 ## Training procedure
+- **step 1:** Load the dataSet
+- **step 2:** Tokenization and model generation
+- **step 3:** Split train-validation
+- **step 4:** Training configuration
+- **step 5:** Training/Evaluation
 ### Training hyperparameters
 - eval_batch_size: 16
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - num_epochs: 3
 ### Training results
 |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
 | 0.4655        | 1.0   | 50   | 0.5517          | 0.755    | 0.7538 | 0.8242    | 0.6944 |
 | 0.1928        | 2.0   | 100  | 0.4830          | 0.825    | 0.8523 | 0.7829    | 0.9352 |
+**| 0.0718        | 3.0   | 150  | 0.5393          | 0.835    | 0.8533 | 0.8205    | 0.8889 |**
 ### Framework versions