gplsi
/

Toxicity_model_binary

Text Classification

Model card Files Files and versions

rsepulvedat commited on Jul 29, 2025

Commit

302fa43

·

verified ·

1 Parent(s): eea9fe1

Update README.md

Files changed (1) hide show

README.md +44 -3

README.md CHANGED Viewed

@@ -1,3 +1,44 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- gplsi/SocialTOX
+language:
+- es
+metrics:
+- accuracy
+- f1
+- precision
+- recall
+base_model:
+- BSC-LT/roberta-base-bne
+pipeline_tag: text-classification
+---
+# 🧠 Toxicity_model_RoBERTa-base-bne– Spanish Toxicity Classifier Binary (Fine-tuned)
+## 📌 Model Description
+This model is a fine-tuned version** of `RoBERTa-base-bne`, specifically trained to classify the toxicity level of **Spanish-language user comments on news articles**. It distinguishes between tow categories:
+- **Non-toxic**
+- **Toxic**
+The model follows instruction-based prompts and returns a single classification label in response.
+---
+## 📂 Training Data
+The model was fine-tuned on the **[SocialTOX dataset](https://huggingface.co/datasets/gplsi/SocialTOX)**, a collection of Spanish-language comments annotated for varying levels of toxicity. These comments come from news platforms and represent real-world scenarios of online discourse. In this case, a Binary classifier was develop, where the classes \textit{Slightly toxic} and \textit{Toxic} were merged into a single \textit{Toxic} category.
+---
+## Training hyperparameters
+- epochs: 10
+- learning_rate: 2.45e-6
+- beta1: 0.9
+- beta2: 0.95
+- Adam_epsilon: 1.00e-8
+- weight_decay: 0
+- batch_size: 16
+- max_seq_length: 512