fc63 commited on
Commit
e0dd549
·
verified ·
1 Parent(s): 928cf5f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -49
README.md CHANGED
@@ -1,50 +1,50 @@
1
- ---
2
- {}
3
- ---
4
- # Model Card for Toxicity Detection Model
5
-
6
- This model is a fine-tuned version of bert-base-uncased for toxicity detection in Turkish text. It has been trained on labeled datasets containing online comments categorized by their toxicity levels. The model uses the Hugging Face transformers library and is suitable for sequence classification tasks. This work was completed as a project assignment for the Natural Language Processing (CENG493) course at Çankaya University.
7
-
8
- - **Model Type:** Sequence Classification
9
- - **Language(s):** Turkish
10
- - **License:** GNU GENERAL PUBLIC LICENSE
11
- - **Fine-tuned from:** `dbmdz/bert-base-turkish-cased`
12
-
13
- ## Uses
14
-
15
- This model can be used directly to analyze the toxicity of text in English. For example:
16
-
17
- - Content moderation in online forums and social media platforms
18
- - Filtering harmful language in customer reviews or feedback
19
- - Monitoring and preventing cyberbullying in messaging applications
20
-
21
- ### Downstream Use
22
-
23
- - Integrating toxic language filtering into chatbots or virtual assistants
24
- - Using it as part of a sentiment analysis pipeline
25
-
26
-
27
- ### Out-of-Scope Use
28
-
29
- - Not suitable for analyzing languages other than Turkish
30
- - Should not be used for sensitive decision-making without human oversight
31
-
32
-
33
- ## Bias, Risks, and Limitations
34
-
35
- The model may inherit biases from the training data, including overrepresentation or underrepresentation of certain demographics or topics. It may also misclassify non-toxic content as toxic or fail to detect subtler forms of toxicity.
36
-
37
- ### Recommendations
38
-
39
- Users should:
40
-
41
- - Avoid deploying the model in high-stakes scenarios without additional validation.
42
- - Regularly monitor performance and update the model if new biases are detected.
43
-
44
- ### Training Data
45
-
46
- https://huggingface.co/datasets/Overfit-GM/turkish-toxic-language
47
-
48
- ## Evaluation
49
-
50
  The model was evaluated on a held-out test set containing a balanced mix of toxic and non-toxic examples.
 
1
+ ---
2
+ {}
3
+ ---
4
+ # Model Card for Toxicity Detection Model
5
+
6
+ This model is a fine-tuned version of bert-base-uncased for toxicity detection in Turkish text. It has been trained on labeled datasets containing online comments categorized by their toxicity levels. The model uses the Hugging Face transformers library and is suitable for sequence classification tasks. This work was completed as a project assignment for the Natural Language Processing (CENG493) course at Çankaya University.
7
+
8
+ - **Model Type:** Sequence Classification
9
+ - **Language(s):** Turkish
10
+ - **License:** GNU GENERAL PUBLIC LICENSE
11
+ - **Fine-tuned from:** `dbmdz/bert-base-turkish-cased`
12
+
13
+ ## Uses
14
+
15
+ This model can be used directly to analyze the toxicity of text in Turkish. For example:
16
+
17
+ - Content moderation in online forums and social media platforms
18
+ - Filtering harmful language in customer reviews or feedback
19
+ - Monitoring and preventing cyberbullying in messaging applications
20
+
21
+ ### Downstream Use
22
+
23
+ - Integrating toxic language filtering into chatbots or virtual assistants
24
+ - Using it as part of a sentiment analysis pipeline
25
+
26
+
27
+ ### Out-of-Scope Use
28
+
29
+ - Not suitable for analyzing languages other than Turkish
30
+ - Should not be used for sensitive decision-making without human oversight
31
+
32
+
33
+ ## Bias, Risks, and Limitations
34
+
35
+ The model may inherit biases from the training data, including overrepresentation or underrepresentation of certain demographics or topics. It may also misclassify non-toxic content as toxic or fail to detect subtler forms of toxicity.
36
+
37
+ ### Recommendations
38
+
39
+ Users should:
40
+
41
+ - Avoid deploying the model in high-stakes scenarios without additional validation.
42
+ - Regularly monitor performance and update the model if new biases are detected.
43
+
44
+ ### Training Data
45
+
46
+ https://huggingface.co/datasets/Overfit-GM/turkish-toxic-language
47
+
48
+ ## Evaluation
49
+
50
  The model was evaluated on a held-out test set containing a balanced mix of toxic and non-toxic examples.