Commit
·
c784df1
1
Parent(s):
d7cdb79
Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
pipeline_tag: text-classification
|
| 4 |
+
tags:
|
| 5 |
+
- not-for-all-audiences
|
| 6 |
+
---
|
| 7 |
+
```markdown
|
| 8 |
+
---
|
| 9 |
+
license: apache-2.0
|
| 10 |
+
pipeline_tag: text-classification
|
| 11 |
+
---
|
| 12 |
+
# Model Card: Fine-Tuned DistilBERT for Offensive/Hate Speech Detection
|
| 13 |
+
|
| 14 |
+
## Model Description
|
| 15 |
+
|
| 16 |
+
The **Fine-Tuned DistilBERT** is a variant of the BERT transformer model,
|
| 17 |
+
distilled for efficient performance while maintaining high accuracy.
|
| 18 |
+
It has been adapted and fine-tuned for the specific task of offensive/hate speech detection in text data.
|
| 19 |
+
|
| 20 |
+
The model, named "distilbert-base-uncased," is pre-trained on a substantial amount of text data,
|
| 21 |
+
which allows it to capture semantic nuances and contextual information present in natural language text.
|
| 22 |
+
It has been fine-tuned with meticulous attention to hyperparameter settings, including batch size and learning rate, to ensure optimal model performance for the offensive/hate speech detection task.
|
| 23 |
+
|
| 24 |
+
During the fine-tuning process, a batch size suitable for efficient computation and learning was chosen.
|
| 25 |
+
Additionally, a learning rate was selected to strike a balance between rapid convergence and steady optimization,
|
| 26 |
+
ensuring the model not only learns quickly but also steadily refines its capabilities throughout training.
|
| 27 |
+
|
| 28 |
+
This model has been trained on a proprietary dataset specifically designed for offensive/hate speech detection.
|
| 29 |
+
The dataset consists of text samples, each labeled as "non-offensive" or "offensive."
|
| 30 |
+
The diversity within the dataset allowed the model to learn to identify offensive content accurately.
|
| 31 |
+
|
| 32 |
+
The goal of this meticulous training process is to equip the model with the ability to detect offensive and hate speech in text data effectively. The result is a model ready to contribute significantly to content moderation and safety, while maintaining high standards of accuracy and reliability.
|
| 33 |
+
|
| 34 |
+
## Intended Uses & Limitations
|
| 35 |
+
|
| 36 |
+
### Intended Uses
|
| 37 |
+
- **Offensive/Hate Speech Detection**: The primary intended use of this model is to detect offensive or hate speech in text data. It is well-suited for filtering and identifying inappropriate content in various applications.
|
| 38 |
+
|
| 39 |
+
### How to Use
|
| 40 |
+
To use this model for offensive/hate speech detection, you can follow these steps:
|
| 41 |
+
|
| 42 |
+
```python
|
| 43 |
+
from transformers import pipeline
|
| 44 |
+
|
| 45 |
+
classifier = pipeline("text-classification", model="Falconsai/offensive_speech_detection")
|
| 46 |
+
text = "Your text to classify here."
|
| 47 |
+
result = classifier(text)
|
| 48 |
+
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
### Limitations
|
| 52 |
+
- **Specialized Task Fine-Tuning**: While the model is adept at offensive/hate speech detection, its performance may vary when applied to other natural language processing tasks.
|
| 53 |
+
- Users interested in employing this model for different tasks should explore fine-tuned versions available in the model hub for optimal results.
|
| 54 |
+
|
| 55 |
+
## Training Data
|
| 56 |
+
|
| 57 |
+
The model's training data includes a proprietary dataset designed for offensive/hate speech detection. This dataset comprises a diverse collection of text samples, categorized into "non-offensive" and "offensive" classes. The training process aimed to equip the model with the ability to distinguish between offensive and non-offensive content effectively.
|
| 58 |
+
|
| 59 |
+
### Training Stats
|
| 60 |
+
- Evaluation Loss: *Insert Evaluation Loss*
|
| 61 |
+
- Evaluation Accuracy: *Insert Evaluation Accuracy*
|
| 62 |
+
- Evaluation Runtime: *Insert Evaluation Runtime*
|
| 63 |
+
- Evaluation Samples per Second: *Insert Evaluation Samples per Second*
|
| 64 |
+
- Evaluation Steps per Second: *Insert Evaluation Steps per Second*
|
| 65 |
+
|
| 66 |
+
**Note:** Specific evaluation statistics should be provided based on the model's performance.
|
| 67 |
+
|
| 68 |
+
## Responsible Usage
|
| 69 |
+
|
| 70 |
+
It is essential to use this model responsibly and ethically, adhering to content guidelines and applicable regulations when implementing it in real-world applications, particularly those involving potentially sensitive content.
|
| 71 |
+
|
| 72 |
+
## References
|
| 73 |
+
|
| 74 |
+
- [Hugging Face Model Hub](https://huggingface.co/models)
|
| 75 |
+
- [DistilBERT Paper](https://arxiv.org/abs/1910.01108)
|
| 76 |
+
|
| 77 |
+
**Disclaimer:** The model's performance may be influenced by the quality and representativeness of the data it was fine-tuned on. Users are encouraged to assess the model's suitability for their specific applications and datasets.
|
| 78 |
+
```
|
| 79 |
+
|
| 80 |
+
This refactored model card provides information about a Fine-Tuned DistilBERT model for offensive/hate speech detection, including its intended uses, limitations, training data, responsible usage guidelines, and references. Please replace the placeholders such as "Insert Evaluation Loss" with specific evaluation statistics as needed.
|