Spaces:
Build error
Build error
Update README.md
Browse files
README.md
CHANGED
|
@@ -9,5 +9,28 @@ app_file: app.py
|
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
|
| 13 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
---
|
| 12 |
+
# Comment Toxicity Detector
|
| 13 |
+
|
| 14 |
+
This project implements a sentiment analysis model that’s capable of detecting different types of toxicity like threats, obscenity, insults, and identity-based hate of a given comment. An LSTM-based model has been trained on 153 thousand sequence.
|
| 15 |
+
|
| 16 |
+
## Project Structure
|
| 17 |
+
|
| 18 |
+
- __01. Data Preparation:__
|
| 19 |
+
* `Data Collection`: The [dataset](https://www.kaggle.com/datasets/julian3833/jigsaw-toxic-comment-classification-challenge) has been collected from the [Toxic Comment Classificatio Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) held on Kaggle.
|
| 20 |
+
* `Data Cleaning & Preprocessing`:
|
| 21 |
+
- Vectorized the data utilizing "TextVectorization" from keras
|
| 22 |
+
- Prepared a tensorflow dataset for model training
|
| 23 |
+
|
| 24 |
+
- __02. Model Training:__
|
| 25 |
+
* A Bidirectional LSTM model with an embedding layer has been trained on the preprocessed data.
|
| 26 |
+
|
| 27 |
+
- __03. App Deployment:__
|
| 28 |
+
* Developed a web-app with Gradio interface
|
| 29 |
+
* Deployed the [App](https://huggingface.co/spaces/mazed/Comment_Toxicity_Detector) in HuggingFace Spaces.
|
| 30 |
+
|
| 31 |
+
- `requirements.txt`: Contains the dependencies needed for the project:
|
| 32 |
+
- `pandas`
|
| 33 |
+
- `gradio`
|
| 34 |
+
- `tensorflow==2.15.0`
|
| 35 |
|
| 36 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|