--- title: Comment Toxicity Detector emoji: 📈 colorFrom: red colorTo: pink sdk: gradio sdk_version: 4.31.5 app_file: app.py pinned: false license: mit --- # Comment Toxicity Detector This project implements a sentiment analysis model that’s capable of detecting different types of toxicity like threats, obscenity, insults, and identity-based hate of a given comment. An LSTM-based model has been trained on 153 thousand sequence. ## Project Structure - __01. Data Preparation:__ * `Data Collection`: The [dataset](https://www.kaggle.com/datasets/julian3833/jigsaw-toxic-comment-classification-challenge) has been collected from the [Toxic Comment Classificatio Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) held on Kaggle. * `Data Cleaning & Preprocessing`: - Vectorized the data utilizing "TextVectorization" from keras - Prepared a tensorflow dataset for model training - __02. Model Training:__ * A Bidirectional LSTM model with an embedding layer has been trained on the preprocessed data. - __03. App Deployment:__ * Developed a web-app with Gradio interface * Deployed the [App](https://huggingface.co/spaces/mazed/Comment_Toxicity_Detector) in HuggingFace Spaces. - `requirements.txt`: Contains the dependencies needed for the project: - `pandas` - `gradio` - `tensorflow==2.15.0` Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference