NightPrince
/

Toxic_Classification

+---
+language: en
+tags:
+- toxic-content
+- text-classification
+- keras
+- tensorflow
+- deep-learning
+- safety
+- multiclass
+license: mit
+datasets:
+- custom
+metrics:
+- accuracy
+- f1
+pipeline_tag: text-classification
+model-index:
+- name: Toxic_Classification
+  results: []
+---
+# Toxic_Classification (Keras / TensorFlow Model)
+This is a **multi-class text classification model** for toxic content detection.
+It was trained as part of the **Cellula Internship - Safe and Responsible Multi-Modal Toxic Content Moderation** project.
+---
+## 🚩 Task: Multi-class Toxic Content Detection
+The model classifies text (query + image description) into **9 categories:**
+| Label ID | Category                     |
+|--------- |------------------------------|
+| 0        | Child Sexual Exploitation    |
+| 1        | Elections                    |
+| 2        | Non-Violent Crimes           |
+| 3        | Safe                         |
+| 4        | Sex-Related Crimes           |
+| 5        | Suicide & Self-Harm          |
+| 6        | Unknown S-Type               |
+| 7        | Violent Crimes               |
+| 8        | Unsafe                       |
+---
+## ✅ Model Details
+- **Framework:** TensorFlow 2.19.0 + Keras 3.7.0
+- **Input:** Text + Image description (concatenated string)
+- **Tokenizer:** JSON tokenizer (`tokenizer.json`) with OOV handling and vocab size of 10,000
+- **Max Sequence Length:** 150 tokens
+- **Output:** Softmax probabilities over 9 classes
+---
+## ✅ Files Included in this Repository:
+| File                   | Description                         |
+|----------------------- |------------------------------------ |
+| `toxic_classifier.keras` | Saved Keras v3 model file |
+| `tokenizer.json`       | Keras tokenizer for preprocessing |
+| `config.json`          | Model configuration (architecture, vocab size, labels etc) |
+| `requirements.txt`     | Python dependencies |
+| `README.md`            | This model card |
+---
+## ✅ Example Usage (Python):
+```python
+from keras.saving import load_model
+from tensorflow.keras.preprocessing.text import tokenizer_from_json
+from tensorflow.keras.preprocessing.sequence import pad_sequences
+import numpy as np
+import json
+# Load tokenizer
+with open("tokenizer.json", "r", encoding="utf-8") as f:
+    tokenizer = tokenizer_from_json(f.read())
+# Load model
+model = load_model("toxic_classifier.keras")
+# Example inference
+query = "Example user query"
+image_desc = "Image describes a dangerous situation"
+text = query + " " + image_desc
+sequence = tokenizer.texts_to_sequences([text])
+padded = pad_sequences(sequence, maxlen=150, padding='post', truncating='post')
+prediction = model.predict(padded)
+predicted_label = np.argmax(prediction, axis=1)[0]
+print(f"Predicted Label ID: {predicted_label}")
+## 📚 Resources
+- [Cellula Internship Project Proposal](#)
+- [BLIP: Bootstrapped Language-Image Pre-training](https://github.com/salesforce/BLIP)
+- [Llama Guard](https://llama.meta.com/llama-guard/)
+- [DistilBERT](https://huggingface.co/distilbert-base-uncased)
+- [Streamlit](https://streamlit.io/)
+---
+## License
+MIT License
+---
+**Author:** Yahya Muhammad Alnwsany
+**Contact:** yahyaalnwsany39@gmail.com
+**Portfolio:** https://nightprincey.github.io/Portfolio/