ispromashka
/

arab-detoxification-isp

+---
+license: mit
+language:
+- ar
+- en
+library_name: transformers
+tags:
+- arabic
+- text-generation
+- detoxification
+- ensemble
+- bloom
+- nlp
+pipeline_tag: text-generation
+base_model:
+- bigscience/bloom-1b7
+datasets:
+- custom
+metrics:
+- accuracy
+model-index:
+- name: arabic-detox-ensemble
+  results:
+  - task:
+      type: text-generation
+      name: Text Detoxification
+    metrics:
+    - type: j-score
+      value: 0.7129
+      name: J-Score
+    - type: accuracy
+      value: 0.95
+      name: Style Transfer Accuracy
+    - type: similarity
+      value: 0.9995
+      name: Reference Similarity
+---
+<div align="center">
+# 🛡️ Arabic Text Detoxification Model
+### Ensemble Knowledge Distillation Approach
+[![Model](https://img.shields.io/badge/Model-Bloom--1b7-blue)](https://huggingface.co/bigscience/bloom-1b7)
+[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
+[![Language](https://img.shields.io/badge/Language-Arabic-red)](https://en.wikipedia.org/wiki/Arabic)
+[![HuggingFace](https://img.shields.io/badge/🤗-HuggingFace-yellow)](https://huggingface.co/ispromashka/arab-detoxification-isp)
+**Transform toxic Arabic text into polite, neutral alternatives while preserving meaning**
+[Model Demo](#usage) | [Paper](#methodology) | [Dataset](#dataset) | [Results](#evaluation-results)
+</div>
+---
+## 📊 Architecture Overview
+<div align="center">
+<img src="https://huggingface.co/ispromashka/arab-detoxification-isp/resolve/main/architecture.png" alt="Model Architecture" width="100%">
+</div>
+---
+## 🎯 Model Description
+This model performs **text detoxification** for Arabic language — converting offensive, toxic, or aggressive text into neutral, polite alternatives while preserving the original semantic meaning.
+### Key Features
+| Feature | Description |
+|---------|-------------|
+| 🏗️ **Architecture** | Bloom-1b7 (1.7B parameters) fine-tuned with ensemble distillation |
+| 🌍 **Language** | Arabic (Modern Standard Arabic + dialects) |
+| 📚 **Training** | Ensemble of 3 models → Knowledge distillation → Final model |
+| ⚡ **Hardware** | Optimized for NVIDIA A100 40GB, works on consumer GPUs |
+| 📏 **Context** | Up to 2048 tokens |
+### Ensemble Components
+| Model | Parameters | Role | Source |
+|-------|------------|------|--------|
+| AraGPT2-Medium | 370M | Arabic Language Expert | AUB MIND Lab |
+| Bloom-560m | 560M | Multilingual Generalization | BigScience |
+| Bloom-1b7 | 1.7B | High Capacity Patterns | BigScience |
+---
+## 📈 Evaluation Results
+<div align="center">
+| Metric | Score | Description |
+|--------|-------|-------------|
+| **J-Score** | **0.7129** | Joint metric (geometric mean) |
+| **STA** | 0.9500 | Style Transfer Accuracy |
+| **SIM (ref)** | 0.9995 | Similarity to reference |
+| **Fluency** | 1.0000 | Grammatical correctness |
+</div>
+```
+J-Score    ████████████████████████████░░░░░░░░░░  0.71
+STA        ██████████████████████████████████████  0.95
+SIM (ref)  ██████████████████████████████████████  1.00
+Fluency    ██████████████████████████████████████  1.00
+```
+---
+## 🚀 Quick Start
+### Installation
+```bash
+pip install transformers torch
+```
+### Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+# Load model
+model_name = "ispromashka/arab-detoxification-isp"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
+model.to("cuda")  # or "cpu"
+def detoxify(text: str) -> str:
+    """Convert toxic Arabic text to neutral form."""
+    prompt = f"سام: {text}\nمهذب:"
+    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+    outputs = model.generate(
+        **inputs,
+        max_new_tokens=50,
+        temperature=0.7,
+        top_p=0.9,
+        repetition_penalty=1.2,
+        do_sample=True,
+        pad_token_id=tokenizer.pad_token_id,
+    )
+    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
+    return result.split("مهذب:")[-1].strip().split("\n")[0]
+# Example
+toxic_text = "أنت غبي جداً"
+neutral_text = detoxify(toxic_text)
+print(f"Input:  {toxic_text}")
+print(f"Output: {neutral_text}")
+```
+---
+## 💡 Examples
+| Category | Toxic Input (سام) | Neutral Output (مهذب) |
+|----------|-------------------|----------------------|
+| Insult | أنت غبي جداً | ربما تحتاج إلى مزيد من الوقت للفهم |
+| Command | اخرس يا أحمق | أرجو أن تكون أكثر هدوءاً |
+| Criticism | هذا العمل تافه وسخيف | العمل يمكن تطويره |
+| Threat | سأجعلك تندم | دعنا نحل هذا بسلام |
+| Contempt | أنت فاشل تماماً | النجاح يحتاج لمزيد من الجهد |
+| Mockery | يا له من غبي | ربما لم يفهم جيداً |
+| Blame | كل شيء خطؤك | نحتاج تحديد المسؤوليات |
+| Appearance | منظرك سيء | المظهر يمكن تحسينه |
+---
+## 🔬 Methodology
+### Training Pipeline
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    STAGE 1: Base Models                     │
+├─────────────────────────────────────────────────────────────┤
+│  Train 3 specialized models independently on detox dataset  │
+│  • AraGPT2-Medium (25 epochs)                               │
+│  • Bloom-560m (25 epochs)                                   │
+│  • Bloom-1b7 (20 epochs)                                    │
+└─────────────────────────────────────────────────────────────┘
+                              ↓
+┌─────────────────────────────────────────────────────────────┐
+│                 STAGE 2: Ensemble Selection                 │
+├─────────────────────────────────────────────────────────────┤
+│  For each input, select best prediction using:              │
+│  Sentence-BERT (paraphrase-multilingual-mpnet-base-v2)      │
+│  Selection: argmax(cosine_similarity(pred, reference))      │
+└─────────────────────────────────────────────────────────────┘
+                              ↓
+┌─────────────────────────────────────────────────────────────┐
+│               STAGE 3: Knowledge Distillation               │
+├─────────────────────────────────────────────────────────────┤
+│  Fine-tune fresh Bloom-1b7 on:                              │
+│  • Original dataset (3000+ examples)                        │
+│  • Ensemble best predictions (1500+ examples)               │
+│  • Total: 4500+ training examples                           │
+└─────────────────────────────────────────────────────────────┘
+```
+### Evaluation Metrics
+**J-Score** (Primary metric):
+$$J = \sqrt[3]{STA \times SIM \times FL}$$
+Where:
+- **STA** (Style Transfer Accuracy): Measures toxicity removal success
+- **SIM** (Semantic Similarity): Content preservation (Sentence-BERT cosine similarity)
+- **FL** (Fluency): Ratio of grammatically valid outputs
+---
+## 📁 Dataset
+### Composition
+| Category | Examples | Description |
+|----------|----------|-------------|
+| Personal Insults | 30 | Direct personal attacks |
+| Aggressive Commands | 20 | Hostile imperatives |
+| Work Criticism | 25 | Professional negative feedback |
+| Threats | 15 | Intimidation and warnings |
+| Contempt | 15 | Expressions of superiority |
+| Blame | 15 | Accusatory statements |
+| Appearance Criticism | 15 | Physical/aesthetic insults |
+| Mockery | 15 | Sarcastic belittling |
+| **Total Unique** | **150** | — |
+| **Augmented (×20)** | **3,000+** | Training examples |
+### Data Format
+```
+سام: {toxic_text}
+مهذب: {neutral_text}<EOS>
+```
+---
+## ⚙️ Training Configuration
+| Parameter | Base Models | Final Model |
+|-----------|-------------|-------------|
+| Hardware | NVIDIA A100 40GB | NVIDIA A100 40GB |
+| Precision | BF16 | BF16 |
+| Batch Size | 8-16 | 8 |
+| Learning Rate | 2e-5 - 3e-5 | 1.5e-5 |
+| Epochs | 20-25 | 15 |
+| Optimizer | AdamW | AdamW |
+| Scheduler | Cosine | Cosine |
+| Warmup | 10% | 10% |
+| Total Time | ~85 min | ~30 min |
+---
+## ⚠️ Limitations
+- **Language Coverage**: Optimized for Modern Standard Arabic; dialectal performance may vary
+- **Text Length**: Best for short-medium texts (< 100 tokens)
+- **Domain**: Trained on general toxicity; domain-specific content may need fine-tuning
+- **Context**: Does not consider conversation history
+---
+## 🔮 Future Work
+- Expand to Arabic dialects (Egyptian, Gulf, Levantine)
+- Add toxicity detection classifier
+- Multi-turn conversation support
+- Larger model variants (3B, 7B)
+- Arabic-English code-switching support
+---
+## 📖 Citation
+```bibtex
+@misc{arabicdetox2024,
+  author = {ispromashka},
+  title = {Arabic Text Detoxification: Ensemble Knowledge Distillation Approach},
+  year = {2024},
+  publisher = {HuggingFace},
+  url = {https://huggingface.co/ispromashka/arab-detoxification-isp}
+}
+```
+---
+## 🙏 Acknowledgments
+- [BigScience](https://bigscience.huggingface.co/) for BLOOM models
+- [AUB MIND Lab](https://mind.aub.edu.lb/) for AraGPT2
+- [SBERT](https://www.sbert.net/) for multilingual embeddings
+- [Hugging Face](https://huggingface.co/) for model hosting and Transformers library
+---
+## 📄 License
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+---
+<div align="center">
+**Made with ❤️ for the Arabic NLP community**
+[GitHub](https://github.com/ispromashka)
+</div>