---
license: mit
language:
- ar
- en
library_name: transformers
tags:
- arabic
- text-generation
- detoxification
- ensemble
- bloom
pipeline_tag: text-generation
model-index:
- name: arab-detoxification-isp
results:
- task:
type: text-generation
name: Text Generation
dataset:
type: custom
name: Arabic Detox Dataset
metrics:
- type: accuracy
value: 0.95
name: STA
---
# ðĄïļ Arabic Text Detoxification Model
### Ensemble Knowledge Distillation Approach
[](https://huggingface.co/bigscience/bloom-1b7)
[](https://opensource.org/licenses/MIT)
[](https://en.wikipedia.org/wiki/Arabic)
[](https://huggingface.co/ispromashka/arab-detoxification-isp)
**Transform toxic Arabic text into polite, neutral alternatives while preserving meaning**
[Model Demo](#-quick-start) | [Architecture](#-architecture-overview) | [Dataset](https://huggingface.co/datasets/ispromashka/arabic-detox-dataset) | [Results](#-evaluation-results)
---
## ð Architecture Overview
---
## ðŊ Model Description
This model performs **text detoxification** for Arabic language â converting offensive, toxic, or aggressive text into neutral, polite alternatives while preserving the original semantic meaning.
### Key Features
| Feature | Description |
|---------|-------------|
| ðïļ **Architecture** | Bloom-1b7 (1.7B parameters) fine-tuned with ensemble distillation |
| ð **Language** | Arabic (Modern Standard Arabic + dialects) |
| ð **Training** | Ensemble of 3 models â Knowledge distillation â Final model |
| ⥠**Hardware** | Optimized for NVIDIA A100 40GB, works on consumer GPUs |
| ð **Context** | Up to 2048 tokens |
### Ensemble Components
| Model | Parameters | Role | Source |
|-------|------------|------|--------|
| AraGPT2-Medium | 370M | Arabic Language Expert | AUB MIND Lab |
| Bloom-560m | 560M | Multilingual Generalization | BigScience |
| Bloom-1b7 | 1.7B | High Capacity Patterns | BigScience |
---
## ð Evaluation Results
| Metric | Score | Description |
|--------|-------|-------------|
| **J-Score** | **0.7129** | Joint metric (geometric mean) |
| **STA** | 0.9500 | Style Transfer Accuracy |
| **SIM (ref)** | 0.9995 | Similarity to reference |
| **Fluency** | 1.0000 | Grammatical correctness |
```
J-Score ââââââââââââââââââââââââââââââââââââââ 0.71
STA ââââââââââââââââââââââââââââââââââââââ 0.95
SIM (ref) ââââââââââââââââââââââââââââââââââââââ 1.00
Fluency ââââââââââââââââââââââââââââââââââââââ 1.00
```
---
## ð Quick Start
### Installation
```bash
pip install transformers torch
```
### Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model
model_name = "ispromashka/arab-detoxification-isp"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
model.to("cuda") # or "cpu"
def detoxify(text: str) -> str:
"""Convert toxic Arabic text to neutral form."""
prompt = f"ØģاŲ
: {text}\nŲ
ŲØ°ØĻ:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=50,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.2,
do_sample=True,
pad_token_id=tokenizer.pad_token_id,
)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
return result.split("Ų
ŲØ°ØĻ:")[-1].strip().split("\n")[0]
# Example
toxic_text = "ØĢŲØŠ ØšØĻŲ ØŽØŊاŲ"
neutral_text = detoxify(toxic_text)
print(f"Input: {toxic_text}")
print(f"Output: {neutral_text}")
```
---
## ðĄ Examples
| Category | Toxic Input (ØģاŲ
) | Neutral Output (Ų
ŲØ°ØĻ) |
|----------|-------------------|----------------------|
| Insult | ØĢŲØŠ ØšØĻŲ ØŽØŊØ§Ų | ØąØĻŲ
ا ØŠØØŠØ§ØŽ ØĨŲŲ Ų
ØēŲØŊ Ų
Ų Ø§ŲŲŲØŠ ŲŲŲŲŲ
|
| Command | اØŪØąØģ ŲØ§ ØĢØŲ
Ų | ØĢØąØŽŲ ØĢŲ ØŠŲŲŲ ØĢŲØŦØą ŲØŊŲØĄØ§Ų |
| Criticism | ŲØ°Ø§ Ø§ŲØđŲ
Ų ØŠØ§ŲŲ ŲØģØŪŲŲ | Ø§ŲØđŲ
Ų ŲŲ
ŲŲ ØŠØ·ŲŲØąŲ |
| Threat | ØģØĢØŽØđŲŲ ØŠŲØŊŲ
| ØŊØđŲØ§ ŲØŲ ŲØ°Ø§ ØĻØģŲØ§Ų
|
| Contempt | ØĢŲØŠ ŲØ§ØīŲ ØŠŲ
اŲ
Ø§Ų | اŲŲØŽØ§Ø ŲØØŠØ§ØŽ ŲŲ
ØēŲØŊ Ų
Ų Ø§ŲØŽŲØŊ |
| Mockery | ŲØ§ ŲŲ Ų
Ų ØšØĻŲ | ØąØĻŲ
ا ŲŲ
ŲŲŲŲ
ØŽŲØŊØ§Ų |
| Blame | ŲŲ ØīŲØĄ ØŪØ·ØĪŲ | ŲØØŠØ§ØŽ ØŠØØŊŲØŊ اŲŲ
ØģØĪŲŲŲØ§ØŠ |
| Appearance | Ų
ŲØļØąŲ ØģŲØĄ | اŲŲ
ØļŲØą ŲŲ
ŲŲ ØŠØØģŲŲŲ |
---
## ðŽ Methodology
### Training Pipeline
```
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â STAGE 1: Base Models â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââĪ
â Train 3 specialized models independently on detox dataset â
â âĒ AraGPT2-Medium (25 epochs) â
â âĒ Bloom-560m (25 epochs) â
â âĒ Bloom-1b7 (20 epochs) â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â STAGE 2: Ensemble Selection â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââĪ
â For each input, select best prediction using: â
â Sentence-BERT (paraphrase-multilingual-mpnet-base-v2) â
â Selection: argmax(cosine_similarity(pred, reference)) â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
â STAGE 3: Knowledge Distillation â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââĪ
â Fine-tune fresh Bloom-1b7 on: â
â âĒ Original dataset (3000+ examples) â
â âĒ Ensemble best predictions (1500+ examples) â
â âĒ Total: 4500+ training examples â
âââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââââ
```
### Evaluation Metrics
**J-Score** (Primary metric):
$$J = \sqrt[3]{STA \times SIM \times FL}$$
Where:
- **STA** (Style Transfer Accuracy): Measures toxicity removal success
- **SIM** (Semantic Similarity): Content preservation (Sentence-BERT cosine similarity)
- **FL** (Fluency): Ratio of grammatically valid outputs
---
## ð Dataset
Dataset used for training and evaluation:
[**ispromashka/arabic-detox-dataset**](https://huggingface.co/datasets/ispromashka/arabic-detox-dataset)
### Composition
| Category | Examples | Description |
|----------|----------|-------------|
| Personal Insults | 30 | Direct personal attacks |
| Aggressive Commands | 20 | Hostile imperatives |
| Work Criticism | 25 | Professional negative feedback |
| Threats | 15 | Intimidation and warnings |
| Contempt | 15 | Expressions of superiority |
| Blame | 15 | Accusatory statements |
| Appearance Criticism | 15 | Physical/aesthetic insults |
| Mockery | 15 | Sarcastic belittling |
| **Total Unique** | **150** | â |
| **Augmented (Ã20)** | **3,000+** | Training examples |
### Data Format
```
ØģاŲ
: {toxic_text}
Ų
ŲØ°ØĻ: {neutral_text}
```
---
## âïļ Training Configuration
| Parameter | Base Models | Final Model |
|-----------|-------------|-------------|
| Hardware | NVIDIA A100 40GB | NVIDIA A100 40GB |
| Precision | BF16 | BF16 |
| Batch Size | 8â16 | 8 |
| Learning Rate | 2e-5 â 3e-5 | 1.5e-5 |
| Epochs | 20â25 | 15 |
| Optimizer | AdamW | AdamW |
| Scheduler | Cosine | Cosine |
| Warmup | 10% | 10% |
| Total Time | ~85 min | ~30 min |
---
## â ïļ Limitations
- **Language Coverage**: Optimized for Modern Standard Arabic; dialectal performance may vary
- **Text Length**: Best for short-medium texts (< 100 tokens)
- **Domain**: Trained on general toxicity; domain-specific content may need fine-tuning
- **Context**: Does not consider conversation history
---
## ð Citation
```bibtex
@misc{arabicdetox2024,
author = {ispromashka},
title = {Arabic Text Detoxification: Ensemble Knowledge Distillation Approach},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/ispromashka/arab-detoxification-isp}
}
```
---
## ð License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
**Made with âĪïļ for the Arabic NLP community**
[GitHub](https://github.com/ispromashka)