|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- ar |
|
|
- en |
|
|
library_name: transformers |
|
|
tags: |
|
|
- arabic |
|
|
- text-generation |
|
|
- detoxification |
|
|
- ensemble |
|
|
- bloom |
|
|
pipeline_tag: text-generation |
|
|
model-index: |
|
|
- name: arab-detoxification-isp |
|
|
results: |
|
|
- task: |
|
|
type: text-generation |
|
|
name: Text Generation |
|
|
dataset: |
|
|
type: custom |
|
|
name: Arabic Detox Dataset |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.95 |
|
|
name: STA |
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
# ๐ก๏ธ Arabic Text Detoxification Model |
|
|
|
|
|
### Ensemble Knowledge Distillation Approach |
|
|
|
|
|
[](https://huggingface.co/bigscience/bloom-1b7) |
|
|
[](https://opensource.org/licenses/MIT) |
|
|
[](https://en.wikipedia.org/wiki/Arabic) |
|
|
[](https://huggingface.co/ispromashka/arab-detoxification-isp) |
|
|
|
|
|
**Transform toxic Arabic text into polite, neutral alternatives while preserving meaning** |
|
|
|
|
|
[Model Demo](#-quick-start) | [Architecture](#-architecture-overview) | [Dataset](https://huggingface.co/datasets/ispromashka/arabic-detox-dataset) | [Results](#-evaluation-results) |
|
|
|
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Architecture Overview |
|
|
|
|
|
<div align="center"> |
|
|
<img src="https://huggingface.co/ispromashka/arab-detoxification-isp/resolve/main/architecture.png" alt="Model Architecture" width="100%"> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ฏ Model Description |
|
|
|
|
|
This model performs **text detoxification** for Arabic language โ converting offensive, toxic, or aggressive text into neutral, polite alternatives while preserving the original semantic meaning. |
|
|
|
|
|
### Key Features |
|
|
|
|
|
| Feature | Description | |
|
|
|---------|-------------| |
|
|
| ๐๏ธ **Architecture** | Bloom-1b7 (1.7B parameters) fine-tuned with ensemble distillation | |
|
|
| ๐ **Language** | Arabic (Modern Standard Arabic + dialects) | |
|
|
| ๐ **Training** | Ensemble of 3 models โ Knowledge distillation โ Final model | |
|
|
| โก **Hardware** | Optimized for NVIDIA A100 40GB, works on consumer GPUs | |
|
|
| ๐ **Context** | Up to 2048 tokens | |
|
|
|
|
|
### Ensemble Components |
|
|
|
|
|
| Model | Parameters | Role | Source | |
|
|
|-------|------------|------|--------| |
|
|
| AraGPT2-Medium | 370M | Arabic Language Expert | AUB MIND Lab | |
|
|
| Bloom-560m | 560M | Multilingual Generalization | BigScience | |
|
|
| Bloom-1b7 | 1.7B | High Capacity Patterns | BigScience | |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Evaluation Results |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
| Metric | Score | Description | |
|
|
|--------|-------|-------------| |
|
|
| **J-Score** | **0.7129** | Joint metric (geometric mean) | |
|
|
| **STA** | 0.9500 | Style Transfer Accuracy | |
|
|
| **SIM (ref)** | 0.9995 | Similarity to reference | |
|
|
| **Fluency** | 1.0000 | Grammatical correctness | |
|
|
|
|
|
</div> |
|
|
|
|
|
``` |
|
|
J-Score โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 0.71 |
|
|
STA โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 0.95 |
|
|
SIM (ref) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1.00 |
|
|
Fluency โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1.00 |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Quick Start |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install transformers torch |
|
|
``` |
|
|
|
|
|
### Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
import torch |
|
|
|
|
|
# Load model |
|
|
model_name = "ispromashka/arab-detoxification-isp" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16) |
|
|
model.to("cuda") # or "cpu" |
|
|
|
|
|
def detoxify(text: str) -> str: |
|
|
"""Convert toxic Arabic text to neutral form.""" |
|
|
prompt = f"ุณุงู
: {text}\nู
ูุฐุจ:" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
|
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=50, |
|
|
temperature=0.7, |
|
|
top_p=0.9, |
|
|
repetition_penalty=1.2, |
|
|
do_sample=True, |
|
|
pad_token_id=tokenizer.pad_token_id, |
|
|
) |
|
|
|
|
|
result = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
return result.split("ู
ูุฐุจ:")[-1].strip().split("\n")[0] |
|
|
|
|
|
# Example |
|
|
toxic_text = "ุฃูุช ุบุจู ุฌุฏุงู" |
|
|
neutral_text = detoxify(toxic_text) |
|
|
print(f"Input: {toxic_text}") |
|
|
print(f"Output: {neutral_text}") |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ก Examples |
|
|
|
|
|
| Category | Toxic Input (ุณุงู
) | Neutral Output (ู
ูุฐุจ) | |
|
|
|----------|-------------------|----------------------| |
|
|
| Insult | ุฃูุช ุบุจู ุฌุฏุงู | ุฑุจู
ุง ุชุญุชุงุฌ ุฅูู ู
ุฒูุฏ ู
ู ุงูููุช ููููู
| |
|
|
| Command | ุงุฎุฑุณ ูุง ุฃุญู
ู | ุฃุฑุฌู ุฃู ุชููู ุฃูุซุฑ ูุฏูุกุงู | |
|
|
| Criticism | ูุฐุง ุงูุนู
ู ุชุงูู ูุณุฎูู | ุงูุนู
ู ูู
ูู ุชุทููุฑู | |
|
|
| Threat | ุณุฃุฌุนูู ุชูุฏู
| ุฏุนูุง ูุญู ูุฐุง ุจุณูุงู
| |
|
|
| Contempt | ุฃูุช ูุงุดู ุชู
ุงู
ุงู | ุงููุฌุงุญ ูุญุชุงุฌ ูู
ุฒูุฏ ู
ู ุงูุฌูุฏ | |
|
|
| Mockery | ูุง ูู ู
ู ุบุจู | ุฑุจู
ุง ูู
ูููู
ุฌูุฏุงู | |
|
|
| Blame | ูู ุดูุก ุฎุทุคู | ูุญุชุงุฌ ุชุญุฏูุฏ ุงูู
ุณุคูููุงุช | |
|
|
| Appearance | ู
ูุธุฑู ุณูุก | ุงูู
ุธูุฑ ูู
ูู ุชุญุณููู | |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ฌ Methodology |
|
|
|
|
|
### Training Pipeline |
|
|
|
|
|
``` |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
|
|
โ STAGE 1: Base Models โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค |
|
|
โ Train 3 specialized models independently on detox dataset โ |
|
|
โ โข AraGPT2-Medium (25 epochs) โ |
|
|
โ โข Bloom-560m (25 epochs) โ |
|
|
โ โข Bloom-1b7 (20 epochs) โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
|
|
โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
|
|
โ STAGE 2: Ensemble Selection โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค |
|
|
โ For each input, select best prediction using: โ |
|
|
โ Sentence-BERT (paraphrase-multilingual-mpnet-base-v2) โ |
|
|
โ Selection: argmax(cosine_similarity(pred, reference)) โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
|
|
โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
|
|
โ STAGE 3: Knowledge Distillation โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค |
|
|
โ Fine-tune fresh Bloom-1b7 on: โ |
|
|
โ โข Original dataset (3000+ examples) โ |
|
|
โ โข Ensemble best predictions (1500+ examples) โ |
|
|
โ โข Total: 4500+ training examples โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
|
|
``` |
|
|
|
|
|
### Evaluation Metrics |
|
|
|
|
|
**J-Score** (Primary metric): |
|
|
|
|
|
$$J = \sqrt[3]{STA \times SIM \times FL}$$ |
|
|
|
|
|
Where: |
|
|
- **STA** (Style Transfer Accuracy): Measures toxicity removal success |
|
|
- **SIM** (Semantic Similarity): Content preservation (Sentence-BERT cosine similarity) |
|
|
- **FL** (Fluency): Ratio of grammatically valid outputs |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Dataset |
|
|
|
|
|
Dataset used for training and evaluation: |
|
|
[**ispromashka/arabic-detox-dataset**](https://huggingface.co/datasets/ispromashka/arabic-detox-dataset) |
|
|
|
|
|
### Composition |
|
|
|
|
|
| Category | Examples | Description | |
|
|
|----------|----------|-------------| |
|
|
| Personal Insults | 30 | Direct personal attacks | |
|
|
| Aggressive Commands | 20 | Hostile imperatives | |
|
|
| Work Criticism | 25 | Professional negative feedback | |
|
|
| Threats | 15 | Intimidation and warnings | |
|
|
| Contempt | 15 | Expressions of superiority | |
|
|
| Blame | 15 | Accusatory statements | |
|
|
| Appearance Criticism | 15 | Physical/aesthetic insults | |
|
|
| Mockery | 15 | Sarcastic belittling | |
|
|
| **Total Unique** | **150** | โ | |
|
|
| **Augmented (ร20)** | **3,000+** | Training examples | |
|
|
|
|
|
### Data Format |
|
|
|
|
|
``` |
|
|
ุณุงู
: {toxic_text} |
|
|
ู
ูุฐุจ: {neutral_text}<EOS> |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## โ๏ธ Training Configuration |
|
|
|
|
|
| Parameter | Base Models | Final Model | |
|
|
|-----------|-------------|-------------| |
|
|
| Hardware | NVIDIA A100 40GB | NVIDIA A100 40GB | |
|
|
| Precision | BF16 | BF16 | |
|
|
| Batch Size | 8โ16 | 8 | |
|
|
| Learning Rate | 2e-5 โ 3e-5 | 1.5e-5 | |
|
|
| Epochs | 20โ25 | 15 | |
|
|
| Optimizer | AdamW | AdamW | |
|
|
| Scheduler | Cosine | Cosine | |
|
|
| Warmup | 10% | 10% | |
|
|
| Total Time | ~85 min | ~30 min | |
|
|
|
|
|
--- |
|
|
|
|
|
## โ ๏ธ Limitations |
|
|
|
|
|
- **Language Coverage**: Optimized for Modern Standard Arabic; dialectal performance may vary |
|
|
- **Text Length**: Best for short-medium texts (< 100 tokens) |
|
|
- **Domain**: Trained on general toxicity; domain-specific content may need fine-tuning |
|
|
- **Context**: Does not consider conversation history |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{arabicdetox2024, |
|
|
author = {ispromashka}, |
|
|
title = {Arabic Text Detoxification: Ensemble Knowledge Distillation Approach}, |
|
|
year = {2024}, |
|
|
publisher = {HuggingFace}, |
|
|
url = {https://huggingface.co/ispromashka/arab-detoxification-isp} |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ License |
|
|
|
|
|
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |
|
|
|
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
**Made with โค๏ธ for the Arabic NLP community** |
|
|
|
|
|
[GitHub](https://github.com/ispromashka) |
|
|
|
|
|
</div> |