ispromashka's picture
Update README.md
91ca32b verified
---
license: mit
language:
- ar
- en
library_name: transformers
tags:
- arabic
- text-generation
- detoxification
- ensemble
- bloom
pipeline_tag: text-generation
model-index:
- name: arab-detoxification-isp
results:
- task:
type: text-generation
name: Text Generation
dataset:
type: custom
name: Arabic Detox Dataset
metrics:
- type: accuracy
value: 0.95
name: STA
---
<div align="center">
# ๐Ÿ›ก๏ธ Arabic Text Detoxification Model
### Ensemble Knowledge Distillation Approach
[![Model](https://img.shields.io/badge/Model-Bloom--1b7-blue)](https://huggingface.co/bigscience/bloom-1b7)
[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![Language](https://img.shields.io/badge/Language-Arabic-red)](https://en.wikipedia.org/wiki/Arabic)
[![HuggingFace](https://img.shields.io/badge/๐Ÿค—-HuggingFace-yellow)](https://huggingface.co/ispromashka/arab-detoxification-isp)
**Transform toxic Arabic text into polite, neutral alternatives while preserving meaning**
[Model Demo](#-quick-start) | [Architecture](#-architecture-overview) | [Dataset](https://huggingface.co/datasets/ispromashka/arabic-detox-dataset) | [Results](#-evaluation-results)
</div>
---
## ๐Ÿ“Š Architecture Overview
<div align="center">
<img src="https://huggingface.co/ispromashka/arab-detoxification-isp/resolve/main/architecture.png" alt="Model Architecture" width="100%">
</div>
---
## ๐ŸŽฏ Model Description
This model performs **text detoxification** for Arabic language โ€” converting offensive, toxic, or aggressive text into neutral, polite alternatives while preserving the original semantic meaning.
### Key Features
| Feature | Description |
|---------|-------------|
| ๐Ÿ—๏ธ **Architecture** | Bloom-1b7 (1.7B parameters) fine-tuned with ensemble distillation |
| ๐ŸŒ **Language** | Arabic (Modern Standard Arabic + dialects) |
| ๐Ÿ“š **Training** | Ensemble of 3 models โ†’ Knowledge distillation โ†’ Final model |
| โšก **Hardware** | Optimized for NVIDIA A100 40GB, works on consumer GPUs |
| ๐Ÿ“ **Context** | Up to 2048 tokens |
### Ensemble Components
| Model | Parameters | Role | Source |
|-------|------------|------|--------|
| AraGPT2-Medium | 370M | Arabic Language Expert | AUB MIND Lab |
| Bloom-560m | 560M | Multilingual Generalization | BigScience |
| Bloom-1b7 | 1.7B | High Capacity Patterns | BigScience |
---
## ๐Ÿ“ˆ Evaluation Results
<div align="center">
| Metric | Score | Description |
|--------|-------|-------------|
| **J-Score** | **0.7129** | Joint metric (geometric mean) |
| **STA** | 0.9500 | Style Transfer Accuracy |
| **SIM (ref)** | 0.9995 | Similarity to reference |
| **Fluency** | 1.0000 | Grammatical correctness |
</div>
```
J-Score โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘ 0.71
STA โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 0.95
SIM (ref) โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 1.00
Fluency โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 1.00
```
---
## ๐Ÿš€ Quick Start
### Installation
```bash
pip install transformers torch
```
### Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model
model_name = "ispromashka/arab-detoxification-isp"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
model.to("cuda") # or "cpu"
def detoxify(text: str) -> str:
"""Convert toxic Arabic text to neutral form."""
prompt = f"ุณุงู…: {text}\nู…ู‡ุฐุจ:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=50,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.2,
do_sample=True,
pad_token_id=tokenizer.pad_token_id,
)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
return result.split("ู…ู‡ุฐุจ:")[-1].strip().split("\n")[0]
# Example
toxic_text = "ุฃู†ุช ุบุจูŠ ุฌุฏุงู‹"
neutral_text = detoxify(toxic_text)
print(f"Input: {toxic_text}")
print(f"Output: {neutral_text}")
```
---
## ๐Ÿ’ก Examples
| Category | Toxic Input (ุณุงู…) | Neutral Output (ู…ู‡ุฐุจ) |
|----------|-------------------|----------------------|
| Insult | ุฃู†ุช ุบุจูŠ ุฌุฏุงู‹ | ุฑุจู…ุง ุชุญุชุงุฌ ุฅู„ู‰ ู…ุฒูŠุฏ ู…ู† ุงู„ูˆู‚ุช ู„ู„ูู‡ู… |
| Command | ุงุฎุฑุณ ูŠุง ุฃุญู…ู‚ | ุฃุฑุฌูˆ ุฃู† ุชูƒูˆู† ุฃูƒุซุฑ ู‡ุฏูˆุกุงู‹ |
| Criticism | ู‡ุฐุง ุงู„ุนู…ู„ ุชุงูู‡ ูˆุณุฎูŠู | ุงู„ุนู…ู„ ูŠู…ูƒู† ุชุทูˆูŠุฑู‡ |
| Threat | ุณุฃุฌุนู„ูƒ ุชู†ุฏู… | ุฏุนู†ุง ู†ุญู„ ู‡ุฐุง ุจุณู„ุงู… |
| Contempt | ุฃู†ุช ูุงุดู„ ุชู…ุงู…ุงู‹ | ุงู„ู†ุฌุงุญ ูŠุญุชุงุฌ ู„ู…ุฒูŠุฏ ู…ู† ุงู„ุฌู‡ุฏ |
| Mockery | ูŠุง ู„ู‡ ู…ู† ุบุจูŠ | ุฑุจู…ุง ู„ู… ูŠูู‡ู… ุฌูŠุฏุงู‹ |
| Blame | ูƒู„ ุดูŠุก ุฎุทุคูƒ | ู†ุญุชุงุฌ ุชุญุฏูŠุฏ ุงู„ู…ุณุคูˆู„ูŠุงุช |
| Appearance | ู…ู†ุธุฑูƒ ุณูŠุก | ุงู„ู…ุธู‡ุฑ ูŠู…ูƒู† ุชุญุณูŠู†ู‡ |
---
## ๐Ÿ”ฌ Methodology
### Training Pipeline
```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STAGE 1: Base Models โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Train 3 specialized models independently on detox dataset โ”‚
โ”‚ โ€ข AraGPT2-Medium (25 epochs) โ”‚
โ”‚ โ€ข Bloom-560m (25 epochs) โ”‚
โ”‚ โ€ข Bloom-1b7 (20 epochs) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STAGE 2: Ensemble Selection โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ For each input, select best prediction using: โ”‚
โ”‚ Sentence-BERT (paraphrase-multilingual-mpnet-base-v2) โ”‚
โ”‚ Selection: argmax(cosine_similarity(pred, reference)) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STAGE 3: Knowledge Distillation โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Fine-tune fresh Bloom-1b7 on: โ”‚
โ”‚ โ€ข Original dataset (3000+ examples) โ”‚
โ”‚ โ€ข Ensemble best predictions (1500+ examples) โ”‚
โ”‚ โ€ข Total: 4500+ training examples โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
### Evaluation Metrics
**J-Score** (Primary metric):
$$J = \sqrt[3]{STA \times SIM \times FL}$$
Where:
- **STA** (Style Transfer Accuracy): Measures toxicity removal success
- **SIM** (Semantic Similarity): Content preservation (Sentence-BERT cosine similarity)
- **FL** (Fluency): Ratio of grammatically valid outputs
---
## ๐Ÿ“ Dataset
Dataset used for training and evaluation:
[**ispromashka/arabic-detox-dataset**](https://huggingface.co/datasets/ispromashka/arabic-detox-dataset)
### Composition
| Category | Examples | Description |
|----------|----------|-------------|
| Personal Insults | 30 | Direct personal attacks |
| Aggressive Commands | 20 | Hostile imperatives |
| Work Criticism | 25 | Professional negative feedback |
| Threats | 15 | Intimidation and warnings |
| Contempt | 15 | Expressions of superiority |
| Blame | 15 | Accusatory statements |
| Appearance Criticism | 15 | Physical/aesthetic insults |
| Mockery | 15 | Sarcastic belittling |
| **Total Unique** | **150** | โ€” |
| **Augmented (ร—20)** | **3,000+** | Training examples |
### Data Format
```
ุณุงู…: {toxic_text}
ู…ู‡ุฐุจ: {neutral_text}<EOS>
```
---
## โš™๏ธ Training Configuration
| Parameter | Base Models | Final Model |
|-----------|-------------|-------------|
| Hardware | NVIDIA A100 40GB | NVIDIA A100 40GB |
| Precision | BF16 | BF16 |
| Batch Size | 8โ€“16 | 8 |
| Learning Rate | 2e-5 โ€“ 3e-5 | 1.5e-5 |
| Epochs | 20โ€“25 | 15 |
| Optimizer | AdamW | AdamW |
| Scheduler | Cosine | Cosine |
| Warmup | 10% | 10% |
| Total Time | ~85 min | ~30 min |
---
## โš ๏ธ Limitations
- **Language Coverage**: Optimized for Modern Standard Arabic; dialectal performance may vary
- **Text Length**: Best for short-medium texts (< 100 tokens)
- **Domain**: Trained on general toxicity; domain-specific content may need fine-tuning
- **Context**: Does not consider conversation history
---
## ๐Ÿ“– Citation
```bibtex
@misc{arabicdetox2024,
author = {ispromashka},
title = {Arabic Text Detoxification: Ensemble Knowledge Distillation Approach},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/ispromashka/arab-detoxification-isp}
}
```
---
## ๐Ÿ“„ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
<div align="center">
**Made with โค๏ธ for the Arabic NLP community**
[GitHub](https://github.com/ispromashka)
</div>