Update README.md

91ca32b verified about 2 months ago

10.5 kB

	---
	license: mit
	language:
	- ar
	- en
	library_name: transformers
	tags:
	- arabic
	- text-generation
	- detoxification
	- ensemble
	- bloom
	pipeline_tag: text-generation
	model-index:
	- name: arab-detoxification-isp
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	type: custom
	name: Arabic Detox Dataset
	metrics:
	- type: accuracy
	value: 0.95
	name: STA
	---

	<div align="center">

	# 🛡️ Arabic Text Detoxification Model

	### Ensemble Knowledge Distillation Approach

	[![Model](https://img.shields.io/badge/Model-Bloom--1b7-blue)](https://huggingface.co/bigscience/bloom-1b7)
	[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
	[![Language](https://img.shields.io/badge/Language-Arabic-red)](https://en.wikipedia.org/wiki/Arabic)
	[![HuggingFace](https://img.shields.io/badge/🤗-HuggingFace-yellow)](https://huggingface.co/ispromashka/arab-detoxification-isp)

	Transform toxic Arabic text into polite, neutral alternatives while preserving meaning

	[Model Demo](#-quick-start) \| [Architecture](#-architecture-overview) \| [Dataset](https://huggingface.co/datasets/ispromashka/arabic-detox-dataset) \| [Results](#-evaluation-results)

	</div>

	---

	## 📊 Architecture Overview

	<div align="center">
	<img src="https://huggingface.co/ispromashka/arab-detoxification-isp/resolve/main/architecture.png" alt="Model Architecture" width="100%">
	</div>

	---

	## 🎯 Model Description

	This model performs text detoxification for Arabic language — converting offensive, toxic, or aggressive text into neutral, polite alternatives while preserving the original semantic meaning.

	### Key Features

	\| Feature \| Description \|
	\|---------\|-------------\|
	\| 🏗️ Architecture \| Bloom-1b7 (1.7B parameters) fine-tuned with ensemble distillation \|
	\| 🌍 Language \| Arabic (Modern Standard Arabic + dialects) \|
	\| 📚 Training \| Ensemble of 3 models → Knowledge distillation → Final model \|
	\| ⚡ Hardware \| Optimized for NVIDIA A100 40GB, works on consumer GPUs \|
	\| 📏 Context \| Up to 2048 tokens \|

	### Ensemble Components

	\| Model \| Parameters \| Role \| Source \|
	\|-------\|------------\|------\|--------\|
	\| AraGPT2-Medium \| 370M \| Arabic Language Expert \| AUB MIND Lab \|
	\| Bloom-560m \| 560M \| Multilingual Generalization \| BigScience \|
	\| Bloom-1b7 \| 1.7B \| High Capacity Patterns \| BigScience \|

	---

	## 📈 Evaluation Results

	<div align="center">

	\| Metric \| Score \| Description \|
	\|--------\|-------\|-------------\|
	\| J-Score \| 0.7129 \| Joint metric (geometric mean) \|
	\| STA \| 0.9500 \| Style Transfer Accuracy \|
	\| SIM (ref) \| 0.9995 \| Similarity to reference \|
	\| Fluency \| 1.0000 \| Grammatical correctness \|

	</div>

	```
	J-Score ████████████████████████████░░░░░░░░░░ 0.71
	STA ██████████████████████████████████████ 0.95
	SIM (ref) ██████████████████████████████████████ 1.00
	Fluency ██████████████████████████████████████ 1.00
	```

	---

	## 🚀 Quick Start

	### Installation

	```bash
	pip install transformers torch
	```

	### Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	# Load model
	model_name = "ispromashka/arab-detoxification-isp"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
	model.to("cuda") # or "cpu"

	def detoxify(text: str) -> str:
	"""Convert toxic Arabic text to neutral form."""
	prompt = f"سام: {text}\nمهذب:"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	outputs = model.generate(
	**inputs,
	max_new_tokens=50,
	temperature=0.7,
	top_p=0.9,
	repetition_penalty=1.2,
	do_sample=True,
	pad_token_id=tokenizer.pad_token_id,
	)

	result = tokenizer.decode(outputs[0], skip_special_tokens=True)
	return result.split("مهذب:")[-1].strip().split("\n")[0]

	# Example
	toxic_text = "أنت غبي جداً"
	neutral_text = detoxify(toxic_text)
	print(f"Input: {toxic_text}")
	print(f"Output: {neutral_text}")
	```

	---

	## 💡 Examples

	\| Category \| Toxic Input (سام) \| Neutral Output (مهذب) \|
	\|----------\|-------------------\|----------------------\|
	\| Insult \| أنت غبي جداً \| ربما تحتاج إلى مزيد من الوقت للفهم \|
	\| Command \| اخرس يا أحمق \| أرجو أن تكون أكثر هدوءاً \|
	\| Criticism \| هذا العمل تافه وسخيف \| العمل يمكن تطويره \|
	\| Threat \| سأجعلك تندم \| دعنا نحل هذا بسلام \|
	\| Contempt \| أنت فاشل تماماً \| النجاح يحتاج لمزيد من الجهد \|
	\| Mockery \| يا له من غبي \| ربما لم يفهم جيداً \|
	\| Blame \| كل شيء خطؤك \| نحتاج تحديد المسؤوليات \|
	\| Appearance \| منظرك سيء \| المظهر يمكن تحسينه \|

	---

	## 🔬 Methodology

	### Training Pipeline

	```
	┌─────────────────────────────────────────────────────────────┐
	│ STAGE 1: Base Models │
	├─────────────────────────────────────────────────────────────┤
	│ Train 3 specialized models independently on detox dataset │
	│ • AraGPT2-Medium (25 epochs) │
	│ • Bloom-560m (25 epochs) │
	│ • Bloom-1b7 (20 epochs) │
	└─────────────────────────────────────────────────────────────┘
	↓
	┌─────────────────────────────────────────────────────────────┐
	│ STAGE 2: Ensemble Selection │
	├─────────────────────────────────────────────────────────────┤
	│ For each input, select best prediction using: │
	│ Sentence-BERT (paraphrase-multilingual-mpnet-base-v2) │
	│ Selection: argmax(cosine_similarity(pred, reference)) │
	└─────────────────────────────────────────────────────────────┘
	↓
	┌─────────────────────────────────────────────────────────────┐
	│ STAGE 3: Knowledge Distillation │
	├─────────────────────────────────────────────────────────────┤
	│ Fine-tune fresh Bloom-1b7 on: │
	│ • Original dataset (3000+ examples) │
	│ • Ensemble best predictions (1500+ examples) │
	│ • Total: 4500+ training examples │
	└─────────────────────────────────────────────────────────────┘
	```

	### Evaluation Metrics

	J-Score (Primary metric):

	$$J = \sqrt[3]{STA \times SIM \times FL}$$

	Where:
	- STA (Style Transfer Accuracy): Measures toxicity removal success
	- SIM (Semantic Similarity): Content preservation (Sentence-BERT cosine similarity)
	- FL (Fluency): Ratio of grammatically valid outputs

	---

	## 📁 Dataset

	Dataset used for training and evaluation:
	[ispromashka/arabic-detox-dataset](https://huggingface.co/datasets/ispromashka/arabic-detox-dataset)

	### Composition

	\| Category \| Examples \| Description \|
	\|----------\|----------\|-------------\|
	\| Personal Insults \| 30 \| Direct personal attacks \|
	\| Aggressive Commands \| 20 \| Hostile imperatives \|
	\| Work Criticism \| 25 \| Professional negative feedback \|
	\| Threats \| 15 \| Intimidation and warnings \|
	\| Contempt \| 15 \| Expressions of superiority \|
	\| Blame \| 15 \| Accusatory statements \|
	\| Appearance Criticism \| 15 \| Physical/aesthetic insults \|
	\| Mockery \| 15 \| Sarcastic belittling \|
	\| Total Unique \| 150 \| — \|
	\| Augmented (×20) \| 3,000+ \| Training examples \|

	### Data Format

	```
	سام: {toxic_text}
	مهذب: {neutral_text}<EOS>
	```

	---

	## ⚙️ Training Configuration

	\| Parameter \| Base Models \| Final Model \|
	\|-----------\|-------------\|-------------\|
	\| Hardware \| NVIDIA A100 40GB \| NVIDIA A100 40GB \|
	\| Precision \| BF16 \| BF16 \|
	\| Batch Size \| 8–16 \| 8 \|
	\| Learning Rate \| 2e-5 – 3e-5 \| 1.5e-5 \|
	\| Epochs \| 20–25 \| 15 \|
	\| Optimizer \| AdamW \| AdamW \|
	\| Scheduler \| Cosine \| Cosine \|
	\| Warmup \| 10% \| 10% \|
	\| Total Time \| ~85 min \| ~30 min \|

	---

	## ⚠️ Limitations

	- Language Coverage: Optimized for Modern Standard Arabic; dialectal performance may vary
	- Text Length: Best for short-medium texts (< 100 tokens)
	- Domain: Trained on general toxicity; domain-specific content may need fine-tuning
	- Context: Does not consider conversation history

	---

	## 📖 Citation

	```bibtex
	@misc{arabicdetox2024,
	author = {ispromashka},
	title = {Arabic Text Detoxification: Ensemble Knowledge Distillation Approach},
	year = {2024},
	publisher = {HuggingFace},
	url = {https://huggingface.co/ispromashka/arab-detoxification-isp}
	}
	```

	---

	## 📄 License

	This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

	---

	<div align="center">

	Made with ❤️ for the Arabic NLP community

	[GitHub](https://github.com/ispromashka)

	</div>