Update README.md

15a8d8b verified 5 months ago

14 kB

	---
	language:
	- en
	tags:
	- text-detoxification
	- text2text-generation
	- detoxification
	- content-moderation
	- toxicity-reduction
	- llama
	- gguf
	- minibase
	- medium-model
	- 4096-context
	license: apache-2.0
	datasets:
	- paradetox
	metrics:
	- toxicity-reduction
	- semantic-similarity
	- fluency
	- latency
	model-index:
	- name: Detoxify-Medium
	results:
	- task:
	type: text-detoxification
	name: Toxicity Reduction
	dataset:
	type: paradetox
	name: ParaDetox
	config: toxic-neutral
	split: test
	metrics:
	- type: toxicity-reduction
	value: 0.178
	name: Average Toxicity Reduction
	- type: semantic-similarity
	value: 0.561
	name: Semantic to Expected
	- type: fluency
	value: 0.929
	name: Text Fluency
	- type: latency
	value: 160.2
	name: Average Latency (ms)
	---

	# Detoxify-Medium 🤖

	<div align="center">

	A medium-sized, high-capacity text detoxification model for advanced toxicity removal while preserving meaning.

	[![Model Size](https://img.shields.io/badge/Model_Size-369MB-blue)](https://huggingface.co/)
	[![Architecture](https://img.shields.io/badge/Architecture-LlamaForCausalLM-green)](https://huggingface.co/)
	[![Context Window](https://img.shields.io/badge/Context-4096_Tokens-orange)](https://huggingface.co/)
	[![License](https://img.shields.io/badge/License-Apache_2.0-yellow)](LICENSE)
	[![Discord](https://img.shields.io/badge/Discord-Join_Community-5865F2)](https://discord.com/invite/BrJn4D2Guh)

	Built by [Minibase](https://minibase.ai) - Train and deploy small AI models from your browser.
	Browse all of the models and datasets available on the [Minibase Marketplace](https://minibase.ai/wiki/Special:Marketplace).

	</div>

	## 📋 Model Summary

	Minibase-Detoxify-Medium is a medium-capacity language model fine-tuned specifically for advanced text detoxification tasks. It takes toxic or inappropriate text as input and generates cleaned, non-toxic versions while preserving the original meaning and intent as much as possible. With a 4,096 token context window and enhanced capacity, it excels at handling longer texts and more complex detoxification scenarios.

	### Key Features
	- ⚡ Balanced Performance: ~160ms average response time
	- 🎯 High Fluency: 92.9% well-formed output text
	- 🧹 Advanced Detoxification: 17.8% average toxicity reduction
	- 💾 Medium Size: 369MB (GGUF Q8_0 quantized)
	- 🔒 Privacy-First: Runs locally, no data sent to external servers
	- 📏 Extended Context: 4,096 token context window (4x larger than Small)

	## 🚀 Quick Start

	### Local Inference (Recommended)

	1. Install llama.cpp (if not already installed):
	```bash
	git clone https://github.com/ggerganov/llama.cpp
	cd llama.cpp && make
	```

	2. Download and run the model:
	```bash
	# Download model files
	wget https://huggingface.co/Minibase/Detoxify-Language-Medium/resolve/main/detoxify-medium-q8_0.gguf
	wget https://huggingface.co/Minibase/Detoxify-Language-Medium/resolve/main/detoxify_inference.py

	# Make executable and run
	chmod +x run_server.sh
	./run_server.sh
	```

	3. Make API calls:
	```python
	import requests

	# Detoxify text
	response = requests.post("http://127.0.0.1:8000/completion", json={
	"prompt": "Instruction: Rewrite the provided text to remove the toxicity.\n\nInput: This is fucking terrible!\n\nResponse: ",
	"max_tokens": 256,
	"temperature": 0.7
	})

	result = response.json()
	print(result["content"]) # "This is really terrible!"
	```

	### Python Client

	```python
	from detoxify_inference import DetoxifyClient

	# Initialize client
	client = DetoxifyClient()

	# Detoxify text
	toxic_text = "This product is fucking amazing, no bullshit!"
	clean_text = client.detoxify_text(toxic_text)

	print(clean_text) # "This product is really amazing, no kidding!"
	```

	## 📊 Benchmarks & Performance

	### ParaDetox Dataset Results (1,011 samples)

	\| Metric \| Value \| Description \|
	\|--------\|-------\|-------------\|
	\| Original Toxicity \| 0.196 (19.6%) \| Input toxicity level \|
	\| Final Toxicity \| 0.018 (1.8%) \| Output toxicity level \|
	\| Toxicity Reduction \| 91% \| Reduction in toxicity scores \|
	\| Semantic Similarity (Expected) \| 0.561 (56.1%) \| Similarity to human expert rewrites \|
	\| Semantic Similarity (Original) \| 0.625 (62.5%) \| How much original meaning is preserved \|
	\| Fluency \| 0.929 (92.9%) \| Quality of generated text structure \|
	\| Latency \| 160.2ms \| Average response time \|
	\| Throughput \| ~6 req/sec \| Estimated requests per second \|

	### Dataset Breakdown

	#### General Toxic Content (1,000 samples)
	- Toxicity Reduction: 17.8%
	- Semantic Preservation: 56.1%
	- Fluency: 92.9%

	#### High-Toxicity Content (11 samples)
	- Toxicity Reduction: 31.3% ⭐ Strong performance!
	- Semantic Preservation: 47.7%
	- Fluency: 93.6%

	### Comparison with Detoxify-Small

	\| Model \| Context Window \| Toxicity Reduction \| Semantic Similarity \| Latency \| Size \|
	\|-------\|----------------\|-------------------\|-------------------\|---------\|------\|
	\| Detoxify-Medium \| 4,096 tokens \| 17.8% \| 56.1% \| 160ms \| 369MB \|
	\| Detoxify-Small \| 1,024 tokens \| 3.2% \| 47.1% \| 66ms \| 138MB \|

	Key Improvements:
	- ✅ 4x larger context window
	- ✅ 5.6x better toxicity reduction
	- ✅ 19% better semantic preservation
	- ✅ 2.7x larger model size

	### Comparison with Baselines

	\| Model \| Semantic Similarity \| Toxicity Reduction \| Fluency \|
	\|-------\|-------------------\|-------------------\|---------\|
	\| Detoxify-Medium \| 0.561 \| 0.178 \| 0.929 \|
	\| Detoxify-Small \| 0.471 \| 0.032 \| 0.919 \|
	\| BART-base (ParaDetox) \| 0.750 \| ~0.15 \| ~0.85 \|
	\| Human Performance \| 0.850 \| ~0.25 \| ~0.95 \|

	Performance Notes:
	- 📈 Semantic Similarity: How well meaning is preserved
	- 🧹 Toxicity Reduction: How effectively toxicity is removed
	- ✍️ Fluency: Quality of generated text
	- 🎯 Detoxify-Medium achieves strong performance across all metrics

	## 🏗️ Technical Details

	### Model Architecture
	- Architecture: LlamaForCausalLM
	- Parameters: 279M (medium capacity)
	- Context Window: 4,096 tokens (4x larger than Small)
	- Max Position Embeddings: 8,192
	- Quantization: GGUF (Q8_0 quantization)
	- File Size: 369MB
	- Memory Requirements: 12GB RAM minimum, 24GB recommended

	### Training Details
	- Base Model: Custom-trained Llama architecture
	- Fine-tuning Dataset: Curated toxic-neutral parallel pairs
	- Training Objective: Instruction-following for detoxification
	- Optimization: Quantized for edge deployment
	- Model Scale: Medium capacity for enhanced performance

	### System Requirements

	\| Component \| Minimum \| Recommended \|
	\|-----------\|---------\|-------------\|
	\| Operating System \| Linux, macOS, Windows \| Linux or macOS \|
	\| RAM \| 12GB \| 24GB \|
	\| Storage \| 400MB free space \| 1GB free space \|
	\| Python \| 3.8+ \| 3.10+ \|
	\| Dependencies \| llama.cpp \| llama.cpp, requests \|
	\| GPU \| Optional \| NVIDIA RTX 30-series or Apple M2/M3 \|

	Notes:
	- ✅ CPU-only inference is supported but slower
	- ✅ GPU acceleration provides significant speed improvements
	- ✅ Apple Silicon users get Metal acceleration automatically

	## 📖 Usage Examples

	### Basic Detoxification
	```python
	# Input: "This is fucking awesome!"
	# Output: "This is really awesome!"

	# Input: "You stupid idiot, get out of my way!"
	# Output: "You silly person, please move aside!"
	```

	### Long-Form Text Detoxification
	```python
	# Input: "This article is complete bullshit and the author is a fucking moron who doesn't know what they're talking about. The whole thing is garbage and worthless."
	# Output: "This article is not well-founded and the author seems uninformed about the topic. The whole thing seems questionable."
	```

	### API Integration
	```python
	import requests

	def detoxify_text(text: str) -> str:
	"""Detoxify text using Detoxify-Medium API"""
	prompt = f"Instruction: Rewrite the provided text to remove the toxicity.\n\nInput: {text}\n\nResponse: "

	response = requests.post("http://127.0.0.1:8000/completion", json={
	"prompt": prompt,
	"max_tokens": 256,
	"temperature": 0.7
	})

	return response.json()["content"]

	# Usage
	toxic_comment = "This product sucks donkey balls!"
	clean_comment = detoxify_text(toxic_comment)
	print(clean_comment) # "This product is not very good!"
	```

	### Batch Processing
	```python
	import asyncio
	import aiohttp

	async def detoxify_batch(texts: list) -> list:
	"""Process multiple texts concurrently"""
	async with aiohttp.ClientSession() as session:
	tasks = []
	for text in texts:
	prompt = f"Instruction: Rewrite the provided text to remove the toxicity.\n\nInput: {text}\n\nResponse: "
	payload = {
	"prompt": prompt,
	"max_tokens": 256,
	"temperature": 0.7
	}
	tasks.append(session.post("http://127.0.0.1:8000/completion", json=payload))

	responses = await asyncio.gather(*tasks)
	return [await resp.json() for resp in responses]

	# Process multiple comments
	comments = [
	"This is fucking brilliant!",
	"You stupid moron!",
	"What the hell is wrong with you?"
	]

	clean_comments = await detoxify_batch(comments)
	```

	## 🔧 Advanced Configuration

	### Server Configuration
	```bash
	# GPU acceleration (macOS with Metal)
	llama-server \
	-m detoxify-medium-q8_0.gguf \
	--host 127.0.0.1 \
	--port 8000 \
	--n-gpu-layers 35 \
	--ctx-size 4096 \
	--metal

	# CPU-only (higher memory usage)
	llama-server \
	-m detoxify-medium-q8_0.gguf \
	--host 127.0.0.1 \
	--port 8000 \
	--n-gpu-layers 0 \
	--threads 8 \
	--ctx-size 4096

	# Custom context window
	llama-server \
	-m detoxify-medium-q8_0.gguf \
	--ctx-size 2048 \
	--host 127.0.0.1 \
	--port 8000
	```

	### Alternative: Use the MacOS Application
	```bash
	# If using the provided MacOS app bundle
	cd /path/to/downloaded/model
	./Minibase-detoxify-medium.app/Contents/MacOS/run_server
	```

	### Temperature Settings

	\| Temperature Range \| Approach \| Description \|
	\|------------------\|----------\|-------------\|
	\| 0.1-0.3 \| Conservative \| Minimal changes, preserves original style \|
	\| 0.4-0.7 \| Balanced (Recommended) \| Best trade-off between detoxification and naturalness \|
	\| 0.8-1.0 \| Creative \| More aggressive detoxification, may alter style \|

	### Context Window Optimization

	\| Context Size \| Use Case \| Performance \|
	\|--------------\|----------\|------------\|
	\| 4,096 tokens \| Long documents, complex detoxification \| Best quality, slower processing \|
	\| 2,048 tokens \| Balanced performance and quality \| Good compromise (recommended) \|
	\| 1,024 tokens \| Simple tasks, fast processing \| Faster inference, adequate quality \|

	## 📚 Limitations & Biases

	### Current Limitations

	\| Limitation \| Description \| Impact \|
	\|------------\|-------------\|--------\|
	\| Vocabulary Scope \| Trained primarily on English toxic content \| May not handle other languages effectively \|
	\| Context Awareness \| Limited detection of sarcasm or cultural context \| May miss nuanced toxicity \|
	\| Length Constraints \| Limited to 4,096 token context window \| Cannot process very long documents \|
	\| Domain Specificity \| Optimized for general web content \| May perform differently on specialized domains \|
	\| Memory Requirements \| Higher RAM usage than smaller models \| Requires more system resources \|

	### Potential Biases

	\| Bias Type \| Description \| Mitigation \|
	\|-----------\|-------------\|------------\|
	\| Cultural Context \| May not handle culture-specific expressions \| Use with awareness of cultural differences \|
	\| Dialect Variations \| Limited exposure to regional dialects \| May not recognize regional toxic patterns \|
	\| Emerging Slang \| May not recognize newest internet slang \| Regular model updates recommended \|
	\| Long-form Content \| May struggle with very complex toxicity \| Break long content into smaller chunks \|

	## 🤝 Contributing

	We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

	### Development Setup
	```bash
	# Clone the repository
	git clone https://github.com/minibase-ai/detoxify-medium
	cd detoxify-medium

	# Install dependencies
	pip install -r requirements.txt

	# Run tests
	python -m pytest tests/
	```

	## 📜 Citation

	If you use Detoxify-Medium in your research, please cite:

	```bibtex
	@misc{detoxify-medium-2025,
	title={Detoxify-Medium: A High-Capacity Text Detoxification Model},
	author={Minibase AI Team},
	year={2025},
	publisher={Hugging Face},
	url={https://huggingface.co/Minibase/Detoxify-Language-Medium}
	}
	```

	## 📞 Contact & Community

	- Website: [minibase.ai](https://minibase.ai)
	- Discord Community: [Join our Discord](https://discord.com/invite/BrJn4D2Guh)
	- GitHub Issues: [Report bugs or request features on Discord](https://discord.com/invite/BrJn4D2Guh)
	- Email: hello@minibase.ai

	### Support
	- 📖 Documentation: [help.minibase.ai](https://help.minibase.ai)
	- 💬 Community Forum: [Join our Discord Community](https://discord.com/invite/BrJn4D2Guh)

	## 📋 License

	This model is released under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)).

	## 🙏 Acknowledgments

	- ParaDetox Dataset: Used for benchmarking and evaluation
	- llama.cpp: For efficient local inference
	- Hugging Face: For model hosting and community
	- Our amazing community: For feedback and contributions

	---

	<div align="center">

	Built with ❤️ by the Minibase team

	Making AI more accessible for everyone

	[📖 Minibase Help Center](https://help.minibase.ai) • [💬 Join our Discord](https://discord.com/invite/BrJn4D2Guh)

	</div>