Spaces:

ChauHPham
/

AITextDetector

Sleeping

App Files Files Community

AITextDetector / DESKLIB_INTEGRATION.md

ChauHPham

Upload folder using huggingface_hub

25faba3 verified 2 months ago

preview code

raw

history blame contribute delete

2.52 kB

	# Desklib Pre-trained Model Integration

	## ✅ What Was Added

	Instead of training your own model (which hits PyTorch MPS bugs on M2 Mac), the project now uses Desklib's pre-trained AI text detector - a state-of-the-art model that leads the RAID Benchmark.

	## 🎯 Model Details

	- Model: `desklib/ai-text-detector-v1.01`
	- Base: microsoft/deberta-v3-large
	- Architecture: DeBERTa with mean pooling + classifier head
	- Performance: Top performer on RAID benchmark
	- No Training Needed: Pre-trained and ready to use!

	## 📝 Changes Made

	### 1. `ai_text_detector/models.py`
	- ✅ Added `DesklibAIDetectionModel` class (custom architecture)
	- ✅ Updated `DetectorModel` to support Desklib model
	- ✅ Added `predict()` method for easy inference
	- ✅ Automatic CPU placement for macOS compatibility

	### 2. `gradio_app.py`
	- ✅ Now uses Desklib model by default (instead of RoBERTa-base)
	- ✅ Updated detection logic to use new `predict()` method
	- ✅ Better error handling

	## 🚀 Usage

	### In Gradio App
	```bash
	python gradio_app.py
	```
	The app will automatically use the Desklib model!

	### In Your Code
	```python
	from ai_text_detector.models import DetectorModel

	# Load Desklib model
	model = DetectorModel("desklib/ai-text-detector-v1.01", use_desklib=True)

	# Predict
	ai_prob, label = model.predict("Your text here")
	print(f"AI Probability: {ai_prob:.2%}")
	print(f"Label: {'AI-generated' if label == 1 else 'Human-written'}")
	```

	### Test It
	```bash
	python test_desklib.py
	```

	## 🎉 Benefits

	- ✅ No Training Needed - Pre-trained model ready to use
	- ✅ Better Accuracy - State-of-the-art performance
	- ✅ Works on M2 Mac - Avoids PyTorch MPS training bugs
	- ✅ Easy to Use - Same interface as before
	- ✅ Production Ready - Already fine-tuned and optimized

	## 📊 Model Performance

	- RAID Benchmark: Top performer
	- Robust: Handles adversarial attacks well
	- Domain Generalization: Works across different text types
	- Fast Inference: Optimized for production use

	## 🔄 Fallback

	If Desklib model fails to load, the code falls back to:
	- Your trained model (if exists in `models/ai_detector`)
	- RoBERTa-base (standard classification model)

	## 📚 References

	- Model Card: https://huggingface.co/desklib/ai-text-detector-v1.01
	- GitHub: https://github.com/desklib/ai-text-detector
	- Try Online: https://desklib.com/ai-detector

	---

	You now have a production-ready AI text detector without needing to train! 🎉