AITextDetector / DESKLIB_INTEGRATION.md
ChauHPham's picture
Upload folder using huggingface_hub
25faba3 verified
# Desklib Pre-trained Model Integration
## βœ… What Was Added
Instead of training your own model (which hits PyTorch MPS bugs on M2 Mac), the project now uses **Desklib's pre-trained AI text detector** - a state-of-the-art model that leads the RAID Benchmark.
## 🎯 Model Details
- **Model**: `desklib/ai-text-detector-v1.01`
- **Base**: microsoft/deberta-v3-large
- **Architecture**: DeBERTa with mean pooling + classifier head
- **Performance**: Top performer on RAID benchmark
- **No Training Needed**: Pre-trained and ready to use!
## πŸ“ Changes Made
### 1. `ai_text_detector/models.py`
- βœ… Added `DesklibAIDetectionModel` class (custom architecture)
- βœ… Updated `DetectorModel` to support Desklib model
- βœ… Added `predict()` method for easy inference
- βœ… Automatic CPU placement for macOS compatibility
### 2. `gradio_app.py`
- βœ… Now uses Desklib model by default (instead of RoBERTa-base)
- βœ… Updated detection logic to use new `predict()` method
- βœ… Better error handling
## πŸš€ Usage
### In Gradio App
```bash
python gradio_app.py
```
The app will automatically use the Desklib model!
### In Your Code
```python
from ai_text_detector.models import DetectorModel
# Load Desklib model
model = DetectorModel("desklib/ai-text-detector-v1.01", use_desklib=True)
# Predict
ai_prob, label = model.predict("Your text here")
print(f"AI Probability: {ai_prob:.2%}")
print(f"Label: {'AI-generated' if label == 1 else 'Human-written'}")
```
### Test It
```bash
python test_desklib.py
```
## πŸŽ‰ Benefits
- βœ… **No Training Needed** - Pre-trained model ready to use
- βœ… **Better Accuracy** - State-of-the-art performance
- βœ… **Works on M2 Mac** - Avoids PyTorch MPS training bugs
- βœ… **Easy to Use** - Same interface as before
- βœ… **Production Ready** - Already fine-tuned and optimized
## πŸ“Š Model Performance
- **RAID Benchmark**: Top performer
- **Robust**: Handles adversarial attacks well
- **Domain Generalization**: Works across different text types
- **Fast Inference**: Optimized for production use
## πŸ”„ Fallback
If Desklib model fails to load, the code falls back to:
- Your trained model (if exists in `models/ai_detector`)
- RoBERTa-base (standard classification model)
## πŸ“š References
- **Model Card**: https://huggingface.co/desklib/ai-text-detector-v1.01
- **GitHub**: https://github.com/desklib/ai-text-detector
- **Try Online**: https://desklib.com/ai-detector
---
**You now have a production-ready AI text detector without needing to train!** πŸŽ‰