|
|
--- |
|
|
title: Website Category Classifier |
|
|
emoji: π |
|
|
colorFrom: blue |
|
|
colorTo: green |
|
|
sdk: gradio |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: mit |
|
|
hardware: zero-gpu |
|
|
--- |
|
|
|
|
|
# Website Category Classifier (Fixed Version) |
|
|
|
|
|
This application classifies websites into three categories using a fine-tuned Mistral 7B model: |
|
|
- **OTHER**: General websites |
|
|
- **NEWS/BLOG**: News websites and blogs |
|
|
- **E-COMMERCE**: Online shopping sites |
|
|
|
|
|
## π§ Fixed Issues |
|
|
|
|
|
This version resolves the torch.int1 AttributeError by: |
|
|
- Removing unsloth dependency (causing compatibility issues) |
|
|
- Using transformers library directly |
|
|
- Pinning PyTorch version to avoid conflicts |
|
|
- Adding proper error handling |
|
|
|
|
|
## Features |
|
|
|
|
|
- **Batch Processing**: Classify up to 20 URLs at once |
|
|
- **AI-Powered**: Uses Mistral 7B model for accurate classification |
|
|
- **Real-time Progress**: Shows processing progress |
|
|
- **GPU Acceleration**: Powered by Hugging Face ZeroGPU |
|
|
- **Error Recovery**: Handles failed URLs gracefully |
|
|
|
|
|
## Usage |
|
|
|
|
|
1. Enter URLs (one per line) in the input textbox |
|
|
2. Click "π Classify Websites" |
|
|
3. View results showing each URL and its predicted category |
|
|
|
|
|
## Model |
|
|
|
|
|
This app uses the `limitedonly41/website_mistral7b_v02` model loaded via transformers library with 4-bit quantization for efficiency. |
|
|
|
|
|
## Technical Details |
|
|
|
|
|
- Built with Gradio for the interface |
|
|
- Uses transformers instead of unsloth for compatibility |
|
|
- ZeroGPU decorator for efficient GPU utilization |
|
|
- Async processing for better performance |
|
|
- Translation support for non-English websites |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Maximum 20 URLs per batch (reduced for stability) |
|
|
- 30-second timeout per URL |
|
|
- Requires internet connection for URL scraping |
|
|
- Model loading may take a few minutes on first run |
|
|
|