File size: 1,725 Bytes
81aab5a
f25b436
 
 
 
81aab5a
 
 
f25b436
 
81aab5a
 
f25b436
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
title: Website Category Classifier
emoji: πŸ”
colorFrom: blue
colorTo: green
sdk: gradio
app_file: app.py
pinned: false
license: mit
hardware: zero-gpu
---

# Website Category Classifier (Fixed Version)

This application classifies websites into three categories using a fine-tuned Mistral 7B model:
- **OTHER**: General websites
- **NEWS/BLOG**: News websites and blogs  
- **E-COMMERCE**: Online shopping sites

## πŸ”§ Fixed Issues

This version resolves the torch.int1 AttributeError by:
- Removing unsloth dependency (causing compatibility issues)
- Using transformers library directly
- Pinning PyTorch version to avoid conflicts
- Adding proper error handling

## Features

- **Batch Processing**: Classify up to 20 URLs at once
- **AI-Powered**: Uses Mistral 7B model for accurate classification
- **Real-time Progress**: Shows processing progress
- **GPU Acceleration**: Powered by Hugging Face ZeroGPU
- **Error Recovery**: Handles failed URLs gracefully

## Usage

1. Enter URLs (one per line) in the input textbox
2. Click "πŸš€ Classify Websites" 
3. View results showing each URL and its predicted category

## Model

This app uses the `limitedonly41/website_mistral7b_v02` model loaded via transformers library with 4-bit quantization for efficiency.

## Technical Details

- Built with Gradio for the interface
- Uses transformers instead of unsloth for compatibility
- ZeroGPU decorator for efficient GPU utilization
- Async processing for better performance
- Translation support for non-English websites

## Limitations

- Maximum 20 URLs per batch (reduced for stability)
- 30-second timeout per URL
- Requires internet connection for URL scraping
- Model loading may take a few minutes on first run