content-classifier / README.md
parthraninga's picture
Upload README.md with huggingface_hub
4900029 verified
---
title: Content Classifier
emoji: ๐Ÿ”
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app_hf.py
pinned: false
license: mit
---
# Content Classifier
This Space provides a content classification service using an ONNX model. It categorizes text as either "safe" or "unsafe" content.
## Features
- **Single Text Classification**: Classify individual pieces of text
- **Batch Processing**: Process multiple texts at once
- **API Access**: Use as a web service via HTTP requests
- **Real-time Interface**: Interactive Gradio web interface
## Usage
### Web Interface
Simply enter text in the interface and click "Classify" to get predictions.
### API Usage
#### Single Text Classification
```bash
curl -X POST https://your-space-name.hf.space/predict \
-H "Content-Type: application/json" \
-d '{"text": "Your content to classify"}'
```
#### Batch Processing
```bash
curl -X POST https://your-space-name.hf.space/predict \
-H "Content-Type: application/json" \
-d '{"text": ["Text 1", "Text 2", "Text 3"]}'
```
### Response Format
```json
{
"is_threat": false,
"final_confidence": 0.85,
"threat_prediction": "safe",
"onnx_prediction": {
"safe": 0.85,
"unsafe": 0.15
},
"models_used": ["onnx"]
}
```
## Model Information
The classifier uses an ONNX model (`contextClassifier.onnx`) for efficient inference. The model processes text and outputs probability scores for "safe" and "unsafe" classifications.
## Local Development
1. Clone this repository
2. Install dependencies: `pip install -r requirements.txt`
3. Run the application: `python app_hf.py`
4. Access the interface at `http://localhost:7860`
## Basic Python Usage
```python
from inference import ContentClassifierInference
# Initialize classifier
classifier = ContentClassifierInference()
# Classify single text
result = classifier.predict("Your text here")
print(f"Threat: {result['is_threat']}, Confidence: {result['final_confidence']}")
# Classify multiple texts
texts = ["Text 1", "Text 2"]
results = classifier.predict_batch(texts)
```
### Response Format
The model returns predictions in the following format:
```json
{
"is_threat": false,
"final_confidence": 0.75,
"threat_prediction": "safe",
"sentiment_analysis": null,
"onnx_prediction": {
"safe": 0.75,
"unsafe": 0.25
},
"models_used": ["onnx"],
"raw_predictions": {
"onnx": {
"safe": 0.75,
"unsafe": 0.25
},
"sentiment": null
}
}
```
### Configuration
Modify `config.json` to adjust:
- `labels`: Class labels for your model
- `max_length`: Maximum input sequence length
- `threshold`: Classification confidence threshold
## Testing
Run the test script:
```bash
python test_inference.py
```
## Model Requirements
- Input: Text string
- Output: Classification probabilities
- Format: ONNX model file
Note: You may need to adjust the `preprocess` method in `inference.py` based on your specific model's input requirements (tokenization, encoding, etc.).