---
title: Content Classifier
emoji: 🔍
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app_hf.py
pinned: false
license: mit
---

# Content Classifier

This Space provides a content classification service using an ONNX model. It categorizes text as either "safe" or "unsafe" content.

## Features

- **Single Text Classification**: Classify individual pieces of text
- **Batch Processing**: Process multiple texts at once
- **API Access**: Use as a web service via HTTP requests
- **Real-time Interface**: Interactive Gradio web interface

## Usage

### Web Interface
Simply enter text in the interface and click "Classify" to get predictions.

### API Usage

#### Single Text Classification
```bash
curl -X POST https://your-space-name.hf.space/predict \
  -H "Content-Type: application/json" \
  -d '{"text": "Your content to classify"}'
```

#### Batch Processing
```bash
curl -X POST https://your-space-name.hf.space/predict \
  -H "Content-Type: application/json" \
  -d '{"text": ["Text 1", "Text 2", "Text 3"]}'
```

### Response Format
```json
{
  "is_threat": false,
  "final_confidence": 0.85,
  "threat_prediction": "safe",
  "onnx_prediction": {
    "safe": 0.85,
    "unsafe": 0.15
  },
  "models_used": ["onnx"]
}
```

## Model Information

The classifier uses an ONNX model (`contextClassifier.onnx`) for efficient inference. The model processes text and outputs probability scores for "safe" and "unsafe" classifications.

## Local Development

1. Clone this repository
2. Install dependencies: `pip install -r requirements.txt`
3. Run the application: `python app_hf.py`
4. Access the interface at `http://localhost:7860`

## Basic Python Usage

```python
from inference import ContentClassifierInference

# Initialize classifier
classifier = ContentClassifierInference()

# Classify single text
result = classifier.predict("Your text here")
print(f"Threat: {result['is_threat']}, Confidence: {result['final_confidence']}")

# Classify multiple texts
texts = ["Text 1", "Text 2"]
results = classifier.predict_batch(texts)
```

### Response Format

The model returns predictions in the following format:

```json
{
  "is_threat": false,
  "final_confidence": 0.75,
  "threat_prediction": "safe",
  "sentiment_analysis": null,
  "onnx_prediction": {
    "safe": 0.75,
    "unsafe": 0.25
  },
  "models_used": ["onnx"],
  "raw_predictions": {
    "onnx": {
      "safe": 0.75,
      "unsafe": 0.25
    },
    "sentiment": null
  }
}
```

### Configuration

Modify `config.json` to adjust:
- `labels`: Class labels for your model
- `max_length`: Maximum input sequence length
- `threshold`: Classification confidence threshold

## Testing

Run the test script:
```bash
python test_inference.py
```

## Model Requirements

- Input: Text string
- Output: Classification probabilities
- Format: ONNX model file

Note: You may need to adjust the `preprocess` method in `inference.py` based on your specific model's input requirements (tokenization, encoding, etc.).