content-classifier / README.md
parthraninga's picture
Upload README.md with huggingface_hub
4900029 verified

A newer version of the Gradio SDK is available: 6.5.1

Upgrade
metadata
title: Content Classifier
emoji: 🔍
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app_hf.py
pinned: false
license: mit

Content Classifier

This Space provides a content classification service using an ONNX model. It categorizes text as either "safe" or "unsafe" content.

Features

  • Single Text Classification: Classify individual pieces of text
  • Batch Processing: Process multiple texts at once
  • API Access: Use as a web service via HTTP requests
  • Real-time Interface: Interactive Gradio web interface

Usage

Web Interface

Simply enter text in the interface and click "Classify" to get predictions.

API Usage

Single Text Classification

curl -X POST https://your-space-name.hf.space/predict \
  -H "Content-Type: application/json" \
  -d '{"text": "Your content to classify"}'

Batch Processing

curl -X POST https://your-space-name.hf.space/predict \
  -H "Content-Type: application/json" \
  -d '{"text": ["Text 1", "Text 2", "Text 3"]}'

Response Format

{
  "is_threat": false,
  "final_confidence": 0.85,
  "threat_prediction": "safe",
  "onnx_prediction": {
    "safe": 0.85,
    "unsafe": 0.15
  },
  "models_used": ["onnx"]
}

Model Information

The classifier uses an ONNX model (contextClassifier.onnx) for efficient inference. The model processes text and outputs probability scores for "safe" and "unsafe" classifications.

Local Development

  1. Clone this repository
  2. Install dependencies: pip install -r requirements.txt
  3. Run the application: python app_hf.py
  4. Access the interface at http://localhost:7860

Basic Python Usage

from inference import ContentClassifierInference

# Initialize classifier
classifier = ContentClassifierInference()

# Classify single text
result = classifier.predict("Your text here")
print(f"Threat: {result['is_threat']}, Confidence: {result['final_confidence']}")

# Classify multiple texts
texts = ["Text 1", "Text 2"]
results = classifier.predict_batch(texts)

Response Format

The model returns predictions in the following format:

{
  "is_threat": false,
  "final_confidence": 0.75,
  "threat_prediction": "safe",
  "sentiment_analysis": null,
  "onnx_prediction": {
    "safe": 0.75,
    "unsafe": 0.25
  },
  "models_used": ["onnx"],
  "raw_predictions": {
    "onnx": {
      "safe": 0.75,
      "unsafe": 0.25
    },
    "sentiment": null
  }
}

Configuration

Modify config.json to adjust:

  • labels: Class labels for your model
  • max_length: Maximum input sequence length
  • threshold: Classification confidence threshold

Testing

Run the test script:

python test_inference.py

Model Requirements

  • Input: Text string
  • Output: Classification probabilities
  • Format: ONNX model file

Note: You may need to adjust the preprocess method in inference.py based on your specific model's input requirements (tokenization, encoding, etc.).