Spaces:
Sleeping
Sleeping
| title: Content Classifier | |
| emoji: ๐ | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 4.44.0 | |
| app_file: app_hf.py | |
| pinned: false | |
| license: mit | |
| # Content Classifier | |
| This Space provides a content classification service using an ONNX model. It categorizes text as either "safe" or "unsafe" content. | |
| ## Features | |
| - **Single Text Classification**: Classify individual pieces of text | |
| - **Batch Processing**: Process multiple texts at once | |
| - **API Access**: Use as a web service via HTTP requests | |
| - **Real-time Interface**: Interactive Gradio web interface | |
| ## Usage | |
| ### Web Interface | |
| Simply enter text in the interface and click "Classify" to get predictions. | |
| ### API Usage | |
| #### Single Text Classification | |
| ```bash | |
| curl -X POST https://your-space-name.hf.space/predict \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"text": "Your content to classify"}' | |
| ``` | |
| #### Batch Processing | |
| ```bash | |
| curl -X POST https://your-space-name.hf.space/predict \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"text": ["Text 1", "Text 2", "Text 3"]}' | |
| ``` | |
| ### Response Format | |
| ```json | |
| { | |
| "is_threat": false, | |
| "final_confidence": 0.85, | |
| "threat_prediction": "safe", | |
| "onnx_prediction": { | |
| "safe": 0.85, | |
| "unsafe": 0.15 | |
| }, | |
| "models_used": ["onnx"] | |
| } | |
| ``` | |
| ## Model Information | |
| The classifier uses an ONNX model (`contextClassifier.onnx`) for efficient inference. The model processes text and outputs probability scores for "safe" and "unsafe" classifications. | |
| ## Local Development | |
| 1. Clone this repository | |
| 2. Install dependencies: `pip install -r requirements.txt` | |
| 3. Run the application: `python app_hf.py` | |
| 4. Access the interface at `http://localhost:7860` | |
| ## Basic Python Usage | |
| ```python | |
| from inference import ContentClassifierInference | |
| # Initialize classifier | |
| classifier = ContentClassifierInference() | |
| # Classify single text | |
| result = classifier.predict("Your text here") | |
| print(f"Threat: {result['is_threat']}, Confidence: {result['final_confidence']}") | |
| # Classify multiple texts | |
| texts = ["Text 1", "Text 2"] | |
| results = classifier.predict_batch(texts) | |
| ``` | |
| ### Response Format | |
| The model returns predictions in the following format: | |
| ```json | |
| { | |
| "is_threat": false, | |
| "final_confidence": 0.75, | |
| "threat_prediction": "safe", | |
| "sentiment_analysis": null, | |
| "onnx_prediction": { | |
| "safe": 0.75, | |
| "unsafe": 0.25 | |
| }, | |
| "models_used": ["onnx"], | |
| "raw_predictions": { | |
| "onnx": { | |
| "safe": 0.75, | |
| "unsafe": 0.25 | |
| }, | |
| "sentiment": null | |
| } | |
| } | |
| ``` | |
| ### Configuration | |
| Modify `config.json` to adjust: | |
| - `labels`: Class labels for your model | |
| - `max_length`: Maximum input sequence length | |
| - `threshold`: Classification confidence threshold | |
| ## Testing | |
| Run the test script: | |
| ```bash | |
| python test_inference.py | |
| ``` | |
| ## Model Requirements | |
| - Input: Text string | |
| - Output: Classification probabilities | |
| - Format: ONNX model file | |
| Note: You may need to adjust the `preprocess` method in `inference.py` based on your specific model's input requirements (tokenization, encoding, etc.). | |