--- title: Content Classifier emoji: 🔍 colorFrom: blue colorTo: green sdk: gradio sdk_version: 4.44.0 app_file: app_hf.py pinned: false license: mit --- # Content Classifier This Space provides a content classification service using an ONNX model. It categorizes text as either "safe" or "unsafe" content. ## Features - **Single Text Classification**: Classify individual pieces of text - **Batch Processing**: Process multiple texts at once - **API Access**: Use as a web service via HTTP requests - **Real-time Interface**: Interactive Gradio web interface ## Usage ### Web Interface Simply enter text in the interface and click "Classify" to get predictions. ### API Usage #### Single Text Classification ```bash curl -X POST https://your-space-name.hf.space/predict \ -H "Content-Type: application/json" \ -d '{"text": "Your content to classify"}' ``` #### Batch Processing ```bash curl -X POST https://your-space-name.hf.space/predict \ -H "Content-Type: application/json" \ -d '{"text": ["Text 1", "Text 2", "Text 3"]}' ``` ### Response Format ```json { "is_threat": false, "final_confidence": 0.85, "threat_prediction": "safe", "onnx_prediction": { "safe": 0.85, "unsafe": 0.15 }, "models_used": ["onnx"] } ``` ## Model Information The classifier uses an ONNX model (`contextClassifier.onnx`) for efficient inference. The model processes text and outputs probability scores for "safe" and "unsafe" classifications. ## Local Development 1. Clone this repository 2. Install dependencies: `pip install -r requirements.txt` 3. Run the application: `python app_hf.py` 4. Access the interface at `http://localhost:7860` ## Basic Python Usage ```python from inference import ContentClassifierInference # Initialize classifier classifier = ContentClassifierInference() # Classify single text result = classifier.predict("Your text here") print(f"Threat: {result['is_threat']}, Confidence: {result['final_confidence']}") # Classify multiple texts texts = ["Text 1", "Text 2"] results = classifier.predict_batch(texts) ``` ### Response Format The model returns predictions in the following format: ```json { "is_threat": false, "final_confidence": 0.75, "threat_prediction": "safe", "sentiment_analysis": null, "onnx_prediction": { "safe": 0.75, "unsafe": 0.25 }, "models_used": ["onnx"], "raw_predictions": { "onnx": { "safe": 0.75, "unsafe": 0.25 }, "sentiment": null } } ``` ### Configuration Modify `config.json` to adjust: - `labels`: Class labels for your model - `max_length`: Maximum input sequence length - `threshold`: Classification confidence threshold ## Testing Run the test script: ```bash python test_inference.py ``` ## Model Requirements - Input: Text string - Output: Classification probabilities - Format: ONNX model file Note: You may need to adjust the `preprocess` method in `inference.py` based on your specific model's input requirements (tokenization, encoding, etc.).