Spaces:

parthraninga
/

content-classifier

Sleeping

App Files Files Community

content-classifier / README.md

parthraninga

Upload README.md with huggingface_hub

4900029 verified 7 months ago

preview code

raw

history blame contribute delete

3.13 kB

	---
	title: Content Classifier
	emoji: 🔍
	colorFrom: blue
	colorTo: green
	sdk: gradio
	sdk_version: 4.44.0
	app_file: app_hf.py
	pinned: false
	license: mit
	---

	# Content Classifier

	This Space provides a content classification service using an ONNX model. It categorizes text as either "safe" or "unsafe" content.

	## Features

	- Single Text Classification: Classify individual pieces of text
	- Batch Processing: Process multiple texts at once
	- API Access: Use as a web service via HTTP requests
	- Real-time Interface: Interactive Gradio web interface

	## Usage

	### Web Interface
	Simply enter text in the interface and click "Classify" to get predictions.

	### API Usage

	#### Single Text Classification
	```bash
	curl -X POST https://your-space-name.hf.space/predict \
	-H "Content-Type: application/json" \
	-d '{"text": "Your content to classify"}'
	```

	#### Batch Processing
	```bash
	curl -X POST https://your-space-name.hf.space/predict \
	-H "Content-Type: application/json" \
	-d '{"text": ["Text 1", "Text 2", "Text 3"]}'
	```

	### Response Format
	```json
	{
	"is_threat": false,
	"final_confidence": 0.85,
	"threat_prediction": "safe",
	"onnx_prediction": {
	"safe": 0.85,
	"unsafe": 0.15
	},
	"models_used": ["onnx"]
	}
	```

	## Model Information

	The classifier uses an ONNX model (`contextClassifier.onnx`) for efficient inference. The model processes text and outputs probability scores for "safe" and "unsafe" classifications.

	## Local Development

	1. Clone this repository
	2. Install dependencies: `pip install -r requirements.txt`
	3. Run the application: `python app_hf.py`
	4. Access the interface at `http://localhost:7860`

	## Basic Python Usage

	```python
	from inference import ContentClassifierInference

	# Initialize classifier
	classifier = ContentClassifierInference()

	# Classify single text
	result = classifier.predict("Your text here")
	print(f"Threat: {result['is_threat']}, Confidence: {result['final_confidence']}")

	# Classify multiple texts
	texts = ["Text 1", "Text 2"]
	results = classifier.predict_batch(texts)
	```

	### Response Format

	The model returns predictions in the following format:

	```json
	{
	"is_threat": false,
	"final_confidence": 0.75,
	"threat_prediction": "safe",
	"sentiment_analysis": null,
	"onnx_prediction": {
	"safe": 0.75,
	"unsafe": 0.25
	},
	"models_used": ["onnx"],
	"raw_predictions": {
	"onnx": {
	"safe": 0.75,
	"unsafe": 0.25
	},
	"sentiment": null
	}
	}
	```

	### Configuration

	Modify `config.json` to adjust:
	- `labels`: Class labels for your model
	- `max_length`: Maximum input sequence length
	- `threshold`: Classification confidence threshold

	## Testing

	Run the test script:
	```bash
	python test_inference.py
	```

	## Model Requirements

	- Input: Text string
	- Output: Classification probabilities
	- Format: ONNX model file

	Note: You may need to adjust the `preprocess` method in `inference.py` based on your specific model's input requirements (tokenization, encoding, etc.).