ZetangForward
/

SegmentCNN

Text Classification

toxic_classification

Model card Files Files and versions

SegmentCNN / README.md

ZetangForward's picture

Update README.md

11d896b verified 12 months ago

|

history blame contribute delete

1.88 kB

	---
	license: apache-2.0
	datasets:
	- allenai/real-toxicity-prompts
	language:
	- en
	metrics:
	- accuracy
	pipeline_tag: text-classification
	tags:
	- toxic_classification
	---
	# SegmentCNN Model for Toxic Text Classification

	## Overview

	The SegmentCNN model, a.k.a, SpanCNN, is designed for toxic text classification, distinguishing between safe and toxic content. This model is part of the research presented in the paper titled [CMD: A Framework for Context-aware Model Self-Detoxification](https://arxiv.org/abs/2308.08295).

	## Model Details

	- Input: Text data
	- Output: Integer
	- `0` represents safe content
	- `1` represents toxic content

	## Usage

	To use the SegmentCNN model for toxic text classification, follow the example below:

	```python
	from transformers import pipeline

	# Load the SpanCNN model
	classifier = pipeline("spancnn-classification", model="ZetangForward/SegmentCNN", trust_remote_code=True)

	# Example 1: Safe text
	pos_text = "You look good today~!"
	result = classifier(pos_text)
	print(result) # Output: 0 (safe)

	# Example 2: Toxic text
	neg_text = "You're too stupid, you're just like a fool"
	result = classifier(neg_text)
	print(result) # Output: 1 (toxic)
	```

	## Citation

	If you find this model useful, please consider citing the original paper:

	```bibtex
	@article{tang2023detoxify,
	title={Detoxify language model step-by-step},
	author={Tang, Zecheng and Zhou, Keyan and Wang, Pinzheng and Ding, Yuyang and Li, Juntao and others},
	journal={arXiv preprint arXiv:2308.08295},
	year={2023}
	}
	```

	## Disclaimer

	While the SegmentCNN model is effective in detecting toxic segments within text, we strongly recommend that users carefully review the results and exercise caution when applying this method in real-world scenarios. The model is not infallible, and its outputs should be validated in context-sensitive applications.