LiteDetective🕵️

Lightweight and Accurate Chinese Toxic Text Detection

Disclaimer: The paper contains content that may be profane, vulgar, or offensive.

Qinjian Zhao *¹

Mingcheng Hu *¹

{¹zhaoq ,¹humin } @ kean.edu

* Equal Contribution

CPS 3320 2025

Kean University

Paper Code ACL 2025

Abstract

Harmful content detection is a critical task for any social media platform, as the presence of misinformation, age, gender, and racial discrim- ination can lead to a reduction in active users. This paper introduces a novel approach that leverages large language models (LLMs) to an- alyze specific social media data and generate training data, combined with a BERT-based Dy- namic TextCNN architecture. We first crawl potential harmful comments from targeted com- munities (e.g., "ShunBa"). These comments are then subjected to random filtering and clus- tering using a smaller LLM to generate policy- guided seed examples. Next, we employ a large LLM (Qwen-3) for context-aware and context- free data augmentation. Finally, we integrate BERT embeddings with a Dynamic TextCNN classifier on our custom dataset.

Online Demo

Model Architecture

Model Architecture

Model Architecture

BERT Encoder

We utilize hfl/chinese-roberta-wwm-ext as the base transformer, with hidden states concatenation:

H_{\text{concat}} = [H_{L-1}; H_L] \in \mathbb{R}^{L \times 1536}

Dynamic Multi-Scale Convolution

Kernel Adaptation: Each DynamicConv1d layer contains $K=4$ parallel kernels with attention mechanism:
$\alpha = \text{Softmax}(W_2\sigma(W_1\text{AvgPool}(x)))$
$\text{Output} = \sum_{k=1}^K \alpha^{(k)} \otimes \text{Conv}_k(x) + x$

Hierarchical Classification

The final prediction head implements dimension reduction with layer normalization:

\hat{y} = W_3(\text{Dropout}(\sigma(W_2(\text{Dropout}(\sigma(W_1F_{\text{pooled}})))))