pp-nsfw_Inspector(中文版)

pp-nsfw_Inspector is an image content moderation pipeline running on the Axera NPU. It combines OCR, NSFW detection, QR code scanning, and keyword rule matching to classify images as PASS / REVIEW / REJECT.

Pipeline Overview

flowchart TD
    A[Input Image] --> B[Preprocess Layer<br/>Scene Classification + Image Processing + Long Image Slicing]
    B --> C{Process by Slice}

    C --> D[Image Branch<br/>NSFW + QR Code]
    C --> E{OCR Routing}

    E -->|SCREENSHOT| F[PP-OCRv5]
    E -->|DOCUMENT / POSTER / UNKNOWN| G[PP-DocLayout-S]
    G --> H[Text Region OCR]
    G --> I[Figure Region Extraction]
    I --> J[Figure Region NSFW]

    F --> K[OCR Result<br/>blocks + avg_score + text_state]
    H --> K
    D --> L[Image Signals<br/>nsfw / qr]
    J --> L

    K --> M[Understanding Layer<br/>Text Normalization + Strong/Weak Rules]
    M --> N[Text Signals<br/>rule_hits]
    L --> O[Decision Layer]
    K --> O
    N --> O

    O --> P{Final Action}
    P -->|PASS| Q[Release]
    P -->|REVIEW| R[Review]
    P -->|REJECT| S[Reject]

Supported Tasks

Layer	Task	Method
Preprocess	Scene classification (rule-based)	Screenshot / Document / Poster / Unknown
Perception	Text recognition (OCR)	PP-OCRv5 (det + cls + rec)
Perception	Layout analysis	PP-DocLayout-S
Perception	NSFW detection	ViT-based classifier
Perception	QR code detection & domain filtering	pyzbar + HTTP redirect expansion
Understanding	Text normalization	Traditional↔Simplified, full↔half-width, homophone map
Understanding	Keyword rule matching	pyahocorasick + google-re2
Decision	Three-tier verdict	PASS / REVIEW / REJECT

Model Details

All models are exported in w8a16 quantization for Axera NPU as .axmodel format. The following data is measured with ax_run_model -r 100 -w 10 (single-model benchmark, 100 iterations, 10 warmup).

Model	NPU Model	Size (CMM)	Latency NPU1	Latency NPU3 (3 Core)
PP-OCRv5 Det	`axmodel/ppocrv5/det_npu{1,3}.axmodel`	57.79 / 50.88 MiB	29.2 ms	17.1 ms
PP-OCRv5 Cls	`axmodel/ppocrv5/cls_npu{1,3}.axmodel`	0.62 / 0.75 MiB	0.3 ms	0.2 ms
PP-OCRv5 Rec	`axmodel/ppocrv5/rec_npu{1,3}.axmodel`	6.14 / 6.43 MiB	3.4 ms	1.4 ms
PP-DocLayout-S	`axmodel/ppstructurev3/ppstructure_npu{1,3}.axmodel`	6.90 / 4.08 MiB	5.5 ms	2.0 ms
NSFW	`axmodel/nsfw/nsfw_npu{1,3}.axmodel`	91.14 / 92.10 MiB	30.0 ms	11.7 ms

NPU1: Single core, OCR models via Pulsar2 5.2, PP-DocLayout-S / NSFW via Pulsar2 6.0
NPU3: Triple core, Pulsar2 6.0 (CMM size differs due to 3-core fusion strategy)

Model conversion tools: Pulsar2-docs (ver 5.2+). Engine version: 2.10.1s.

Note: Toolchain version differences may affect operator optimization, resulting in minor performance variations.

Support Platform

AX650N / AX8850

How to Use

Python Environment

pyaxengine

# pyaxengine
wget https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc3/axengine-0.1.3-py3-none-any.whl
pip install axengine-0.1.3-py3-none-any.whl

# Other dependencies
pip install -r requirements.txt

Note: pyzbar requires the system zbar shared library.

Local Test (CLI)

python test.py

Iterates over images in images/ and prints the decision for each:

action:     REJECT
risk_level: high
labels:     ['nsfw']
score:      0.987
evidence:   [{'source': 'nsfw_model', 'score': 0.987}]

Web Demo

python app.py

Open http://127.0.0.1:5000/ in a browser. The web page supports:

Viewing a default sample image result
Uploading local images for moderation
Displaying decision, risk level, labels, evidence, slice details, OCR status, normalized text, and rule hits

Note: For safety, only partial keyword anomaly detection is enabled in this public release. For the complete illegal keyword list, please file an issue or contact us through official channels.

Run in background:

setsid python app.py > web.log 2>&1 < /dev/null &
# Stop: pkill -f 'python app.py'
# Logs: tail -f web.log

Decision Semantics

The decision layer uses a tiered OR logic:

Strong Signals → REJECT

Strong keyword rule hit
QR code blacklist domain
NSFW score at or above scene-specific reject threshold

Weak Signals → REVIEW

Weak keyword rule hit
QR code unknown domain
NSFW score at or above review threshold
OCR low quality (avg_score < 0.65 and blocks < 3)
OCR missing expected text / missing uncertain

NSFW Thresholds by Scene

Scene	review	reject
SCREENSHOT	0.60	0.93
DOCUMENT	0.60	0.95
POSTER	0.60	0.85
UNKNOWN	0.60	0.90

Output Fields

Field	Description
`action`	Final verdict: `PASS` / `REVIEW` / `REJECT`
`risk_level`	`low` / `medium` / `high`
`primary_reason`	Top contributing factor for quick triage
`labels`	All matched reasons (includes soft signals even on REJECT)
`score`	Max confidence score across all signals
`evidence`	All matched signal details (source, score, domains, etc.)

Limitations

Not a production service (no auth, no access control)
REVIEW strategy is conservative (cold-start phase)
Rules and thresholds are in early tuning stage
Designed for development, integration testing, and demonstrations

Other

PP-OCRv5 models: AXERA-TECH/PPOCR_v5
PP-DocLayout-S model: PP-DocLayout
nsfw model: Falconsai/nsfw_image_detection_26
Test Images: All images under images are website screenshots used for internal regression testing.

Downloads last month: 11

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support