pp-nsfw_Inspector(中文版)
pp-nsfw_Inspector is an image content moderation pipeline running on the Axera NPU. It combines OCR, NSFW detection, QR code scanning, and keyword rule matching to classify images as PASS / REVIEW / REJECT.
Pipeline Overview
flowchart TD
A[Input Image] --> B[Preprocess Layer<br/>Scene Classification + Image Processing + Long Image Slicing]
B --> C{Process by Slice}
C --> D[Image Branch<br/>NSFW + QR Code]
C --> E{OCR Routing}
E -->|SCREENSHOT| F[PP-OCRv5]
E -->|DOCUMENT / POSTER / UNKNOWN| G[PP-DocLayout-S]
G --> H[Text Region OCR]
G --> I[Figure Region Extraction]
I --> J[Figure Region NSFW]
F --> K[OCR Result<br/>blocks + avg_score + text_state]
H --> K
D --> L[Image Signals<br/>nsfw / qr]
J --> L
K --> M[Understanding Layer<br/>Text Normalization + Strong/Weak Rules]
M --> N[Text Signals<br/>rule_hits]
L --> O[Decision Layer]
K --> O
N --> O
O --> P{Final Action}
P -->|PASS| Q[Release]
P -->|REVIEW| R[Review]
P -->|REJECT| S[Reject]
Supported Tasks
| Layer | Task | Method |
|---|---|---|
| Preprocess | Scene classification (rule-based) | Screenshot / Document / Poster / Unknown |
| Perception | Text recognition (OCR) | PP-OCRv5 (det + cls + rec) |
| Perception | Layout analysis | PP-DocLayout-S |
| Perception | NSFW detection | ViT-based classifier |
| Perception | QR code detection & domain filtering | pyzbar + HTTP redirect expansion |
| Understanding | Text normalization | Traditional↔Simplified, full↔half-width, homophone map |
| Understanding | Keyword rule matching | pyahocorasick + google-re2 |
| Decision | Three-tier verdict | PASS / REVIEW / REJECT |
Model Details
All models are exported in w8a16 quantization for Axera NPU as .axmodel format. The following data is measured with ax_run_model -r 100 -w 10 (single-model benchmark, 100 iterations, 10 warmup).
| Model | NPU Model | Size (CMM) | Latency NPU1 | Latency NPU3 (3 Core) |
|---|---|---|---|---|
| PP-OCRv5 Det | axmodel/ppocrv5/det_npu{1,3}.axmodel |
57.79 / 50.88 MiB | 29.2 ms | 17.1 ms |
| PP-OCRv5 Cls | axmodel/ppocrv5/cls_npu{1,3}.axmodel |
0.62 / 0.75 MiB | 0.3 ms | 0.2 ms |
| PP-OCRv5 Rec | axmodel/ppocrv5/rec_npu{1,3}.axmodel |
6.14 / 6.43 MiB | 3.4 ms | 1.4 ms |
| PP-DocLayout-S | axmodel/ppstructurev3/ppstructure_npu{1,3}.axmodel |
6.90 / 4.08 MiB | 5.5 ms | 2.0 ms |
| NSFW | axmodel/nsfw/nsfw_npu{1,3}.axmodel |
91.14 / 92.10 MiB | 30.0 ms | 11.7 ms |
- NPU1: Single core, OCR models via Pulsar2 5.2, PP-DocLayout-S / NSFW via Pulsar2 6.0
- NPU3: Triple core, Pulsar2 6.0 (CMM size differs due to 3-core fusion strategy)
Model conversion tools: Pulsar2-docs (ver 5.2+). Engine version: 2.10.1s.
Note: Toolchain version differences may affect operator optimization, resulting in minor performance variations.
Support Platform
- AX650N / AX8850
How to Use
Python Environment
pyaxengine
# pyaxengine
wget https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc3/axengine-0.1.3-py3-none-any.whl
pip install axengine-0.1.3-py3-none-any.whl
# Other dependencies
pip install -r requirements.txt
Note: pyzbar requires the system zbar shared library.
Local Test (CLI)
python test.py
Iterates over images in images/ and prints the decision for each:
action: REJECT
risk_level: high
labels: ['nsfw']
score: 0.987
evidence: [{'source': 'nsfw_model', 'score': 0.987}]
Web Demo
python app.py
Open http://127.0.0.1:5000/ in a browser. The web page supports:
- Viewing a default sample image result
- Uploading local images for moderation
- Displaying decision, risk level, labels, evidence, slice details, OCR status, normalized text, and rule hits
Note: For safety, only partial keyword anomaly detection is enabled in this public release. For the complete illegal keyword list, please file an issue or contact us through official channels.
Run in background:
setsid python app.py > web.log 2>&1 < /dev/null &
# Stop: pkill -f 'python app.py'
# Logs: tail -f web.log
Decision Semantics
The decision layer uses a tiered OR logic:
Strong Signals → REJECT
- Strong keyword rule hit
- QR code blacklist domain
- NSFW score at or above scene-specific reject threshold
Weak Signals → REVIEW
- Weak keyword rule hit
- QR code unknown domain
- NSFW score at or above review threshold
- OCR low quality (
avg_score < 0.65andblocks < 3) - OCR missing expected text / missing uncertain
NSFW Thresholds by Scene
| Scene | review | reject |
|---|---|---|
| SCREENSHOT | 0.60 | 0.93 |
| DOCUMENT | 0.60 | 0.95 |
| POSTER | 0.60 | 0.85 |
| UNKNOWN | 0.60 | 0.90 |
Output Fields
| Field | Description |
|---|---|
action |
Final verdict: PASS / REVIEW / REJECT |
risk_level |
low / medium / high |
primary_reason |
Top contributing factor for quick triage |
labels |
All matched reasons (includes soft signals even on REJECT) |
score |
Max confidence score across all signals |
evidence |
All matched signal details (source, score, domains, etc.) |
Limitations
- Not a production service (no auth, no access control)
- REVIEW strategy is conservative (cold-start phase)
- Rules and thresholds are in early tuning stage
- Designed for development, integration testing, and demonstrations
Other
- PP-OCRv5 models: AXERA-TECH/PPOCR_v5
- PP-DocLayout-S model: PP-DocLayout
- nsfw model: Falconsai/nsfw_image_detection_26
- Test Images: All images under
imagesare website screenshots used for internal regression testing.
- Downloads last month
- 11