| --- |
| license: agpl-3.0 |
| language: |
| - en |
| base_model: |
| - fushh7/WeDetect |
| pipeline_tag: zero-shot-object-detection |
| tags: |
| - Axera |
| - WeDetect |
| - NPU |
| - Open Vocabulary Object Detection |
| --- |
| |
| # WeDetect demo for Axera |
|
|
| ## The original repo |
|
|
| [WeDetect](https://github.com/WeChatCV/WeDetect/tree/main) |
|
|
| ## 背景 |
| 开放词汇检测旨在利用文本描述来检测任意的物体,Wedetect不利用跨模态交互的方案把识别任务类比成一种检索任务,即在一个统一的特征空间中匹配区域特征和文本特征。本项目用于指导开发者完成以下内容: |
|
|
| - 导出 class num = 4 的 WeDetect ONNX 模型; |
| - 生成 AXERA NPU 模型转换工具 Pulsar2 编译依赖的 text 量化校准数据集; |
| - 完成ONNX模型基于Pulsa2工具链的编译及在AX650N上的部署。 |
|
|
| ## 模型导出 |
|
|
| 项目使用模型为wedetect_base,模型可在[huggingface](https://huggingface.co/fushh7/WeDetect/tree/main)下载。 |
| 生成适合用于 AXera NPU 工具链 Pulsar2 模型转换的 ONNX 模型: |
| |
| - 下载 `wedetect_base.pth`放在checkpoints目录下 |
| - 使用 export_onnx.py分别导出图像编码模型和文本编码模型: |
| |
| ``` |
| python export_onnx.py --config config/wedetect_base.py --checkpoint checkpoints/wedetect_base.pth |
| ``` |
| |
| - 生成 Pulsar2 编译模型时所依赖的量化校准数据 `class_embedding_4cls.tar.gz`、`input_ids.tar.gz`、 `attention_mask.tar.gz`,另准备图片数据如`coco100.tar.gz`作为图像编码器量化对分数据 |
| |
| ``` |
| python generate_class_embedding.py --wedetect_checkpoint checkpoints/wedetect_base.pth --classname_file ./coco_zh_class_texts.json --calib-dir ./quant |
| ``` |
| |
| ## 模型编译 |
| |
| - Pulsar2 安装及使用请参考相关文档 |
| - [在线文档](https://pulsar2-docs.readthedocs.io/zh-cn/latest/index.html) |
| |
| - 编译命令 |
| ``` |
| # image encoder |
| pulsar2 build --config quant/image_encoder_4cls.json |
|
|
| # text encoder |
| pulsar2 build --config quant/text_encoder_4cls.json |
| ``` |
| - 模型性能 |
| |
| | Models | Platforms | latency | CMM size(MB) | |
| | --------------------------------------- | --------- | ------------- | ------------- | |
| | wedetect_image_encoder_npu3_u16.axmodel | AX650 | 100.3ms | 152.2 | |
| | wedetect_text_encoder_npu3_u16.axmodel | AX650 | 7.2ms | 457.8 | |
| |
| |
| ## Python demo |
| |
| ### Onnx demo |
| ``` |
| python3 onnx_infer.py |
| ``` |
| |
| ### Axmodel demo |
| |
| 需基于[PyAXEngine](https://github.com/AXERA-TECH/pyaxengine)在AX650N上进行部署 |
| 板端执行命令: |
| ``` |
| root@ax650:~/WeDetect# python3 axmodel_infer.py |
| [INFO] Available providers: ['AxEngineExecutionProvider', 'AXCLRTExecutionProvider'] |
| Classes: ['鞋', '床', '人', '衣架'] |
| [INFO] Using provider: AxEngineExecutionProvider |
| [INFO] Chip type: ChipType.MC50 |
| [INFO] VNPU type: VNPUType.DISABLED |
| [INFO] Engine version: 2.12.0s |
| [INFO] Model type: 2 (triple core) |
| [INFO] Compiler version: 6.0 6965315a |
| [INFO] Using provider: AxEngineExecutionProvider |
| [INFO] Model type: 2 (triple core) |
| [INFO] Compiler version: 6.0 62ad4ff7 |
| Detections: 3 |
| 床 0.863 (336, 427, 837, 679) |
| 鞋 0.430 (316, 629, 345, 688) |
| 鞋 0.397 (268, 628, 331, 679) |
| Saved: axmodel_res.jpg |
| |
| ``` |
| |
|  |