AXERA-TECH
/

WeDetect.axera

Zero-Shot Object Detection

Open Vocabulary Object Detection

Model card Files Files and versions

WeDetect.axera / README.md

wzf19947's picture

update readme

3c88a25 about 1 month ago

|

History Blame Contribute Delete

3.19 kB

	---
	license: agpl-3.0
	language:
	- en
	base_model:
	- fushh7/WeDetect
	pipeline_tag: zero-shot-object-detection
	tags:
	- Axera
	- WeDetect
	- NPU
	- Open Vocabulary Object Detection
	---

	# WeDetect demo for Axera

	## The original repo

	[WeDetect](https://github.com/WeChatCV/WeDetect/tree/main)

	## 背景
	开放词汇检测旨在利用文本描述来检测任意的物体，Wedetect不利用跨模态交互的方案把识别任务类比成一种检索任务，即在一个统一的特征空间中匹配区域特征和文本特征。本项目用于指导开发者完成以下内容：

	- 导出 class num = 4 的 WeDetect ONNX 模型；
	- 生成 AXERA NPU 模型转换工具 Pulsar2 编译依赖的 text 量化校准数据集；
	- 完成ONNX模型基于Pulsa2工具链的编译及在AX650N上的部署。

	## 模型导出

	项目使用模型为wedetect_base,模型可在[huggingface](https://huggingface.co/fushh7/WeDetect/tree/main)下载。
	生成适合用于 AXera NPU 工具链 Pulsar2 模型转换的 ONNX 模型:

	- 下载 `wedetect_base.pth`放在checkpoints目录下
	- 使用 export_onnx.py分别导出图像编码模型和文本编码模型：

	```
	python export_onnx.py --config config/wedetect_base.py --checkpoint checkpoints/wedetect_base.pth
	```

	- 生成 Pulsar2 编译模型时所依赖的量化校准数据 `class_embedding_4cls.tar.gz`、`input_ids.tar.gz`、 `attention_mask.tar.gz`，另准备图片数据如`coco100.tar.gz`作为图像编码器量化对分数据

	```
	python generate_class_embedding.py --wedetect_checkpoint checkpoints/wedetect_base.pth --classname_file ./coco_zh_class_texts.json --calib-dir ./quant
	```

	## 模型编译

	- Pulsar2 安装及使用请参考相关文档
	- [在线文档](https://pulsar2-docs.readthedocs.io/zh-cn/latest/index.html)

	- 编译命令
	```
	# image encoder
	pulsar2 build --config quant/image_encoder_4cls.json

	# text encoder
	pulsar2 build --config quant/text_encoder_4cls.json
	```
	- 模型性能

	\| Models \| Platforms \| latency \| CMM size(MB) \|
	\| --------------------------------------- \| --------- \| ------------- \| ------------- \|
	\| wedetect_image_encoder_npu3_u16.axmodel \| AX650 \| 100.3ms \| 152.2 \|
	\| wedetect_text_encoder_npu3_u16.axmodel \| AX650 \| 7.2ms \| 457.8 \|


	## Python demo

	### Onnx demo
	```
	python3 onnx_infer.py
	```

	### Axmodel demo

	需基于[PyAXEngine](https://github.com/AXERA-TECH/pyaxengine)在AX650N上进行部署
	板端执行命令：
	```
	root@ax650:~/WeDetect# python3 axmodel_infer.py
	[INFO] Available providers: ['AxEngineExecutionProvider', 'AXCLRTExecutionProvider']
	Classes: ['鞋', '床', '人', '衣架']
	[INFO] Using provider: AxEngineExecutionProvider
	[INFO] Chip type: ChipType.MC50
	[INFO] VNPU type: VNPUType.DISABLED
	[INFO] Engine version: 2.12.0s
	[INFO] Model type: 2 (triple core)
	[INFO] Compiler version: 6.0 6965315a
	[INFO] Using provider: AxEngineExecutionProvider
	[INFO] Model type: 2 (triple core)
	[INFO] Compiler version: 6.0 62ad4ff7
	Detections: 3
	床 0.863 (336, 427, 837, 679)
	鞋 0.430 (316, 629, 345, 688)
	鞋 0.397 (268, 628, 331, 679)
	Saved: axmodel_res.jpg

	```

	![](axmodel_res.jpg)