diff --git a/.gitattributes b/.gitattributes
index c7d9f3332a950355d5a77d85000f05e6f45435ea..6636f7b144effac61f806581a4bc14e7152b6ac8 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -32,3 +32,12 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
+model_final.pdparams filter=lfs diff=lfs merge=lfs -text
+PaddleDetection-release-2.6/demo/car.jpg filter=lfs diff=lfs merge=lfs -text
+PaddleDetection-release-2.6/demo/P0072__1.0__0___0.png filter=lfs diff=lfs merge=lfs -text
+PaddleDetection-release-2.6/demo/P0861__1.0__1154___824.png filter=lfs diff=lfs merge=lfs -text
+PaddleDetection-release-2.6/docs/images/picodet_android_demo1.jpg filter=lfs diff=lfs merge=lfs -text
+PaddleDetection-release-2.6/docs/images/picodet_android_demo2.jpg filter=lfs diff=lfs merge=lfs -text
+PaddleDetection-release-2.6/docs/images/picodet_android_demo3.jpg filter=lfs diff=lfs merge=lfs -text
+PaddleDetection-release-2.6/docs/images/picodet_map.png filter=lfs diff=lfs merge=lfs -text
+PaddleDetection-release-2.6/docs/images/tinypose_demo.png filter=lfs diff=lfs merge=lfs -text
diff --git a/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/1_bug-report.yml b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/1_bug-report.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e2afdaa5ee2f2275ca567500bd2b640680e35b73
--- /dev/null
+++ b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/1_bug-report.yml
@@ -0,0 +1,106 @@
+name: 🐛 报BUG Bug Report
+description: 报告一个可复现的Bug以帮助我们修复PaddleDetection。 Report a bug to help us reproduce and fix it.
+labels: [type/bug-report, status/new-issue]
+
+body:
+- type: markdown
+ attributes:
+ value: |
+ Thank you for submitting a PaddleDetection Bug Report!
+
+- type: checkboxes
+ attributes:
+ label: 问题确认 Search before asking
+ description: >
+ (必选项) 在向PaddleDetection报bug之前,请先查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)是否报过同样的bug。
+
+ (Required) Before submitting a bug, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/PaddlePaddle/PaddleDetection/issues).
+
+ options:
+ - label: >
+ 我已经查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues),没有发现相似的bug。I have searched the [issues](https://github.com/PaddlePaddle/PaddleDetection/issues) and found no similar bug report.
+ required: true
+
+- type: dropdown
+ attributes:
+ label: Bug组件 Bug Component
+ description: |
+ (可选项) 请选择在哪部分代码发现这个bug。(Optional) Please select the part of PaddleDetection where you found the bug.
+ multiple: true
+ options:
+ - "Training"
+ - "Validation"
+ - "Inference"
+ - "Export"
+ - "Deploy"
+ - "Installation"
+ - "DataProcess"
+ - "Other"
+ validations:
+ required: false
+
+- type: textarea
+ id: code
+ attributes:
+ label: Bug描述 Describe the Bug
+ description: |
+ 请清晰而简洁地描述这个bug,并附上bug复现步骤、报错信息或截图、代码改动说明或最小可复现代码。如果代码太长,请将可执行代码放到[AIStudio](https://aistudio.baidu.com/aistudio/index)中并将项目设置为公开(或者放到github gist上),并在项目中描述清楚bug复现步骤,在issue中描述期望结果与实际结果。
+
+ 如果你报告的是一个报错信息,请将完整回溯的报错贴在这里,并使用 ` ```三引号块``` `展示错误信息。
+
+
+ placeholder: |
+ 请清晰简洁的描述这个bug。 A clear and concise description of what the bug is.
+
+ ```python
+ 代码改动说明,或最小可复现代码。 Code change description, or sample code to reproduce the problem.
+ ```
+
+ ```shell
+ 带有完整回溯信息的报错日志或截图。 The error log or screenshot you got, with the full traceback.
+ ```
+ validations:
+ required: true
+
+- type: textarea
+ attributes:
+ label: 复现环境 Environment
+ description: 请具体说明复现bug的环境信息。Please specify the environment information for reproducing the bug.
+ placeholder: |
+ - OS: Linux/Windows
+ - PaddlePaddle: 2.2.2
+ - PaddleDetection: release/2.4
+ - Python: 3.8.0
+ - CUDA: 10.2
+ - CUDNN: 7.6
+ - GCC: 8.2.0
+ validations:
+ required: true
+
+- type: checkboxes
+ attributes:
+ label: Bug描述确认 Bug description confirmation
+ description: >
+ (必选项) 请确认是否提供了详细的Bug描述和环境信息,确认问题是否可以复现。
+
+ (Required) Please confirm whether the bug description and environment information are provided, and whether the problem can be reproduced.
+
+ options:
+ - label: >
+ 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息,确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.
+ required: true
+
+- type: checkboxes
+ attributes:
+ label: 是否愿意提交PR? Are you willing to submit a PR?
+ description: >
+ (可选项) 如果你对修复bug有自己的想法,十分鼓励提交[Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls),共同提升PaddleDetection。
+
+ (Optional) We encourage you to submit a [Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls) (PR) to help improve PaddleDetection for everyone, especially if you have a good understanding of how to implement a fix or feature.
+ options:
+ - label: 我愿意提交PR!I'd like to help by submitting a PR!
+
+- type: markdown
+ attributes:
+ value: >
+ 感谢你的贡献 🎉!Thanks for your contribution 🎉!
diff --git a/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/2_feature-request.yml b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/2_feature-request.yml
new file mode 100644
index 0000000000000000000000000000000000000000..dcf9ec4462886c7064315f0fc6ac167dd6c6dbf5
--- /dev/null
+++ b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/2_feature-request.yml
@@ -0,0 +1,50 @@
+name: 🚀 新需求 Feature Request
+description: 提交一个你对PaddleDetection的新需求。 Submit a request for a new Paddle feature.
+labels: [type/feature-request, status/new-issue]
+
+body:
+- type: markdown
+ attributes:
+ value: >
+ #### 你可以在这里提出你对PaddleDetection的新需求,包括但不限于:功能或模型缺失、功能不全或无法使用、精度/性能不符合预期等。
+
+ #### You could submit a request for a new feature here, including but not limited to: new features or models, incomplete or unusable features, accuracy/performance not as expected, etc.
+
+- type: checkboxes
+ attributes:
+ label: 问题确认 Search before asking
+ description: >
+ 在向PaddleDetection提新需求之前,请先查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)是否报过同样的需求。
+
+ Before submitting a feature request, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/PaddlePaddle/PaddleDetection/issues).
+
+ options:
+ - label: >
+ 我已经查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues),没有类似需求。I have searched the [issues](https://github.com/PaddlePaddle/PaddleDetection/issues) and found no similar feature requests.
+ required: true
+
+- type: textarea
+ id: description
+ attributes:
+ label: 需求描述 Feature Description
+ description: |
+ 请尽可能包含任务目标、需求场景、功能描述等信息,全面的信息有利于我们准确评估你的需求。
+ Please include as much information as possible, such as mission objectives, requirement scenarios, functional descriptions, etc. Comprehensive information will help us accurately assess your feature request.
+ value: "1. 任务目标(请描述你正在做的项目是什么,如模型、论文、项目是什么?); 2. 需求场景(请描述你的项目中为什么需要用此功能); 3. 功能描述(请简单描述或设计这个功能)"
+ validations:
+ required: true
+
+- type: checkboxes
+ attributes:
+ label: 是否愿意提交PR Are you willing to submit a PR?
+ description: >
+ (可选)如果你对新feature有自己的想法,十分鼓励提交[Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls),共同提升PaddleDetection
+
+ (Optional) We encourage you to submit a [Pull Request](https://github.com/PaddlePaddle/PaddleDetection/pulls) (PR) to help improve PaddleDetection for everyone, especially if you have a good understanding of how to implement a fix or feature.
+ options:
+ - label: Yes I'd like to help by submitting a PR!
+
+- type: markdown
+ attributes:
+ value: >
+ 感谢你的贡献 🎉!Thanks for your contribution 🎉!
diff --git a/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/3_documentation-issue.yml b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/3_documentation-issue.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4ea08cd5f4b99003d2323e1578bd0456a9dcf848
--- /dev/null
+++ b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/3_documentation-issue.yml
@@ -0,0 +1,38 @@
+name: 📚 文档 Documentation Issue
+description: 反馈一个官网文档错误。 Report an issue related to https://github.com/PaddlePaddle/PaddleDetection.
+labels: [type/docs, status/new-issue]
+
+body:
+- type: markdown
+ attributes:
+ value: >
+ #### 请确认反馈的问题来自PaddlePaddle官网文档:https://github.com/PaddlePaddle/PaddleDetection 。
+
+ #### Before submitting a Documentation Issue, Please make sure that issue is related to https://github.com/PaddlePaddle/PaddleDetection.
+
+- type: textarea
+ id: link
+ attributes:
+ label: 文档链接&描述 Document Links & Description
+ description: |
+ 请说明有问题的文档链接以及该文档存在的问题。
+ Please fill in the link to the document and describe the question.
+ validations:
+ required: true
+
+
+- type: textarea
+ id: error
+ attributes:
+ label: 请提出你的建议 Please give your suggestion
+ description: |
+ 请告诉我们,你希望如何改进这个文档。或者你可以提个PR修复这个问题。
+ Please tell us how you would like to improve this document. Or you can submit a PR to fix this problem.
+
+ validations:
+ required: false
+
+- type: markdown
+ attributes:
+ value: >
+ 感谢你的贡献 🎉!Thanks for your contribution 🎉!
diff --git a/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/4_ask-a-question.yml b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/4_ask-a-question.yml
new file mode 100644
index 0000000000000000000000000000000000000000..af237f516eb333d4c5f33bba4b7dc9c0dec2e30f
--- /dev/null
+++ b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/4_ask-a-question.yml
@@ -0,0 +1,37 @@
+name: 🙋🏼♀️🙋🏻♂️提问 Ask a Question
+description: 提出一个使用/咨询问题。 Ask a usage or consultation question.
+labels: [type/question, status/new-issue]
+
+body:
+- type: checkboxes
+ attributes:
+ label: 问题确认 Search before asking
+ description: >
+ #### 你可以在这里提出一个使用/咨询问题,提问之前请确保:
+
+ - 1)已经百度/谷歌搜索过你的问题,但是没有找到解答;
+
+ - 2)已经在官网查询过[教程文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/docs/tutorials/GETTING_STARTED_cn.md)与[FAQ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/docs/tutorials/FAQ),但是没有找到解答;
+
+ - 3)已经在[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)中搜索过,没有找到同类issue或issue未被解答。
+
+
+ #### You could ask a usage or consultation question here, before your start, please make sure:
+
+ - 1) You have searched your question on Baidu/Google, but found no answer;
+
+ - 2) You have checked the [tutorials](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/docs/tutorials/GETTING_STARTED.md), but found no answer;
+
+ - 3) You have searched [the existing and past issues](https://github.com/PaddlePaddle/PaddleDetection/issues), but found no similar issue or the issue has not been answered.
+
+ options:
+ - label: >
+ 我已经搜索过问题,但是没有找到解答。I have searched the question and found no related answer.
+ required: true
+
+- type: textarea
+ id: question
+ attributes:
+ label: 请提出你的问题 Please ask your question
+ validations:
+ required: true
diff --git a/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/5_others.yml b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/5_others.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ec2f08ae16098cd8987f3b6bc726d9a28696833a
--- /dev/null
+++ b/PaddleDetection-release-2.6/.github/ISSUE_TEMPLATE/5_others.yml
@@ -0,0 +1,23 @@
+name: 🧩 其他 Others
+description: 提出其他问题。 Report any other non-support related issues.
+labels: [type/others, status/new-issue]
+
+body:
+- type: markdown
+ attributes:
+ value: >
+ #### 你可以在这里提出任何前面几类模板不适用的问题,包括但不限于:优化性建议、框架使用体验反馈、版本兼容性问题、报错信息不清楚等。
+
+ #### You can report any issues that are not applicable to the previous types of templates, including but not limited to: enhancement suggestions, feedback on the use of the framework, version compatibility issues, unclear error information, etc.
+
+- type: textarea
+ id: others
+ attributes:
+ label: 问题描述 Please describe your issue
+ validations:
+ required: true
+
+- type: markdown
+ attributes:
+ value: >
+ 感谢你的贡献 🎉! Thanks for your contribution 🎉!
diff --git a/PaddleDetection-release-2.6/.gitignore b/PaddleDetection-release-2.6/.gitignore
new file mode 100644
index 0000000000000000000000000000000000000000..4b6a6e8246385c676e00a412f6030ec4100d090f
--- /dev/null
+++ b/PaddleDetection-release-2.6/.gitignore
@@ -0,0 +1,88 @@
+# Virtualenv
+/.venv/
+/venv/
+
+# Byte-compiled / optimized / DLL files
+__pycache__/
+.ipynb_checkpoints/
+*.py[cod]
+
+# C extensions
+*.so
+
+# json file
+*.json
+
+# log file
+*.log
+
+# Distribution / packaging
+/bin/
+*build/
+/develop-eggs/
+*dist/
+/eggs/
+/lib/
+/lib64/
+/output/
+/inference_model/
+/output_inference/
+/parts/
+/sdist/
+/var/
+*.egg-info/
+/.installed.cfg
+/*.egg
+/.eggs
+
+# AUTHORS and ChangeLog will be generated while packaging
+/AUTHORS
+/ChangeLog
+
+# BCloud / BuildSubmitter
+/build_submitter.*
+/logger_client_log
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+.tox/
+.coverage
+.cache
+.pytest_cache
+nosetests.xml
+coverage.xml
+
+# Translations
+*.mo
+
+# Sphinx documentation
+/docs/_build/
+
+*.tar
+*.pyc
+
+.idea/
+
+dataset/coco/annotations
+dataset/coco/train2017
+dataset/coco/val2017
+dataset/voc/VOCdevkit
+dataset/fruit/fruit-detection/
+dataset/voc/test.txt
+dataset/voc/trainval.txt
+dataset/wider_face/WIDER_test
+dataset/wider_face/WIDER_train
+dataset/wider_face/WIDER_val
+dataset/wider_face/wider_face_split
+
+ppdet/version.py
+
+# NPU meta folder
+kernel_meta/
+
+# MAC
+*.DS_Store
+
diff --git a/PaddleDetection-release-2.6/.pre-commit-config.yaml b/PaddleDetection-release-2.6/.pre-commit-config.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..099148ac4ed123b68803486f7d30d157005b617d
--- /dev/null
+++ b/PaddleDetection-release-2.6/.pre-commit-config.yaml
@@ -0,0 +1,44 @@
+- repo: https://github.com/PaddlePaddle/mirrors-yapf.git
+ sha: 0d79c0c469bab64f7229c9aca2b1186ef47f0e37
+ hooks:
+ - id: yapf
+ files: \.py$
+- repo: https://github.com/pre-commit/pre-commit-hooks
+ sha: a11d9314b22d8f8c7556443875b731ef05965464
+ hooks:
+ - id: check-merge-conflict
+ - id: check-symlinks
+ - id: detect-private-key
+ files: (?!.*paddle)^.*$
+ - id: end-of-file-fixer
+ files: \.(md|yml)$
+ - id: trailing-whitespace
+ files: \.(md|yml)$
+- repo: https://github.com/Lucas-C/pre-commit-hooks
+ sha: v1.0.1
+ hooks:
+ - id: forbid-crlf
+ files: \.(md|yml)$
+ - id: remove-crlf
+ files: \.(md|yml)$
+ - id: forbid-tabs
+ files: \.(md|yml)$
+ - id: remove-tabs
+ files: \.(md|yml)$
+- repo: local
+ hooks:
+ - id: clang-format-with-version-check
+ name: clang-format
+ description: Format files with ClangFormat.
+ entry: bash ./.travis/codestyle/clang_format.hook -i
+ language: system
+ files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx|proto)$
+
+- repo: local
+ hooks:
+ - id: cpplint-cpp-source
+ name: cpplint
+ description: Check C++ code style using cpplint.py.
+ entry: bash ./.travis/codestyle/cpplint_pre_commit.hook
+ language: system
+ files: \.(c|cc|cxx|cpp|cu|h|hpp|hxx)$
diff --git a/PaddleDetection-release-2.6/.style.yapf b/PaddleDetection-release-2.6/.style.yapf
new file mode 100644
index 0000000000000000000000000000000000000000..4741fb4f3bbc6681088cf9e960321e7b857a93a8
--- /dev/null
+++ b/PaddleDetection-release-2.6/.style.yapf
@@ -0,0 +1,3 @@
+[style]
+based_on_style = pep8
+column_limit = 80
diff --git a/PaddleDetection-release-2.6/.travis.yml b/PaddleDetection-release-2.6/.travis.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b8eff51456d9723695cd037543e73f921ad4d009
--- /dev/null
+++ b/PaddleDetection-release-2.6/.travis.yml
@@ -0,0 +1,35 @@
+language: cpp
+cache: ccache
+sudo: required
+dist: trusty
+services:
+ - docker
+os:
+ - linux
+env:
+ - JOB=PRE_COMMIT
+
+addons:
+ apt:
+ packages:
+ - git
+ - python
+ - python-pip
+ - python2.7-dev
+ ssh_known_hosts: 13.229.163.131
+before_install:
+ - sudo pip install -U virtualenv pre-commit pip -i https://pypi.tuna.tsinghua.edu.cn/simple
+ - docker pull paddlepaddle/paddle:latest
+ - git pull https://github.com/PaddlePaddle/PaddleDetection develop
+
+script:
+ - exit_code=0
+ - .travis/precommit.sh || exit_code=$(( exit_code | $? ))
+ # - docker run -i --rm -v "$PWD:/py_unittest" paddlepaddle/paddle:latest /bin/bash -c
+ # 'cd /py_unittest; sh .travis/unittest.sh' || exit_code=$(( exit_code | $? ))
+ - if [ $exit_code -eq 0 ]; then true; else exit 1; fi;
+
+notifications:
+ email:
+ on_success: change
+ on_failure: always
diff --git a/PaddleDetection-release-2.6/.travis/codestyle/clang_format.hook b/PaddleDetection-release-2.6/.travis/codestyle/clang_format.hook
new file mode 100644
index 0000000000000000000000000000000000000000..1c4aa5b164a9871a227e753c9dce57827eabd748
--- /dev/null
+++ b/PaddleDetection-release-2.6/.travis/codestyle/clang_format.hook
@@ -0,0 +1,4 @@
+#!/bin/bash
+set -e
+
+clang-format $@
diff --git a/PaddleDetection-release-2.6/.travis/codestyle/cpplint_pre_commit.hook b/PaddleDetection-release-2.6/.travis/codestyle/cpplint_pre_commit.hook
new file mode 100644
index 0000000000000000000000000000000000000000..c90bf29ecb794bde52df7468d7626211397b0391
--- /dev/null
+++ b/PaddleDetection-release-2.6/.travis/codestyle/cpplint_pre_commit.hook
@@ -0,0 +1,27 @@
+#!/bin/bash
+
+TOTAL_ERRORS=0
+if [[ ! $TRAVIS_BRANCH ]]; then
+ # install cpplint on local machine.
+ if [[ ! $(which cpplint) ]]; then
+ pip install cpplint
+ fi
+ # diff files on local machine.
+ files=$(git diff --cached --name-status | awk '$1 != "D" {print $2}')
+else
+ # diff files between PR and latest commit on Travis CI.
+ branch_ref=$(git rev-parse "$TRAVIS_BRANCH")
+ head_ref=$(git rev-parse HEAD)
+ files=$(git diff --name-status $branch_ref $head_ref | awk '$1 != "D" {print $2}')
+fi
+# The trick to remove deleted files: https://stackoverflow.com/a/2413151
+for file in $files; do
+ if [[ $file =~ ^(patches/.*) ]]; then
+ continue;
+ else
+ cpplint --filter=-readability/fn_size,-build/include_what_you_use,-build/c++11 $file;
+ TOTAL_ERRORS=$(expr $TOTAL_ERRORS + $?);
+ fi
+done
+
+exit $TOTAL_ERRORS
diff --git a/PaddleDetection-release-2.6/.travis/precommit.sh b/PaddleDetection-release-2.6/.travis/precommit.sh
new file mode 100644
index 0000000000000000000000000000000000000000..bcbfb2bb530ca6fecd1ac4c9e049c292a61e5e64
--- /dev/null
+++ b/PaddleDetection-release-2.6/.travis/precommit.sh
@@ -0,0 +1,21 @@
+#!/bin/bash
+function abort(){
+ echo "Your commit not fit PaddlePaddle code style" 1>&2
+ echo "Please use pre-commit scripts to auto-format your code" 1>&2
+ exit 1
+}
+
+trap 'abort' 0
+set -e
+cd `dirname $0`
+cd ..
+export PATH=/usr/bin:$PATH
+pre-commit install
+
+if ! pre-commit run -a ; then
+ ls -lh
+ git diff --exit-code
+ exit 1
+fi
+
+trap : 0
diff --git a/PaddleDetection-release-2.6/.travis/requirements.txt b/PaddleDetection-release-2.6/.travis/requirements.txt
new file mode 100644
index 0000000000000000000000000000000000000000..27a340d8f53b1adf94d99b887c525984d53dfd4c
--- /dev/null
+++ b/PaddleDetection-release-2.6/.travis/requirements.txt
@@ -0,0 +1,8 @@
+# add python requirements for unittests here, note install pycocotools
+# directly is not supported in travis ci, it is installed by compiling
+# from source files in unittest.sh
+tqdm
+cython
+shapely
+llvmlite==0.33
+numba==0.50
diff --git a/PaddleDetection-release-2.6/.travis/unittest.sh b/PaddleDetection-release-2.6/.travis/unittest.sh
new file mode 100644
index 0000000000000000000000000000000000000000..e71833134fe62db0b3eddfe2806951e1880624d9
--- /dev/null
+++ b/PaddleDetection-release-2.6/.travis/unittest.sh
@@ -0,0 +1,47 @@
+#!/bin/bash
+
+abort(){
+ echo "Run unittest failed" 1>&2
+ echo "Please check your code" 1>&2
+ echo " 1. you can run unit tests by 'bash .travis/unittest.sh' locally" 1>&2
+ echo " 2. you can add python requirements in .travis/requirements.txt if you use new requirements in unit tests" 1>&2
+ exit 1
+}
+
+unittest(){
+ if [ $? != 0 ]; then
+ exit 1
+ fi
+ find "./ppdet" -name 'tests' -type d -print0 | \
+ xargs -0 -I{} -n1 bash -c \
+ 'python -m unittest discover -v -s {}'
+}
+
+trap 'abort' 0
+set -e
+
+# install travis python dependencies exclude pycocotools
+if [ -f ".travis/requirements.txt" ]; then
+ pip install -r .travis/requirements.txt
+fi
+
+# install pycocotools
+if [ `pip list | grep pycocotools | wc -l` -eq 0 ]; then
+ # install git if needed
+ if [ -n `which git` ]; then
+ apt-get update
+ apt-get install -y git
+ fi;
+ git clone https://github.com/cocodataset/cocoapi.git
+ cd cocoapi/PythonAPI
+ make install
+ python setup.py install --user
+ cd ../..
+ rm -rf cocoapi
+fi
+
+export PYTHONPATH=`pwd`:$PYTHONPATH
+
+unittest .
+
+trap : 0
diff --git a/PaddleDetection-release-2.6/LICENSE b/PaddleDetection-release-2.6/LICENSE
new file mode 100644
index 0000000000000000000000000000000000000000..261eeb9e9f8b2b4b0d119366dda99c6fd7d35c64
--- /dev/null
+++ b/PaddleDetection-release-2.6/LICENSE
@@ -0,0 +1,201 @@
+ Apache License
+ Version 2.0, January 2004
+ http://www.apache.org/licenses/
+
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+ 1. Definitions.
+
+ "License" shall mean the terms and conditions for use, reproduction,
+ and distribution as defined by Sections 1 through 9 of this document.
+
+ "Licensor" shall mean the copyright owner or entity authorized by
+ the copyright owner that is granting the License.
+
+ "Legal Entity" shall mean the union of the acting entity and all
+ other entities that control, are controlled by, or are under common
+ control with that entity. For the purposes of this definition,
+ "control" means (i) the power, direct or indirect, to cause the
+ direction or management of such entity, whether by contract or
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
+ outstanding shares, or (iii) beneficial ownership of such entity.
+
+ "You" (or "Your") shall mean an individual or Legal Entity
+ exercising permissions granted by this License.
+
+ "Source" form shall mean the preferred form for making modifications,
+ including but not limited to software source code, documentation
+ source, and configuration files.
+
+ "Object" form shall mean any form resulting from mechanical
+ transformation or translation of a Source form, including but
+ not limited to compiled object code, generated documentation,
+ and conversions to other media types.
+
+ "Work" shall mean the work of authorship, whether in Source or
+ Object form, made available under the License, as indicated by a
+ copyright notice that is included in or attached to the work
+ (an example is provided in the Appendix below).
+
+ "Derivative Works" shall mean any work, whether in Source or Object
+ form, that is based on (or derived from) the Work and for which the
+ editorial revisions, annotations, elaborations, or other modifications
+ represent, as a whole, an original work of authorship. For the purposes
+ of this License, Derivative Works shall not include works that remain
+ separable from, or merely link (or bind by name) to the interfaces of,
+ the Work and Derivative Works thereof.
+
+ "Contribution" shall mean any work of authorship, including
+ the original version of the Work and any modifications or additions
+ to that Work or Derivative Works thereof, that is intentionally
+ submitted to Licensor for inclusion in the Work by the copyright owner
+ or by an individual or Legal Entity authorized to submit on behalf of
+ the copyright owner. For the purposes of this definition, "submitted"
+ means any form of electronic, verbal, or written communication sent
+ to the Licensor or its representatives, including but not limited to
+ communication on electronic mailing lists, source code control systems,
+ and issue tracking systems that are managed by, or on behalf of, the
+ Licensor for the purpose of discussing and improving the Work, but
+ excluding communication that is conspicuously marked or otherwise
+ designated in writing by the copyright owner as "Not a Contribution."
+
+ "Contributor" shall mean Licensor and any individual or Legal Entity
+ on behalf of whom a Contribution has been received by Licensor and
+ subsequently incorporated within the Work.
+
+ 2. Grant of Copyright License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ copyright license to reproduce, prepare Derivative Works of,
+ publicly display, publicly perform, sublicense, and distribute the
+ Work and such Derivative Works in Source or Object form.
+
+ 3. Grant of Patent License. Subject to the terms and conditions of
+ this License, each Contributor hereby grants to You a perpetual,
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+ (except as stated in this section) patent license to make, have made,
+ use, offer to sell, sell, import, and otherwise transfer the Work,
+ where such license applies only to those patent claims licensable
+ by such Contributor that are necessarily infringed by their
+ Contribution(s) alone or by combination of their Contribution(s)
+ with the Work to which such Contribution(s) was submitted. If You
+ institute patent litigation against any entity (including a
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
+ or a Contribution incorporated within the Work constitutes direct
+ or contributory patent infringement, then any patent licenses
+ granted to You under this License for that Work shall terminate
+ as of the date such litigation is filed.
+
+ 4. Redistribution. You may reproduce and distribute copies of the
+ Work or Derivative Works thereof in any medium, with or without
+ modifications, and in Source or Object form, provided that You
+ meet the following conditions:
+
+ (a) You must give any other recipients of the Work or
+ Derivative Works a copy of this License; and
+
+ (b) You must cause any modified files to carry prominent notices
+ stating that You changed the files; and
+
+ (c) You must retain, in the Source form of any Derivative Works
+ that You distribute, all copyright, patent, trademark, and
+ attribution notices from the Source form of the Work,
+ excluding those notices that do not pertain to any part of
+ the Derivative Works; and
+
+ (d) If the Work includes a "NOTICE" text file as part of its
+ distribution, then any Derivative Works that You distribute must
+ include a readable copy of the attribution notices contained
+ within such NOTICE file, excluding those notices that do not
+ pertain to any part of the Derivative Works, in at least one
+ of the following places: within a NOTICE text file distributed
+ as part of the Derivative Works; within the Source form or
+ documentation, if provided along with the Derivative Works; or,
+ within a display generated by the Derivative Works, if and
+ wherever such third-party notices normally appear. The contents
+ of the NOTICE file are for informational purposes only and
+ do not modify the License. You may add Your own attribution
+ notices within Derivative Works that You distribute, alongside
+ or as an addendum to the NOTICE text from the Work, provided
+ that such additional attribution notices cannot be construed
+ as modifying the License.
+
+ You may add Your own copyright statement to Your modifications and
+ may provide additional or different license terms and conditions
+ for use, reproduction, or distribution of Your modifications, or
+ for any such Derivative Works as a whole, provided Your use,
+ reproduction, and distribution of the Work otherwise complies with
+ the conditions stated in this License.
+
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
+ any Contribution intentionally submitted for inclusion in the Work
+ by You to the Licensor shall be under the terms and conditions of
+ this License, without any additional terms or conditions.
+ Notwithstanding the above, nothing herein shall supersede or modify
+ the terms of any separate license agreement you may have executed
+ with Licensor regarding such Contributions.
+
+ 6. Trademarks. This License does not grant permission to use the trade
+ names, trademarks, service marks, or product names of the Licensor,
+ except as required for reasonable and customary use in describing the
+ origin of the Work and reproducing the content of the NOTICE file.
+
+ 7. Disclaimer of Warranty. Unless required by applicable law or
+ agreed to in writing, Licensor provides the Work (and each
+ Contributor provides its Contributions) on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+ implied, including, without limitation, any warranties or conditions
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+ PARTICULAR PURPOSE. You are solely responsible for determining the
+ appropriateness of using or redistributing the Work and assume any
+ risks associated with Your exercise of permissions under this License.
+
+ 8. Limitation of Liability. In no event and under no legal theory,
+ whether in tort (including negligence), contract, or otherwise,
+ unless required by applicable law (such as deliberate and grossly
+ negligent acts) or agreed to in writing, shall any Contributor be
+ liable to You for damages, including any direct, indirect, special,
+ incidental, or consequential damages of any character arising as a
+ result of this License or out of the use or inability to use the
+ Work (including but not limited to damages for loss of goodwill,
+ work stoppage, computer failure or malfunction, or any and all
+ other commercial damages or losses), even if such Contributor
+ has been advised of the possibility of such damages.
+
+ 9. Accepting Warranty or Additional Liability. While redistributing
+ the Work or Derivative Works thereof, You may choose to offer,
+ and charge a fee for, acceptance of support, warranty, indemnity,
+ or other liability obligations and/or rights consistent with this
+ License. However, in accepting such obligations, You may act only
+ on Your own behalf and on Your sole responsibility, not on behalf
+ of any other Contributor, and only if You agree to indemnify,
+ defend, and hold each Contributor harmless for any liability
+ incurred by, or claims asserted against, such Contributor by reason
+ of your accepting any such warranty or additional liability.
+
+ END OF TERMS AND CONDITIONS
+
+ APPENDIX: How to apply the Apache License to your work.
+
+ To apply the Apache License to your work, attach the following
+ boilerplate notice, with the fields enclosed by brackets "[]"
+ replaced with your own identifying information. (Don't include
+ the brackets!) The text should be enclosed in the appropriate
+ comment syntax for the file format. We also recommend that a
+ file or class name and description of purpose be included on the
+ same "printed page" as the copyright notice for easier
+ identification within third-party archives.
+
+ Copyright [yyyy] [name of copyright owner]
+
+ Licensed under the Apache License, Version 2.0 (the "License");
+ you may not use this file except in compliance with the License.
+ You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
diff --git a/PaddleDetection-release-2.6/README.md b/PaddleDetection-release-2.6/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/README_cn.md b/PaddleDetection-release-2.6/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..15b0896cbb74e7712a52b77e0c0dada6bde248b0
--- /dev/null
+++ b/PaddleDetection-release-2.6/README_cn.md
@@ -0,0 +1,800 @@
+简体中文 | [English](README_en.md)
+
+
+
+## 🌈简介
+
+PaddleDetection是一个基于PaddlePaddle的目标检测端到端开发套件,在提供丰富的模型组件和测试基准的同时,注重端到端的产业落地应用,通过打造产业级特色模型|工具、建设产业应用范例等手段,帮助开发者实现数据准备、模型选型、模型训练、模型部署的全流程打通,快速进行落地应用。
+
+主要模型效果示例如下(点击标题可快速跳转):
+
+| [**通用目标检测**](#pp-yoloe-高精度目标检测模型) | [**小目标检测**](#pp-yoloe-sod-高精度小目标检测模型) | [**旋转框检测**](#pp-yoloe-r-高性能旋转框检测模型) | [**3D目标物检测**](https://github.com/PaddlePaddle/Paddle3D) |
+| :--------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------: |
+|
|
|
|
|
+| [**人脸检测**](#模型库) | [**2D关键点检测**](#️pp-tinypose-人体骨骼关键点识别) | [**多目标追踪**](#pp-tracking-实时多目标跟踪系统) | [**实例分割**](#模型库) |
+|
|
|
|
|
+| [**车辆分析——车牌识别**](#️pp-vehicle-实时车辆分析工具) | [**车辆分析——车流统计**](#️pp-vehicle-实时车辆分析工具) | [**车辆分析——违章检测**](#️pp-vehicle-实时车辆分析工具) | [**车辆分析——属性分析**](#️pp-vehicle-实时车辆分析工具) |
+|
|
|
|
|
+| [**行人分析——闯入分析**](#pp-human-实时行人分析工具) | [**行人分析——行为分析**](#pp-human-实时行人分析工具) | [**行人分析——属性分析**](#pp-human-实时行人分析工具) | [**行人分析——人流统计**](#pp-human-实时行人分析工具) |
+|
|
|
|
|
+
+同时,PaddleDetection提供了模型的在线体验功能,用户可以选择自己的数据进行在线推理。
+
+`说明`:考虑到服务器负载压力,在线推理均为CPU推理,完整的模型开发实例以及产业部署实践代码示例请前往[🎗️产业特色模型|产业工具](#️产业特色模型产业工具-1)。
+
+`传送门`:[模型在线体验](https://www.paddlepaddle.org.cn/models)
+
+
+
+
+
+
+
+## ✨主要特性
+
+#### 🧩模块化设计
+PaddleDetection将检测模型解耦成不同的模块组件,通过自定义模块组件组合,用户可以便捷高效地完成检测模型的搭建。`传送门`:[🧩模块组件](#模块组件)。
+
+#### 📱丰富的模型库
+PaddleDetection支持大量的最新主流的算法基准以及预训练模型,涵盖2D/3D目标检测、实例分割、人脸检测、关键点检测、多目标跟踪、半监督学习等方向。`传送门`:[📱模型库](#模型库)、[⚖️模型性能对比](#️模型性能对比)。
+
+#### 🎗️产业特色模型|产业工具
+PaddleDetection打造产业级特色模型以及分析工具:PP-YOLOE+、PP-PicoDet、PP-TinyPose、PP-HumanV2、PP-Vehicle等,针对通用、高频垂类应用场景提供深度优化解决方案以及高度集成的分析工具,降低开发者的试错、选择成本,针对业务场景快速应用落地。`传送门`:[🎗️产业特色模型|产业工具](#️产业特色模型产业工具-1)。
+
+#### 💡🏆产业级部署实践
+PaddleDetection整理工业、农业、林业、交通、医疗、金融、能源电力等AI应用范例,打通数据标注-模型训练-模型调优-预测部署全流程,持续降低目标检测技术产业落地门槛。`传送门`:[💡产业实践范例](#产业实践范例)、[🏆企业应用案例](#企业应用案例)。
+
+
+
+
+
+
+
+## 📣最新进展
+
+PaddleDetection 2.6版本发布! [点击查看版本更新介绍](https://github.com/PaddlePaddle/PaddleDetection/releases/tag/v2.6.0)
+
+## 👫开源社区
+
+- **📑项目合作:** 如果您是企业开发者且有明确的目标检测垂类应用需求,请扫描如下二维码入群,并联系`群管理员AI`后可免费与官方团队展开不同层次的合作。
+- **🏅️社区贡献:** PaddleDetection非常欢迎你加入到飞桨社区的开源建设中,参与贡献方式可以参考[开源项目开发指南](docs/contribution/README.md)。
+- **💻直播教程:** PaddleDetection会定期在飞桨直播间([B站:飞桨PaddlePaddle](https://space.bilibili.com/476867757)、[微信: 飞桨PaddlePaddle](https://mp.weixin.qq.com/s/6ji89VKqoXDY6SSGkxS8NQ)),针对发新内容、以及产业范例、使用教程等进行直播分享。
+- **🎁加入社区:** **微信扫描二维码并填写问卷之后,可以及时获取如下信息,包括:**
+ - 社区最新文章、直播课等活动预告
+ - 往期直播录播&PPT
+ - 30+行人车辆等垂类高性能预训练模型
+ - 七大任务开源数据集下载链接汇总
+ - 40+前沿检测领域顶会算法
+ - 15+从零上手目标检测理论与实践视频课程
+ - 10+工业安防交通全流程项目实操(含源码)
+
+
+

+
PaddleDetection官方交流群二维码
+
+
+- **🎈社区近期活动**
+
+ - **👀YOLO系列专题**
+ - `文章传送门`:[YOLOv8来啦!YOLO内卷期模型怎么选?9+款AI硬件如何快速部署?深度解析](https://mp.weixin.qq.com/s/rPwprZeHEpmGOe5wxrmO5g)
+ - `代码传送门`:[PaddleYOLO全系列](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/feature_models/PaddleYOLO_MODEL.md)
+
+
+

+
+
+
+ - **🎯少目标迁移学习专题**
+ - `文章传送门`:[囿于数据少?泛化性差?PaddleDetection少样本迁移学习助你一键突围!](https://mp.weixin.qq.com/s/dFEQoxSzVCOaWVZPb3N7WA)
+
+ - **⚽️2022卡塔尔世界杯专题**
+ - `文章传送门`:[世界杯决赛号角吹响!趁周末来搭一套足球3D+AI量化分析系统吧!](https://mp.weixin.qq.com/s/koJxjWDPBOlqgI-98UsfKQ)
+
+
+

+
+
+
+ - **🔍旋转框小目标检测专题**
+ - `文章传送门`:[Yes, PP-YOLOE!80.73mAP、38.5mAP,旋转框、小目标检测能力双SOTA!](https://mp.weixin.qq.com/s/6ji89VKqoXDY6SSGkxS8NQ)
+
+
+

+
+
+
+ - **🎊YOLO Vision世界学术交流大会**
+ - **PaddleDetection**受邀参与首个以**YOLO为主题**的**YOLO-VISION**世界大会,与全球AI领先开发者学习交流。
+ - `活动链接传送门`:[YOLO-VISION](https://ultralytics.com/yolo-vision)
+
+
+

+
+
+- **🏅️社区贡献**
+ - `活动链接传送门`:[Yes, PP-YOLOE! 基于PP-YOLOE的算法开发](https://github.com/PaddlePaddle/PaddleDetection/issues/7345)
+
+
+## 🍱安装
+
+参考[安装说明](docs/tutorials/INSTALL_cn.md)进行安装。
+
+## 🔥教程
+
+**深度学习入门教程**
+
+- [零基础入门深度学习](https://www.paddlepaddle.org.cn/tutorials/projectdetail/4676538)
+- [零基础入门目标检测](https://aistudio.baidu.com/aistudio/education/group/info/1617)
+
+**快速开始**
+
+- [快速体验](docs/tutorials/QUICK_STARTED_cn.md)
+- [示例:30分钟快速开发交通标志检测模型](docs/tutorials/GETTING_STARTED_cn.md)
+
+**数据准备**
+- [数据准备](docs/tutorials/data/README.md)
+- [数据处理模块](docs/advanced_tutorials/READER.md)
+
+**配置文件说明**
+- [RCNN参数说明](docs/tutorials/config_annotation/faster_rcnn_r50_fpn_1x_coco_annotation.md)
+- [PP-YOLO参数说明](docs/tutorials/config_annotation/ppyolo_r50vd_dcn_1x_coco_annotation.md)
+
+**模型开发**
+
+- [新增检测模型](docs/advanced_tutorials/MODEL_TECHNICAL.md)
+- 二次开发
+ - [目标检测](docs/advanced_tutorials/customization/detection.md)
+ - [关键点检测](docs/advanced_tutorials/customization/keypoint_detection.md)
+ - [多目标跟踪](docs/advanced_tutorials/customization/pphuman_mot.md)
+ - [行为识别](docs/advanced_tutorials/customization/action_recognotion/)
+ - [属性识别](docs/advanced_tutorials/customization/pphuman_attribute.md)
+
+**部署推理**
+
+- [模型导出教程](deploy/EXPORT_MODEL.md)
+- [模型压缩](https://github.com/PaddlePaddle/PaddleSlim)
+ - [剪裁/量化/蒸馏教程](configs/slim)
+- [Paddle Inference部署](deploy/README.md)
+ - [Python端推理部署](deploy/python)
+ - [C++端推理部署](deploy/cpp)
+- [Paddle Lite部署](deploy/lite)
+- [Paddle Serving部署](deploy/serving)
+- [ONNX模型导出](deploy/EXPORT_ONNX_MODEL.md)
+- [推理benchmark](deploy/BENCHMARK_INFER.md)
+
+## 🔑FAQ
+- [FAQ/常见问题汇总](docs/tutorials/FAQ)
+
+## 🧩模块组件
+
+
+
+
+ |
+ Backbones
+ |
+
+ Necks
+ |
+
+ Loss
+ |
+
+ Common
+ |
+
+ Data Augmentation
+ |
+
+
+ |
+
+ |
+
+
+ |
+
+
+ |
+
+
+ Post-processing
+
+ Training
+
+ Common
+ |
+
+
+ |
+
+
+
+
+
+
+## 📱模型库
+
+
+
+
+ |
+ 2D Detection
+ |
+
+ Multi Object Tracking
+ |
+
+ KeyPoint Detection
+ |
+
+ Others
+ |
+
+
+ |
+
+ |
+
+
+ |
+
+
+ |
+
+
+ Instance Segmentation
+
+ Face Detection
+
+ Semi-Supervised Detection
+
+ 3D Detection
+
+ Vehicle Analysis Toolbox
+
+ Human Analysis Toolbox
+
+ Sport Analysis Toolbox
+ |
+
+
+
+
+## ⚖️模型性能对比
+
+#### 🖥️服务器端模型性能对比
+
+各模型结构和骨干网络的代表模型在COCO数据集上精度mAP和单卡Tesla V100上预测速度(FPS)对比图。
+
+
+

+
+
+
+ 测试说明(点击展开)
+
+- ViT为ViT-Cascade-Faster-RCNN模型,COCO数据集mAP高达55.7%
+- Cascade-Faster-RCNN为Cascade-Faster-RCNN-ResNet50vd-DCN,PaddleDetection将其优化到COCO数据mAP为47.8%时推理速度为20FPS
+- PP-YOLOE是对PP-YOLO v2模型的进一步优化,L版本在COCO数据集mAP为51.6%,Tesla V100预测速度78.1FPS
+- PP-YOLOE+是对PPOLOE模型的进一步优化,L版本在COCO数据集mAP为53.3%,Tesla V100预测速度78.1FPS
+- YOLOX和YOLOv5均为基于PaddleDetection复现算法,YOLOv5代码在[PaddleYOLO](https://github.com/PaddlePaddle/PaddleYOLO)中,参照[PaddleYOLO_MODEL](docs/feature_models/PaddleYOLO_MODEL.md)
+- 图中模型均可在[📱模型库](#模型库)中获取
+
+
+#### ⌚️移动端模型性能对比
+
+各移动端模型在COCO数据集上精度mAP和高通骁龙865处理器上预测速度(FPS)对比图。
+
+
+

+
+
+
+
+ 测试说明(点击展开)
+
+- 测试数据均使用高通骁龙865(4xA77+4xA55)处理器,batch size为1, 开启4线程测试,测试使用NCNN预测库,测试脚本见[MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark)
+- PP-PicoDet及PP-YOLO-Tiny为PaddleDetection自研模型,可在[📱模型库](#模型库)中获取,其余模型PaddleDetection暂未提供
+
+
+## 🎗️产业特色模型|产业工具
+
+产业特色模型|产业工具是PaddleDetection针对产业高频应用场景打造的兼顾精度和速度的模型以及工具箱,注重从数据处理-模型训练-模型调优-模型部署的端到端打通,且提供了实际生产环境中的实践范例代码,帮助拥有类似需求的开发者高效的完成产品开发落地应用。
+
+该系列模型|工具均已PP前缀命名,具体介绍、预训练模型以及产业实践范例代码如下。
+
+### 💎PP-YOLOE 高精度目标检测模型
+
+
+ 简介(点击展开)
+
+PP-YOLOE是基于PP-YOLOv2的卓越的单阶段Anchor-free模型,超越了多种流行的YOLO模型。PP-YOLOE避免了使用诸如Deformable Convolution或者Matrix NMS之类的特殊算子,以使其能轻松地部署在多种多样的硬件上。其使用大规模数据集obj365预训练模型进行预训练,可以在不同场景数据集上快速调优收敛。
+
+`传送门`:[PP-YOLOE说明](configs/ppyoloe/README_cn.md)。
+
+`传送门`:[arXiv论文](https://arxiv.org/abs/2203.16250)。
+
+
+
+
+ 预训练模型(点击展开)
+
+| 模型名称 | COCO精度(mAP) | V100 TensorRT FP16速度(FPS) | 推荐部署硬件 | 配置文件 | 模型下载 |
+| :---------- | :-------------: | :-------------------------: | :----------: | :-----------------------------------------------------: | :-------------------------------------------------------------------------------------: |
+| PP-YOLOE+_l | 53.3 | 149.2 | 服务器 | [链接](configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams) |
+
+`传送门`:[全部预训练模型](configs/ppyoloe/README_cn.md)。
+
+
+
+ 产业应用代码示例(点击展开)
+
+| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
+| ---- | ----------------- | --------------------------------------------------------------------------------------------- | ------------------------------------------------------------- | --------------------------------------------------- |
+| 农业 | 农作物检测 | 用于葡萄栽培中基于图像的监测和现场机器人技术,提供了来自5种不同葡萄品种的实地实例 | [PP-YOLOE+ 下游任务](./configs/ppyoloe/application/README.md) | [下载链接](./configs/ppyoloe/application/README.md) |
+| 通用 | 低光场景检测 | 低光数据集使用ExDark,包括从极低光环境到暮光环境等10种不同光照条件下的图片。 | [PP-YOLOE+ 下游任务](./configs/ppyoloe/application/README.md) | [下载链接](./configs/ppyoloe/application/README.md) |
+| 工业 | PCB电路板瑕疵检测 | 工业数据集使用PKU-Market-PCB,该数据集用于印刷电路板(PCB)的瑕疵检测,提供了6种常见的PCB缺陷 | [PP-YOLOE+ 下游任务](./configs/ppyoloe/application/README.md) | [下载链接](./configs/ppyoloe/application/README.md) |
+
+
+### 💎PP-YOLOE-R 高性能旋转框检测模型
+
+
+ 简介(点击展开)
+
+PP-YOLOE-R是一个高效的单阶段Anchor-free旋转框检测模型,基于PP-YOLOE+引入了一系列改进策略来提升检测精度。根据不同的硬件对精度和速度的要求,PP-YOLOE-R包含s/m/l/x四个尺寸的模型。在DOTA 1.0数据集上,PP-YOLOE-R-l和PP-YOLOE-R-x在单尺度训练和测试的情况下分别达到了78.14mAP和78.28 mAP,这在单尺度评估下超越了几乎所有的旋转框检测模型。通过多尺度训练和测试,PP-YOLOE-R-l和PP-YOLOE-R-x的检测精度进一步提升至80.02mAP和80.73 mAP,超越了所有的Anchor-free方法并且和最先进的Anchor-based的两阶段模型精度几乎相当。在保持高精度的同时,PP-YOLOE-R避免使用特殊的算子,例如Deformable Convolution或Rotated RoI Align,使其能轻松地部署在多种多样的硬件上。
+
+`传送门`:[PP-YOLOE-R说明](configs/rotate/ppyoloe_r)。
+
+`传送门`:[arXiv论文](https://arxiv.org/abs/2211.02386)。
+
+
+
+
+ 预训练模型(点击展开)
+
+| 模型 | Backbone | mAP | V100 TRT FP16 (FPS) | RTX 2080 Ti TRT FP16 (FPS) | Params (M) | FLOPs (G) | 学习率策略 | 角度表示 | 数据增广 | GPU数目 | 每GPU图片数目 | 模型下载 | 配置文件 |
+| :----------: | :------: | :---: | :-----------------: | :------------------------: | :--------: | :-------: | :--------: | :------: | :------: | :-----: | :-----------: | :---------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------: |
+| PP-YOLOE-R-l | CRN-l | 80.02 | 69.7 | 48.3 | 53.29 | 281.65 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml) |
+
+`传送门`:[全部预训练模型](configs/rotate/ppyoloe_r)。
+
+
+
+ 产业应用代码示例(点击展开)
+
+| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
+| ---- | ---------- | --------------------------------------------------------------------- | --------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 通用 | 旋转框检测 | 手把手教你上手PP-YOLOE-R旋转框检测,10分钟将脊柱数据集精度训练至95mAP | [基于PP-YOLOE-R的旋转框检测](https://aistudio.baidu.com/aistudio/projectdetail/5058293) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/5058293) |
+
+
+### 💎PP-YOLOE-SOD 高精度小目标检测模型
+
+
+ 简介(点击展开)
+
+PP-YOLOE-SOD(Small Object Detection)是PaddleDetection团队针对小目标检测提出的检测方案,在VisDrone-DET数据集上单模型精度达到38.5mAP,达到了SOTA性能。其分别基于切图拼图流程优化的小目标检测方案以及基于原图模型算法优化的小目标检测方案。同时提供了数据集自动分析脚本,只需输入数据集标注文件,便可得到数据集统计结果,辅助判断数据集是否是小目标数据集以及是否需要采用切图策略,同时给出网络超参数参考值。
+
+`传送门`:[PP-YOLOE-SOD 小目标检测模型](configs/smalldet)。
+
+
+
+
+ 预训练模型(点击展开)
+- VisDrone数据集预训练模型
+
+| 模型 | COCOAPI mAPval
0.5:0.95 | COCOAPI mAPval
0.5 | COCOAPI mAPtest_dev
0.5:0.95 | COCOAPI mAPtest_dev
0.5 | MatlabAPI mAPtest_dev
0.5:0.95 | MatlabAPI mAPtest_dev
0.5 | 下载 | 配置文件 |
+| :------------------ | :-----------------------------: | :------------------------: | :----------------------------------: | :-----------------------------: | :------------------------------------: | :-------------------------------: | :---------------------------------------------------------------------------------------------: | :----------------------------------------------------------: |
+| **PP-YOLOE+_SOD-l** | **31.9** | **52.1** | **25.6** | **43.5** | **30.25** | **51.18** | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml) |
+
+`传送门`:[全部预训练模型](configs/smalldet)。
+
+
+
+ 产业应用代码示例(点击展开)
+
+| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
+| ---- | ---------- | ---------------------------------------------------- | ------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 通用 | 小目标检测 | 基于PP-YOLOE-SOD的无人机航拍图像检测案例全流程实操。 | [基于PP-YOLOE-SOD的无人机航拍图像检测](https://aistudio.baidu.com/aistudio/projectdetail/5036782) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/5036782) |
+
+
+### 💫PP-PicoDet 超轻量实时目标检测模型
+
+
+ 简介(点击展开)
+
+全新的轻量级系列模型PP-PicoDet,在移动端具有卓越的性能,成为全新SOTA轻量级模型。
+
+`传送门`:[PP-PicoDet说明](configs/picodet/README.md)。
+
+`传送门`:[arXiv论文](https://arxiv.org/abs/2111.00902)。
+
+
+
+
+ 预训练模型(点击展开)
+
+| 模型名称 | COCO精度(mAP) | 骁龙865 四线程速度(FPS) | 推荐部署硬件 | 配置文件 | 模型下载 |
+| :-------- | :-------------: | :---------------------: | :------------: | :--------------------------------------------------: | :----------------------------------------------------------------------------------: |
+| PicoDet-L | 36.1 | 39.7 | 移动端、嵌入式 | [链接](configs/picodet/picodet_l_320_coco_lcnet.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) |
+
+`传送门`:[全部预训练模型](configs/picodet/README.md)。
+
+
+
+
+ 产业应用代码示例(点击展开)
+
+| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
+| -------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
+| 智慧城市 | 道路垃圾检测 | 通过在市政环卫车辆上安装摄像头对路面垃圾检测并分析,实现对路面遗撒的垃圾进行监控,记录并通知环卫人员清理,大大提升了环卫人效。 | [基于PP-PicoDet的路面垃圾检测](https://aistudio.baidu.com/aistudio/projectdetail/3846170?channelType=0&channel=0) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/3846170?channelType=0&channel=0) |
+
+
+### 📡PP-Tracking 实时多目标跟踪系统
+
+
+ 简介(点击展开)
+
+PaddleDetection团队提供了实时多目标跟踪系统PP-Tracking,是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。 PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
+
+`传送门`:[PP-Tracking说明](configs/mot/README.md)。
+
+
+
+
+ 预训练模型(点击展开)
+
+| 模型名称 | 模型简介 | 精度 | 速度(FPS) | 推荐部署硬件 | 配置文件 | 模型下载 |
+| :-------- | :----------------------------------: | :--------------------: | :-------: | :--------------------: | :--------------------------------------------------------: | :------------------------------------------------------------------------------------------------: |
+| ByteTrack | SDE多目标跟踪算法 仅包含检测模型 | MOT-17 test: 78.4 | - | 服务器、移动端、嵌入式 | [链接](configs/mot/bytetrack/bytetrack_yolox.yml) | [下载地址](https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) |
+| FairMOT | JDE多目标跟踪算法 多任务联合学习方法 | MOT-16 test: 75.0 | - | 服务器、移动端、嵌入式 | [链接](configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) |
+| OC-SORT | SDE多目标跟踪算法 仅包含检测模型 | MOT-17 half val: 75.5 | - | 服务器、移动端、嵌入式 | [链接](configs/mot/ocsort/ocsort_yolox.yml) | [下载地址](https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_mot_ch.pdparams) |
+
+
+
+ 产业应用代码示例(点击展开)
+
+| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
+| ---- | ---------- | -------------------------- | ---------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 通用 | 多目标跟踪 | 快速上手单镜头、多镜头跟踪 | [PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/3022582) |
+
+
+### ⛷️PP-TinyPose 人体骨骼关键点识别
+
+
+ 简介(点击展开)
+
+PaddleDetection 中的关键点检测部分紧跟最先进的算法,包括 Top-Down 和 Bottom-Up 两种方法,可以满足用户的不同需求。同时,PaddleDetection 提供针对移动端设备优化的自研实时关键点检测模型 PP-TinyPose。
+
+`传送门`:[PP-TinyPose说明](configs/keypoint/tiny_pose)。
+
+
+
+
+ 预训练模型(点击展开)
+
+| 模型名称 | 模型简介 | COCO精度(AP) | 速度(FPS) | 推荐部署硬件 | 配置文件 | 模型下载 |
+| :---------: | :----------------------------------: | :------------: | :-----------------------: | :------------: | :-----------------------------------------------------: | :--------------------------------------------------------------------------------------: |
+| PP-TinyPose | 轻量级关键点算法
输入尺寸256x192 | 68.8 | 骁龙865 四线程: 158.7 FPS | 移动端、嵌入式 | [链接](configs/keypoint/tiny_pose/tinypose_256x192.yml) | [下载地址](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) |
+
+`传送门`:[全部预训练模型](configs/keypoint/README.md)。
+
+
+
+ 产业应用代码示例(点击展开)
+
+| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
+| ---- | ---- | ---------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 运动 | 健身 | 提供从模型选型、数据准备、模型训练优化,到后处理逻辑和模型部署的全流程可复用方案,有效解决了复杂健身动作的高效识别,打造AI虚拟健身教练! | [基于PP-TinyPose增强版的智能健身动作识别](https://aistudio.baidu.com/aistudio/projectdetail/4385813) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4385813) |
+
+
+### 🏃🏻PP-Human 实时行人分析工具
+
+
+ 简介(点击展开)
+
+PaddleDetection深入探索核心行业的高频场景,提供了行人开箱即用分析工具,支持图片/单镜头视频/多镜头视频/在线视频流多种输入方式,广泛应用于智慧交通、智慧城市、工业巡检等领域。支持服务器端部署及TensorRT加速,T4服务器上可达到实时。
+PP-Human支持四大产业级功能:五大异常行为识别、26种人体属性分析、实时人流计数、跨镜头(ReID)跟踪。
+
+`传送门`:[PP-Human行人分析工具使用指南](deploy/pipeline/README.md)。
+
+
+
+
+ 预训练模型(点击展开)
+
+| 任务 | T4 TensorRT FP16: 速度(FPS) | 推荐部署硬件 | 模型下载 | 模型体积 |
+| :----------------: | :---------------------------: | :----------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------: |
+| 行人检测(高精度) | 39.8 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
+| 行人跟踪(高精度) | 31.4 | 服务器 | [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
+| 属性识别(高精度) | 单人 117.6 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)
[属性识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_small_person_attribute_954_infer.zip) | 目标检测:182M
属性识别:86M |
+| 摔倒识别 | 单人 100 | 服务器 | [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)
[关键点检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip)
[基于关键点行为识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) | 多目标跟踪:182M
关键点检测:101M
基于关键点行为识别:21.8M |
+| 闯入识别 | 31.4 | 服务器 | [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
+| 打架识别 | 50.8 | 服务器 | [视频分类](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 90M |
+| 抽烟识别 | 340.1 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)
[基于人体id的目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.zip) | 目标检测:182M
基于人体id的目标检测:27M |
+| 打电话识别 | 166.7 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)
[基于人体id的图像分类](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip) | 目标检测:182M
基于人体id的图像分类:45M |
+
+`传送门`:[完整预训练模型](deploy/pipeline/README.md)。
+
+
+
+ 产业应用代码示例(点击展开)
+
+| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
+| -------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------- |
+| 智能安防 | 摔倒检测 | 飞桨行人分析PP-Human中提供的摔倒识别算法,采用了关键点+时空图卷积网络的技术,对摔倒姿势无限制、背景环境无要求。 | [基于PP-Human v2的摔倒检测](https://aistudio.baidu.com/aistudio/projectdetail/4606001) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4606001) |
+| 智能安防 | 打架识别 | 本项目基于PaddleVideo视频开发套件训练打架识别模型,然后将训练好的模型集成到PaddleDetection的PP-Human中,助力行人行为分析。 | [基于PP-Human的打架识别](https://aistudio.baidu.com/aistudio/projectdetail/4086987?contributionType=1) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4086987?contributionType=1) |
+| 智能安防 | 摔倒检测 | 基于PP-Human完成来客分析整体流程。使用PP-Human完成来客分析中非常常见的场景: 1. 来客属性识别(单镜和跨境可视化);2. 来客行为识别(摔倒识别)。 | [基于PP-Human的来客分析案例教程](https://aistudio.baidu.com/aistudio/projectdetail/4537344) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4537344) |
+
+
+### 🏎️PP-Vehicle 实时车辆分析工具
+
+
+ 简介(点击展开)
+
+PaddleDetection深入探索核心行业的高频场景,提供了车辆开箱即用分析工具,支持图片/单镜头视频/多镜头视频/在线视频流多种输入方式,广泛应用于智慧交通、智慧城市、工业巡检等领域。支持服务器端部署及TensorRT加速,T4服务器上可达到实时。
+PP-Vehicle囊括四大交通场景核心功能:车牌识别、属性识别、车流量统计、违章检测。
+
+`传送门`:[PP-Vehicle车辆分析工具指南](deploy/pipeline/README.md)。
+
+
+
+
+ 预训练模型(点击展开)
+
+| 任务 | T4 TensorRT FP16: 速度(FPS) | 推荐部署硬件 | 模型方案 | 模型体积 |
+| :----------------: | :-------------------------: | :----------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------: |
+| 车辆检测(高精度) | 38.9 | 服务器 | [目标检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip) | 182M |
+| 车辆跟踪(高精度) | 25 | 服务器 | [多目标跟踪](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip) | 182M |
+| 车牌识别 | 213.7 | 服务器 | [车牌检测](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_det_infer.tar.gz)
[车牌识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_rec_infer.tar.gz) | 车牌检测:3.9M
车牌字符识别: 12M |
+| 车辆属性 | 136.8 | 服务器 | [属性识别](https://bj.bcebos.com/v1/paddledet/models/pipeline/vehicle_attribute_model.zip) | 7.2M |
+
+`传送门`:[完整预训练模型](deploy/pipeline/README.md)。
+
+
+
+ 产业应用代码示例(点击展开)
+
+| 行业 | 类别 | 亮点 | 文档说明 | 模型下载 |
+| -------- | ---------------- | ------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
+| 智慧交通 | 交通监控车辆分析 | 本项目基于PP-Vehicle演示智慧交通中最刚需的车流量监控、车辆违停检测以及车辆结构化(车牌、车型、颜色)分析三大场景。 | [基于PP-Vehicle的交通监控分析系统](https://aistudio.baidu.com/aistudio/projectdetail/4512254) | [下载链接](https://aistudio.baidu.com/aistudio/projectdetail/4512254) |
+
+
+## 💡产业实践范例
+
+产业实践范例是PaddleDetection针对高频目标检测应用场景,提供的端到端开发示例,帮助开发者打通数据标注-模型训练-模型调优-预测部署全流程。
+针对每个范例我们都通过[AI-Studio](https://ai.baidu.com/ai-doc/AISTUDIO/Tk39ty6ho)提供了项目代码以及说明,用户可以同步运行体验。
+
+`传送门`:[产业实践范例完整列表](industrial_tutorial/README.md)
+
+- [基于PP-YOLOE-R的旋转框检测](https://aistudio.baidu.com/aistudio/projectdetail/5058293)
+- [基于PP-YOLOE-SOD的无人机航拍图像检测](https://aistudio.baidu.com/aistudio/projectdetail/5036782)
+- [基于PP-Vehicle的交通监控分析系统](https://aistudio.baidu.com/aistudio/projectdetail/4512254)
+- [基于PP-Human v2的摔倒检测](https://aistudio.baidu.com/aistudio/projectdetail/4606001)
+- [基于PP-TinyPose增强版的智能健身动作识别](https://aistudio.baidu.com/aistudio/projectdetail/4385813)
+- [基于PP-Human的打架识别](https://aistudio.baidu.com/aistudio/projectdetail/4086987?contributionType=1)
+- [基于Faster-RCNN的瓷砖表面瑕疵检测](https://aistudio.baidu.com/aistudio/projectdetail/2571419)
+- [基于PaddleDetection的PCB瑕疵检测](https://aistudio.baidu.com/aistudio/projectdetail/2367089)
+- [基于FairMOT实现人流量统计](https://aistudio.baidu.com/aistudio/projectdetail/2421822)
+- [基于YOLOv3实现跌倒检测](https://aistudio.baidu.com/aistudio/projectdetail/2500639)
+- [基于PP-PicoDetv2 的路面垃圾检测](https://aistudio.baidu.com/aistudio/projectdetail/3846170?channelType=0&channel=0)
+- [基于人体关键点检测的合规检测](https://aistudio.baidu.com/aistudio/projectdetail/4061642?contributionType=1)
+- [基于PP-Human的来客分析案例教程](https://aistudio.baidu.com/aistudio/projectdetail/4537344)
+- 持续更新中...
+
+## 🏆企业应用案例
+
+企业应用案例是企业在实生产环境下落地应用PaddleDetection的方案思路,相比产业实践范例其更多强调整体方案设计思路,可供开发者在项目方案设计中做参考。
+
+`传送门`:[企业应用案例完整列表](https://www.paddlepaddle.org.cn/customercase)
+
+- [中国南方电网——变电站智慧巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2330)
+- [国铁电气——轨道在线智能巡检系统](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2280)
+- [京东物流——园区车辆行为识别](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2611)
+- [中兴克拉—厂区传统仪表统计监测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2618)
+- [宁德时代—动力电池高精度质量检测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2609)
+- [中国科学院空天信息创新研究院——高尔夫球场遥感监测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2483)
+- [御航智能——基于边缘的无人机智能巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2481)
+- [普宙无人机——高精度森林巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2121)
+- [领邦智能——红外无感测温监控](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2615)
+- [北京地铁——口罩检测](https://mp.weixin.qq.com/s/znrqaJmtA7CcjG0yQESWig)
+- [音智达——工厂人员违规行为检测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2288)
+- [华夏天信——输煤皮带机器人智能巡检](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2331)
+- [优恩物联网——社区住户分类支持广告精准投放](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2485)
+- [螳螂慧视——室内3D点云场景物体分割与检测](https://www.paddlepaddle.org.cn/support/news?action=detail&id=2599)
+- 持续更新中...
+
+## 📝许可证书
+
+本项目的发布受[Apache 2.0 license](LICENSE)许可认证。
+
+
+## 📌引用
+
+```
+@misc{ppdet2019,
+title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
+author={PaddlePaddle Authors},
+howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
+year={2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/README_en.md b/PaddleDetection-release-2.6/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..c45b100e1e2e7fef7e979a10f16ad52d786351ae
--- /dev/null
+++ b/PaddleDetection-release-2.6/README_en.md
@@ -0,0 +1,541 @@
+[简体中文](README_cn.md) | English
+
+
+
+
+
+
+**A High-Efficient Development Toolkit for Object Detection based on [PaddlePaddle](https://github.com/paddlepaddle/paddle)**
+
+
+
+
+
+
+
+
+
+
+
+

+
+
+
+##
Product Update
+
+- 🔥 **2022.11.15:SOTA rotated object detector and small object detector based on PP-YOLOE**
+ - Rotated object detector [PP-YOLOE-R](configs/rotate/ppyoloe_r)
+ - SOTA Anchor-free rotated object detection model with high accuracy and efficiency
+ - A series of models, named s/m/l/x, for cloud and edge devices
+ - Avoiding using special operators to be deployed friendly with TensorRT.
+ - Small object detector [PP-YOLOE-SOD](configs/smalldet)
+ - End-to-end detection pipeline based on sliced images
+ - SOTA model on VisDrone based on original images.
+
+- 2022.8.26:PaddleDetection releases[release/2.5 version](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5)
+
+ - 🗳 Model features:
+
+ - Release [PP-YOLOE+](configs/ppyoloe): Increased accuracy by a maximum of 2.4% mAP to 54.9% mAP, 3.75 times faster model training convergence rate, and up to 2.3 times faster end-to-end inference speed; improved generalization for multiple downstream tasks
+ - Release [PicoDet-NPU](configs/picodet) model which supports full quantization deployment of models; add [PicoDet](configs/picodet) layout analysis model
+ - Release [PP-TinyPose Plus](./configs/keypoint/tiny_pose/). With 9.1% AP accuracy improvement in physical exercise, dance, and other scenarios, our PP-TinyPose Plus supports unconventional movements such as turning to one side, lying down, jumping, and high lifts
+
+ - 🔮 Functions in different scenarios
+
+ - Release the pedestrian analysis tool [PP-Human v2](./deploy/pipeline). It introduces four new behavior recognition: fighting, telephoning, smoking, and trespassing. The underlying algorithm performance is optimized, covering three core algorithm capabilities: detection, tracking, and attributes of pedestrians. Our model provides end-to-end development and model optimization strategies for beginners and supports online video streaming input.
+ - First release [PP-Vehicle](./deploy/pipeline), which has four major functions: license plate recognition, vehicle attribute analysis (color, model), traffic flow statistics, and violation detection. It is compatible with input formats, including pictures, online video streaming, and video. And we also offer our users a comprehensive set of tutorials for customization.
+
+ - 💡 Cutting-edge algorithms:
+
+ - Release [PaddleYOLO](https://github.com/PaddlePaddle/PaddleYOLO) which overs classic and latest models of [YOLO family](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/docs/MODEL_ZOO_en.md): YOLOv3, PP-YOLOE (a real-time high-precision object detection model developed by Baidu PaddlePaddle), and cutting-edge detection algorithms such as YOLOv4, YOLOv5, YOLOX, YOLOv6, YOLOv7 and YOLOv8
+ - Newly add high precision detection model based on [ViT](configs/vitdet) backbone network, with a 55.7% mAP accuracy on COCO dataset; newly add multi-object tracking model [OC-SORT](configs/mot/ocsort); newly add [ConvNeXt](configs/convnext) backbone network.
+
+ - 📋 Industrial applications: Newly add [Smart Fitness](https://aistudio.baidu.com/aistudio/projectdetail/4385813), [Fighting recognition](https://aistudio.baidu.com/aistudio/projectdetail/4086987?channelType=0&channel=0),[ and Visitor Analysis](https://aistudio.baidu.com/aistudio/projectdetail/4230123?channelType=0&channel=0).
+
+- 2022.3.24:PaddleDetection released[release/2.4 version](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4)
+ - Release high-performanace SOTA object detection model [PP-YOLOE](configs/ppyoloe). It integrates cloud and edge devices and provides S/M/L/X versions. In particular, Verson L has the accuracy as 51.4% on COCO test 2017 dataset, inference speed as 78.1 FPS on a single Test V100. It supports mixed precision training, 33% faster than PP-YOLOv2. Its full range of multi-sized models can meet different hardware arithmetic requirements, and adaptable to server, edge-device GPU and other AI accelerator cards on servers.
+ - Release ultra-lightweight SOTA object detection model [PP-PicoDet Plus](configs/picodet) with 2% improvement in accuracy and 63% improvement in CPU inference speed. Add PicoDet-XS model with a 0.7M parameter, providing model sparsification and quantization functions for model acceleration. No specific post processing module is required for all the hardware, simplifying the deployment.
+ - Release the real-time pedestrian analysis tool [PP-Human](deploy/pphuman). It has four major functions: pedestrian tracking, visitor flow statistics, human attribute recognition and falling detection. For falling detection, it is optimized based on real-life data with accurate recognition of various types of falling posture. It can adapt to different environmental background, light and camera angle.
+ - Add [YOLOX](configs/yolox) object detection model with nano/tiny/S/M/L/X. X version has the accuracy as 51.8% on COCO Val2017 dataset.
+
+- [More releases](https://github.com/PaddlePaddle/PaddleDetection/releases)
+
+##
Brief Introduction
+
+**PaddleDetection** is an end-to-end object detection development kit based on PaddlePaddle. Providing **over 30 model algorithm** and **over 300 pre-trained models**, it covers object detection, instance segmentation, keypoint detection, multi-object tracking. In particular, PaddleDetection offers **high- performance & light-weight** industrial SOTA models on **servers and mobile** devices, champion solution and cutting-edge algorithm. PaddleDetection provides various data augmentation methods, configurable network components, loss functions and other advanced optimization & deployment schemes. In addition to running through the whole process of data processing, model development, training, compression and deployment, PaddlePaddle also provides rich cases and tutorials to accelerate the industrial application of algorithm.
+
+
+

+
+
+##
Features
+
+- **Rich model library**: PaddleDetection provides over 250 pre-trained models including **object detection, instance segmentation, face recognition, multi-object tracking**. It covers a variety of **global competition champion** schemes.
+- **Simple to use**: Modular design, decoupling each network component, easy for developers to build and try various detection models and optimization strategies, quick access to high-performance, customized algorithm.
+- **Getting Through End to End**: PaddlePaddle gets through end to end from data augmentation, constructing models, training, compression, depolyment. It also supports multi-architecture, multi-device deployment for **cloud and edge** device.
+- **High Performance**: Due to the high performance core, PaddlePaddle has clear advantages in training speed and memory occupation. It also supports FP16 training and multi-machine training.
+
+
+

+
+
+##
Exchanges
+
+- If you have any question or suggestion, please give us your valuable input via [GitHub Issues](https://github.com/PaddlePaddle/PaddleDetection/issues)
+
+ Welcome to join PaddleDetection user groups on WeChat (scan the QR code, add and reply "D" to the assistant)
+
+
+

+
+
+##
Kit Structure
+
+
+
+
+ |
+ Architectures
+ |
+
+ Backbones
+ |
+
+ Components
+ |
+
+ Data Augmentation
+ |
+
+
+
+
+ Object Detection
+
+ - Faster RCNN
+ - FPN
+ - Cascade-RCNN
+ - PSS-Det
+ - RetinaNet
+ - YOLOv3
+ - YOLOF
+ - YOLOX
+ - YOLOv5
+ - YOLOv6
+ - YOLOv7
+ - YOLOv8
+ - RTMDet
+ - PP-YOLO
+ - PP-YOLO-Tiny
+ - PP-PicoDet
+ - PP-YOLOv2
+ - PP-YOLOE
+ - PP-YOLOE+
+ - PP-YOLOE-SOD
+ - PP-YOLOE-R
+ - SSD
+ - CenterNet
+ - FCOS
+ - FCOSR
+ - TTFNet
+ - TOOD
+ - GFL
+ - GFLv2
+ - DETR
+ - Deformable DETR
+ - Swin Transformer
+ - Sparse RCNN
+
+ Instance Segmentation
+
+ - Mask RCNN
+ - Cascade Mask RCNN
+ - SOLOv2
+
+ Face Detection
+
+ Multi-Object-Tracking
+
+ - JDE
+ - FairMOT
+ - DeepSORT
+ - ByteTrack
+ - OC-SORT
+ - BoT-SORT
+ - CenterTrack
+
+ KeyPoint-Detection
+
+ - HRNet
+ - HigherHRNet
+ - Lite-HRNet
+ - PP-TinyPose
+
+
+ |
+
+ Details
+
+ - ResNet(&vd)
+ - Res2Net(&vd)
+ - CSPResNet
+ - SENet
+ - Res2Net
+ - HRNet
+ - Lite-HRNet
+ - DarkNet
+ - CSPDarkNet
+ - MobileNetv1/v3
+ - ShuffleNet
+ - GhostNet
+ - BlazeNet
+ - DLA
+ - HardNet
+ - LCNet
+ - ESNet
+ - Swin-Transformer
+ - ConvNeXt
+ - Vision Transformer
+
+ |
+
+ Common
+
+ - Sync-BN
+ - Group Norm
+ - DCNv2
+ - EMA
+
+
+ KeyPoint
+
+
+ FPN
+
+ - BiFPN
+ - CSP-PAN
+ - Custom-PAN
+ - ES-PAN
+ - HRFPN
+
+
+ Loss
+
+ - Smooth-L1
+ - GIoU/DIoU/CIoU
+ - IoUAware
+ - Focal Loss
+ - CT Focal Loss
+ - VariFocal Loss
+
+
+ Post-processing
+
+
+ Speed
+
+ - FP16 training
+ - Multi-machine training
+
+
+ |
+
+ Details
+
+ - Resize
+ - Lighting
+ - Flipping
+ - Expand
+ - Crop
+ - Color Distort
+ - Random Erasing
+ - Mixup
+ - AugmentHSV
+ - Mosaic
+ - Cutmix
+ - Grid Mask
+ - Auto Augment
+ - Random Perspective
+
+ |
+
+
+
+
+
+
+
+##
Model Performance
+
+
+ Performance comparison of Cloud models
+
+The comparison between COCO mAP and FPS on Tesla V100 of representative models of each architectures and backbones.
+
+
+

+
+
+**Clarification:**
+
+- `ViT` stands for `ViT-Cascade-Faster-RCNN`, which has highest mAP on COCO as 55.7%
+- `Cascade-Faster-RCNN`stands for `Cascade-Faster-RCNN-ResNet50vd-DCN`, which has been optimized to 20 FPS inference speed when COCO mAP as 47.8% in PaddleDetection models
+- `PP-YOLOE` are optimized `PP-YOLO v2`. It reached accuracy as 51.4% on COCO dataset, inference speed as 78.1 FPS on Tesla V100
+- `PP-YOLOE+` are optimized `PP-YOLOE`. It reached accuracy as 53.3% on COCO dataset, inference speed as 78.1 FPS on Tesla V100
+- The models in the figure are available in the[ model library](#模型库)
+
+
+
+
+ Performance omparison on mobiles
+
+The comparison between COCO mAP and FPS on Qualcomm Snapdragon 865 processor of models on mobile devices.
+
+
+

+
+
+**Clarification:**
+
+- Tests were conducted on Qualcomm Snapdragon 865 (4 \*A77 + 4 \*A55) batch_size=1, 4 thread, and NCNN inference library, test script see [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark)
+- [PP-PicoDet](configs/picodet) and [PP-YOLO-Tiny](configs/ppyolo) are self-developed models of PaddleDetection, and other models are not tested yet.
+
+
+
+##
Model libraries
+
+
+ 1. General detection
+
+#### PP-YOLOE series Recommended scenarios: Cloud GPU such as Nvidia V100, T4 and edge devices such as Jetson series
+
+| Model | COCO Accuracy(mAP) | V100 TensorRT FP16 Speed(FPS) | Configuration | Download |
+|:---------- |:------------------:|:-----------------------------:|:-------------------------------------------------------:|:----------------------------------------------------------------------------------------:|
+| PP-YOLOE+_s | 43.9 | 333.3 | [link](configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams) |
+| PP-YOLOE+_m | 50.0 | 208.3 | [link](configs/ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams) |
+| PP-YOLOE+_l | 53.3 | 149.2 | [link](configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) |
+| PP-YOLOE+_x | 54.9 | 95.2 | [link](configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_x_80e_coco.pdparams) |
+
+#### PP-PicoDet series Recommended scenarios: Mobile chips and x86 CPU devices, such as ARM CPU(RK3399, Raspberry Pi) and NPU(BITMAIN)
+
+| Model | COCO Accuracy(mAP) | Snapdragon 865 four-thread speed (ms) | Configuration | Download |
+|:---------- |:------------------:|:-------------------------------------:|:-----------------------------------------------------:|:-------------------------------------------------------------------------------------:|
+| PicoDet-XS | 23.5 | 7.81 | [Link](configs/picodet/picodet_xs_320_coco_lcnet.yml) | [Download](https://paddledet.bj.bcebos.com/models/picodet_xs_320_coco_lcnet.pdparams) |
+| PicoDet-S | 29.1 | 9.56 | [Link](configs/picodet/picodet_s_320_coco_lcnet.yml) | [Download](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams) |
+| PicoDet-M | 34.4 | 17.68 | [Link](configs/picodet/picodet_m_320_coco_lcnet.yml) | [Download](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco_lcnet.pdparams) |
+| PicoDet-L | 36.1 | 25.21 | [Link](configs/picodet/picodet_l_320_coco_lcnet.yml) | [Download](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) |
+
+#### [Frontier detection algorithm](docs/feature_models/PaddleYOLO_MODEL.md)
+
+| Model | COCO Accuracy(mAP) | V100 TensorRT FP16 speed(FPS) | Configuration | Download |
+|:-------- |:------------------:|:-----------------------------:|:--------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------:|
+| [YOLOX-l](configs/yolox) | 50.1 | 107.5 | [Link](configs/yolox/yolox_l_300e_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/yolox_l_300e_coco.pdparams) |
+| [YOLOv5-l](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov5) | 48.6 | 136.0 | [Link](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov5/yolov5_l_300e_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/yolov5_l_300e_coco.pdparams) |
+| [YOLOv7-l](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov7) | 51.0 | 135.0 | [链接](https://github.com/PaddlePaddle/PaddleYOLO/tree/develop/configs/yolov7/yolov7_l_300e_coco.yml) | [下载地址](https://paddledet.bj.bcebos.com/models/yolov7_l_300e_coco.pdparams) |
+
+#### Other general purpose models [doc](docs/MODEL_ZOO_en.md)
+
+
+
+
+ 2. Instance segmentation
+
+| Model | Introduction | Recommended Scenarios | COCO Accuracy(mAP) | Configuration | Download |
+|:----------------- |:-------------------------------------------------------- |:--------------------------------------------- |:--------------------------------:|:-----------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------:|
+| Mask RCNN | Two-stage instance segmentation algorithm | Edge-Cloud end
| box AP: 41.4
mask AP: 37.5 | [Link](configs/mask_rcnn/mask_rcnn_r50_vd_fpn_2x_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_vd_fpn_2x_coco.pdparams) |
+| Cascade Mask RCNN | Two-stage instance segmentation algorithm | Edge-Cloud end
| box AP: 45.7
mask AP: 39.7 | [Link](configs/mask_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) |
+| SOLOv2 | Lightweight single-stage instance segmentation algorithm | Edge-Cloud end
| mask AP: 38.0 | [Link](configs/solov2/solov2_r50_fpn_3x_coco.yml) | [Download](https://paddledet.bj.bcebos.com/models/solov2_r50_fpn_3x_coco.pdparams) |
+
+
+
+
+ 3. Keypoint detection
+
+| Model | Introduction | Recommended scenarios | COCO Accuracy(AP) | Speed | Configuration | Download |
+|:-------------------- |:--------------------------------------------------------------------------------------------- |:--------------------------------------------- |:-----------------:|:---------------------------------:|:---------------------------------------------------------:|:-------------------------------------------------------------------------------------------:|
+| HRNet-w32 + DarkPose | Top-down Keypoint detection algorithm
Input size: 384x288
| Edge-Cloud end
| 78.3 | T4 TensorRT FP16 2.96ms | [Link](configs/keypoint/hrnet/dark_hrnet_w32_384x288.yml) | [Download](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_384x288.pdparams) |
+| HRNet-w32 + DarkPose | Top-down Keypoint detection algorithm
Input size: 256x192 | Edge-Cloud end | 78.0 | T4 TensorRT FP16 1.75ms | [Link](configs/keypoint/hrnet/dark_hrnet_w32_256x192.yml) | [Download](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_256x192.pdparams) |
+| PP-TinyPose | Light-weight keypoint algorithm
Input size: 256x192 | Mobile | 68.8 | Snapdragon 865 four-thread 6.30ms | [Link](configs/keypoint/tiny_pose/tinypose_256x192.yml) | [Download](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) |
+| PP-TinyPose | Light-weight keypoint algorithm
Input size: 128x96 | Mobile | 58.1 | Snapdragon 865 four-thread 2.37ms | [Link](configs/keypoint/tiny_pose/tinypose_128x96.yml) | [Download](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams) |
+
+#### Other keypoint detection models [doc](configs/keypoint)
+
+
+
+
+ 4. Multi-object tracking PP-Tracking
+
+| Model | Introduction | Recommended scenarios | Accuracy | Configuration | Download |
+|:--------- |:------------------------------------------------------------- |:--------------------- |:----------------------:|:-----------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------:|
+| ByteTrack | SDE Multi-object tracking algorithm with detection model only | Edge-Cloud end | MOT-17 half val: 77.3 | [Link](configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml) | [Download](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolox_x_24e_800x1440_mix_det.pdparams) |
+| FairMOT | JDE multi-object tracking algorithm multi-task learning | Edge-Cloud end | MOT-16 test: 75.0 | [Link](configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml) | [Download](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) |
+| OC-SORT | SDE multi-object tracking algorithm with detection model only | Edge-Cloud end | MOT-16 half val: 75.5 | [Link](configs/mot/ocsort/ocsort_yolox.yml) | - |
+
+#### Other multi-object tracking models [docs](configs/mot)
+
+
+
+
+ 5. Industrial real-time pedestrain analysis tool-PP Human
+
+| Task | End-to-End Speed(ms) | Model | Size |
+|:--------------------------------------:|:--------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------:|
+| Pedestrian detection (high precision) | 25.1ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
+| Pedestrian detection (lightweight) | 16.2ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) | 27M |
+| Pedestrian tracking (high precision) | 31.8ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
+| Pedestrian tracking (lightweight) | 21.0ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) | 27M |
+| Attribute recognition (high precision) | Single person8.5ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)
[Attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip) | Object detection:182M
Attribute recognition:86M |
+| Attribute recognition (lightweight) | Single person 7.1ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)
[Attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/strongbaseline_r50_30e_pa100k.zip) | Object detection:182M
Attribute recognition:86M |
+| Falling detection | Single person 10ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)
[Keypoint detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip)
[Behavior detection based on key points](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) | Multi-object tracking:182M
Keypoint detection:101M
Behavior detection based on key points: 21.8M |
+| Intrusion detection | 31.8ms | [Multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 182M |
+| Fighting detection | 19.7ms | [Video classification](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip) | 90M |
+| Smoking detection | Single person 15.1ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)
[Object detection based on Human Id](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.zip) | Object detection:182M
Object detection based on Human ID: 27M |
+| Phoning detection | Single person ms | [Object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip)
[Image classification based on Human ID](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip) | Object detection:182M
Image classification based on Human ID:45M |
+
+Please refer to [docs](deploy/pipeline/README_en.md) for details.
+
+
+
+
+ 6. Industrial real-time vehicle analysis tool-PP Vehicle
+
+| Task | End-to-End Speed(ms) | Model | Size |
+|:--------------------------------------:|:--------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------:|
+| Vehicle detection (high precision) | 25.7ms | [object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip) | 182M |
+| Vehicle detection (lightweight) | 13.2ms | [object detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_ppvehicle.zip) | 27M |
+| Vehicle tracking (high precision) | 40ms | [multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_ppvehicle.zip) | 182M |
+| Vehicle tracking (lightweight) | 25ms | [multi-object tracking](https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_s_36e_pipeline.zip) | 27M |
+| Plate Recognition | 4.68ms | [plate detection](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_det_infer.tar.gz)
[plate recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_rec_infer.tar.gz) | Plate detection:3.9M
Plate recognition:12M |
+| Vehicle attribute | 7.31ms | [attribute recognition](https://bj.bcebos.com/v1/paddledet/models/pipeline/vehicle_attribute_model.zip) | 7.2M |
+
+Please refer to [docs](deploy/pipeline/README_en.md) for details.
+
+
+
+
+##
Document tutorials
+
+### Introductory tutorials
+
+- [Installation](docs/tutorials/INSTALL_cn.md)
+- [Quick start](docs/tutorials/QUICK_STARTED_cn.md)
+- [Data preparation](docs/tutorials/data/README.md)
+- [Geting Started on PaddleDetection](docs/tutorials/GETTING_STARTED_cn.md)
+- [FAQ](docs/tutorials/FAQ)
+
+### Advanced tutorials
+
+- Configuration
+
+ - [RCNN Configuration](docs/tutorials/config_annotation/faster_rcnn_r50_fpn_1x_coco_annotation.md)
+ - [PP-YOLO Configuration](docs/tutorials/config_annotation/ppyolo_r50vd_dcn_1x_coco_annotation.md)
+
+- Compression based on [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)
+
+ - [Pruning/Quantization/Distillation Tutorial](configs/slim)
+
+- [Inference deployment](deploy/README.md)
+
+ - [Export model for inference](deploy/EXPORT_MODEL.md)
+
+ - [Paddle Inference deployment](deploy/README.md)
+
+ - [Inference deployment with Python](deploy/python)
+ - [Inference deployment with C++](deploy/cpp)
+
+ - [Paddle-Lite deployment](deploy/lite)
+
+ - [Paddle Serving deployment](deploy/serving)
+
+ - [ONNX model export](deploy/EXPORT_ONNX_MODEL.md)
+
+ - [Inference benchmark](deploy/BENCHMARK_INFER.md)
+
+- Advanced development
+
+ - [Data processing module](docs/advanced_tutorials/READER.md)
+ - [New object detection models](docs/advanced_tutorials/MODEL_TECHNICAL.md)
+ - Custumization
+ - [Object detection](docs/advanced_tutorials/customization/detection.md)
+ - [Keypoint detection](docs/advanced_tutorials/customization/keypoint_detection.md)
+ - [Multiple object tracking](docs/advanced_tutorials/customization/pphuman_mot.md)
+ - [Action recognition](docs/advanced_tutorials/customization/action_recognotion/)
+ - [Attribute recognition](docs/advanced_tutorials/customization/pphuman_attribute.md)
+
+### Courses
+
+- **[Theoretical foundation] [Object detection 7-day camp](https://aistudio.baidu.com/aistudio/education/group/info/1617):** Overview of object detection tasks, details of RCNN series object detection algorithm and YOLO series object detection algorithm, PP-YOLO optimization strategy and case sharing, introduction and practice of AnchorFree series algorithm
+
+- **[Industrial application] [AI Fast Track industrial object detection technology and application](https://aistudio.baidu.com/aistudio/education/group/info/23670):** Super object detection algorithms, real-time pedestrian analysis system PP-Human, breakdown and practice of object detection industrial application
+
+- **[Industrial features] 2022.3.26** **[Smart City Industry Seven-Day Class](https://aistudio.baidu.com/aistudio/education/group/info/25620)** : Urban planning, Urban governance, Smart governance service, Traffic management, community governance.
+
+- **[Academic exchange] 2022.9.27 [YOLO Vision Event](https://www.youtube.com/playlist?list=PL1FZnkj4ad1NHVC7CMc3pjSQ-JRK-Ev6O):** As the first YOLO-themed event, PaddleDetection was invited to communicate with the experts in the field of Computer Vision around the world.
+
+### [Industrial tutorial examples](./industrial_tutorial/README.md)
+
+- [Rotated object detection based on PP-YOLOE-R](https://aistudio.baidu.com/aistudio/projectdetail/5058293)
+
+- [Aerial image detection based on PP-YOLOE-SOD](https://aistudio.baidu.com/aistudio/projectdetail/5036782)
+
+- [Fall down recognition based on PP-Human v2](https://aistudio.baidu.com/aistudio/projectdetail/4606001)
+
+- [Intelligent fitness recognition based on PP-TinyPose Plus](https://aistudio.baidu.com/aistudio/projectdetail/4385813)
+
+- [Road litter detection based on PP-PicoDet Plus](https://aistudio.baidu.com/aistudio/projectdetail/3561097)
+
+- [Visitor flow statistics based on FairMOT](https://aistudio.baidu.com/aistudio/projectdetail/2421822)
+
+- [Guest analysis based on PP-Human](https://aistudio.baidu.com/aistudio/projectdetail/4537344)
+
+- [More examples](./industrial_tutorial/README.md)
+
+##
Applications
+
+- [Fitness app on android mobile](https://github.com/zhiboniu/pose_demo_android)
+- [PP-Tracking GUI Visualization Interface](https://github.com/yangyudong2020/PP-Tracking_GUi)
+
+## Recommended third-party tutorials
+
+- [Deployment of PaddleDetection for Windows I ](https://zhuanlan.zhihu.com/p/268657833)
+- [Deployment of PaddleDetection for Windows II](https://zhuanlan.zhihu.com/p/280206376)
+- [Deployment of PaddleDetection on Jestson Nano](https://zhuanlan.zhihu.com/p/319371293)
+- [How to deploy YOLOv3 model on Raspberry Pi for Helmet detection](https://github.com/PaddleCV-FAQ/PaddleDetection-FAQ/blob/main/Lite%E9%83%A8%E7%BD%B2/yolov3_for_raspi.md)
+- [Use SSD-MobileNetv1 for a project -- From dataset to deployment on Raspberry Pi](https://github.com/PaddleCV-FAQ/PaddleDetection-FAQ/blob/main/Lite%E9%83%A8%E7%BD%B2/ssd_mobilenet_v1_for_raspi.md)
+
+##
Version updates
+
+Please refer to the[ Release note ](https://github.com/PaddlePaddle/Paddle/wiki/PaddlePaddle-2.3.0-Release-Note-EN)for more details about the updates
+
+##
License
+
+PaddlePaddle is provided under the [Apache 2.0 license](LICENSE)
+
+##
Contribute your code
+
+We appreciate your contributions and your feedback!
+
+- Thank [Mandroide](https://github.com/Mandroide) for code cleanup and
+- Thank [FL77N](https://github.com/FL77N/) for `Sparse-RCNN`model
+- Thank [Chen-Song](https://github.com/Chen-Song) for `Swin Faster-RCNN`model
+- Thank [yangyudong](https://github.com/yangyudong2020), [hchhtc123](https://github.com/hchhtc123) for developing PP-Tracking GUI interface
+- Thank Shigure19 for developing PP-TinyPose fitness APP
+- Thank [manangoel99](https://github.com/manangoel99) for Wandb visualization methods
+
+##
Quote
+
+```
+@misc{ppdet2019,
+title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
+author={PaddlePaddle Authors},
+howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
+year={2019}
+}
+```
diff --git "a/PaddleDetection-release-2.6/activity/\347\233\264\346\222\255\347\255\224\347\226\221\347\254\254\344\270\200\346\234\237.md" "b/PaddleDetection-release-2.6/activity/\347\233\264\346\222\255\347\255\224\347\226\221\347\254\254\344\270\200\346\234\237.md"
new file mode 100644
index 0000000000000000000000000000000000000000..393bf18f7e64bb7360c4bba43d7b9b48662dd8e0
--- /dev/null
+++ "b/PaddleDetection-release-2.6/activity/\347\233\264\346\222\255\347\255\224\347\226\221\347\254\254\344\270\200\346\234\237.md"
@@ -0,0 +1,125 @@
+# 直播答疑第一期
+
+### 答疑全程回放可以通过链接下载观看:https://pan.baidu.com/s/168ouju4MxN5XJEb-GU1iAw 提取码: 92mw
+
+## PaddleDetection框架/API问题
+
+#### Q1. warmup能详细讲解下吗?
+A1. warmup是在训练初期学习率从0调整至预设学习率的过程,设置可以参考[源码](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/ppdet/optimizer.py#L156),可以设置step数或epoch数
+
+#### Q2. 如果类别不匹配 也能用pretrain weights吗?
+A2. 可以,类别不匹配时,模型会自动不加载shape不匹配的权重,通常和类别数相关的权重位于head层
+
+#### Q3. 请问nms_eta怎么用呀,源码上没有写的很清楚,API文档也没有细说
+A3. 针对密集的场景,nms_eta会在每轮动态的调整nms阈值,避免过滤掉两个重叠程度很高但是属于不同物体的检测框,具体可以参考[源码](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/detection/multiclass_nms_op.cc#L139),默认为1,通常无需设置
+
+#### Q4. 请问anchor_cluster.py中的--size 是模型的input size 还是 实际使用图片的size?
+A4. 是实际推理时的图片尺寸,一般可以参考TestReader中的image_shape的设置。
+
+#### Q5. 请问为什么预测的坐标会出现负的值?
+A5. 模型算法中是有可能负值的情况,首先需要判断模型预测效果是否符合预期,如果正常可以考虑在后处理中增加clip的操作限制输出box在图像中;如果不正常,说明模型训练效果欠佳,需要进一步排查问题或调优
+
+#### Q6. PaddleDetection 人脸检测blazeface模型,一键式预测时load_params没有参数文件,从哪里下载?
+A6. blazeface的模型可以在[模型库](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/face_detection#%E6%A8%A1%E5%9E%8B%E5%BA%93)中下载到,如果想部署需要参考[步骤](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/EXPORT_MODEL.md) 导出模型
+
+## PP-YOLOE问题
+#### Q1. 训练PP-YOLOE的时候,loss是越训练越高这种情况 是数据集的问题吗?
+A1. 可以从以下几个方面排查
+
+1. 数据: 首先确认数据集没问题,包括标注,类别等
+2. 超参数:base_lr根据batch_size调整,遵守线性原则;warmup_iters根据总的epoch数进行调整
+3. 预训练参数:可以加载官方提供的自在coco数据集上的预训练参数
+4. 网络结构方面:分析下box的分布情况 适当调整dfl的参数
+
+#### Q2. 检测模型选型问题:PicoDet、PP-YOLO系列如何选型
+A2. PicoDet是针对移动端设备设计的模型,是针对arm,x86等低算力设备上设计;PP-YOLO是针对服务器端设计的模型,英伟达N卡,百度昆仑卡等。手机端,无gpu桌面端,优先PicoDet;有高算力设备,如N卡,优先PP-YOLO系列;对延时不敏感的场景,更注重高精度,优先PP-YOLO系列
+
+#### Q3. ConvBNLayer中BN层的参数都不会使用L2Decay;PP-YOLOE-s的其它部分都会按照配置文件的设置使用0.0005的L2Decay。是这样吗
+A3. PP-YOLOE的backbone和neck部分使用了ConvBNLayer,其中BN层不会使用L2Decay,其他部分使用全局设置的0.0005的L2Decay
+
+#### Q4. PP-YOLOE的Conv的bias也不使用decay吗?
+A4. PP-YOLOE的backbone和neck部分的Conv是没有bias参数的,head部分的Conv bias使用全局decay
+
+#### Q5. 在测速时,为什么要用PaddleInference而不是直接加载模型测时间呢
+A5. PaddleInference会将paddle导出的预测模型会前向算子做融合,从而实现速度优化,并且实际部署过程也是使用PaddleInference实现
+
+#### Q6. PP-YOLOE系列在部署的时候,前后处理是不是一样的啊?
+A6. PP-YOLO系列模型在部署时的前处理都是 decode-resize-normalize-permute的流程,后处理方面PP-YOLOv2使用了Matrix NMS,PP-YOLOE使用的是普通的NMS算法
+
+#### Q7. 针对小目标和类别不平衡的数据集,PP-YOLOE有什么调整策略吗
+A7 针对小目标数据集,可以适当增大ppyoloe的输入尺寸,然后在模型中增加注意力机制,目前基于PP-YOLOE的小目标检测正在开发中;针对类别不平衡问题,可以从数据采样的角度处理,目前PP-YOLOE还没有专门针对类别不平衡问题的优化
+
+## PP-Human问题
+#### Q1. 请问pphuman用导出的模型18个点(不是官方17个点)去预测时,报错是为什么
+A1. 这个问题是关键点模型输出点的数量与行为识别模型不一致导致的。如果希望用18点模型预测,除了关键点用18点模型以外,还需要自建18点的动作识别模型。
+
+#### Q2. 为什么官方导出模型设置的window_size是50
+A2. 导出模型的设置与训练和预测的输入数据长度是一致的;我们主要采用的数据集是ntu、企业提供的实际数据等等。在训练这个模型的时候,我们对这些数据中摔倒的片段做了统计分析,基本上每个动作片段持续的帧数大约是40~80左右。综合考虑到实际使用的延迟以及预测效果,我们选择了50这个量级,在我们的这部分数据上既能完整描述一个完整动作,又不会使得延迟过大。
+
+总的来说,这个window_size的数值最好还是根据实际动作以及设备的情况进行选择。例如在某种设备上,50帧的长度根本不足以包含一个完整的动作,那么这个数值就需要扩大;又或者某些动作持续时间很短,50帧的长度包含了太多不相关的其他动作,容易造成误识别,那么这个数值可以适当缩小。
+
+
+#### Q3. PP-Human中如何替换检测、跟踪、关键点模型
+A3. 我们使用的模型都是PaddleDetection中模型进行导出得到的。理论上PP-Human所使用的模型都是可以直接替换的,但是需要注意是流程和前后处理一样的模型。
+
+#### Q4. PP-Human中的数据标注问题(检测、跟踪、关键点、行为、属性)标注工具推荐和标注步骤
+A4. 标注工具:检测 labelme, labelImg, cvat; 跟踪darklabel,cvat;关键点 labelme,cvat。检测标注可以使用tools/x2coco.py转换成coco格式
+
+#### Q5. PP-Human中如何更改label(属性和动作识别)
+A5. 在PPHuman中,动作识别被定义为基于骨骼点序列的分类问题,目前我们已经开源的摔倒动作识别是一个二分类问题;属性方面我们当前还暂时没有开放训练,正在建设中
+
+#### Q6. PP-Human的哪些功能支持单人、哪些支持多人
+A6. PP-Human的功能实现基于一套流程:检测->跟踪->具体功能。当前我们的具体功能模型每次处理的是单人的,即属性、动作等都是属于图像中每一个具体人的。但是基于这套流程下来,图像中的每一个人都得到了处理的。所以单人、多人实际都是一样支持的。
+
+#### Q7. PP-Human对视频流预测的支持及服务化部署
+A7. 目前正在建设当中,下个版本会支持这部分功能
+
+#### Q8. 在使用pphuman训练自己的数据集时,训练完进行测试时,可视化的标签如何更改,没有更改的情况下还是falling
+
+A8. 可视化的函数位于https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/visualize.py#L368,这里在可视化的时候将 action_text替换为期望的类别即可。
+
+#### Q9. 关键点检测可以实现一个连贯动作的检测吗,比如健身规范
+A9. 基于关键点是可以实现的。这里可以有不同思路去做:
+
+1. 如果是期望判定动作规范的程度,且这个动作可以很好的描述。那么可以在关键点模型获得的坐标的基础上,人工增加逻辑判断即可。这里我们提供一个安卓的健身APP示例:https://github.com/zhiboniu/pose_demo_android ,其中实现判定各项动作的逻辑可以参考https://github.com/zhiboniu/pose_demo_android/blob/release/1.0/app/src/main/cpp/pose_action.cc 。
+
+2. 当一个动作较难用逻辑去描述的时候,可能参考现有摔倒检测的案例,训练一个识别健身动作的模型,但对收集数据的要求会比较高。
+
+
+#### Q10. 有遮挡的生产环境中梯子,可以用关键点检测判断人员上下梯动作是否合规
+A10. 这个问题需要视遮挡的程度而定,如果遮挡过于严重时关键点检测模型的效果会大打折扣,从而导致行为的判断失准。此外,由于基于关键点的方案抹去了外观信息,如果只是从人物本身的动作上去做判断,那么在遮挡不严重的场景下是可以的。反之,如果梯子这个物体是判断动作是否合规的必要元素,那么这个方案其实不一定是最佳选择。
+
+#### Q11. 关键点做的行为识别并不是时序上的动作识别吗
+A11. 是时序的动作识别。这里是将一定时间范围内的每一帧关键点坐标组成一个时序的关键点序列,再通过行为识别模型去预测这个序列所属的行为类别。
+
+
+## 检测算法问题
+#### Q1. 大图片小目标 最终推理的图片也是大图片 怎么预处理呀
+A1. 小目标问题常见的处理方式是切图以及增大网络输入尺寸,如果使用基于anchor的检测算法,可以通过对目标物体大小聚类生成anchor,参考[脚本](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/tools/anchor_cluster.py); 目前基于PP-YOLOE的小目标检测正在开发中
+
+#### Q2. 想问下大的目标对象怎么检测,比如发票
+A2. 如果使用基于anchor的检测算法,可以通过对目标物体大小聚类生成anchor,参考[脚本](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/tools/anchor_cluster.py);另外可以增强深层特征提升大物体检测效果
+
+#### Q3. 在做预测时发现预测框特别多,有的框的置信度甚至低于0.1,请问如果将这种框过滤掉?也就是训练模型时就把这些极低置信度的预测结果过滤掉,避免在推理部署时,做不必要的计算,从而影响推理速度。
+A3. 后处理部分有两个过滤,1)是提取置信度最高的Top 100个框做nms。2)是根据设定阈值threshold进行过滤。如果你可以确认图片上目标相对比较少<10个,可以调整Top 100这个值到50或者更低,这样可以加速nms部分的计算。其次调整threshold这个影响最终检测的准确度和召回率的效果。
+
+#### Q4. 正负样本的比例一般怎么设计
+A4. 在PaddleDetection中,支持负样本训练,TrainDataset下设置allow_empty: true即可,通过数据集测试,负样本比例在0.3时对模型提升效果最明显。
+
+## 压缩部署问题
+#### Q1. PaddleDetection训练的模型导出inference model后,在做推理部署的时候,前后处理相关代码如何编写,有什么参考教程吗?
+A1. 目前PaddleDetection下的网络模型大部分都能够支持c++ inference,不同的处理方式针对不同功能,例如:PP-YOLOE速度测试不包含后处理,PicoDet为支持不同的第三方推理引擎会设置是否导出nms
+
+object_detector.cc是针对所有检测模型的流程,其中前处理大部分都是decode-resize-normalize-permute 部分网络会加入padding的操作;大部分模型的后处理操作都放在模型里面了,picodet有单独提供nms的后处理代码
+
+检测模型的输入统一为image,im_shape,scale_factor ,如果模型中没有使用im_shape,输出个数会减少,但是整套预处理流程不需要额外开发
+
+#### Q2. 针对TensorRT的加速问题,fp16在v100确实可以,但是耗时好像有点偏差,我在1080ti上,单张图片跑1000次,耗时50s,还是float32的,可是在v100上,float16耗时97
+A2. 目前PPYOLOE等模型的速度都有在V100上使用TensorRT FP16测试,关于速度测试有以下几个方面可以排查:
+
+1. 速度测试时是否正确设置warmup,以避免过长的启动时间影响速度测试准确度
+2. 在开启TensorRT时,生成engine文件的过程耗时较长,可以在https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/infer.py#L745 中将use_static设置为True
+
+
+#### Q3. PaddleDetection已经支持了在线量化一些模型,比如想训练其他的一个新模型,是不是可以轻松用起来qat?如果不能,为什么只能支持很有限的模型,而qat其他模型总会出各种各样的问题,原因是什么?
+A3. 目前PaddleDetection模型很多,只能针对部分模型开源了QAT的config,其他模型也是支持QAT的,只是配置文件没有覆盖到,如果量化报错,通常是配置问题。检测模型一般建议跳过head最后一个conv。如果想要跳过某些层量化,可以设置skip_quant,参考[代码](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/ppdet/modeling/heads/yolo_head.py#L97)
diff --git a/PaddleDetection-release-2.6/benchmark/README.md b/PaddleDetection-release-2.6/benchmark/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..1c8bb2bf084a77226b069167ffc19ec9723e437d
--- /dev/null
+++ b/PaddleDetection-release-2.6/benchmark/README.md
@@ -0,0 +1,47 @@
+# 通用检测benchmark测试脚本说明
+
+```
+├── benchmark
+│ ├── analysis_log.py
+│ ├── prepare.sh
+│ ├── README.md
+│ ├── run_all.sh
+│ ├── run_benchmark.sh
+```
+
+## 脚本说明
+
+### prepare.sh
+相关数据准备脚本,完成数据、模型的自动下载
+### run_all.sh
+主要运行脚本,可完成所有相关模型的测试方案
+### run_benchmark.sh
+单模型运行脚本,可完成指定模型的测试方案
+
+## Docker 运行环境
+* docker image: registry.baidubce.com/paddlepaddle/paddle:2.1.2-gpu-cuda10.2-cudnn7
+* paddle = 2.1.2
+* python = 3.7
+
+## 运行benchmark测试
+
+### 运行所有模型
+```
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+cd PaddleDetection
+bash benchmark/run_all.sh
+```
+
+### 运行指定模型
+* Usage:bash run_benchmark.sh ${run_mode} ${batch_size} ${fp_item} ${max_epoch} ${model_name}
+* model_name: faster_rcnn, fcos, deformable_detr, gfl, hrnet, higherhrnet, solov2, jde, fairmot
+```
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+cd PaddleDetection
+bash benchmark/prepare.sh
+
+# 单卡
+CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh sp 2 fp32 1 faster_rcnn
+# 多卡
+CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark.sh mp 2 fp32 1 faster_rcnn
+```
diff --git a/PaddleDetection-release-2.6/benchmark/configs/faster_rcnn_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/benchmark/configs/faster_rcnn_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..02f138559c8e866ce1d1c9e7dde40720df9cf400
--- /dev/null
+++ b/PaddleDetection-release-2.6/benchmark/configs/faster_rcnn_r50_fpn_1x_coco.yml
@@ -0,0 +1,48 @@
+_BASE_: [
+ '../../configs/datasets/coco_detection.yml',
+ '../../configs/runtime.yml',
+ '../../configs/faster_rcnn/_base_/optimizer_1x.yml',
+ '../../configs/faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
+]
+weights: output/faster_rcnn_r50_fpn_1x_coco/model_final
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/benchmark/prepare.sh b/PaddleDetection-release-2.6/benchmark/prepare.sh
new file mode 100644
index 0000000000000000000000000000000000000000..0133f2b6847e6e2baa445d499bf7cf9a2d77743b
--- /dev/null
+++ b/PaddleDetection-release-2.6/benchmark/prepare.sh
@@ -0,0 +1,17 @@
+#!/usr/bin/env bash
+
+pip install -U pip Cython
+pip install -r requirements.txt
+
+mv ./dataset/coco/download_coco.py . && rm -rf ./dataset/coco/* && mv ./download_coco.py ./dataset/coco/
+# prepare lite train data
+wget -nc -P ./dataset/coco/ https://paddledet.bj.bcebos.com/data/coco_benchmark.tar
+cd ./dataset/coco/ && tar -xvf coco_benchmark.tar && mv -u coco_benchmark/* .
+rm -rf coco_benchmark/
+
+cd ../../
+rm -rf ./dataset/mot/*
+# prepare mot mini train data
+wget -nc -P ./dataset/mot/ https://paddledet.bj.bcebos.com/data/mot_benchmark.tar
+cd ./dataset/mot/ && tar -xvf mot_benchmark.tar && mv -u mot_benchmark/* .
+rm -rf mot_benchmark/
diff --git a/PaddleDetection-release-2.6/benchmark/run_all.sh b/PaddleDetection-release-2.6/benchmark/run_all.sh
new file mode 100644
index 0000000000000000000000000000000000000000..cffeb09421cd14f45fc05e51b8922daab815ab67
--- /dev/null
+++ b/PaddleDetection-release-2.6/benchmark/run_all.sh
@@ -0,0 +1,47 @@
+# Use docker: paddlepaddle/paddle:latest-gpu-cuda10.1-cudnn7 paddle=2.1.2 python3.7
+#
+# Usage:
+# git clone https://github.com/PaddlePaddle/PaddleDetection.git
+# cd PaddleDetection
+# bash benchmark/run_all.sh
+log_path=${LOG_PATH_INDEX_DIR:-$(pwd)} # benchmark系统指定该参数,不需要跑profile时,log_path指向存speed的目录
+
+# run prepare.sh
+bash benchmark/prepare.sh
+
+model_name_list=(faster_rcnn fcos deformable_detr gfl hrnet higherhrnet solov2 jde fairmot)
+fp_item_list=(fp32)
+max_epoch=2
+
+for model_item in ${model_name_list[@]}; do
+ for fp_item in ${fp_item_list[@]}; do
+ case ${model_item} in
+ faster_rcnn) bs_list=(1 8) ;;
+ fcos) bs_list=(2) ;;
+ deformable_detr) bs_list=(2) ;;
+ gfl) bs_list=(2) ;;
+ hrnet) bs_list=(64) ;;
+ higherhrnet) bs_list=(20) ;;
+ solov2) bs_list=(2) ;;
+ jde) bs_list=(4) ;;
+ fairmot) bs_list=(6) ;;
+ *) echo "wrong model_name"; exit 1;
+ esac
+ for bs_item in ${bs_list[@]}
+ do
+ run_mode=sp
+ log_name=detection_${model_item}_bs${bs_item}_${fp_item} # 如:clas_MobileNetv1_mp_bs32_fp32_8
+ echo "index is speed, 1gpus, begin, ${log_name}"
+ CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh ${run_mode} ${bs_item} \
+ ${fp_item} ${max_epoch} ${model_item} | tee ${log_path}/${log_name}_speed_1gpus 2>&1
+ sleep 60
+
+ run_mode=mp
+ log_name=detection_${model_item}_bs${bs_item}_${fp_item} # 如:clas_MobileNetv1_mp_bs32_fp32_8
+ echo "index is speed, 8gpus, run_mode is multi_process, begin, ${log_name}"
+ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash benchmark/run_benchmark.sh ${run_mode} \
+ ${bs_item} ${fp_item} ${max_epoch} ${model_item}| tee ${log_path}/${log_name}_speed_8gpus8p 2>&1
+ sleep 60
+ done
+ done
+done
diff --git a/PaddleDetection-release-2.6/benchmark/run_benchmark.sh b/PaddleDetection-release-2.6/benchmark/run_benchmark.sh
new file mode 100644
index 0000000000000000000000000000000000000000..908bfe59fe0e88d73783bb1328017b561be24bec
--- /dev/null
+++ b/PaddleDetection-release-2.6/benchmark/run_benchmark.sh
@@ -0,0 +1,92 @@
+#!/usr/bin/env bash
+set -xe
+# Usage:CUDA_VISIBLE_DEVICES=0 bash benchmark/run_benchmark.sh ${run_mode} ${batch_size} ${fp_item} ${max_epoch} ${model_name}
+python="python3.7"
+# Parameter description
+function _set_params(){
+ run_mode=${1:-"sp"} # sp|mp
+ batch_size=${2:-"2"}
+ fp_item=${3:-"fp32"} # fp32|fp16
+ max_epoch=${4:-"1"}
+ model_item=${5:-"model_item"}
+ run_log_path=${TRAIN_LOG_DIR:-$(pwd)}
+# 添加日志解析需要的参数
+ base_batch_size=${batch_size}
+ mission_name="目标检测"
+ direction_id="0"
+ ips_unit="images/s"
+ skip_steps=10 # 解析日志,有些模型前几个step耗时长,需要跳过 (必填)
+ keyword="ips:" # 解析日志,筛选出数据所在行的关键字 (必填)
+ index="1"
+ model_name=${model_item}_bs${batch_size}_${fp_item}
+
+ device=${CUDA_VISIBLE_DEVICES//,/ }
+ arr=(${device})
+ num_gpu_devices=${#arr[*]}
+ log_file=${run_log_path}/${model_item}_${run_mode}_bs${batch_size}_${fp_item}_${num_gpu_devices}
+}
+function _train(){
+ echo "Train on ${num_gpu_devices} GPUs"
+ echo "current CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES, gpus=$num_gpu_devices, batch_size=$batch_size"
+
+ # set runtime params
+ set_optimizer_lr_sp=" "
+ set_optimizer_lr_mp=" "
+ # parse model_item
+ case ${model_item} in
+ faster_rcnn) model_yml="benchmark/configs/faster_rcnn_r50_fpn_1x_coco.yml"
+ set_optimizer_lr_sp="LearningRate.base_lr=0.001" ;;
+ fcos) model_yml="configs/fcos/fcos_r50_fpn_1x_coco.yml"
+ set_optimizer_lr_sp="LearningRate.base_lr=0.001" ;;
+ deformable_detr) model_yml="configs/deformable_detr/deformable_detr_r50_1x_coco.yml" ;;
+ gfl) model_yml="configs/gfl/gfl_r50_fpn_1x_coco.yml"
+ set_optimizer_lr_sp="LearningRate.base_lr=0.001" ;;
+ hrnet) model_yml="configs/keypoint/hrnet/hrnet_w32_256x192.yml" ;;
+ higherhrnet) model_yml="configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml" ;;
+ solov2) model_yml="configs/solov2/solov2_r50_fpn_1x_coco.yml" ;;
+ jde) model_yml="configs/mot/jde/jde_darknet53_30e_1088x608.yml" ;;
+ fairmot) model_yml="configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml" ;;
+ *) echo "Undefined model_item"; exit 1;
+ esac
+
+ set_batch_size="TrainReader.batch_size=${batch_size}"
+ set_max_epoch="epoch=${max_epoch}"
+ set_log_iter="log_iter=1"
+ if [ ${fp_item} = "fp16" ]; then
+ set_fp_item="--fp16"
+ else
+ set_fp_item=" "
+ fi
+
+ case ${run_mode} in
+ sp) train_cmd="${python} -u tools/train.py -c ${model_yml} ${set_fp_item} \
+ -o ${set_batch_size} ${set_max_epoch} ${set_log_iter} ${set_optimizer_lr_sp}" ;;
+ mp) rm -rf mylog
+ train_cmd="${python} -m paddle.distributed.launch --log_dir=./mylog \
+ --gpus=${CUDA_VISIBLE_DEVICES} tools/train.py -c ${model_yml} ${set_fp_item} \
+ -o ${set_batch_size} ${set_max_epoch} ${set_log_iter} ${set_optimizer_lr_mp}"
+ log_parse_file="mylog/workerlog.0" ;;
+ *) echo "choose run_mode(sp or mp)"; exit 1;
+ esac
+
+ timeout 15m ${train_cmd} > ${log_file} 2>&1
+ if [ $? -ne 0 ];then
+ echo -e "${train_cmd}, FAIL"
+ export job_fail_flag=1
+ else
+ echo -e "${train_cmd}, SUCCESS"
+ export job_fail_flag=0
+ fi
+ kill -9 `ps -ef|grep 'python'|awk '{print $2}'`
+
+ if [ $run_mode = "mp" -a -d mylog ]; then
+ rm ${log_file}
+ cp mylog/workerlog.0 ${log_file}
+ fi
+}
+
+source ${BENCHMARK_ROOT}/scripts/run_model.sh # 在该脚本中会对符合benchmark规范的log使用analysis.py 脚本进行性能数据解析;该脚本在联调时可从benchmark repo中下载https://github.com/PaddlePaddle/benchmark/blob/master/scripts/run_model.sh;如果不联调只想要产出训练log可以注掉本行,提交时需打开
+_set_params $@
+# _train # 如果只想产出训练log,不解析,可取消注释
+_run # 该函数在run_model.sh中,执行时会调用_train; 如果不联调只想要产出训练log可以注掉本行,提交时需打开
+
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/README.md b/PaddleDetection-release-2.6/configs/cascade_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..8084fac2e999cea4db11bca2f9bf12b56b0a44d9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/README.md
@@ -0,0 +1,28 @@
+# Cascade R-CNN: High Quality Object Detection and Instance Segmentation
+
+## Model Zoo
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | Mask AP | 下载 | 配置文件 |
+| :------------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50-FPN | Cascade Faster | 1 | 1x | ---- | 41.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.yml) |
+| ResNet50-FPN | Cascade Mask | 1 | 1x | ---- | 41.8 | 36.3 | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Cascade Faster | 1 | 1x | ---- | 44.4 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Cascade Faster | 1 | 2x | ---- | 45.0 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Cascade Mask | 1 | 1x | ---- | 44.9 | 39.1 | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Cascade Mask | 1 | 2x | ---- | 45.7 | 39.7 | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml) |
+
+
+## Citations
+```
+@article{Cai_2019,
+ title={Cascade R-CNN: High Quality Object Detection and Instance Segmentation},
+ ISSN={1939-3539},
+ url={http://dx.doi.org/10.1109/tpami.2019.2956516},
+ DOI={10.1109/tpami.2019.2956516},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ publisher={Institute of Electrical and Electronics Engineers (IEEE)},
+ author={Cai, Zhaowei and Vasconcelos, Nuno},
+ year={2019},
+ pages={1–1}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_fpn_reader.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_fpn_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9b9abccd63e499bfa9402f3038425470e4a6e953
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_fpn_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_mask_fpn_reader.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_mask_fpn_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9b9abccd63e499bfa9402f3038425470e4a6e953
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_mask_fpn_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_mask_rcnn_r50_fpn.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_mask_rcnn_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ea2937babd488b1e874f75494093d942366315e5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_mask_rcnn_r50_fpn.yml
@@ -0,0 +1,97 @@
+architecture: CascadeRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+
+CascadeRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ mask_head: MaskHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+ mask_post_process: MaskPostProcess
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+CascadeHead:
+ head: CascadeTwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ cascade_iou: [0.5, 0.6, 0.7]
+ use_random: True
+
+CascadeTwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode:
+ name: RCNNBox
+ prior_box_var: [30.0, 30.0, 15.0, 15.0]
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
+
+
+MaskHead:
+ head: MaskFeat
+ roi_extractor:
+ resolution: 14
+ sampling_ratio: 0
+ aligned: True
+ mask_assigner: MaskAssigner
+ share_bbox_feat: False
+
+MaskFeat:
+ num_convs: 4
+ out_channel: 256
+
+MaskAssigner:
+ mask_resolution: 28
+
+MaskPostProcess:
+ binary_thresh: 0.5
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_rcnn_r50_fpn.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_rcnn_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c5afe774347209812ed759e31fb03e5aff677d96
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/cascade_rcnn_r50_fpn.yml
@@ -0,0 +1,75 @@
+architecture: CascadeRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+
+CascadeRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+CascadeHead:
+ head: CascadeTwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ cascade_iou: [0.5, 0.6, 0.7]
+ use_random: True
+
+CascadeTwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode:
+ name: RCNNBox
+ prior_box_var: [30.0, 30.0, 15.0, 15.0]
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..63f898e9c52556bfa0fbbe9c369900c09ab3f94c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b2c7e536d5ccaedf3ef25e7e36624b664897cfef
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/cascade_mask_rcnn_r50_fpn.yml',
+ '_base_/cascade_mask_fpn_reader.yml',
+]
+weights: output/cascade_mask_rcnn_r50_fpn_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0ab507caa9548e9118aeafb32f5c7394409601c8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml
@@ -0,0 +1,18 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/cascade_mask_rcnn_r50_fpn.yml',
+ '_base_/cascade_mask_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/cascade_mask_rcnn_r50_vd_fpn_ssld_1x_coco/model_final
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..736ba2e7430717781364343312716d5b3f2ef4aa
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/cascade_mask_rcnn_r50_fpn.yml',
+ '_base_/cascade_mask_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/cascade_mask_rcnn_r50_vd_fpn_ssld_2x_coco/model_final
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b2cc7993b9885bb5feafbff53bcc82ae3049a148
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/cascade_rcnn_r50_fpn.yml',
+ '_base_/cascade_fpn_reader.yml',
+]
+weights: output/cascade_rcnn_r50_fpn_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..905adbd61a5b2b5213d737d5ad2a49df650c8425
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_1x_coco.yml
@@ -0,0 +1,18 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/cascade_rcnn_r50_fpn.yml',
+ '_base_/cascade_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_1x_coco/model_final
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
diff --git a/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.yml b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a6272145d03bb273c90ccf8d950c8b88f9b3e13b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/cascade_rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/cascade_rcnn_r50_fpn.yml',
+ '_base_/cascade_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_2x_coco/model_final
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/centernet/README.md b/PaddleDetection-release-2.6/configs/centernet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..6dd52cd32608d6e76f08e50bbda8f3c2f4190418
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/README.md
@@ -0,0 +1,37 @@
+English | [简体中文](README_cn.md)
+
+# CenterNet (CenterNet: Objects as Points)
+
+## Table of Contents
+- [Introduction](#Introduction)
+- [Model Zoo](#Model_Zoo)
+- [Citations](#Citations)
+
+## Introduction
+
+[CenterNet](http://arxiv.org/abs/1904.07850) is an Anchor Free detector, which model an object as a single point -- the center point of its bounding box. The detector uses keypoint estimation to find center points and regresses to all other object properties. The center point based approach, CenterNet, is end-to-end differentiable, simpler, faster, and more accurate than corresponding bounding box based detectors.
+
+## Model Zoo
+
+### CenterNet Results on COCO-val 2017
+
+| backbone | input shape | mAP | FPS | download | config |
+| :--------------| :------- | :----: | :------: | :----: |:-----: |
+| DLA-34(paper) | 512x512 | 37.4 | - | - | - |
+| DLA-34 | 512x512 | 37.6 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_dla34_140e_coco.pdparams) | [config](./centernet_dla34_140e_coco.yml) |
+| ResNet50 + DLAUp | 512x512 | 38.9 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_r50_140e_coco.pdparams) | [config](./centernet_r50_140e_coco.yml) |
+| MobileNetV1 + DLAUp | 512x512 | 28.2 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv1_140e_coco.pdparams) | [config](./centernet_mbv1_140e_coco.yml) |
+| MobileNetV3_small + DLAUp | 512x512 | 17 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_small_140e_coco.pdparams) | [config](./centernet_mbv3_small_140e_coco.yml) |
+| MobileNetV3_large + DLAUp | 512x512 | 27.1 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_large_140e_coco.pdparams) | [config](./centernet_mbv3_large_140e_coco.yml) |
+| ShuffleNetV2 + DLAUp | 512x512 | 23.8 | - | [model](https://bj.bcebos.com/v1/paddledet/models/centernet_shufflenetv2_140e_coco.pdparams) | [config](./centernet_shufflenetv2_140e_coco.yml) |
+
+
+## Citations
+```
+@article{zhou2019objects,
+ title={Objects as points},
+ author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
+ journal={arXiv preprint arXiv:1904.07850},
+ year={2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/centernet/README_cn.md b/PaddleDetection-release-2.6/configs/centernet/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..d78cd40e3d3d8751422d3d6bde078ff21d08223d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/README_cn.md
@@ -0,0 +1,36 @@
+简体中文 | [English](README.md)
+
+# CenterNet (CenterNet: Objects as Points)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [引用](#引用)
+
+## 内容
+
+[CenterNet](http://arxiv.org/abs/1904.07850)是Anchor Free检测器,将物体表示为一个目标框中心点。CenterNet使用关键点检测的方式定位中心点并回归物体的其他属性。CenterNet是以中心点为基础的检测方法,是端到端可训练的,并且相较于基于anchor的检测器更加检测高效。
+
+## 模型库
+
+### CenterNet在COCO-val 2017上结果
+
+| 骨干网络 | 输入尺寸 | mAP | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :------: | :----: |:-----: |
+| DLA-34(paper) | 512x512 | 37.4 | - | - | - |
+| DLA-34 | 512x512 | 37.6 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_dla34_140e_coco.pdparams) | [配置文件](./centernet_dla34_140e_coco.yml) |
+| ResNet50 + DLAUp | 512x512 | 38.9 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_r50_140e_coco.pdparams) | [配置文件](./centernet_r50_140e_coco.yml) |
+| MobileNetV1 + DLAUp | 512x512 | 28.2 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv1_140e_coco.pdparams) | [配置文件](./centernet_mbv1_140e_coco.yml) |
+| MobileNetV3_small + DLAUp | 512x512 | 17 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_small_140e_coco.pdparams) | [配置文件](./centernet_mbv3_small_140e_coco.yml) |
+| MobileNetV3_large + DLAUp | 512x512 | 27.1 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_mbv3_large_140e_coco.pdparams) | [配置文件](./centernet_mbv3_large_140e_coco.yml) |
+| ShuffleNetV2 + DLAUp | 512x512 | 23.8 | - | [下载链接](https://bj.bcebos.com/v1/paddledet/models/centernet_shufflenetv2_140e_coco.pdparams) | [配置文件](./centernet_shufflenetv2_140e_coco.yml) |
+
+## 引用
+```
+@article{zhou2019objects,
+ title={Objects as points},
+ author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
+ journal={arXiv preprint arXiv:1904.07850},
+ year={2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/centernet/_base_/centernet_dla34.yml b/PaddleDetection-release-2.6/configs/centernet/_base_/centernet_dla34.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f8fb86912ef1ef48bdcde6363b5b228966fefd09
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/_base_/centernet_dla34.yml
@@ -0,0 +1,22 @@
+architecture: CenterNet
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/DLA34_pretrain.pdparams
+
+CenterNet:
+ backbone: DLA
+ neck: CenterNetDLAFPN
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+
+DLA:
+ depth: 34
+
+CenterNetDLAFPN:
+ down_ratio: 4
+
+CenterNetHead:
+ head_planes: 256
+ regress_ltrb: False
+
+CenterNetPostProcess:
+ max_per_img: 100
+ regress_ltrb: False
diff --git a/PaddleDetection-release-2.6/configs/centernet/_base_/centernet_r50.yml b/PaddleDetection-release-2.6/configs/centernet/_base_/centernet_r50.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f2dc2ee2897d190066250113c9dbf01a7b92e130
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/_base_/centernet_r50.yml
@@ -0,0 +1,34 @@
+architecture: CenterNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+CenterNet:
+ backbone: ResNet
+ neck: CenterNetDLAFPN
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [0, 1, 2, 3]
+ freeze_at: -1
+ norm_decay: 0.
+ dcn_v2_stages: [3]
+
+
+CenterNetDLAFPN:
+ first_level: 0
+ last_level: 4
+ down_ratio: 4
+ dcn_v2: False
+
+CenterNetHead:
+ head_planes: 256
+ regress_ltrb: False
+
+CenterNetPostProcess:
+ max_per_img: 100
+ regress_ltrb: False
diff --git a/PaddleDetection-release-2.6/configs/centernet/_base_/centernet_reader.yml b/PaddleDetection-release-2.6/configs/centernet/_base_/centernet_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..81af4ab840502da6e738ac667dd0883041ba8992
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/_base_/centernet_reader.yml
@@ -0,0 +1,35 @@
+worker_num: 4
+TrainReader:
+ inputs_def:
+ image_shape: [3, 512, 512]
+ sample_transforms:
+ - Decode: {}
+ - FlipWarpAffine: {keep_res: False, input_h: 512, input_w: 512, use_random: True}
+ - CenterRandColor: {}
+ - Lighting: {eigval: [0.2141788, 0.01817699, 0.00341571], eigvec: [[-0.58752847, -0.69563484, 0.41340352], [-0.5832747, 0.00994535, -0.81221408], [-0.56089297, 0.71832671, 0.41158938]]}
+ - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: False}
+ - Permute: {}
+ - Gt2CenterNetTarget: {down_ratio: 4, max_objs: 128}
+ batch_size: 16
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - WarpAffine: {keep_res: True, input_h: 512, input_w: 512}
+ - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834]}
+ - Permute: {}
+ batch_size: 1
+
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 512, 512]
+ sample_transforms:
+ - Decode: {}
+ - WarpAffine: {keep_res: True, input_h: 512, input_w: 512}
+ - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/centernet/_base_/optimizer_140e.yml b/PaddleDetection-release-2.6/configs/centernet/_base_/optimizer_140e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8c014e1ffe9f4971a9f322644bc943880ed57cec
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/_base_/optimizer_140e.yml
@@ -0,0 +1,14 @@
+epoch: 140
+
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [90, 120]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
diff --git a/PaddleDetection-release-2.6/configs/centernet/centernet_dla34_140e_coco.yml b/PaddleDetection-release-2.6/configs/centernet/centernet_dla34_140e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e6a66a9b955e1c53512e3d399a40a1e120f8a0d2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/centernet_dla34_140e_coco.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_140e.yml',
+ '_base_/centernet_dla34.yml',
+ '_base_/centernet_reader.yml',
+]
+
+weights: output/centernet_dla34_140e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/centernet/centernet_mbv1_140e_coco.yml b/PaddleDetection-release-2.6/configs/centernet/centernet_mbv1_140e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..48429a1dd9ced07b4e906304a199a8e2193e235d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/centernet_mbv1_140e_coco.yml
@@ -0,0 +1,21 @@
+_BASE_: [
+ 'centernet_r50_140e_coco.yml'
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV1_pretrained.pdparams
+weights: output/centernet_mbv1_140e_coco/model_final
+
+CenterNet:
+ backbone: MobileNet
+ neck: CenterNetDLAFPN
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+
+MobileNet:
+ scale: 1.
+ with_extra_blocks: false
+ extra_block_filters: []
+ feature_maps: [3, 5, 11, 13]
+
+TrainReader:
+ batch_size: 32
diff --git a/PaddleDetection-release-2.6/configs/centernet/centernet_mbv3_large_140e_coco.yml b/PaddleDetection-release-2.6/configs/centernet/centernet_mbv3_large_140e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..57830a9b5ab3c4124138a1283f964ae62fa2c00e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/centernet_mbv3_large_140e_coco.yml
@@ -0,0 +1,22 @@
+_BASE_: [
+ 'centernet_r50_140e_coco.yml'
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x1_0_ssld_pretrained.pdparams
+weights: output/centernet_mbv3_large_140e_coco/model_final
+
+CenterNet:
+ backbone: MobileNetV3
+ neck: CenterNetDLAFPN
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+
+MobileNetV3:
+ model_name: large
+ scale: 1.
+ with_extra_blocks: false
+ extra_block_filters: []
+ feature_maps: [4, 7, 13, 16]
+
+TrainReader:
+ batch_size: 32
diff --git a/PaddleDetection-release-2.6/configs/centernet/centernet_mbv3_small_140e_coco.yml b/PaddleDetection-release-2.6/configs/centernet/centernet_mbv3_small_140e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..de73f1b2f4023ecb9bfee96436403192b7f6d80f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/centernet_mbv3_small_140e_coco.yml
@@ -0,0 +1,28 @@
+_BASE_: [
+ 'centernet_r50_140e_coco.yml'
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_small_x1_0_ssld_pretrained.pdparams
+weights: output/centernet_mbv3_small_140e_coco/model_final
+
+CenterNet:
+ backbone: MobileNetV3
+ neck: CenterNetDLAFPN
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+
+MobileNetV3:
+ model_name: small
+ scale: 1.
+ with_extra_blocks: false
+ extra_block_filters: []
+ feature_maps: [4, 9, 12]
+
+CenterNetDLAFPN:
+ first_level: 0
+ last_level: 3
+ down_ratio: 8
+ dcn_v2: False
+
+TrainReader:
+ batch_size: 32
diff --git a/PaddleDetection-release-2.6/configs/centernet/centernet_r50_140e_coco.yml b/PaddleDetection-release-2.6/configs/centernet/centernet_r50_140e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a8b1e98e3373507c8572ee77fab868f2b21bed64
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/centernet_r50_140e_coco.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_140e.yml',
+ '_base_/centernet_r50.yml',
+ '_base_/centernet_reader.yml',
+]
+
+weights: output/centernet_r50_140e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/centernet/centernet_shufflenetv2_140e_coco.yml b/PaddleDetection-release-2.6/configs/centernet/centernet_shufflenetv2_140e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9ccdae16400c77c0c7f2775db531e4687379f545
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/centernet/centernet_shufflenetv2_140e_coco.yml
@@ -0,0 +1,33 @@
+_BASE_: [
+ 'centernet_r50_140e_coco.yml'
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ShuffleNetV2_x1_0_pretrained.pdparams
+weights: output/centernet_shufflenetv2_140e_coco/model_final
+
+CenterNet:
+ backbone: ShuffleNetV2
+ neck: CenterNetDLAFPN
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+
+ShuffleNetV2:
+ scale: 1.0
+ feature_maps: [5, 13, 17]
+ act: leaky_relu
+
+CenterNetDLAFPN:
+ first_level: 0
+ last_level: 3
+ down_ratio: 8
+ dcn_v2: False
+
+TrainReader:
+ batch_size: 32
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - WarpAffine: {keep_res: False, input_h: 512, input_w: 512}
+ - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834]}
+ - Permute: {}
diff --git a/PaddleDetection-release-2.6/configs/convnext/README.md b/PaddleDetection-release-2.6/configs/convnext/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..644d66815660427d2a6cdf587c014d8cb877eb15
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/convnext/README.md
@@ -0,0 +1,20 @@
+# ConvNeXt (A ConvNet for the 2020s)
+
+## 模型库
+### ConvNeXt on COCO
+
+| 网络网络 | 输入尺寸 | 图片数/GPU | 学习率策略 | mAPval
0.5:0.95 | mAPval
0.5 | Params(M) | FLOPs(G) | 下载链接 | 配置文件 |
+| :------------- | :------- | :-------: | :------: | :------------: | :---------------------: | :----------------: |:---------: | :------: |:---------------: |
+| PP-YOLOE-ConvNeXt-tiny | 640 | 16 | 36e | 44.6 | 63.3 | 33.04 | 13.87 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_convnext_tiny_36e_coco.pdparams) | [配置文件](./ppyoloe_convnext_tiny_36e_coco.yml) |
+| YOLOX-ConvNeXt-s | 640 | 8 | 36e | 44.6 | 65.3 | 36.20 | 27.52 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_convnext_s_36e_coco.pdparams) | [配置文件](./yolox_convnext_s_36e_coco.yml) |
+
+
+## Citations
+```
+@Article{liu2022convnet,
+ author = {Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
+ title = {A ConvNet for the 2020s},
+ journal = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+ year = {2022},
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/convnext/ppyoloe_convnext_tiny_36e_coco.yml b/PaddleDetection-release-2.6/configs/convnext/ppyoloe_convnext_tiny_36e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..360a368ec0837033ab408db59aa0d4ea5b7972dd
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/convnext/ppyoloe_convnext_tiny_36e_coco.yml
@@ -0,0 +1,55 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+depth_mult: 0.25
+width_mult: 0.50
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_convnext_tiny_36e_coco/model_final
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/convnext_tiny_22k_224.pdparams
+
+
+YOLOv3:
+ backbone: ConvNeXt
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+ConvNeXt:
+ arch: 'tiny'
+ drop_path_rate: 0.4
+ layer_scale_init_value: 1.0
+ return_idx: [1, 2, 3]
+
+
+PPYOLOEHead:
+ static_assigner_epoch: 12
+ nms:
+ nms_top_k: 10000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+
+TrainReader:
+ batch_size: 16
+
+
+epoch: 36
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [36]
+ use_warmup: false
+
+OptimizerBuilder:
+ regularizer: false
+ optimizer:
+ type: AdamW
+ weight_decay: 0.0005
diff --git a/PaddleDetection-release-2.6/configs/convnext/yolox_convnext_s_36e_coco.yml b/PaddleDetection-release-2.6/configs/convnext/yolox_convnext_s_36e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b41551dee8a2e2793ac09d474c0e7d2a8868299f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/convnext/yolox_convnext_s_36e_coco.yml
@@ -0,0 +1,58 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../yolox/_base_/yolox_cspdarknet.yml',
+ '../yolox/_base_/yolox_reader.yml'
+]
+depth_mult: 0.33
+width_mult: 0.50
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/yolox_convnext_s_36e_coco/model_final
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/convnext_tiny_22k_224.pdparams
+
+
+YOLOX:
+ backbone: ConvNeXt
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ size_stride: 32
+ size_range: [15, 25] # multi-scale range [480*480 ~ 800*800]
+
+ConvNeXt:
+ arch: 'tiny'
+ drop_path_rate: 0.4
+ layer_scale_init_value: 1.0
+ return_idx: [1, 2, 3]
+
+
+TrainReader:
+ batch_size: 8
+ mosaic_epoch: 30
+
+
+YOLOXHead:
+ l1_epoch: 30
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 10000
+ keep_top_k: 1000
+ score_threshold: 0.001
+ nms_threshold: 0.65
+
+
+epoch: 36
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [36]
+ use_warmup: false
+
+OptimizerBuilder:
+ regularizer: false
+ optimizer:
+ type: AdamW
+ weight_decay: 0.0005
diff --git a/PaddleDetection-release-2.6/configs/datasets/coco_detection.yml b/PaddleDetection-release-2.6/configs/datasets/coco_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..176ba271c7910531ef1e6f8ed72572cd2a5d4efa
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/coco_detection.yml
@@ -0,0 +1,21 @@
+metric: COCO
+num_classes: 80
+
+TrainDataset:
+ name: COCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ name: COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+ allow_empty: true
+
+TestDataset:
+ name: ImageFolder
+ anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/datasets/coco_instance.yml b/PaddleDetection-release-2.6/configs/datasets/coco_instance.yml
new file mode 100644
index 0000000000000000000000000000000000000000..91c4ab8890e5353becf0deb43d9e0d256a991987
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/coco_instance.yml
@@ -0,0 +1,20 @@
+metric: COCO
+num_classes: 80
+
+TrainDataset:
+ name: COCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_poly', 'is_crowd']
+
+EvalDataset:
+ name: COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+
+TestDataset:
+ name: ImageFolder
+ anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/datasets/dota.yml b/PaddleDetection-release-2.6/configs/datasets/dota.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9dda08400aaac1d914b1858dda32ff0f82717b49
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/dota.yml
@@ -0,0 +1,21 @@
+metric: RBOX
+num_classes: 15
+
+TrainDataset:
+ !COCODataSet
+ image_dir: trainval1024/images
+ anno_path: trainval1024/DOTA_trainval1024.json
+ dataset_dir: dataset/dota/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: trainval1024/images
+ anno_path: trainval1024/DOTA_trainval1024.json
+ dataset_dir: dataset/dota/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
+
+TestDataset:
+ !ImageFolder
+ anno_path: test1024/DOTA_test1024.json
+ dataset_dir: dataset/dota/
diff --git a/PaddleDetection-release-2.6/configs/datasets/dota_ms.yml b/PaddleDetection-release-2.6/configs/datasets/dota_ms.yml
new file mode 100644
index 0000000000000000000000000000000000000000..802e8846d7f443a7032cf49a88bfe79328ea41db
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/dota_ms.yml
@@ -0,0 +1,21 @@
+metric: RBOX
+num_classes: 15
+
+TrainDataset:
+ !COCODataSet
+ image_dir: trainval1024/images
+ anno_path: trainval1024/DOTA_trainval1024.json
+ dataset_dir: dataset/dota_ms/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: trainval1024/images
+ anno_path: trainval1024/DOTA_trainval1024.json
+ dataset_dir: dataset/dota_ms/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
+
+TestDataset:
+ !ImageFolder
+ anno_path: test1024/DOTA_test1024.json
+ dataset_dir: dataset/dota_ms/
diff --git a/PaddleDetection-release-2.6/configs/datasets/mcmot.yml b/PaddleDetection-release-2.6/configs/datasets/mcmot.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5f639a045b0630fcd4fe87fd01ff461be2c5d8a8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/mcmot.yml
@@ -0,0 +1,25 @@
+metric: MCMOT
+num_classes: 10
+# using VisDrone2019 MOT dataset with 10 classes as default, you can modify it for your needs.
+
+# for MCMOT training
+TrainDataset:
+ !MCMOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_mcmot.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+ label_list: label_list.txt
+
+# for MCMOT evaluation
+# If you want to change the MCMOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_mcmot/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MCMOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/datasets/mot.yml b/PaddleDetection-release-2.6/configs/datasets/mot.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7107da4905e88847aba29e66346a5c05bc418462
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/mot.yml
@@ -0,0 +1,23 @@
+metric: MOT
+num_classes: 1
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT16/images/train
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/datasets/objects365_detection.yml b/PaddleDetection-release-2.6/configs/datasets/objects365_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..735ebf96dcea828459428016bad764c8461e8ee8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/objects365_detection.yml
@@ -0,0 +1,21 @@
+metric: COCO
+num_classes: 365
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train
+ anno_path: annotations/zhiyuan_objv2_train.json
+ dataset_dir: dataset/objects365
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val
+ anno_path: annotations/zhiyuan_objv2_val.json
+ dataset_dir: dataset/objects365
+ allow_empty: true
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/zhiyuan_objv2_val.json
+ dataset_dir: dataset/objects365/
diff --git a/PaddleDetection-release-2.6/configs/datasets/roadsign_voc.yml b/PaddleDetection-release-2.6/configs/datasets/roadsign_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9a081611aa8dafef5d5c6f1af1476cc038db5702
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/roadsign_voc.yml
@@ -0,0 +1,21 @@
+metric: VOC
+map_type: integral
+num_classes: 4
+
+TrainDataset:
+ name: VOCDataSet
+ dataset_dir: dataset/roadsign_voc
+ anno_path: train.txt
+ label_list: label_list.txt
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
+
+EvalDataset:
+ name: VOCDataSet
+ dataset_dir: dataset/roadsign_voc
+ anno_path: valid.txt
+ label_list: label_list.txt
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
+
+TestDataset:
+ name: ImageFolder
+ anno_path: dataset/roadsign_voc/label_list.txt
diff --git a/PaddleDetection-release-2.6/configs/datasets/sniper_coco_detection.yml b/PaddleDetection-release-2.6/configs/datasets/sniper_coco_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b5cff989f5b58e79836e95efa2070c580e5edc44
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/sniper_coco_detection.yml
@@ -0,0 +1,47 @@
+metric: SNIPERCOCO
+num_classes: 80
+
+TrainDataset:
+ !SniperCOCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+ allow_empty: true
+ is_trainset: true
+ image_target_sizes: [2000, 1000]
+ valid_box_ratio_ranges: [[-1, 0.1],[0.08, -1]]
+ chip_target_size: 512
+ chip_target_stride: 200
+ use_neg_chip: false
+ max_neg_num_per_im: 8
+
+
+EvalDataset:
+ !SniperCOCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+ allow_empty: true
+ is_trainset: false
+ image_target_sizes: [2000, 1000]
+ valid_box_ratio_ranges: [[-1, 0.1], [0.08, -1]]
+ chip_target_size: 512
+ chip_target_stride: 200
+ max_per_img: -1
+ nms_thresh: 0.5
+
+TestDataset:
+ !SniperCOCODataSet
+ image_dir: val2017
+ dataset_dir: dataset/coco
+ is_trainset: false
+ image_target_sizes: [2000, 1000]
+ valid_box_ratio_ranges: [[-1, 0.1],[0.08, -1]]
+ chip_target_size: 500
+ chip_target_stride: 200
+ max_per_img: -1
+ nms_thresh: 0.5
+
+
diff --git a/PaddleDetection-release-2.6/configs/datasets/sniper_visdrone_detection.yml b/PaddleDetection-release-2.6/configs/datasets/sniper_visdrone_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f6c12a9516b71026e3451e10b128aec1fbf96160
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/sniper_visdrone_detection.yml
@@ -0,0 +1,47 @@
+metric: SNIPERCOCO
+num_classes: 9
+
+TrainDataset:
+ !SniperCOCODataSet
+ image_dir: train
+ anno_path: annotations/train.json
+ dataset_dir: dataset/VisDrone2019_coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+ allow_empty: true
+ is_trainset: true
+ image_target_sizes: [8145, 2742]
+ valid_box_ratio_ranges: [[-1, 0.03142857142857144], [0.02333211853008726, -1]]
+ chip_target_size: 1536
+ chip_target_stride: 1184
+ use_neg_chip: false
+ max_neg_num_per_im: 8
+
+
+EvalDataset:
+ !SniperCOCODataSet
+ image_dir: val
+ anno_path: annotations/val.json
+ dataset_dir: dataset/VisDrone2019_coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+ allow_empty: true
+ is_trainset: false
+ image_target_sizes: [8145, 2742]
+ valid_box_ratio_ranges: [[-1, 0.03142857142857144], [0.02333211853008726, -1]]
+ chip_target_size: 1536
+ chip_target_stride: 1184
+ max_per_img: -1
+ nms_thresh: 0.5
+
+TestDataset:
+ !SniperCOCODataSet
+ image_dir: val
+ dataset_dir: dataset/VisDrone2019_coco
+ is_trainset: false
+ image_target_sizes: [8145, 2742]
+ valid_box_ratio_ranges: [[-1, 0.03142857142857144], [0.02333211853008726, -1]]
+ chip_target_size: 1536
+ chip_target_stride: 1184
+ max_per_img: -1
+ nms_thresh: 0.5
+
+
diff --git a/PaddleDetection-release-2.6/configs/datasets/spine_coco.yml b/PaddleDetection-release-2.6/configs/datasets/spine_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2339c26db1fcd55a52c8cc7b7dc2623964b7c97a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/spine_coco.yml
@@ -0,0 +1,21 @@
+metric: RBOX
+num_classes: 9
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/spine_coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/valid.json
+ dataset_dir: dataset/spine_coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_poly']
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/valid.json
+ dataset_dir: dataset/spine_coco
diff --git a/PaddleDetection-release-2.6/configs/datasets/visdrone_detection.yml b/PaddleDetection-release-2.6/configs/datasets/visdrone_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..37feb6e2618ff9d83ce2842a9e581dcfd31efc78
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/visdrone_detection.yml
@@ -0,0 +1,22 @@
+metric: COCO
+num_classes: 10
+
+TrainDataset:
+ !COCODataSet
+ image_dir: VisDrone2019-DET-train
+ anno_path: train.json
+ dataset_dir: dataset/visdrone
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: VisDrone2019-DET-val
+ anno_path: val.json
+ # image_dir: test_dev
+ # anno_path: test_dev.json
+ dataset_dir: dataset/visdrone
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/visdrone
diff --git a/PaddleDetection-release-2.6/configs/datasets/voc.yml b/PaddleDetection-release-2.6/configs/datasets/voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..72182bed9d17ca076c94a1872613ce7ad29d36d9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/voc.yml
@@ -0,0 +1,21 @@
+metric: VOC
+map_type: 11point
+num_classes: 20
+
+TrainDataset:
+ name: VOCDataSet
+ dataset_dir: dataset/voc
+ anno_path: trainval.txt
+ label_list: label_list.txt
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
+
+EvalDataset:
+ name: VOCDataSet
+ dataset_dir: dataset/voc
+ anno_path: test.txt
+ label_list: label_list.txt
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
+
+TestDataset:
+ name: ImageFolder
+ anno_path: dataset/voc/label_list.txt
diff --git a/PaddleDetection-release-2.6/configs/datasets/wider_face.yml b/PaddleDetection-release-2.6/configs/datasets/wider_face.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cc01378d728af95ce7072001aefea08ba80631e2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/datasets/wider_face.yml
@@ -0,0 +1,20 @@
+metric: WiderFace
+num_classes: 1
+
+TrainDataset:
+ !WIDERFaceDataSet
+ dataset_dir: dataset/wider_face
+ anno_path: wider_face_split/wider_face_train_bbx_gt.txt
+ image_dir: WIDER_train/images
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+EvalDataset:
+ !WIDERFaceDataSet
+ dataset_dir: dataset/wider_face
+ anno_path: wider_face_split/wider_face_val_bbx_gt.txt
+ image_dir: WIDER_val/images
+ data_fields: ['image']
+
+TestDataset:
+ !ImageFolder
+ use_default_label: true
diff --git a/PaddleDetection-release-2.6/configs/dcn/README.md b/PaddleDetection-release-2.6/configs/dcn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..90502dcea6f3ae3a452fa3f48d2005801900064f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/README.md
@@ -0,0 +1,37 @@
+### Deformable ConvNets v2
+
+| 骨架网络 | 网络类型 | 卷积 | 每张GPU图片个数 | 学习率策略 |推理时间(fps)| Box AP | Mask AP | 下载 | 配置文件 |
+| :------------------- | :------------- | :-----: |:--------: | :-----: | :-----------: |:----: | :-----: | :----------------------------------------------------------: | :----: |
+| ResNet50-FPN | Faster | c3-c5 | 1 | 1x | - | 42.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/faster_rcnn_dcn_r50_fpn_1x_coco.yml) |
+| ResNet50-vd-FPN | Faster | c3-c5 | 1 | 1x | - | 42.7 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_r50_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_1x_coco.yml) |
+| ResNet50-vd-FPN | Faster | c3-c5 | 1 | 2x | - | 43.7 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_r50_vd_fpn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_2x_coco.yml) |
+| ResNet101-vd-FPN | Faster | c3-c5 | 1 | 1x | - | 45.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_r101_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/faster_rcnn_dcn_r101_vd_fpn_1x_coco.yml) |
+| ResNeXt101-vd-FPN | Faster | c3-c5 | 1 | 1x | - | 46.5 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.pdparams) |[配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml) |
+| ResNet50-FPN | Mask | c3-c5 | 1 | 1x | - | 42.7 | 38.4 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_dcn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/mask_rcnn_dcn_r50_fpn_1x_coco.yml) |
+| ResNet50-vd-FPN | Mask | c3-c5 | 1 | 2x | - | 44.6 | 39.8 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_dcn_r50_vd_fpn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/mask_rcnn_dcn_r50_vd_fpn_2x_coco.yml) |
+| ResNet101-vd-FPN | Mask | c3-c5 | 1 | 1x | - | 45.6 | 40.6 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_dcn_r101_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/mask_rcnn_dcn_r101_vd_fpn_1x_coco.yml) |
+| ResNeXt101-vd-FPN | Mask | c3-c5 | 1 | 1x | - | 47.3 | 42.0 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml) |
+| ResNet50-FPN | Cascade Faster | c3-c5 | 1 | 1x | - | 42.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_dcn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/cascade_rcnn_dcn_r50_fpn_1x_coco.yml) |
+| ResNeXt101-vd-FPN | Cascade Faster | c3-c5 | 1 | 1x | - | 48.8 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/dcn/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml) |
+
+
+**注意事项:**
+
+- Deformable卷积网络v2(dcn_v2)参考自论文[Deformable ConvNets v2](https://arxiv.org/abs/1811.11168).
+- `c3-c5`意思是在resnet模块的3到5阶段增加`dcn`.
+
+## Citations
+```
+@inproceedings{dai2017deformable,
+ title={Deformable Convolutional Networks},
+ author={Dai, Jifeng and Qi, Haozhi and Xiong, Yuwen and Li, Yi and Zhang, Guodong and Hu, Han and Wei, Yichen},
+ booktitle={Proceedings of the IEEE international conference on computer vision},
+ year={2017}
+}
+@article{zhu2018deformable,
+ title={Deformable ConvNets v2: More Deformable, Better Results},
+ author={Zhu, Xizhou and Hu, Han and Lin, Stephen and Dai, Jifeng},
+ journal={arXiv preprint arXiv:1811.11168},
+ year={2018}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/dcn/cascade_rcnn_dcn_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/cascade_rcnn_dcn_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9f2738f3a85175b7cde3e1dac962177cb5852912
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/cascade_rcnn_dcn_r50_fpn_1x_coco.yml
@@ -0,0 +1,16 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../cascade_rcnn/_base_/optimizer_1x.yml',
+ '../cascade_rcnn/_base_/cascade_rcnn_r50_fpn.yml',
+ '../cascade_rcnn/_base_/cascade_fpn_reader.yml',
+]
+weights: output/cascade_rcnn_dcn_r50_fpn_1x_coco/model_final
+
+ResNet:
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
diff --git a/PaddleDetection-release-2.6/configs/dcn/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4180919edcbf139f4c61109c084e1cd289caba0e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml
@@ -0,0 +1,16 @@
+_BASE_: [
+ 'cascade_rcnn_dcn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
+weights: output/cascade_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco/model_final
+
+ResNet:
+ depth: 101
+ groups: 64
+ base_width: 4
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
diff --git a/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r101_vd_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r101_vd_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..274c1710bb11612798c8368e3edd048b4fddad97
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r101_vd_fpn_1x_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ 'faster_rcnn_dcn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
+weights: output/faster_rcnn_dcn_r101_vd_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
diff --git a/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1cd02ac1f2cedcc7892497752a4d4779dc635718
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r50_fpn_1x_coco.yml
@@ -0,0 +1,16 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../faster_rcnn/_base_/optimizer_1x.yml',
+ '../faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
+ '../faster_rcnn/_base_/faster_fpn_reader.yml',
+]
+weights: output/faster_rcnn_dcn_r50_fpn_1x_coco/model_final
+
+ResNet:
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
diff --git a/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..735edbbd1e8160467761eb1e79406ac4ed89de9b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_1x_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ 'faster_rcnn_dcn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
+weights: output/faster_rcnn_dcn_r50_vd_fpn_2x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
diff --git a/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..685d9671068e76139da5212e3059403626843ccc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_r50_vd_fpn_2x_coco.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ 'faster_rcnn_dcn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
+weights: output/faster_rcnn_dcn_r50_vd_fpn_2x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..68fef482bed4eeaa09faf27c9babece0c57adaed
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml
@@ -0,0 +1,17 @@
+_BASE_: [
+ 'faster_rcnn_dcn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
+weights: output/faster_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco/model_final
+
+ResNet:
+ # for ResNeXt: groups, base_width, base_channels
+ depth: 101
+ groups: 64
+ base_width: 4
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
diff --git a/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_r101_vd_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_r101_vd_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..930bd89875e360374aed7a970989878cc63c34c0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_r101_vd_fpn_1x_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ 'mask_rcnn_dcn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
+weights: output/mask_rcnn_dcn_r101_vd_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
diff --git a/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b14a1ed1dd2ed3bf8522ebe810be8b0d4d0f80a7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_r50_fpn_1x_coco.yml
@@ -0,0 +1,16 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '../mask_rcnn/_base_/optimizer_1x.yml',
+ '../mask_rcnn/_base_/mask_rcnn_r50_fpn.yml',
+ '../mask_rcnn/_base_/mask_fpn_reader.yml',
+]
+weights: output/mask_rcnn_dcn_r50_fpn_1x_coco/model_final
+
+ResNet:
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
diff --git a/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_r50_vd_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_r50_vd_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d36b5f56f0c790b9eb6aa7e9f7778057a66fc1be
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_r50_vd_fpn_2x_coco.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ 'mask_rcnn_dcn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
+weights: output/mask_rcnn_dcn_r50_vd_fpn_2x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8e7857c5916ccd0da2177fee64dc662971e8922f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dcn/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco.yml
@@ -0,0 +1,17 @@
+_BASE_: [
+ 'mask_rcnn_dcn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
+weights: output/mask_rcnn_dcn_x101_vd_64x4d_fpn_1x_coco/model_final
+
+ResNet:
+ # for ResNeXt: groups, base_width, base_channels
+ depth: 101
+ variant: d
+ groups: 64
+ base_width: 4
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
diff --git a/PaddleDetection-release-2.6/configs/deformable_detr/README.md b/PaddleDetection-release-2.6/configs/deformable_detr/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..09118ec5b547e754facf22d0ed5202231dc85174
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/deformable_detr/README.md
@@ -0,0 +1,36 @@
+# Deformable DETR
+
+## Introduction
+
+
+Deformable DETR is an object detection model based on DETR. We reproduced the model of the paper.
+
+
+## Model Zoo
+
+| Backbone | Model | Images/GPU | Inf time (fps) | Box AP | Config | Download |
+|:------:|:--------:|:--------:|:--------------:|:------:|:------:|:--------:|
+| R-50 | Deformable DETR | 2 | --- | 44.5 | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/deformable_detr/deformable_detr_r50_1x_coco.yml) | [model](https://paddledet.bj.bcebos.com/models/deformable_detr_r50_1x_coco.pdparams) |
+
+**Notes:**
+
+- Deformable DETR is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
+- Deformable DETR uses 8GPU to train 50 epochs.
+
+GPU multi-card training
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/deformable_detr/deformable_detr_r50_1x_coco.yml --fleet
+```
+
+## Citations
+```
+@inproceedings{
+zhu2021deformable,
+title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
+author={Xizhou Zhu and Weijie Su and Lewei Lu and Bin Li and Xiaogang Wang and Jifeng Dai},
+booktitle={International Conference on Learning Representations},
+year={2021},
+url={https://openreview.net/forum?id=gZ9hCDWe6ke}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/deformable_detr/_base_/deformable_detr_r50.yml b/PaddleDetection-release-2.6/configs/deformable_detr/_base_/deformable_detr_r50.yml
new file mode 100644
index 0000000000000000000000000000000000000000..641129a6e519dd234d1a418d702f31bd97e6365a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/deformable_detr/_base_/deformable_detr_r50.yml
@@ -0,0 +1,48 @@
+architecture: DETR
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vb_normal_pretrained.pdparams
+hidden_dim: 256
+use_focal_loss: True
+
+
+DETR:
+ backbone: ResNet
+ transformer: DeformableTransformer
+ detr_head: DeformableDETRHead
+ post_process: DETRBBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1, 2, 3]
+ lr_mult_list: [0.0, 0.1, 0.1, 0.1]
+ num_stages: 4
+
+
+DeformableTransformer:
+ num_queries: 300
+ position_embed_type: sine
+ nhead: 8
+ num_encoder_layers: 6
+ num_decoder_layers: 6
+ dim_feedforward: 1024
+ dropout: 0.1
+ activation: relu
+ num_feature_levels: 4
+ num_encoder_points: 4
+ num_decoder_points: 4
+
+
+DeformableDETRHead:
+ num_mlp_layers: 3
+
+
+DETRLoss:
+ loss_coeff: {class: 2, bbox: 5, giou: 2, mask: 1, dice: 1}
+ aux_loss: True
+
+
+HungarianMatcher:
+ matcher_coeff: {class: 2, bbox: 5, giou: 2}
diff --git a/PaddleDetection-release-2.6/configs/deformable_detr/_base_/deformable_detr_reader.yml b/PaddleDetection-release-2.6/configs/deformable_detr/_base_/deformable_detr_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c15a0f3b6390fb7627f46f040fbd5054398b0e6b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/deformable_detr/_base_/deformable_detr_reader.yml
@@ -0,0 +1,48 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomSelect: { transforms1: [ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ],
+ transforms2: [
+ RandomShortSideResize: { short_side_sizes: [ 400, 500, 600 ] },
+ RandomSizeCrop: { min_size: 384, max_size: 600 },
+ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ]
+ }
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - NormalizeBox: {}
+ - BboxXYXY2XYWH: {}
+ - Permute: {}
+ batch_transforms:
+ - PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/deformable_detr/_base_/deformable_optimizer_1x.yml b/PaddleDetection-release-2.6/configs/deformable_detr/_base_/deformable_optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c068f4de493fabb52fac94d3d55c8b2b04efd850
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/deformable_detr/_base_/deformable_optimizer_1x.yml
@@ -0,0 +1,16 @@
+epoch: 50
+
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [40]
+ use_warmup: false
+
+OptimizerBuilder:
+ clip_grad_by_norm: 0.1
+ regularizer: false
+ optimizer:
+ type: AdamW
+ weight_decay: 0.0001
diff --git a/PaddleDetection-release-2.6/configs/deformable_detr/deformable_detr_r50_1x_coco.yml b/PaddleDetection-release-2.6/configs/deformable_detr/deformable_detr_r50_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4ca749106d418ad7e188a3ebba33fc0cb2860279
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/deformable_detr/deformable_detr_r50_1x_coco.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/deformable_optimizer_1x.yml',
+ '_base_/deformable_detr_r50.yml',
+ '_base_/deformable_detr_reader.yml',
+]
+weights: output/deformable_detr_r50_1x_coco/model_final
+find_unused_parameters: True
diff --git a/PaddleDetection-release-2.6/configs/detr/README.md b/PaddleDetection-release-2.6/configs/detr/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..8f661212a950e858161acdf44efa00d4c343f209
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/detr/README.md
@@ -0,0 +1,39 @@
+# DETR
+
+## Introduction
+
+
+DETR is an object detection model based on transformer. We reproduced the model of the paper.
+
+
+## Model Zoo
+
+| Backbone | Model | Images/GPU | Inf time (fps) | Box AP | Config | Download |
+|:------:|:--------:|:--------:|:--------------:|:------:|:------:|:--------:|
+| R-50 | DETR | 4 | --- | 42.3 | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/detr/detr_r50_1x_coco.yml) | [model](https://paddledet.bj.bcebos.com/models/detr_r50_1x_coco.pdparams) |
+
+**Notes:**
+
+- DETR is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
+- DETR uses 8GPU to train 500 epochs.
+
+GPU multi-card training
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/detr/detr_r50_1x_coco.yml --fleet
+```
+
+## Citations
+```
+@inproceedings{detr,
+ author = {Nicolas Carion and
+ Francisco Massa and
+ Gabriel Synnaeve and
+ Nicolas Usunier and
+ Alexander Kirillov and
+ Sergey Zagoruyko},
+ title = {End-to-End Object Detection with Transformers},
+ booktitle = {ECCV},
+ year = {2020}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/detr/_base_/detr_r50.yml b/PaddleDetection-release-2.6/configs/detr/_base_/detr_r50.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5006f11937c9a7c2566913a08144fbb6ee3d0efa
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/detr/_base_/detr_r50.yml
@@ -0,0 +1,44 @@
+architecture: DETR
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vb_normal_pretrained.pdparams
+hidden_dim: 256
+
+
+DETR:
+ backbone: ResNet
+ transformer: DETRTransformer
+ detr_head: DETRHead
+ post_process: DETRBBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [3]
+ lr_mult_list: [0.0, 0.1, 0.1, 0.1]
+ num_stages: 4
+
+
+DETRTransformer:
+ num_queries: 100
+ position_embed_type: sine
+ nhead: 8
+ num_encoder_layers: 6
+ num_decoder_layers: 6
+ dim_feedforward: 2048
+ dropout: 0.1
+ activation: relu
+
+
+DETRHead:
+ num_mlp_layers: 3
+
+
+DETRLoss:
+ loss_coeff: {class: 1, bbox: 5, giou: 2, no_object: 0.1, mask: 1, dice: 1}
+ aux_loss: True
+
+
+HungarianMatcher:
+ matcher_coeff: {class: 1, bbox: 5, giou: 2}
diff --git a/PaddleDetection-release-2.6/configs/detr/_base_/detr_reader.yml b/PaddleDetection-release-2.6/configs/detr/_base_/detr_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..997ef724afcebc3ba648ea3f09858b9950dd0550
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/detr/_base_/detr_reader.yml
@@ -0,0 +1,48 @@
+worker_num: 0
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomSelect: { transforms1: [ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ],
+ transforms2: [
+ RandomShortSideResize: { short_side_sizes: [ 400, 500, 600 ] },
+ RandomSizeCrop: { min_size: 384, max_size: 600 },
+ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ]
+ }
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - NormalizeBox: {}
+ - BboxXYXY2XYWH: {}
+ - Permute: {}
+ batch_transforms:
+ - PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/detr/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/detr/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..13528c5eba5bc81092c7af62e289a4d887c6f15f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/detr/_base_/optimizer_1x.yml
@@ -0,0 +1,16 @@
+epoch: 500
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [400]
+ use_warmup: false
+
+OptimizerBuilder:
+ clip_grad_by_norm: 0.1
+ regularizer: false
+ optimizer:
+ type: AdamW
+ weight_decay: 0.0001
diff --git a/PaddleDetection-release-2.6/configs/detr/detr_r50_1x_coco.yml b/PaddleDetection-release-2.6/configs/detr/detr_r50_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d8838fac01b446168338ba27fcf4c2ae1722f0ef
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/detr/detr_r50_1x_coco.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/detr_r50.yml',
+ '_base_/detr_reader.yml',
+]
+weights: output/detr_r50_1x_coco/model_final
+find_unused_parameters: True
diff --git a/PaddleDetection-release-2.6/configs/dino/README.md b/PaddleDetection-release-2.6/configs/dino/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..e7d666f8b0f6e4bf28f37bbea5dfdcf64b68ce97
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dino/README.md
@@ -0,0 +1,39 @@
+# DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
+
+## Introduction
+
+
+[DINO](https://arxiv.org/abs/2203.03605) is an object detection model based on DETR. We reproduced the model of the paper.
+
+
+## Model Zoo
+
+| Backbone | Model | Epochs | Box AP | Config | Download |
+|:------:|:---------------:|:------:|:------:|:---------------------------------------:|:--------------------------------------------------------------------------------:|
+| R-50 | dino_r50_4scale | 12 | 49.1 | [config](./dino_r50_4scale_1x_coco.yml) | [model](https://paddledet.bj.bcebos.com/models/dino_r50_4scale_1x_coco.pdparams) |
+| R-50 | dino_r50_4scale | 24 | 50.5 | [config](./dino_r50_4scale_2x_coco.yml) | [model](https://paddledet.bj.bcebos.com/models/dino_r50_4scale_2x_coco.pdparams) |
+
+**Notes:**
+
+- DINO is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
+- DINO uses 4GPU to train.
+
+GPU multi-card training
+```bash
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/dino/dino_r50_4scale_1x_coco.yml --fleet --eval
+```
+
+## Custom Operator
+- Multi-scale deformable attention custom operator see [here](../../ppdet/modeling/transformers/ext_op).
+
+## Citations
+```
+@misc{zhang2022dino,
+ title={DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection},
+ author={Hao Zhang and Feng Li and Shilong Liu and Lei Zhang and Hang Su and Jun Zhu and Lionel M. Ni and Heung-Yeung Shum},
+ year={2022},
+ eprint={2203.03605},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/dino/_base_/dino_r50.yml b/PaddleDetection-release-2.6/configs/dino/_base_/dino_r50.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0b151bd48960fbfbba90962018525d53bd5a8865
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dino/_base_/dino_r50.yml
@@ -0,0 +1,49 @@
+architecture: DETR
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+hidden_dim: 256
+use_focal_loss: True
+
+
+DETR:
+ backbone: ResNet
+ transformer: DINOTransformer
+ detr_head: DINOHead
+ post_process: DETRBBoxPostProcess
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1, 2, 3]
+ lr_mult_list: [0.0, 0.1, 0.1, 0.1]
+ num_stages: 4
+
+DINOTransformer:
+ num_queries: 900
+ position_embed_type: sine
+ num_levels: 4
+ nhead: 8
+ num_encoder_layers: 6
+ num_decoder_layers: 6
+ dim_feedforward: 2048
+ dropout: 0.0
+ activation: relu
+ pe_temperature: 20
+ pe_offset: 0.0
+ num_denoising: 100
+ label_noise_ratio: 0.5
+ box_noise_scale: 1.0
+ learnt_init_query: True
+
+DINOHead:
+ loss:
+ name: DINOLoss
+ loss_coeff: {class: 1, bbox: 5, giou: 2}
+ aux_loss: True
+ matcher:
+ name: HungarianMatcher
+ matcher_coeff: {class: 2, bbox: 5, giou: 2}
+
+DETRBBoxPostProcess:
+ num_top_queries: 300
diff --git a/PaddleDetection-release-2.6/configs/dino/_base_/dino_reader.yml b/PaddleDetection-release-2.6/configs/dino/_base_/dino_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c62a8054cf3593d75f51a85562dfd816ce1c3463
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dino/_base_/dino_reader.yml
@@ -0,0 +1,48 @@
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomSelect: { transforms1: [ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ],
+ transforms2: [
+ RandomShortSideResize: { short_side_sizes: [ 400, 500, 600 ] },
+ RandomSizeCrop: { min_size: 384, max_size: 600 },
+ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ]
+ }
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - NormalizeBox: {}
+ - BboxXYXY2XYWH: {}
+ - Permute: {}
+ batch_transforms:
+ - PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadMaskBatch: {pad_to_stride: -1, return_pad_mask: true}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/dino/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/dino/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..63b3a9ed27949559e77696af0b026c49118a1a5c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dino/_base_/optimizer_1x.yml
@@ -0,0 +1,16 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [11]
+ use_warmup: false
+
+OptimizerBuilder:
+ clip_grad_by_norm: 0.1
+ regularizer: false
+ optimizer:
+ type: AdamW
+ weight_decay: 0.0001
diff --git a/PaddleDetection-release-2.6/configs/dino/_base_/optimizer_2x.yml b/PaddleDetection-release-2.6/configs/dino/_base_/optimizer_2x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d009dfd2e0f8e6432b1dfd8888e15876b5cb8f3b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dino/_base_/optimizer_2x.yml
@@ -0,0 +1,16 @@
+epoch: 24
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [20]
+ use_warmup: false
+
+OptimizerBuilder:
+ clip_grad_by_norm: 0.1
+ regularizer: false
+ optimizer:
+ type: AdamW
+ weight_decay: 0.0001
diff --git a/PaddleDetection-release-2.6/configs/dino/dino_r50_4scale_1x_coco.yml b/PaddleDetection-release-2.6/configs/dino/dino_r50_4scale_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c3f471e14e1554844b93235663e0f2ad4a611bfe
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dino/dino_r50_4scale_1x_coco.yml
@@ -0,0 +1,11 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/dino_r50.yml',
+ '_base_/dino_reader.yml',
+]
+
+weights: output/dino_r50_4scale_1x_coco/model_final
+find_unused_parameters: True
+log_iter: 100
diff --git a/PaddleDetection-release-2.6/configs/dino/dino_r50_4scale_2x_coco.yml b/PaddleDetection-release-2.6/configs/dino/dino_r50_4scale_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e8588f55e03aef6e6287bd7653ea3973f5bfedb1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/dino/dino_r50_4scale_2x_coco.yml
@@ -0,0 +1,11 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_2x.yml',
+ '_base_/dino_r50.yml',
+ '_base_/dino_reader.yml',
+]
+
+weights: output/dino_r50_4scale_2x_coco/model_final
+find_unused_parameters: True
+log_iter: 100
diff --git a/PaddleDetection-release-2.6/configs/face_detection/README.md b/PaddleDetection-release-2.6/configs/face_detection/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..badfa494d049d38f4563293c4adea560deea900c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/face_detection/README.md
@@ -0,0 +1,176 @@
+# 人脸检测模型
+
+## 简介
+`face_detection`中提供高效、高速的人脸检测解决方案,包括最先进的模型和经典模型。
+
+
+
+## 模型库
+
+#### WIDER-FACE数据集上的mAP
+
+| 网络结构 | 输入尺寸 | 图片个数/GPU | 学习率策略 | Easy/Medium/Hard Set | 预测时延(SD855)| 模型大小(MB) | 下载 | 配置文件 |
+|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:|
+| BlazeFace | 640 | 8 | 1000e | 0.885 / 0.855 / 0.731 | - | 0.472 |[下载链接](https://paddledet.bj.bcebos.com/models/blazeface_1000e.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/face_detection/blazeface_1000e.yml) |
+| BlazeFace-FPN-SSH | 640 | 8 | 1000e | 0.907 / 0.883 / 0.793 | - | 0.479 |[下载链接](https://paddledet.bj.bcebos.com/models/blazeface_fpn_ssh_1000e.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/face_detection/blazeface_fpn_ssh_1000e.yml) |
+
+**注意:**
+- 我们使用多尺度评估策略得到`Easy/Medium/Hard Set`里的mAP。具体细节请参考[在WIDER-FACE数据集上评估](#在WIDER-FACE数据集上评估)。
+
+## 快速开始
+
+### 数据准备
+我们使用[WIDER-FACE数据集](http://shuoyang1213.me/WIDERFACE/)进行训练和模型测试,官方网站提供了详细的数据介绍。
+- WIDER-Face数据源:
+使用如下目录结构加载`wider_face`类型的数据集:
+
+ ```
+ dataset/wider_face/
+ ├── wider_face_split
+ │ ├── wider_face_train_bbx_gt.txt
+ │ ├── wider_face_val_bbx_gt.txt
+ ├── WIDER_train
+ │ ├── images
+ │ │ ├── 0--Parade
+ │ │ │ ├── 0_Parade_marchingband_1_100.jpg
+ │ │ │ ├── 0_Parade_marchingband_1_381.jpg
+ │ │ │ │ ...
+ │ │ ├── 10--People_Marching
+ │ │ │ ...
+ ├── WIDER_val
+ │ ├── images
+ │ │ ├── 0--Parade
+ │ │ │ ├── 0_Parade_marchingband_1_1004.jpg
+ │ │ │ ├── 0_Parade_marchingband_1_1045.jpg
+ │ │ │ │ ...
+ │ │ ├── 10--People_Marching
+ │ │ │ ...
+ ```
+
+- 手动下载数据集:
+要下载WIDER-FACE数据集,请运行以下命令:
+```
+cd dataset/wider_face && ./download_wider_face.sh
+```
+
+### 参数配置
+基础模型的配置可以参考`configs/face_detection/_base_/blazeface.yml`;
+改进模型增加FPN和SSH的neck结构,配置文件可以参考`configs/face_detection/_base_/blazeface_fpn.yml`,可以根据需求配置FPN和SSH,具体如下:
+```yaml
+BlazeNet:
+ blaze_filters: [[24, 24], [24, 24], [24, 48, 2], [48, 48], [48, 48]]
+ double_blaze_filters: [[48, 24, 96, 2], [96, 24, 96], [96, 24, 96],
+ [96, 24, 96, 2], [96, 24, 96], [96, 24, 96]]
+ act: hard_swish #配置backbone中BlazeBlock的激活函数,基础模型为relu,增加FPN和SSH时需使用hard_swish
+
+BlazeNeck:
+ neck_type : fpn_ssh #可选only_fpn、only_ssh和fpn_ssh
+ in_channel: [96,96]
+```
+
+
+
+### 训练与评估
+训练流程与评估流程方法与其他算法一致,请参考[GETTING_STARTED_cn.md](../../docs/tutorials/GETTING_STARTED_cn.md)。
+**注意:** 人脸检测模型目前不支持边训练边评估。
+
+#### 在WIDER-FACE数据集上评估
+- 步骤一:评估并生成结果文件:
+```shell
+python -u tools/eval.py -c configs/face_detection/blazeface_1000e.yml \
+ -o weights=output/blazeface_1000e/model_final \
+ multi_scale=True
+```
+设置`multi_scale=True`进行多尺度评估,评估完成后,将在`output/pred`中生成txt格式的测试结果。
+
+- 步骤二:下载官方评估脚本和Ground Truth文件:
+```
+wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip
+unzip eval_tools.zip && rm -f eval_tools.zip
+```
+
+- 步骤三:开始评估
+
+方法一:python评估:
+```
+git clone https://github.com/wondervictor/WiderFace-Evaluation.git
+cd WiderFace-Evaluation
+# 编译
+python3 setup.py build_ext --inplace
+# 开始评估
+python3 evaluation.py -p /path/to/PaddleDetection/output/pred -g /path/to/eval_tools/ground_truth
+```
+
+方法二:MatLab评估:
+```
+# 在`eval_tools/wider_eval.m`中修改保存结果路径和绘制曲线的名称:
+pred_dir = './pred';
+legend_name = 'Paddle-BlazeFace';
+
+`wider_eval.m` 是评估模块的主要执行程序。运行命令如下:
+matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;"
+```
+
+### Python脚本预测
+为了支持二次开发,这里提供通过Python脚本使用Paddle Detection whl包来进行预测的示例。
+```python
+import cv2
+import paddle
+import numpy as np
+from ppdet.core.workspace import load_config
+from ppdet.engine import Trainer
+from ppdet.metrics import get_infer_results
+from ppdet.data.transform.operators import NormalizeImage, Permute
+
+
+if __name__ == '__main__':
+ # 准备基础的参数
+ config_path = 'PaddleDetection/configs/face_detection/blazeface_1000e.yml'
+ cfg = load_config(config_path)
+ weight_path = 'PaddleDetection/output/blazeface_1000e.pdparams'
+ infer_img_path = 'PaddleDetection/demo/hrnet_demo.jpg'
+ cfg.weights = weight_path
+ bbox_thre = 0.8
+ paddle.set_device('gpu')
+ # 创建所需的类
+ trainer = Trainer(cfg, mode='test')
+ trainer.load_weights(cfg.weights)
+ trainer.model.eval()
+ normaler = NormalizeImage(mean=[123, 117, 104], std=[127.502231, 127.502231, 127.502231], is_scale=False)
+ permuter = Permute()
+ # 进行图片读取
+ im = cv2.imread(infer_img_path)
+ im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
+ # 准备数据字典
+ data_dict = {'image': im}
+ data_dict = normaler(data_dict)
+ data_dict = permuter(data_dict)
+ h, w, c = im.shape
+ data_dict['im_id'] = paddle.Tensor(np.array([[0]]))
+ data_dict['im_shape'] = paddle.Tensor(np.array([[h, w]], dtype=np.float32))
+ data_dict['scale_factor'] = paddle.Tensor(np.array([[1., 1.]], dtype=np.float32))
+ data_dict['image'] = paddle.Tensor(data_dict['image'].reshape((1, c, h, w)))
+ data_dict['curr_iter'] = paddle.Tensor(np.array([0]))
+ # 进行预测
+ outs = trainer.model(data_dict)
+ # 对预测的数据进行后处理得到最终的bbox信息
+ for key in ['im_shape', 'scale_factor', 'im_id']:
+ outs[key] = data_dict[key]
+ for key, value in outs.items():
+ outs[key] = value.numpy()
+ clsid2catid, catid2name = {0: 'face'}, {0: 0}
+ batch_res = get_infer_results(outs, clsid2catid)
+ bbox = [sub_dict for sub_dict in batch_res['bbox'] if sub_dict['score'] > bbox_thre]
+ print(bbox)
+```
+
+## Citations
+
+```
+@article{bazarevsky2019blazeface,
+ title={BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs},
+ author={Valentin Bazarevsky and Yury Kartynnik and Andrey Vakunov and Karthik Raveendran and Matthias Grundmann},
+ year={2019},
+ eprint={1907.05047},
+ archivePrefix={arXiv},
+```
diff --git a/PaddleDetection-release-2.6/configs/face_detection/README_en.md b/PaddleDetection-release-2.6/configs/face_detection/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..96bb53280ef442cbf5c0f12ee5e0cdef3bb57c33
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/face_detection/README_en.md
@@ -0,0 +1,176 @@
+# Face Detection Model
+
+## Introduction
+`face_detection` High efficiency, high speed face detection solutions, including the most advanced models and classic models.
+
+
+
+## Model Library
+
+#### A mAP on the WIDERFACE dataset
+
+| Network structure | size | images/GPUs | Learning rate strategy | Easy/Medium/Hard Set | Prediction delay(SD855)| Model size(MB) | Download | Configuration File |
+|:------------:|:--------:|:----:|:-------:|:-------:|:---------:|:----------:|:---------:|:--------:|
+| BlazeFace | 640 | 8 | 1000e | 0.885 / 0.855 / 0.731 | - | 0.472 |[link](https://paddledet.bj.bcebos.com/models/blazeface_1000e.pdparams) | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/face_detection/blazeface_1000e.yml) |
+| BlazeFace-FPN-SSH | 640 | 8 | 1000e | 0.907 / 0.883 / 0.793 | - | 0.479 |[link](https://paddledet.bj.bcebos.com/models/blazeface_fpn_ssh_1000e.pdparams) | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/face_detection/blazeface_fpn_ssh_1000e.yml) |
+
+**Attention:**
+- We use a multi-scale evaluation strategy to get the mAP in `Easy/Medium/Hard Set`. Please refer to the [evaluation on the WIDER FACE dataset](#Evaluated-on-the-WIDER-FACE-Dataset) for details.
+
+## Quick Start
+
+### Data preparation
+We use [WIDER-FACE dataset](http://shuoyang1213.me/WIDERFACE/) for training and model tests, the official web site provides detailed data is introduced.
+- WIDER-Face data source:
+- Load a dataset of type `wider_face` using the following directory structure:
+ ```
+ dataset/wider_face/
+ ├── wider_face_split
+ │ ├── wider_face_train_bbx_gt.txt
+ │ ├── wider_face_val_bbx_gt.txt
+ ├── WIDER_train
+ │ ├── images
+ │ │ ├── 0--Parade
+ │ │ │ ├── 0_Parade_marchingband_1_100.jpg
+ │ │ │ ├── 0_Parade_marchingband_1_381.jpg
+ │ │ │ │ ...
+ │ │ ├── 10--People_Marching
+ │ │ │ ...
+ ├── WIDER_val
+ │ ├── images
+ │ │ ├── 0--Parade
+ │ │ │ ├── 0_Parade_marchingband_1_1004.jpg
+ │ │ │ ├── 0_Parade_marchingband_1_1045.jpg
+ │ │ │ │ ...
+ │ │ ├── 10--People_Marching
+ │ │ │ ...
+ ```
+
+- Manually download the dataset:
+To download the WIDER-FACE dataset, run the following command:
+```
+cd dataset/wider_face && ./download_wider_face.sh
+```
+
+### Parameter configuration
+The configuration of the base model can be referenced to `configs/face_detection/_base_/blazeface.yml`;
+Improved model to add FPN and SSH neck structure, configuration files can be referenced to `configs/face_detection/_base_/blazeface_fpn.yml`, You can configure FPN and SSH as required
+```yaml
+BlazeNet:
+ blaze_filters: [[24, 24], [24, 24], [24, 48, 2], [48, 48], [48, 48]]
+ double_blaze_filters: [[48, 24, 96, 2], [96, 24, 96], [96, 24, 96],
+ [96, 24, 96, 2], [96, 24, 96], [96, 24, 96]]
+ act: hard_swish #Configure Blaze Block activation function in Backbone. The basic model is Relu. hard_swish is needed to add FPN and SSH
+
+BlazeNeck:
+ neck_type : fpn_ssh #only_fpn, only_ssh and fpn_ssh
+ in_channel: [96,96]
+```
+
+
+
+### Training and Evaluation
+The training process and evaluation process methods are consistent with other algorithms, please refer to [GETTING_STARTED_cn.md](../../docs/tutorials/GETTING_STARTED_cn.md)。
+**Attention:** Face detection models currently do not support training and evaluation.
+
+#### Evaluated on the WIDER-FACE Dataset
+- Step 1: Evaluate and generate a result file:
+```shell
+python -u tools/eval.py -c configs/face_detection/blazeface_1000e.yml \
+ -o weights=output/blazeface_1000e/model_final \
+ multi_scale=True
+```
+Set `multi_scale=True` for multi-scale evaluation. After evaluation, test results in TXT format will be generated in `output/pred`.
+
+- Step 2: Download the official evaluation script and Ground Truth file:
+```
+wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip
+unzip eval_tools.zip && rm -f eval_tools.zip
+```
+
+- Step 3: Start the evaluation
+
+Method 1: Python evaluation:
+```
+git clone https://github.com/wondervictor/WiderFace-Evaluation.git
+cd WiderFace-Evaluation
+# compile
+python3 setup.py build_ext --inplace
+# Begin to assess
+python3 evaluation.py -p /path/to/PaddleDetection/output/pred -g /path/to/eval_tools/ground_truth
+```
+
+Method 2: MatLab evaluation:
+```
+# Change the name of save result path and draw curve in `eval_tools/wider_eval.m`:
+pred_dir = './pred';
+legend_name = 'Paddle-BlazeFace';
+
+`wider_eval.m` is the main implementation of the evaluation module. Run the following command:
+matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;"
+```
+
+### Use by Python Code
+In order to support development, here is an example of using the Paddle Detection whl package to make predictions through Python code.
+```python
+import cv2
+import paddle
+import numpy as np
+from ppdet.core.workspace import load_config
+from ppdet.engine import Trainer
+from ppdet.metrics import get_infer_results
+from ppdet.data.transform.operators import NormalizeImage, Permute
+
+
+if __name__ == '__main__':
+ # prepare for the parameters
+ config_path = 'PaddleDetection/configs/face_detection/blazeface_1000e.yml'
+ cfg = load_config(config_path)
+ weight_path = 'PaddleDetection/output/blazeface_1000e.pdparams'
+ infer_img_path = 'PaddleDetection/demo/hrnet_demo.jpg'
+ cfg.weights = weight_path
+ bbox_thre = 0.8
+ paddle.set_device('gpu')
+ # create the class object
+ trainer = Trainer(cfg, mode='test')
+ trainer.load_weights(cfg.weights)
+ trainer.model.eval()
+ normaler = NormalizeImage(mean=[123, 117, 104], std=[127.502231, 127.502231, 127.502231], is_scale=False)
+ permuter = Permute()
+ # read the image file
+ im = cv2.imread(infer_img_path)
+ im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
+ # prepare for the data dict
+ data_dict = {'image': im}
+ data_dict = normaler(data_dict)
+ data_dict = permuter(data_dict)
+ h, w, c = im.shape
+ data_dict['im_id'] = paddle.Tensor(np.array([[0]]))
+ data_dict['im_shape'] = paddle.Tensor(np.array([[h, w]], dtype=np.float32))
+ data_dict['scale_factor'] = paddle.Tensor(np.array([[1., 1.]], dtype=np.float32))
+ data_dict['image'] = paddle.Tensor(data_dict['image'].reshape((1, c, h, w)))
+ data_dict['curr_iter'] = paddle.Tensor(np.array([0]))
+ # do the prediction
+ outs = trainer.model(data_dict)
+ # to do the postprocess to get the final bbox info
+ for key in ['im_shape', 'scale_factor', 'im_id']:
+ outs[key] = data_dict[key]
+ for key, value in outs.items():
+ outs[key] = value.numpy()
+ clsid2catid, catid2name = {0: 'face'}, {0: 0}
+ batch_res = get_infer_results(outs, clsid2catid)
+ bbox = [sub_dict for sub_dict in batch_res['bbox'] if sub_dict['score'] > bbox_thre]
+ print(bbox)
+```
+
+
+## Citations
+
+```
+@article{bazarevsky2019blazeface,
+ title={BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs},
+ author={Valentin Bazarevsky and Yury Kartynnik and Andrey Vakunov and Karthik Raveendran and Matthias Grundmann},
+ year={2019},
+ eprint={1907.05047},
+ archivePrefix={arXiv},
+```
diff --git a/PaddleDetection-release-2.6/configs/face_detection/_base_/blazeface.yml b/PaddleDetection-release-2.6/configs/face_detection/_base_/blazeface.yml
new file mode 100644
index 0000000000000000000000000000000000000000..de54100fe63c1d0dd004c5c1797b6a6587106993
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/face_detection/_base_/blazeface.yml
@@ -0,0 +1,45 @@
+architecture: BlazeFace
+
+BlazeFace:
+ backbone: BlazeNet
+ neck: BlazeNeck
+ blaze_head: FaceHead
+ post_process: BBoxPostProcess
+
+BlazeNet:
+ blaze_filters: [[24, 24], [24, 24], [24, 48, 2], [48, 48], [48, 48]]
+ double_blaze_filters: [[48, 24, 96, 2], [96, 24, 96], [96, 24, 96],
+ [96, 24, 96, 2], [96, 24, 96], [96, 24, 96]]
+ act: relu
+
+BlazeNeck:
+ neck_type : None
+ in_channel: [96,96]
+
+FaceHead:
+ in_channels: [96,96]
+ anchor_generator: AnchorGeneratorSSD
+ loss: SSDLoss
+
+SSDLoss:
+ overlap_threshold: 0.35
+
+AnchorGeneratorSSD:
+ steps: [8., 16.]
+ aspect_ratios: [[1.], [1.]]
+ min_sizes: [[16.,24.], [32., 48., 64., 80., 96., 128.]]
+ max_sizes: [[], []]
+ offset: 0.5
+ flip: False
+ min_max_aspect_ratios_order: false
+
+BBoxPostProcess:
+ decode:
+ name: SSDBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 750
+ score_threshold: 0.01
+ nms_threshold: 0.3
+ nms_top_k: 5000
+ nms_eta: 1.0
diff --git a/PaddleDetection-release-2.6/configs/face_detection/_base_/blazeface_fpn.yml b/PaddleDetection-release-2.6/configs/face_detection/_base_/blazeface_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6572a99d301eda65a65c485e133cc00497a2eee2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/face_detection/_base_/blazeface_fpn.yml
@@ -0,0 +1,45 @@
+architecture: BlazeFace
+
+BlazeFace:
+ backbone: BlazeNet
+ neck: BlazeNeck
+ blaze_head: FaceHead
+ post_process: BBoxPostProcess
+
+BlazeNet:
+ blaze_filters: [[24, 24], [24, 24], [24, 48, 2], [48, 48], [48, 48]]
+ double_blaze_filters: [[48, 24, 96, 2], [96, 24, 96], [96, 24, 96],
+ [96, 24, 96, 2], [96, 24, 96], [96, 24, 96]]
+ act: hard_swish
+
+BlazeNeck:
+ neck_type : fpn_ssh
+ in_channel: [96,96]
+
+FaceHead:
+ in_channels: [48, 48]
+ anchor_generator: AnchorGeneratorSSD
+ loss: SSDLoss
+
+SSDLoss:
+ overlap_threshold: 0.35
+
+AnchorGeneratorSSD:
+ steps: [8., 16.]
+ aspect_ratios: [[1.], [1.]]
+ min_sizes: [[16.,24.], [32., 48., 64., 80., 96., 128.]]
+ max_sizes: [[], []]
+ offset: 0.5
+ flip: False
+ min_max_aspect_ratios_order: false
+
+BBoxPostProcess:
+ decode:
+ name: SSDBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 750
+ score_threshold: 0.01
+ nms_threshold: 0.3
+ nms_top_k: 5000
+ nms_eta: 1.0
diff --git a/PaddleDetection-release-2.6/configs/face_detection/_base_/face_reader.yml b/PaddleDetection-release-2.6/configs/face_detection/_base_/face_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5a25e8aa0f1acdd1b3b235a8c1a3923eb2af4ba6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/face_detection/_base_/face_reader.yml
@@ -0,0 +1,44 @@
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 90
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {brightness: [0.5, 1.125, 0.875], random_apply: False}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomFlip: {}
+ - CropWithDataAchorSampling: {
+ anchor_sampler: [[1, 10, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.2, 0.0]],
+ batch_sampler: [
+ [1, 50, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0],
+ [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0],
+ [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0],
+ [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0],
+ [1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0],
+ ],
+ target_size: 640}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 1}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 90}
+ batch_transforms:
+ - NormalizeImage: {mean: [123, 117, 104], std: [127.502231, 127.502231, 127.502231], is_scale: false}
+ - Permute: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - NormalizeImage: {mean: [123, 117, 104], std: [127.502231, 127.502231, 127.502231], is_scale: false}
+ - Permute: {}
+ batch_size: 1
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - NormalizeImage: {mean: [123, 117, 104], std: [127.502231, 127.502231, 127.502231], is_scale: false}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/face_detection/_base_/optimizer_1000e.yml b/PaddleDetection-release-2.6/configs/face_detection/_base_/optimizer_1000e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d67da4c6786e9418029b3a336c7e8e7d80e2d0bf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/face_detection/_base_/optimizer_1000e.yml
@@ -0,0 +1,21 @@
+epoch: 1000
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 333
+ - 800
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.0
+ type: RMSProp
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/face_detection/blazeface_1000e.yml b/PaddleDetection-release-2.6/configs/face_detection/blazeface_1000e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..58fc908f81fc53c3ea3b39714826f2de5ea0fcea
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/face_detection/blazeface_1000e.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/wider_face.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1000e.yml',
+ '_base_/blazeface.yml',
+ '_base_/face_reader.yml',
+]
+weights: output/blazeface_1000e/model_final
+multi_scale_eval: True
diff --git a/PaddleDetection-release-2.6/configs/face_detection/blazeface_fpn_ssh_1000e.yml b/PaddleDetection-release-2.6/configs/face_detection/blazeface_fpn_ssh_1000e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..21dbd26443856710a5674f8e93e1cc0075836a38
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/face_detection/blazeface_fpn_ssh_1000e.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/wider_face.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1000e.yml',
+ '_base_/blazeface_fpn.yml',
+ '_base_/face_reader.yml',
+]
+weights: output/blazeface_fpn_ssh_1000e/model_final
+multi_scale_eval: True
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/README.md b/PaddleDetection-release-2.6/configs/faster_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..da495599ce180b80ce019ff1828ae63c1140a7ff
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/README.md
@@ -0,0 +1,38 @@
+# Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
+
+## Model Zoo
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | 下载 | 配置文件 |
+| :------------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50 | Faster | 1 | 1x | ---- | 36.7 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_1x_coco.pdparams) | [配置文件](./faster_rcnn_r50_1x_coco.yml) |
+| ResNet50-vd | Faster | 1 | 1x | ---- | 37.6 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_1x_coco.pdparams) | [配置文件](./faster_rcnn_r50_vd_1x_coco.yml) |
+| ResNet101 | Faster | 1 | 1x | ---- | 39.0 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_1x_coco.pdparams) | [配置文件](./faster_rcnn_r101_1x_coco.yml) |
+| ResNet34-FPN | Faster | 1 | 1x | ---- | 37.8 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r34_fpn_1x_coco.pdparams) | [配置文件](./faster_rcnn_r34_fpn_1x_coco.yml) |
+| ResNet34-FPN-MultiScaleTest | Faster | 1 | 1x | ---- | 38.2 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r34_fpn_multiscaletest_1x_coco.pdparams) | [配置文件](./faster_rcnn_r34_fpn_multiscaletest_1x_coco.yml) |
+| ResNet34-vd-FPN | Faster | 1 | 1x | ---- | 38.5 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r34_vd_fpn_1x_coco.pdparams) | [配置文件](./faster_rcnn_r34_vd_fpn_1x_coco.yml) |
+| ResNet50-FPN | Faster | 1 | 1x | ---- | 38.4 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](./faster_rcnn_r50_fpn_1x_coco.yml) |
+| ResNet50-FPN | Faster | 1 | 2x | ---- | 40.0 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_2x_coco.pdparams) | [配置文件](./faster_rcnn_r50_fpn_2x_coco.yml) |
+| ResNet50-vd-FPN | Faster | 1 | 1x | ---- | 39.5 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_1x_coco.pdparams) | [配置文件](./faster_rcnn_r50_vd_fpn_1x_coco.yml) |
+| ResNet50-vd-FPN | Faster | 1 | 2x | ---- | 40.8 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_2x_coco.pdparams) | [配置文件](./faster_rcnn_r50_vd_fpn_2x_coco.yml) |
+| ResNet101-FPN | Faster | 1 | 2x | ---- | 41.4 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_fpn_2x_coco.pdparams) | [配置文件](./faster_rcnn_r101_fpn_2x_coco.yml) |
+| ResNet101-vd-FPN | Faster | 1 | 1x | ---- | 42.0 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_vd_fpn_1x_coco.pdparams) | [配置文件](./faster_rcnn_r101_vd_fpn_1x_coco.yml) |
+| ResNet101-vd-FPN | Faster | 1 | 2x | ---- | 43.0 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_vd_fpn_2x_coco.pdparams) | [配置文件](./faster_rcnn_r101_vd_fpn_2x_coco.yml) |
+| ResNeXt101-vd-FPN | Faster | 1 | 1x | ---- | 43.4 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_x101_vd_64x4d_fpn_1x_coco.pdparams) | [配置文件](./faster_rcnn_x101_vd_64x4d_fpn_1x_coco.yml) |
+| ResNeXt101-vd-FPN | Faster | 1 | 2x | ---- | 44.0 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_x101_vd_64x4d_fpn_2x_coco.pdparams) | [配置文件](./faster_rcnn_x101_vd_64x4d_fpn_2x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Faster | 1 | 1x | ---- | 41.4 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_ssld_1x_coco.pdparams) | [配置文件](./faster_rcnn_r50_vd_fpn_ssld_1x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Faster | 1 | 2x | ---- | 42.3 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) | [配置文件](./faster_rcnn_r50_vd_fpn_ssld_2x_coco.yml) |
+| Swin-Tiny-FPN | Faster | 2 | 1x | ---- | 42.6 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_swin_tiny_fpn_1x_coco.pdparams) | [配置文件](./faster_rcnn_swin_tiny_fpn_1x_coco.yml) |
+| Swin-Tiny-FPN | Faster | 2 | 2x | ---- | 44.8 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_swin_tiny_fpn_2x_coco.pdparams) | [配置文件](./faster_rcnn_swin_tiny_fpn_2x_coco.yml) |
+| Swin-Tiny-FPN | Faster | 2 | 3x | ---- | 45.3 | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_swin_tiny_fpn_3x_coco.pdparams) | [配置文件](./faster_rcnn_swin_tiny_fpn_3x_coco.yml) |
+
+## Citations
+```
+@article{Ren_2017,
+ title={Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ publisher={Institute of Electrical and Electronics Engineers (IEEE)},
+ author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
+ year={2017},
+ month={Jun},
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_fpn_reader.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_fpn_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9b9abccd63e499bfa9402f3038425470e4a6e953
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_fpn_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_r50.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_r50.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fd29f5ea1a1df9e2599d3efcff344c5d3363945e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_r50.yml
@@ -0,0 +1,66 @@
+architecture: FasterRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+FasterRCNN:
+ backbone: ResNet
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [2]
+ num_stages: 3
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [32, 64, 128, 256, 512]
+ strides: [16]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 12000
+ post_nms_top_n: 2000
+ topk_after_collect: False
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 6000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: Res5Head
+ roi_extractor:
+ resolution: 14
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+ with_pool: true
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_r50_fpn.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..38ee81def0cb528f3f67e8ed616b9589bd72de9e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_r50_fpn.yml
@@ -0,0 +1,73 @@
+architecture: FasterRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_swin_reader.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_swin_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e1165cd0a03fd07f41eaea2701526639010cc7e9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_swin_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResizeCrop: {resizes: [400, 500, 600], cropsizes: [[384, 600], ], prob: 0.5}
+ - RandomResize: {target_size: [[480, 1333], [512, 1333], [544, 1333], [576, 1333], [608, 1333], [640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 2}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [-1, 3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: 640, keep_ratio: True}
+ - Pad: {size: 640}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_swin_tiny_fpn.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_swin_tiny_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6208600e324a2be5ae2a16f799d58d315dbe1692
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_rcnn_swin_tiny_fpn.yml
@@ -0,0 +1,72 @@
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: SwinTransformer
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ bbox_post_process: BBoxPostProcess
+
+SwinTransformer:
+ embed_dim: 96
+ depths: [2, 2, 6, 2]
+ num_heads: [3, 6, 12, 24]
+ window_size: 7
+ ape: false
+ drop_path_rate: 0.1
+ patch_norm: true
+ out_indices: [0,1,2,3]
+ pretrained: https://paddledet.bj.bcebos.com/models/pretrained/swin_tiny_patch4_window7_224.pdparams
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_reader.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e1c1bb6bc262e86ea69ae78919064aa2b6834311
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/faster_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4caaa63bda15917137a9ac22b736ae83c3d04856
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/optimizer_swin_1x.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/optimizer_swin_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5c1c6679940834f8ff3bb985bb44f6dc2f281428
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/_base_/optimizer_swin_1x.yml
@@ -0,0 +1,22 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 1.0
+ optimizer:
+ type: AdamW
+ weight_decay: 0.05
+
+ param_groups:
+ -
+ params: ['absolute_pos_embed', 'relative_position_bias_table', 'norm']
+ weight_decay: 0.
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8876426fb6d6d4f5b89c39e050f1331520d02656
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_1x_coco.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ 'faster_rcnn_r50_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_pretrained.pdparams
+weights: output/faster_rcnn_r101_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [2]
+ num_stages: 3
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a2e5ee527b60e95b121959492dba1855337467c9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_fpn_1x_coco.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_pretrained.pdparams
+weights: output/faster_rcnn_r101_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0a07dec75890977c9d717cce4a704dad59cec237
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_fpn_2x_coco.yml
@@ -0,0 +1,25 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_pretrained.pdparams
+weights: output/faster_rcnn_r101_fpn_2x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_vd_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_vd_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..32e308b86ef9937601d29c9026dafd7650d86080
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_vd_fpn_1x_coco.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
+weights: output/faster_rcnn_r101_vd_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_vd_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_vd_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..65b8226b9ec8f493fcdb6e82e5f6f9bba903cecf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r101_vd_fpn_2x_coco.yml
@@ -0,0 +1,25 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
+weights: output/faster_rcnn_r101_vd_fpn_2x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r34_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r34_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f1083528578c8ea681f4a550c6726fad31214d16
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r34_fpn_1x_coco.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet34_pretrained.pdparams
+weights: output/faster_rcnn_r34_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 34
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r34_fpn_multiscaletest_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r34_fpn_multiscaletest_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..559d5f1fe9fdcbf42189383a69f9d1a056792cda
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r34_fpn_multiscaletest_1x_coco.yml
@@ -0,0 +1,22 @@
+_BASE_: [
+ 'faster_rcnn_r34_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet34_pretrained.pdparams
+weights: output/faster_rcnn_r34_fpn_multiscaletest_1x_coco/model_final
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+# - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - MultiscaleTestResize: {origin_target_size: [800, 1333], target_size: [700 , 900], use_flip: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+# - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - MultiscaleTestResize: {origin_target_size: [800, 1333], target_size: [700 , 900], use_flip: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r34_vd_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r34_vd_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5cf576b6384ec25bb92fd40705cac8b6196ca793
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r34_vd_fpn_1x_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet34_vd_pretrained.pdparams
+weights: output/faster_rcnn_r34_vd_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 34
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a49bde88a8e90cfd55264262d3475b18954d1bd4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/faster_rcnn_r50.yml',
+ '_base_/faster_reader.yml',
+]
+weights: output/faster_rcnn_r50_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e7b4518957b46bb6310cc65820cb2afd75aaa8bf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/faster_rcnn_r50_fpn.yml',
+ '_base_/faster_fpn_reader.yml',
+]
+weights: output/faster_rcnn_r50_fpn_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7edaadc30250ca5dec06f2db69650291027d4fd3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+weights: output/faster_rcnn_r50_fpn_2x_coco/model_final
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ac0e720499ad204ad3a09785ac30ac6e6b1ef21c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_1x_coco.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ 'faster_rcnn_r50_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
+weights: output/faster_rcnn_r50_vd_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [2]
+ num_stages: 3
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6bf9d7101ee9a89b2c480e04bb3279d608c2f9e3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_1x_coco.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
+weights: output/faster_rcnn_r50_vd_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7fc3a883574a6694b5379b21a38be7de354ee6df
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_2x_coco.yml
@@ -0,0 +1,25 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
+weights: output/faster_rcnn_r50_vd_fpn_2x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_ssld_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_ssld_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d71b82d8301ffd26c86d68245be890fd99e4dec0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_ssld_1x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/faster_rcnn_r50_fpn.yml',
+ '_base_/faster_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/faster_rcnn_r50_vd_fpn_ssld_1x_coco/model_final
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+epoch: 12
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0562354e7a3c64bf1dd96a21108868dbca70d46e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/faster_rcnn_r50_fpn.yml',
+ '_base_/faster_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/faster_rcnn_r50_vd_fpn_ssld_2x_coco/model_final
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_swin_tiny_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_swin_tiny_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7bb783b6aae6e76cf88eeed750087573aa6a0060
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_swin_tiny_fpn_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_swin_1x.yml',
+ '_base_/faster_rcnn_swin_tiny_fpn.yml',
+ '_base_/faster_rcnn_swin_reader.yml',
+]
+weights: output/faster_rcnn_swin_tiny_fpn_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_swin_tiny_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_swin_tiny_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5848c4943b4a40a5b306fb87d9aae7508f56a8c7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_swin_tiny_fpn_2x_coco.yml
@@ -0,0 +1,22 @@
+_BASE_: [
+ 'faster_rcnn_swin_tiny_fpn_1x_coco.yml',
+]
+weights: output/faster_rcnn_swin_tiny_fpn_2x_coco/model_final
+
+epoch: 24
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 1.0
+ optimizer:
+ type: AdamW
+ weight_decay: 0.05
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_swin_tiny_fpn_3x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_swin_tiny_fpn_3x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a1b68cf4703886be497d8efa6aea4b9c5d256797
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_swin_tiny_fpn_3x_coco.yml
@@ -0,0 +1,22 @@
+_BASE_: [
+ 'faster_rcnn_swin_tiny_fpn_1x_coco.yml',
+]
+weights: output/faster_rcnn_swin_tiny_fpn_3x_coco/model_final
+
+epoch: 36
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [24, 33]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 1.0
+ optimizer:
+ type: AdamW
+ weight_decay: 0.05
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_x101_vd_64x4d_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_x101_vd_64x4d_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..317d3741e38e5e4a4720add59e1f0792bf8c4a82
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_x101_vd_64x4d_fpn_1x_coco.yml
@@ -0,0 +1,17 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
+weights: output/faster_rcnn_x101_vd_64x4d_fpn_1x_coco/model_final
+
+ResNet:
+ # for ResNeXt: groups, base_width, base_channels
+ depth: 101
+ groups: 64
+ base_width: 4
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_x101_vd_64x4d_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_x101_vd_64x4d_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..939878f247b2552d6e9e4364f5c9e6443c71de31
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/faster_rcnn/faster_rcnn_x101_vd_64x4d_fpn_2x_coco.yml
@@ -0,0 +1,28 @@
+_BASE_: [
+ 'faster_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
+weights: output/faster_rcnn_x101_vd_64x4d_fpn_2x_coco/model_final
+
+ResNet:
+ # for ResNeXt: groups, base_width, base_channels
+ depth: 101
+ groups: 64
+ base_width: 4
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/fcos/README.md b/PaddleDetection-release-2.6/configs/fcos/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..7b3a58004de7f43264365182949a0532e8d91897
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/fcos/README.md
@@ -0,0 +1,37 @@
+# FCOS (Fully Convolutional One-Stage Object Detection)
+
+## Model Zoo on COCO
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | 下载 | 配置文件 |
+| :------------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50-FPN | FCOS | 2 | 1x | ---- | 39.6 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_1x_coco.pdparams) | [config](./fcos_r50_fpn_1x_coco.yml) |
+| ResNet50-FPN | FCOS + iou | 2 | 1x | ---- | 40.0 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_iou_1x_coco.pdparams) | [config](./fcos_r50_fpn_iou_1x_coco.yml) |
+| ResNet50-FPN | FCOS + DCN | 2 | 1x | ---- | 44.3 | [download](https://paddledet.bj.bcebos.com/models/fcos_dcn_r50_fpn_1x_coco.pdparams) | [config](./fcos_dcn_r50_fpn_1x_coco.yml) |
+| ResNet50-FPN | FCOS + multiscale_train | 2 | 2x | ---- | 41.8 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_multiscale_2x_coco.pdparams) | [config](./fcos_r50_fpn_multiscale_2x_coco.yml) |
+| ResNet50-FPN | FCOS + multiscale_train + iou | 2 | 2x | ---- | 42.6 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_iou_multiscale_2x_coco.pdparams) | [config](./fcos_r50_fpn_iou_multiscale_2x_coco.yml) |
+
+**注意:**
+ - `+ iou` 表示与原版 FCOS 相比,不使用 `centerness` 而是使用 `iou` 来参与计算loss。
+ - 基于 FCOS 的半监督检测方法 `DenseTeaher` 可以参照[DenseTeaher](../semi_det/denseteacher)去使用,结合无标签数据可以进一步提升检测性能。
+ - PaddleDetection中默认使用`R50-vb`预训练,如果使用`R50-vd`结合[SSLD](../../../docs/feature_models/SSLD_PRETRAINED_MODEL.md)的预训练模型,可进一步显著提升检测精度,同时backbone部分配置也需要做出相应更改,如:
+ ```python
+ pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+ ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1, 2, 3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+ ```
+
+## Citations
+```
+@inproceedings{tian2019fcos,
+ title = {{FCOS}: Fully Convolutional One-Stage Object Detection},
+ author = {Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
+ booktitle = {Proc. Int. Conf. Computer Vision (ICCV)},
+ year = {2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/fcos/_base_/fcos_r50_fpn.yml b/PaddleDetection-release-2.6/configs/fcos/_base_/fcos_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6ace6b51f17fcd9fc2015a549039b8919f312012
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/fcos/_base_/fcos_r50_fpn.yml
@@ -0,0 +1,48 @@
+architecture: FCOS
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+FCOS:
+ backbone: ResNet
+ neck: FPN
+ fcos_head: FCOSHead
+
+ResNet:
+ depth: 50
+ variant: 'b'
+ norm_type: bn
+ freeze_at: 0 # res2
+ return_idx: [1, 2, 3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ spatial_scales: [0.125, 0.0625, 0.03125]
+ extra_stage: 2
+ has_extra_convs: True
+ use_c5: False
+
+FCOSHead:
+ fcos_feat:
+ name: FCOSFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: "gn"
+ use_dcn: False
+ fpn_stride: [8, 16, 32, 64, 128]
+ prior_prob: 0.01
+ norm_reg_targets: True
+ centerness_on_reg: True
+ num_shift: 0.5
+ fcos_loss:
+ name: FCOSLoss
+ loss_alpha: 0.25
+ loss_gamma: 2.0
+ iou_loss_type: "giou"
+ reg_weights: 1.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/fcos/_base_/fcos_reader.yml b/PaddleDetection-release-2.6/configs/fcos/_base_/fcos_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8f0016125eb666ca5526ec5edd3373cf081adf6e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/fcos/_base_/fcos_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - RandomFlip: {}
+ batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 128}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ norm_reg_targets: True
+ batch_size: 2
+ shuffle: True
+ drop_last: True
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 128}
+ batch_size: 1
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 128}
+ batch_size: 1
+ fuse_normalize: True
diff --git a/PaddleDetection-release-2.6/configs/fcos/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/fcos/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d28b0947b9fb6567a70f11acfe6663dac89b0771
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/fcos/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/fcos/fcos_dcn_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/fcos/fcos_dcn_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..93ac2c0eb0dd7860259bf38549d3aa176b00cdc8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/fcos/fcos_dcn_r50_fpn_1x_coco.yml
@@ -0,0 +1,16 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/fcos_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/fcos_reader.yml',
+]
+
+weights: output/fcos_dcn_r50_fpn_1x_coco/model_final
+
+ResNet:
+ dcn_v2_stages: [1, 2, 3]
+
+FCOSHead:
+ fcos_feat:
+ use_dcn: True
diff --git a/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0b47d454fffa5056453356faf9e073a4a9d4ec60
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_1x_coco.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/fcos_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/fcos_reader.yml',
+]
+
+weights: output/fcos_r50_fpn_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_iou_1x_coco.yml b/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_iou_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..18c33cf8e221af72ca3edb1ad355572ee456a3ae
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_iou_1x_coco.yml
@@ -0,0 +1,79 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/fcos_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/fcos_reader.yml',
+]
+
+weights: output/fcos_r50_fpn_iou_1x_coco/model_final
+
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - RandomFlip: {}
+ batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ norm_reg_targets: True
+ batch_size: 2
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ fuse_normalize: True
+
+
+FCOSHead:
+ fcos_feat:
+ name: FCOSFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: "gn"
+ use_dcn: False
+ fpn_stride: [8, 16, 32, 64, 128]
+ prior_prob: 0.01
+ norm_reg_targets: True
+ centerness_on_reg: True
+ fcos_loss:
+ name: FCOSLoss
+ loss_alpha: 0.25
+ loss_gamma: 2.0
+ iou_loss_type: "giou"
+ reg_weights: 1.0
+ quality: "iou" # default 'centerness'
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml b/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d53ea17f57b6c78076668489ca5246122bfe8edb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml
@@ -0,0 +1,91 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/fcos_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/fcos_reader.yml',
+]
+
+weights: output/fcos_r50_fpn_iou_multiscale_2x_coco_010/model_final
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - RandomFlip: {}
+ batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ norm_reg_targets: True
+ batch_size: 2
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ fuse_normalize: True
+
+
+epoch: 24
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+
+FCOSHead:
+ fcos_feat:
+ name: FCOSFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: "gn"
+ use_dcn: False
+ fpn_stride: [8, 16, 32, 64, 128]
+ prior_prob: 0.01
+ norm_reg_targets: True
+ centerness_on_reg: True
+ fcos_loss:
+ name: FCOSLoss
+ loss_alpha: 0.25
+ loss_gamma: 2.0
+ iou_loss_type: "giou"
+ reg_weights: 1.0
+ quality: "iou" # default 'centerness'
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_multiscale_2x_coco.yml b/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_multiscale_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0afdbbc5be62468ae073258badb0ee2773948e3c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/fcos/fcos_r50_fpn_multiscale_2x_coco.yml
@@ -0,0 +1,40 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/fcos_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/fcos_reader.yml',
+]
+
+weights: output/fcos_r50_fpn_multiscale_2x_coco/model_final
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - RandomFlip: {}
+ batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 128}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ norm_reg_targets: True
+ batch_size: 2
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+epoch: 24
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
diff --git a/PaddleDetection-release-2.6/configs/few-shot/README.md b/PaddleDetection-release-2.6/configs/few-shot/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..e000a90d4355d5daf738232c67b6a49f739c7495
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/README.md
@@ -0,0 +1,76 @@
+# Co-tuning for Transfer Learning
Supervised Contrastive Learning
+
+## Data preparation
+以[Kaggle数据集](https://www.kaggle.com/andrewmvd/road-sign-detection) 比赛数据为例,说明如何准备自定义数据。
+Kaggle上的 [road-sign-detection](https://www.kaggle.com/andrewmvd/road-sign-detection) 比赛数据包含877张图像,数据类别4类:crosswalk,speedlimit,stop,trafficlight。
+可从Kaggle上下载,也可以从[下载链接](https://fsdet-dataset.bj.bcebos.com/roadsign_coco.tar.gz) 下载。
+分别从原始数据集中每类选取相同样本(例如:10shots即每类都有十个训练样本)训练即可。
+工业数据集使用PKU-Market-PCB,该数据集用于印刷电路板(PCB)的瑕疵检测,提供了6种常见的PCB缺陷[下载链接](https://fsdet-dataset.bj.bcebos.com/pcb.tar.gz)
+
+
+## Model Zoo
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 每类样本个数 | Box AP | 下载 | 配置文件 |
+| :------------------- | :------------- | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50-vd | Faster | 1 | 10 | 60.1 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/faster_rcnn_r50_vd_fpn_1x_coco.pdparams) | [配置文件](./faster_rcnn_r50_vd_fpn_1x_coco_cotuning_roadsign.yml) |
+| PPYOLOE_crn_s | PPYOLOE | 1 | 30 | 17.8 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_s_80e_contrast_pcb.pdparams) |[配置文件](./ppyoloe_plus_crn_s_80e_contrast_pcb.yml) |
+
+## Compare-cotuning
+| 骨架网络 | 网络类型 | 每张GPU图片个数 |每类样本个数 | Cotuning | Box AP |
+| :------------------- | :------------- | :-----: | :-----: | :------------: | :-----: |
+| ResNet50-vd | Faster | 1 | 10 | False | 56.7 |
+| ResNet50-vd | Faster | 1 | 10 | True | 60.1 |
+
+## Compare-contrast
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 每类样本个数 | Contrast | Box AP |
+| :------------------- | :------------- | :-----: | :-----: | :------------: | :-----: |
+| PPYOLOE_crn_s | PPYOLOE | 1 | 30 | False | 15.4 |
+| PPYOLOE_crn_s | PPYOLOE | 1 | 30 | True | 17.8 |
+
+## Training & Evaluation & Inference
+### 1、Training
+
+```
+# -c 参数表示指定使用哪个配置文件
+# --eval 参数表示边训练边评估,训练过程中会保存验证效果最佳的checkpoint
+
+python tools/train.py -c configs/few-shot/faster_rcnn_r50_vd_fpn_1x_coco_cotuning_roadsign.yml --eval
+```
+### 2、Evaluation
+```
+# -c 参数表示指定使用哪个配置文件
+# -o 参数表示指定配置文件中的全局变量(覆盖配置文件中的设置)
+
+python tools/eval.py -c configs/few-shot/faster_rcnn_r50_vd_fpn_1x_coco_cotuning_roadsign.yml \
+ -o weights=output/faster_rcnn_r50_vd_fpn_1x_coco_cotuning_roadsign/best_model
+```
+
+
+### 3、Inference
+```
+# -c 参数表示指定使用哪个配置文件
+# --infer_img 参数指定预测图像路径
+
+python tools/infer.py -c configs/few-shot/faster_rcnn_r50_vd_fpn_1x_coco_cotuning_roadsign.yml \
+ --infer_img=demo/road554.png
+```
+
+## Citations
+```
+@article{you2020co,
+ title={Co-tuning for transfer learning},
+ author={You, Kaichao and Kou, Zhi and Long, Mingsheng and Wang, Jianmin},
+ journal={Advances in Neural Information Processing Systems},
+ volume={33},
+ pages={17236--17246},
+ year={2020}
+}
+
+@article{khosla2020supervised,
+ title={Supervised contrastive learning},
+ author={Khosla, Prannay and Teterwak, Piotr and Wang, Chen and Sarna, Aaron and Tian, Yonglong and Isola, Phillip and Maschinot, Aaron and Liu, Ce and Krishnan, Dilip},
+ journal={Advances in Neural Information Processing Systems},
+ volume={33},
+ pages={18661--18673},
+ year={2020}
+}
+```
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_fpn_reader.yml b/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_fpn_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9b9abccd63e499bfa9402f3038425470e4a6e953
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_fpn_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_rcnn_r50.yml b/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_rcnn_r50.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fd29f5ea1a1df9e2599d3efcff344c5d3363945e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_rcnn_r50.yml
@@ -0,0 +1,66 @@
+architecture: FasterRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+FasterRCNN:
+ backbone: ResNet
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [2]
+ num_stages: 3
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [32, 64, 128, 256, 512]
+ strides: [16]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 12000
+ post_nms_top_n: 2000
+ topk_after_collect: False
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 6000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: Res5Head
+ roi_extractor:
+ resolution: 14
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+ with_pool: true
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_rcnn_r50_fpn.yml b/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_rcnn_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..38ee81def0cb528f3f67e8ed616b9589bd72de9e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_rcnn_r50_fpn.yml
@@ -0,0 +1,73 @@
+architecture: FasterRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_reader.yml b/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e1c1bb6bc262e86ea69ae78919064aa2b6834311
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/_base_/faster_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/few-shot/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/few-shot/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4caaa63bda15917137a9ac22b736ae83c3d04856
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/few-shot/_base_/optimizer_80e.yml b/PaddleDetection-release-2.6/configs/few-shot/_base_/optimizer_80e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7a8773df15aa103f3194f56634604d84a2a084eb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/_base_/optimizer_80e.yml
@@ -0,0 +1,18 @@
+epoch: 80
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/few-shot/_base_/ppyoloe_plus_crn.yml b/PaddleDetection-release-2.6/configs/few-shot/_base_/ppyoloe_plus_crn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a83f35008f4797311689ed952abef15df0c0eea7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/_base_/ppyoloe_plus_crn.yml
@@ -0,0 +1,49 @@
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+use_cot: False
+ema_decay: 0.9998
+ema_black_list: ['proj_conv.weight']
+custom_black_list: ['reduce_mean']
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 30
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/few-shot/_base_/ppyoloe_plus_reader.yml b/PaddleDetection-release-2.6/configs/few-shot/_base_/ppyoloe_plus_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cd9cdeff8b9d46e41a4e6fb518339168dfd4b154
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/_base_/ppyoloe_plus_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/few-shot/faster_rcnn_r50_vd_fpn_1x_coco_cotuning_roadsign.yml b/PaddleDetection-release-2.6/configs/few-shot/faster_rcnn_r50_vd_fpn_1x_coco_cotuning_roadsign.yml
new file mode 100644
index 0000000000000000000000000000000000000000..75fd9e3d0ccaa56fd77d8851711b8a44720df566
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/faster_rcnn_r50_vd_fpn_1x_coco_cotuning_roadsign.yml
@@ -0,0 +1,67 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/faster_rcnn_r50_fpn.yml',
+ '_base_/faster_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_1x_coco.pdparams
+weights: output/faster_rcnn_r50_vd_fpn_1x_coco_cotuning_roadsign/model_final
+
+snapshot_epoch: 5
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+epoch: 30
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+use_cot: True
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+ cot_classes: 80
+ loss_cot:
+ name: COTLoss
+ cot_lambda: 1
+ cot_scale: 1
+
+num_classes: 4
+metric: COCO
+map_type: integral
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train_shots10.json
+ dataset_dir: dataset/roadsign_coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/roadsign_valid.json
+ dataset_dir: dataset/roadsign_coco
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/roadsign_valid.json
+ dataset_dir: dataset/roadsign_coco
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/few-shot/ppyoloe_plus_crn_s_80e_contrast_pcb.yml b/PaddleDetection-release-2.6/configs/few-shot/ppyoloe_plus_crn_s_80e_contrast_pcb.yml
new file mode 100644
index 0000000000000000000000000000000000000000..05320089fcb1f19a690edd030f9b57b502909a38
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/few-shot/ppyoloe_plus_crn_s_80e_contrast_pcb.yml
@@ -0,0 +1,81 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_80e.yml',
+ './_base_/ppyoloe_plus_crn.yml',
+ './_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_plus_crn_s_80e_contrast_pcb/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+epoch: 80
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEContrastHead
+ post_process: ~
+
+PPYOLOEContrastHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5, contrast: 0.2}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ contrast_loss:
+ name: SupContrast
+ temperature: 100
+ sample_num: 2048
+ thresh: 0.75
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+num_classes: 6
+metric: COCO
+map_type: integral
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: pcb_cocoanno/train_shots30.json
+ dataset_dir: dataset/pcb
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: pcb_cocoanno/val.json
+ dataset_dir: dataset/pcb
+
+TestDataset:
+ !ImageFolder
+ anno_path: pcb_cocoanno/val.json
+ dataset_dir: dataset/pcb
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/gfl/README.md b/PaddleDetection-release-2.6/configs/gfl/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..b2b79cbac443e2e519660e7ab99b3d29695d5aa8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/README.md
@@ -0,0 +1,40 @@
+# Generalized Focal Loss Model(GFL)
+
+## Introduction
+
+We reproduce the object detection results in the paper [Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection](https://arxiv.org/abs/2006.04388) and [Generalized Focal Loss V2](https://arxiv.org/pdf/2011.12885.pdf). And We use a better performing pre-trained model and ResNet-vd structure to improve mAP.
+
+## Model Zoo
+
+| Backbone | Model | batch-size/GPU | lr schedule |FPS | Box AP | download | config |
+| :-------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50 | GFL | 2 | 1x | ---- | 41.0 | [model](https://paddledet.bj.bcebos.com/models/gfl_r50_fpn_1x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r50_fpn_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gfl/gfl_r50_fpn_1x_coco.yml) |
+| ResNet50 | GFL + [CWD](../slim/README.md) | 2 | 2x | ---- | 44.0 | [model](https://paddledet.bj.bcebos.com/models/gfl_r50_fpn_2x_coco_cwd.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r50_fpn_2x_coco_cwd.log) | [config1](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gfl/gfl_r50_fpn_1x_coco.yml), [config2](../slim/distill/gfl_r101vd_fpn_coco_distill_cwd.yml) |
+| ResNet101-vd | GFL | 2 | 2x | ---- | 46.8 | [model](https://paddledet.bj.bcebos.com/models/gfl_r101vd_fpn_mstrain_2x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r101vd_fpn_mstrain_2x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml) |
+| ResNet34-vd | GFL | 2 | 1x | ---- | 40.8 | [model](https://paddledet.bj.bcebos.com/models/gfl_r34vd_1x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r34vd_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gfl/gfl_r34vd_1x_coco.yml) |
+| ResNet18-vd | GFL | 2 | 1x | ---- | 36.6 | [model](https://paddledet.bj.bcebos.com/models/gfl_r18vd_1x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gfl_r18vd_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gfl/gfl_r18vd_1x_coco.yml) |
+| ResNet18-vd | GFL + [LD](../slim/README.md) | 2 | 1x | ---- | 38.2 | [model](https://bj.bcebos.com/v1/paddledet/models/gfl_slim_ld_r18vd_1x_coco.pdparams) | [log](https://bj.bcebos.com/v1/paddledet/logs/train_gfl_slim_ld_r18vd_1x_coco.log) | [config1](./gfl_slim_ld_r18vd_1x_coco.yml), [config2](../slim/distill/gfl_ld_distill.yml) |
+| ResNet50 | GFLv2 | 2 | 1x | ---- | 41.2 | [model](https://paddledet.bj.bcebos.com/models/gflv2_r50_fpn_1x_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_gflv2_r50_fpn_1x_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gfl/gflv2_r50_fpn_1x_coco.yml) |
+
+
+**Notes:**
+
+- GFL is trained on COCO train2017 dataset with 8 GPUs and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
+
+## Citations
+```
+@article{li2020generalized,
+ title={Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection},
+ author={Li, Xiang and Wang, Wenhai and Wu, Lijun and Chen, Shuo and Hu, Xiaolin and Li, Jun and Tang, Jinhui and Yang, Jian},
+ journal={arXiv preprint arXiv:2006.04388},
+ year={2020}
+}
+
+@article{li2020gflv2,
+ title={Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection},
+ author={Li, Xiang and Wang, Wenhai and Hu, Xiaolin and Li, Jun and Tang, Jinhui and Yang, Jian},
+ journal={arXiv preprint arXiv:2011.12885},
+ year={2020}
+}
+
+```
diff --git a/PaddleDetection-release-2.6/configs/gfl/_base_/gfl_r50_fpn.yml b/PaddleDetection-release-2.6/configs/gfl/_base_/gfl_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..488bec61ee11f93200edb352dc0088f83f48bba3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/_base_/gfl_r50_fpn.yml
@@ -0,0 +1,51 @@
+architecture: GFL
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+GFL:
+ backbone: ResNet
+ neck: FPN
+ head: GFLHead
+
+ResNet:
+ depth: 50
+ variant: b
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ spatial_scales: [0.125, 0.0625, 0.03125]
+ extra_stage: 2
+ has_extra_convs: true
+ use_c5: false
+
+GFLHead:
+ conv_feat:
+ name: FCOSFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: "gn"
+ use_dcn: false
+ fpn_stride: [8, 16, 32, 64, 128]
+ prior_prob: 0.01
+ reg_max: 16
+ loss_class:
+ name: QualityFocalLoss
+ use_sigmoid: True
+ beta: 2.0
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.25
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/gfl/_base_/gfl_reader.yml b/PaddleDetection-release-2.6/configs/gfl/_base_/gfl_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e36395c569f32b69f040e20ccd38aff350fbf91e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/_base_/gfl_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {prob: 0.5}
+ - Resize: {target_size: [800, 1333], keep_ratio: true, interp: 1}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2GFLTarget:
+ downsample_ratios: [8, 16, 32, 64, 128]
+ grid_cell_scale: 8
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: True
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
diff --git a/PaddleDetection-release-2.6/configs/gfl/_base_/gflv2_r50_fpn.yml b/PaddleDetection-release-2.6/configs/gfl/_base_/gflv2_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e9708d86a140ef03bf67d6abc18bbcf00dd3baa4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/_base_/gflv2_r50_fpn.yml
@@ -0,0 +1,56 @@
+architecture: GFL
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+GFL:
+ backbone: ResNet
+ neck: FPN
+ head: GFLHead
+
+ResNet:
+ depth: 50
+ variant: b
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ spatial_scales: [0.125, 0.0625, 0.03125]
+ extra_stage: 2
+ has_extra_convs: true
+ use_c5: false
+
+GFLHead:
+ conv_feat:
+ name: FCOSFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: "gn"
+ use_dcn: false
+ fpn_stride: [8, 16, 32, 64, 128]
+ prior_prob: 0.01
+ reg_max: 16
+ dgqp_module:
+ name: DGQP
+ reg_topk: 4
+ reg_channels: 64
+ add_mean: True
+ loss_class:
+ name: QualityFocalLoss
+ use_sigmoid: False
+ beta: 2.0
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.25
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/gfl/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/gfl/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..39c54ac805031619debf9b31119afa86b3ead857
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml b/PaddleDetection-release-2.6/configs/gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..04e6804b7180b8df07062f5e48dfd90f38fc45c2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml
@@ -0,0 +1,46 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/gfl_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/gfl_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
+weights: output/gfl_r101vd_fpn_mstrain_2x_coco/model_final
+find_unused_parameters: True
+use_ema: true
+ema_decay: 0.9998
+
+ResNet:
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
+
+epoch: 24
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[480, 1333], [512, 1333], [544, 1333], [576, 1333], [608, 1333], [640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2GFLTarget:
+ downsample_ratios: [8, 16, 32, 64, 128]
+ grid_cell_scale: 8
diff --git a/PaddleDetection-release-2.6/configs/gfl/gfl_r18vd_1x_coco.yml b/PaddleDetection-release-2.6/configs/gfl/gfl_r18vd_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a38c86eee8d5a0669aa6a09f2e66ff08311450e3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/gfl_r18vd_1x_coco.yml
@@ -0,0 +1,19 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/gfl_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/gfl_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet18_vd_pretrained.pdparams
+weights: output/gfl_r18vd_1x_coco/model_final
+find_unused_parameters: True
+
+ResNet:
+ depth: 18
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/gfl/gfl_r34vd_1x_coco.yml b/PaddleDetection-release-2.6/configs/gfl/gfl_r34vd_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1f15085556b0fab9cf5693afe406510f5d55684a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/gfl_r34vd_1x_coco.yml
@@ -0,0 +1,19 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/gfl_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/gfl_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet34_vd_pretrained.pdparams
+weights: output/gfl_r34vd_1x_coco/model_final
+find_unused_parameters: True
+
+ResNet:
+ depth: 34
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/gfl/gfl_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/gfl/gfl_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2e17b23d8b34b6094731c9f3115cc3325ce697bc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/gfl_r50_fpn_1x_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/gfl_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/gfl_reader.yml',
+]
+
+weights: output/gfl_r50_fpn_1x_coco/model_final
+find_unused_parameters: True
diff --git a/PaddleDetection-release-2.6/configs/gfl/gfl_slim_ld_r18vd_1x_coco.yml b/PaddleDetection-release-2.6/configs/gfl/gfl_slim_ld_r18vd_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4417b701d33c31be672a6e2a66ce8b19882d39fa
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/gfl_slim_ld_r18vd_1x_coco.yml
@@ -0,0 +1,73 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/gfl_reader.yml',
+]
+
+weights: output/gfl_r18vd_1x_coco/model_final
+find_unused_parameters: True
+
+architecture: GFL
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet18_vd_pretrained.pdparams
+
+GFL:
+ backbone: ResNet
+ neck: FPN
+ head: LDGFLHead
+
+ResNet:
+ depth: 18
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ spatial_scales: [0.125, 0.0625, 0.03125]
+ extra_stage: 2
+ has_extra_convs: true
+ use_c5: false
+
+LDGFLHead: # new head
+ conv_feat:
+ name: FCOSFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: "gn"
+ use_dcn: false
+ fpn_stride: [8, 16, 32, 64, 128]
+ prior_prob: 0.01
+ reg_max: 16
+ loss_class:
+ name: QualityFocalLoss
+ use_sigmoid: True
+ beta: 2.0
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.25
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.0
+ loss_ld:
+ name: KnowledgeDistillationKLDivLoss
+ loss_weight: 0.25
+ T: 10
+ loss_ld_vlr:
+ name: KnowledgeDistillationKLDivLoss
+ loss_weight: 0.25
+ T: 10
+ loss_kd:
+ name: KnowledgeDistillationKLDivLoss
+ loss_weight: 10
+ T: 2
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/gfl/gflv2_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/gfl/gflv2_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b0a3d410b19696c27f87841966fd9b51ad1088eb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gfl/gflv2_r50_fpn_1x_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/gflv2_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/gfl_reader.yml',
+]
+
+weights: output/gflv2_r50_fpn_1x_coco/model_final
+find_unused_parameters: True
diff --git a/PaddleDetection-release-2.6/configs/gn/README.md b/PaddleDetection-release-2.6/configs/gn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..dc3b88db8324c3a2a9f897f8f02024e382ad6ff6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gn/README.md
@@ -0,0 +1,23 @@
+# Group Normalization
+
+## Model Zoo
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps)| Box AP | Mask AP | 下载 | 配置文件 |
+| :------------- | :------------- | :-----------: | :------: | :--------: |:-----: | :-----: | :----: | :----: |
+| ResNet50-FPN | Faster | 1 | 2x | - | 41.9 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_gn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gn/faster_rcnn_r50_fpn_gn_2x_coco.yml) |
+| ResNet50-FPN | Mask | 1 | 2x | - | 42.3 | 38.4 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_fpn_gn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gn/mask_rcnn_r50_fpn_gn_2x_coco.yml) |
+| ResNet50-FPN | Cascade Faster | 1 | 2x | - | 44.6 | - | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_fpn_gn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gn/cascade_rcnn_r50_fpn_gn_2x_coco.yml) |
+| ResNet50-FPN | Cacade Mask | 1 | 2x | - | 45.0 | 39.3 | [下载链接](https://paddledet.bj.bcebos.com/models/cascade_mask_rcnn_r50_fpn_gn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/gn/cascade_mask_rcnn_r50_fpn_gn_2x_coco.yml) |
+
+
+**注意:** Faster R-CNN baseline仅使用 `2fc` head,而此处使用[`4conv1fc` head](https://arxiv.org/abs/1803.08494)(4层conv之间使用GN),并且FPN也使用GN,而对于Mask R-CNN是在mask head的4层conv之间也使用GN。
+
+## Citations
+```
+@inproceedings{wu2018group,
+ title={Group Normalization},
+ author={Wu, Yuxin and He, Kaiming},
+ booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
+ year={2018}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/gn/cascade_mask_rcnn_r50_fpn_gn_2x_coco.yml b/PaddleDetection-release-2.6/configs/gn/cascade_mask_rcnn_r50_fpn_gn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e2c750dfbe481eb6875fff6df0febba69d0ab947
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gn/cascade_mask_rcnn_r50_fpn_gn_2x_coco.yml
@@ -0,0 +1,61 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '../cascade_rcnn/_base_/optimizer_1x.yml',
+ '../cascade_rcnn/_base_/cascade_mask_rcnn_r50_fpn.yml',
+ '../cascade_rcnn/_base_/cascade_mask_fpn_reader.yml',
+]
+weights: output/cascade_mask_rcnn_r50_fpn_gn_2x_coco/model_final
+
+CascadeRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ mask_head: MaskHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+ mask_post_process: MaskPostProcess
+
+FPN:
+ out_channel: 256
+ norm_type: gn
+
+CascadeHead:
+ head: CascadeXConvNormHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+CascadeXConvNormHead:
+ num_convs: 4
+ out_channel: 1024
+ norm_type: gn
+
+MaskHead:
+ head: MaskFeat
+ roi_extractor:
+ resolution: 14
+ sampling_ratio: 0
+ aligned: True
+ mask_assigner: MaskAssigner
+ share_bbox_feat: False
+
+MaskFeat:
+ num_convs: 4
+ out_channel: 256
+ norm_type: gn
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/gn/cascade_rcnn_r50_fpn_gn_2x_coco.yml b/PaddleDetection-release-2.6/configs/gn/cascade_rcnn_r50_fpn_gn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2706790ed77301739e9d1374e9292f16a0c1c090
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gn/cascade_rcnn_r50_fpn_gn_2x_coco.yml
@@ -0,0 +1,37 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../cascade_rcnn/_base_/optimizer_1x.yml',
+ '../cascade_rcnn/_base_/cascade_rcnn_r50_fpn.yml',
+ '../cascade_rcnn/_base_/cascade_fpn_reader.yml',
+]
+weights: output/cascade_rcnn_r50_fpn_gn_2x_coco/model_final
+
+FPN:
+ out_channel: 256
+ norm_type: gn
+
+CascadeHead:
+ head: CascadeXConvNormHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+CascadeXConvNormHead:
+ num_convs: 4
+ out_channel: 1024
+ norm_type: gn
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/gn/faster_rcnn_r50_fpn_gn_2x_coco.yml b/PaddleDetection-release-2.6/configs/gn/faster_rcnn_r50_fpn_gn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..200a98b4b9fb615c17b7bd42f88b3bb1b2474370
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gn/faster_rcnn_r50_fpn_gn_2x_coco.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../faster_rcnn/_base_/optimizer_1x.yml',
+ '../faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
+ '../faster_rcnn/_base_/faster_fpn_reader.yml',
+]
+weights: output/faster_rcnn_r50_fpn_gn_2x_coco/model_final
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+FPN:
+ out_channel: 256
+ norm_type: gn
+
+BBoxHead:
+ head: XConvNormHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+XConvNormHead:
+ num_convs: 4
+ out_channel: 1024
+ norm_type: gn
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/gn/mask_rcnn_r50_fpn_gn_2x_coco.yml b/PaddleDetection-release-2.6/configs/gn/mask_rcnn_r50_fpn_gn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..70beaf5851df945745c904dc9932928d9cedac01
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/gn/mask_rcnn_r50_fpn_gn_2x_coco.yml
@@ -0,0 +1,61 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '../mask_rcnn/_base_/optimizer_1x.yml',
+ '../mask_rcnn/_base_/mask_rcnn_r50_fpn.yml',
+ '../mask_rcnn/_base_/mask_fpn_reader.yml',
+]
+weights: output/mask_rcnn_r50_fpn_gn_2x_coco/model_final
+
+MaskRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ mask_head: MaskHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+ mask_post_process: MaskPostProcess
+
+FPN:
+ out_channel: 256
+ norm_type: gn
+
+BBoxHead:
+ head: XConvNormHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+XConvNormHead:
+ num_convs: 4
+ out_channel: 1024
+ norm_type: gn
+
+MaskHead:
+ head: MaskFeat
+ roi_extractor:
+ resolution: 14
+ sampling_ratio: 0
+ aligned: True
+ mask_assigner: MaskAssigner
+ share_bbox_feat: False
+
+MaskFeat:
+ num_convs: 4
+ out_channel: 256
+ norm_type: gn
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/hrnet/README.md b/PaddleDetection-release-2.6/configs/hrnet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..f96ee347779fbebc31a44829be8e65765d3c089d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/hrnet/README.md
@@ -0,0 +1,34 @@
+# High-resolution networks (HRNets) for object detection
+
+## Introduction
+
+- Deep High-Resolution Representation Learning for Human Pose Estimation: [https://arxiv.org/abs/1902.09212](https://arxiv.org/abs/1902.09212)
+
+```
+@inproceedings{SunXLW19,
+ title={Deep High-Resolution Representation Learning for Human Pose Estimation},
+ author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
+ booktitle={CVPR},
+ year={2019}
+}
+```
+
+- High-Resolution Representations for Labeling Pixels and Regions: [https://arxiv.org/abs/1904.04514](https://arxiv.org/abs/1904.04514)
+
+```
+@article{SunZJCXLMWLW19,
+ title={High-Resolution Representations for Labeling Pixels and Regions},
+ author={Ke Sun and Yang Zhao and Borui Jiang and Tianheng Cheng and Bin Xiao
+ and Dong Liu and Yadong Mu and Xinggang Wang and Wenyu Liu and Jingdong Wang},
+ journal = {CoRR},
+ volume = {abs/1904.04514},
+ year={2019}
+}
+```
+
+## Model Zoo
+
+| Backbone | Type | Image/gpu | Lr schd | Inf time (fps) | Box AP | Mask AP | Download | Configs |
+| :---------------------- | :------------- | :-------: | :-----: | :------------: | :----: | :-----: | :----------------------------------------------------------: | :-----: |
+| HRNetV2p_W18 | Faster | 1 | 1x | - | 36.8 | - | [model](https://paddledet.bj.bcebos.com/models/faster_rcnn_hrnetv2p_w18_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco.yml) |
+| HRNetV2p_W18 | Faster | 1 | 2x | - | 39.0 | - | [model](https://paddledet.bj.bcebos.com/models/faster_rcnn_hrnetv2p_w18_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco.yml) |
diff --git a/PaddleDetection-release-2.6/configs/hrnet/_base_/faster_rcnn_hrnetv2p_w18.yml b/PaddleDetection-release-2.6/configs/hrnet/_base_/faster_rcnn_hrnetv2p_w18.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6c556f306fdc2ea5bd320376236143984f4cba6a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/hrnet/_base_/faster_rcnn_hrnetv2p_w18.yml
@@ -0,0 +1,68 @@
+architecture: FasterRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/HRNet_W18_C_pretrained.pdparams
+
+FasterRCNN:
+ backbone: HRNet
+ neck: HRFPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+HRNet:
+ width: 18
+ freeze_at: 0
+ return_idx: [0, 1, 2, 3]
+
+HRFPN:
+ out_channel: 256
+ share_conv: false
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco.yml b/PaddleDetection-release-2.6/configs/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6ff05964c41e05b2d7aaee9bf6ef330cee2337c0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco.yml
@@ -0,0 +1,23 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ './_base_/faster_rcnn_hrnetv2p_w18.yml',
+ '../faster_rcnn/_base_/optimizer_1x.yml',
+ '../faster_rcnn/_base_/faster_fpn_reader.yml',
+ '../runtime.yml',
+]
+
+weights: output/faster_rcnn_hrnetv2p_w18_1x_coco/model_final
+epoch: 12
+
+LearningRate:
+ base_lr: 0.02
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+TrainReader:
+ batch_size: 2
diff --git a/PaddleDetection-release-2.6/configs/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco.yml b/PaddleDetection-release-2.6/configs/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..73d9dc8850f67956c724bb033726b177aa703d37
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco.yml
@@ -0,0 +1,23 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ './_base_/faster_rcnn_hrnetv2p_w18.yml',
+ '../faster_rcnn/_base_/optimizer_1x.yml',
+ '../faster_rcnn/_base_/faster_fpn_reader.yml',
+ '../runtime.yml',
+]
+
+weights: output/faster_rcnn_hrnetv2p_w18_2x_coco/model_final
+epoch: 24
+
+LearningRate:
+ base_lr: 0.02
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+TrainReader:
+ batch_size: 2
diff --git a/PaddleDetection-release-2.6/configs/keypoint/KeypointBenchmark.md b/PaddleDetection-release-2.6/configs/keypoint/KeypointBenchmark.md
new file mode 100644
index 0000000000000000000000000000000000000000..c7e5bd6ac090ea332c794013f7855faa0fa438c9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/KeypointBenchmark.md
@@ -0,0 +1,50 @@
+# Keypoint Inference Benchmark
+
+## Benchmark on Server
+We tested benchmarks in different runtime environments。 See the table below for details.
+
+| Model | CPU + MKLDNN (thread=1) | CPU + MKLDNN (thread=4) | GPU | TensorRT (FP32) | TensorRT (FP16) |
+| :------------------------ | :------: | :------: | :-----: | :---: | :---: |
+| LiteHRNet-18-256x192 | 88.8 ms | 40.7 ms | 4.4 ms | 2.0 ms | 1.8 ms |
+| LiteHRNet-18-384x288 | 188.0 ms | 79.3 ms | 4.8 ms | 3.6 ms | 3.2 ms |
+| LiteHRNet-30-256x192 | 148.4 ms | 69.0 ms | 7.1 ms | 3.1 ms | 2.8 ms |
+| LiteHRNet-30-384x288 | 309.8 ms | 133.5 ms | 8.2 ms | 6.0 ms | 5.3 ms |
+| PP-TinyPose-128x96 | 25.2 ms | 14.1 ms | 2.7 ms | 0.9 ms | 0.8 ms |
+| PP-TinyPose-256x192 | 82.4 ms | 36.1 ms | 3.0 ms | 1.5 ms | 1.1 ms |
+
+**Notes:**
+- These tests above are based Python deployment.
+- The environment is NVIDIA T4 / PaddlePaddle(commit: 7df301f2fc0602745e40fa3a7c43ccedd41786ca) / CUDA10.1 / CUDNN7 / Python3.7 / TensorRT6.
+- The test is based on deploy/python/det_keypoint_unite_infer.py with image demo/000000014439.jpg. And input batch size for keypoint model is set to 8.
+- The time only includes inference time.
+
+
+| Model | CPU + MKLDNN (thread=1) | CPU + MKLDNN (thread=4) | GPU | TensorRT (FP32) | TensorRT (FP16) |
+| :------------------------ | :------: | :------: | :-----: | :---: | :---: |
+| DARK_HRNet_w32-256x192 | 363.93 ms | 97.38 ms | 4.13 ms | 3.74 ms | 1.75 ms |
+| DARK_HRNet_w32-384x288 | 823.71 ms | 218.55 ms | 9.44 ms | 8.91 ms | 2.96 ms |
+| HRNet_w32-256x192 | 363.67 ms | 97.64 ms | 4.11 ms | 3.71 ms | 1.72 ms |
+| HRNet_w32-256x256_mpii | 485.56 ms | 131.48 ms | 4.81 ms | 4.26 ms | 2.00 ms |
+| HRNet_w32-384x288 | 822.73 ms | 215.48 ms | 9.40 ms | 8.81 ms | 2.97 ms |
+| PP-TinyPose-128x96 | 24.06 ms | 13.05 ms | 2.43 ms | 0.75 ms | 0.72 ms |
+| PP-TinyPose-256x192 | 82.73 ms | 36.25 ms | 2.57 ms | 1.38 ms | 1.15 ms |
+
+
+**Notes:**
+- These tests above are based C++ deployment.
+- The environment is NVIDIA T4 / PaddlePaddle(commit: 7df301f2fc0602745e40fa3a7c43ccedd41786ca) / CUDA10.1 / CUDNN7 / Python3.7 / TensorRT6.
+- The test is based on deploy/python/det_keypoint_unite_infer.py with image demo/000000014439.jpg. And input batch size for keypoint model is set to 8.
+- The time only includes inference time.
+
+## Benchmark on Mobile
+We tested benchmarks on Kirin and Qualcomm Snapdragon devices. See the table below for details.
+
+| Model | Kirin 980 (1-thread) | Kirin 980 (4-threads) | Qualcomm Snapdragon 845 (1-thread) | Qualcomm Snapdragon 845 (4-threads) | Qualcomm Snapdragon 660 (1-thread) | Qualcomm Snapdragon 660 (4-threads) |
+| :------------------------ | :---: | :---: | :---: | :---: | :---: | :---: |
+| PicoDet-s-192x192 (det) | 14.85 ms | 5.45 ms | 17.50 ms | 7.56 ms | 80.08 ms | 27.36 ms |
+| PicoDet-s-320x320 (det) | 38.09 ms | 12.00 ms | 45.26 ms | 17.07 ms | 232.81 ms | 58.68 ms |
+| PP-TinyPose-128x96 (pose) | 12.03 ms | 5.09 ms | 13.14 ms | 6.73 ms | 71.87 ms | 20.04 ms |
+
+**Notes:**
+- These tests above are based Paddle Lite deployment, and version is v2.10-rc.
+- The time only includes inference time.
diff --git a/PaddleDetection-release-2.6/configs/keypoint/README.md b/PaddleDetection-release-2.6/configs/keypoint/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..f81b2fabbd4a27c5bb7a56fca7abce34660af556
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/README.md
@@ -0,0 +1,287 @@
+简体中文 | [English](README_en.md)
+
+# 关键点检测系列模型
+
+
+

+
+
+## 目录
+
+- [简介](#简介)
+- [模型推荐](#模型推荐)
+- [模型库](#模型库)
+- [快速开始](#快速开始)
+ - [环境安装](#1环境安装)
+ - [数据准备](#2数据准备)
+ - [训练与测试](#3训练与测试)
+ - [单卡训练](#单卡训练)
+ - [多卡训练](#多卡训练)
+ - [模型评估](#模型评估)
+ - [模型预测](#模型预测)
+ - [模型部署](#模型部署)
+ - [Top-Down模型联合部署](#top-down模型联合部署)
+ - [Bottom-Up模型独立部署](#bottom-up模型独立部署)
+ - [与多目标跟踪联合部署](#与多目标跟踪模型fairmot联合部署)
+ - [完整部署教程及Demo](#完整部署教程及Demo)
+- [自定义数据训练](#自定义数据训练)
+- [BenchMark](#benchmark)
+
+## 简介
+
+PaddleDetection 中的关键点检测部分紧跟最先进的算法,包括 Top-Down 和 Bottom-Up 两种方法,可以满足用户的不同需求。Top-Down 先检测对象,再检测特定关键点。Top-Down 模型的准确率会更高,但速度会随着对象数量的增加而变慢。不同的是,Bottom-Up 首先检测点,然后对这些点进行分组或连接以形成多个人体姿势实例。Bottom-Up 的速度是固定的,不会随着物体数量的增加而变慢,但精度会更低。
+
+同时,PaddleDetection 提供针对移动端设备优化的自研实时关键点检测模型 [PP-TinyPose](./tiny_pose/README.md)。
+
+## 模型推荐
+
+### 移动端模型推荐
+
+| 检测模型 | 关键点模型 | 输入尺寸 | COCO数据集精度 | 平均推理耗时 (FP16) | 参数量 (M) | Flops (G) | 模型权重 | Paddle-Lite部署模型(FP16) |
+| :----------------------------------------------------------- | :------------------------------------ | :------------------------------: | :-----------------------------: | :------------------------------------: | --------------------------- | :-------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| [PicoDet-S-Pedestrian](../picodet/legacy_model/application/pedestrian_detection/picodet_s_192_pedestrian.yml) | [PP-TinyPose](./tiny_pose/tinypose_128x96.yml) | 检测:192x192
关键点:128x96 | 检测mAP:29.0
关键点AP:58.1 | 检测耗时:2.37ms
关键点耗时:3.27ms | 检测:1.18
关键点:1.36 | 检测:0.35
关键点:0.08 | [检测](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.pdparams)
[关键点](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams) | [检测](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_fp16.nb)
[关键点](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_fp16.nb) |
+| [PicoDet-S-Pedestrian](../picodet/legacy_model/application/pedestrian_detection/picodet_s_320_pedestrian.yml) | [PP-TinyPose](./tiny_pose/tinypose_256x192.yml) | 检测:320x320
关键点:256x192 | 检测mAP:38.5
关键点AP:68.8 | 检测耗时:6.30ms
关键点耗时:8.33ms | 检测:1.18
关键点:1.36 | 检测:0.97
关键点:0.32 | [检测](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.pdparams)
[关键点](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) | [检测](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_fp16.nb)
[关键点](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_fp16.nb) |
+
+
+*详细关于PP-TinyPose的使用请参考[文档](./tiny_pose/README.md)。
+
+### 服务端模型推荐
+
+| 检测模型 | 关键点模型 | 输入尺寸 | COCO数据集精度 | 参数量 (M) | Flops (G) | 模型权重 |
+| :----------------------------------------------------------- | :----------------------------------------- | :------------------------------: | :-----------------------------: | :----------------------: | :----------------------: | :----------------------------------------------------------: |
+| [PP-YOLOv2](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml) | [HRNet-w32](./hrnet/hrnet_w32_384x288.yml) | 检测:640x640
关键点:384x288 | 检测mAP:49.5
关键点AP:77.8 | 检测:54.6
关键点:28.6 | 检测:115.8
关键点:17.3 | [检测](https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams)
[关键点](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x192.pdparams) |
+| [PP-YOLOv2](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml) | [HRNet-w32](./hrnet/hrnet_w32_256x192.yml) | 检测:640x640
关键点:256x192 | 检测mAP:49.5
关键点AP:76.9 | 检测:54.6
关键点:28.6 | 检测:115.8
关键点:7.68 | [检测](https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams)
[关键点](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_384x288.pdparams) |
+
+
+## 模型库
+
+COCO数据集
+
+| 模型 | 方案 |输入尺寸 | AP(coco val) | 模型下载 | 配置文件 |
+| :---------------- | -------- | :----------: | :----------------------------------------------------------: | ----------------------------------------------------| ------- |
+| PETR_Res50 |One-Stage| 512 | 65.5 | [petr_res50.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/petr_resnet50_16x2_coco.pdparams) | [config](./petr/petr_resnet50_16x2_coco.yml) |
+| HigherHRNet-w32 |Bottom-Up| 512 | 67.1 | [higherhrnet_hrnet_w32_512.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/higherhrnet_hrnet_w32_512.pdparams) | [config](./higherhrnet/higherhrnet_hrnet_w32_512.yml) |
+| HigherHRNet-w32 | Bottom-Up| 640 | 68.3 | [higherhrnet_hrnet_w32_640.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/higherhrnet_hrnet_w32_640.pdparams) | [config](./higherhrnet/higherhrnet_hrnet_w32_640.yml) |
+| HigherHRNet-w32+SWAHR |Bottom-Up| 512 | 68.9 | [higherhrnet_hrnet_w32_512_swahr.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/higherhrnet_hrnet_w32_512_swahr.pdparams) | [config](./higherhrnet/higherhrnet_hrnet_w32_512_swahr.yml) |
+| HRNet-w32 | Top-Down| 256x192 | 76.9 | [hrnet_w32_256x192.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x192.pdparams) | [config](./hrnet/hrnet_w32_256x192.yml) |
+| HRNet-w32 |Top-Down| 384x288 | 77.8 | [hrnet_w32_384x288.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_384x288.pdparams) | [config](./hrnet/hrnet_w32_384x288.yml) |
+| HRNet-w32+DarkPose |Top-Down| 256x192 | 78.0 | [dark_hrnet_w32_256x192.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_256x192.pdparams) | [config](./hrnet/dark_hrnet_w32_256x192.yml) |
+| HRNet-w32+DarkPose |Top-Down| 384x288 | 78.3 | [dark_hrnet_w32_384x288.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_384x288.pdparams) | [config](./hrnet/dark_hrnet_w32_384x288.yml) |
+| WiderNaiveHRNet-18 | Top-Down|256x192 | 67.6(+DARK 68.4) | [wider_naive_hrnet_18_256x192_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/wider_naive_hrnet_18_256x192_coco.pdparams) | [config](./lite_hrnet/wider_naive_hrnet_18_256x192_coco.yml) |
+| LiteHRNet-18 |Top-Down| 256x192 | 66.5 | [lite_hrnet_18_256x192_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/lite_hrnet_18_256x192_coco.pdparams) | [config](./lite_hrnet/lite_hrnet_18_256x192_coco.yml) |
+| LiteHRNet-18 |Top-Down| 384x288 | 69.7 | [lite_hrnet_18_384x288_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/lite_hrnet_18_384x288_coco.pdparams) | [config](./lite_hrnet/lite_hrnet_18_384x288_coco.yml) |
+| LiteHRNet-30 | Top-Down|256x192 | 69.4 | [lite_hrnet_30_256x192_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/lite_hrnet_30_256x192_coco.pdparams) | [config](./lite_hrnet/lite_hrnet_30_256x192_coco.yml) |
+| LiteHRNet-30 |Top-Down| 384x288 | 72.5 | [lite_hrnet_30_384x288_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/lite_hrnet_30_384x288_coco.pdparams) | [config](./lite_hrnet/lite_hrnet_30_384x288_coco.yml) |
+
+备注: Top-Down模型测试AP结果基于GroundTruth标注框
+
+MPII数据集
+| 模型 | 方案| 输入尺寸 | PCKh(Mean) | PCKh(Mean@0.1) | 模型下载 | 配置文件 |
+| :---- | ---|----- | :--------: | :------------: | :----------------------------------------------------------: | -------------------------------------------- |
+| HRNet-w32 | Top-Down|256x256 | 90.6 | 38.5 | [hrnet_w32_256x256_mpii.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x256_mpii.pdparams) | [config](./hrnet/hrnet_w32_256x256_mpii.yml) |
+
+场景模型
+| 模型 | 方案 | 输入尺寸 | 精度 | 预测速度 |模型权重 | 部署模型 | 说明|
+| :---- | ---|----- | :--------: | :--------: | :------------: |:------------: |:-------------------: |
+| HRNet-w32 + DarkPose | Top-Down|256x192 | AP: 87.1 (业务数据集)| 单人2.9ms |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.pdparams) |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip) | 针对摔倒场景特别优化,该模型应用于[PP-Human](../../deploy/pipeline/README.md) |
+
+
+我们同时推出了基于LiteHRNet(Top-Down)针对移动端设备优化的实时关键点检测模型[PP-TinyPose](./tiny_pose/README.md), 欢迎体验。
+
+
+
+## 快速开始
+
+### 1、环境安装
+
+ 请参考PaddleDetection [安装文档](../../docs/tutorials/INSTALL_cn.md)正确安装PaddlePaddle和PaddleDetection即可。
+
+### 2、数据准备
+
+ 目前KeyPoint模型支持[COCO](https://cocodataset.org/#keypoints-2017)数据集和[MPII](http://human-pose.mpi-inf.mpg.de/#overview)数据集,数据集的准备方式请参考[关键点数据准备](../../docs/tutorials/data/PrepareKeypointDataSet.md)。
+
+ 关于config配置文件内容说明请参考[关键点配置文件说明](../../docs/tutorials/KeyPointConfigGuide_cn.md)。
+
+- 请注意,Top-Down方案使用检测框测试时,需要通过检测模型生成bbox.json文件。COCO val2017的检测结果可以参考[Detector having human AP of 56.4 on COCO val2017 dataset](https://paddledet.bj.bcebos.com/data/bbox.json),下载后放在根目录(PaddleDetection)下,然后修改config配置文件中`use_gt_bbox: False`后生效。然后正常执行测试命令即可。
+
+### 3、训练与测试
+
+#### 单卡训练
+
+```shell
+#COCO DataSet
+CUDA_VISIBLE_DEVICES=0 python3 tools/train.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml
+
+#MPII DataSet
+CUDA_VISIBLE_DEVICES=0 python3 tools/train.py -c configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml
+```
+
+#### 多卡训练
+
+```shell
+#COCO DataSet
+CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml
+
+#MPII DataSet
+CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml
+```
+
+#### 模型评估
+
+```shell
+#COCO DataSet
+CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml
+
+#MPII DataSet
+CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml
+
+#当只需要保存评估预测的结果时,可以通过设置save_prediction_only参数实现,评估预测结果默认保存在output/keypoints_results.json文件中
+CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml --save_prediction_only
+```
+
+#### 模型预测
+
+ 注意:top-down模型只支持单人截图预测,如需使用多人图,请使用[联合部署推理]方式。或者使用bottom-up模型。
+
+```shell
+CUDA_VISIBLE_DEVICES=0 python3 tools/infer.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml -o weights=./output/higherhrnet_hrnet_w32_512/model_final.pdparams --infer_dir=../images/ --draw_threshold=0.5 --save_txt=True
+```
+
+#### 模型部署
+
+##### Top-Down模型联合部署
+
+```shell
+#导出检测模型
+python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams
+
+#导出关键点模型
+python tools/export_model.py -c configs/keypoint/hrnet/hrnet_w32_256x192.yml -o weights=https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x192.pdparams
+
+#detector 检测 + keypoint top-down模型联合部署(联合推理只支持top-down方式)
+python deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/ppyolo_r50vd_dcn_2x_coco/ --keypoint_model_dir=output_inference/hrnet_w32_384x288/ --video_file=../video/xxx.mp4 --device=gpu
+```
+
+##### Bottom-Up模型独立部署
+
+```shell
+#导出模型
+python tools/export_model.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml -o weights=output/higherhrnet_hrnet_w32_512/model_final.pdparams
+
+#部署推理
+python deploy/python/keypoint_infer.py --model_dir=output_inference/higherhrnet_hrnet_w32_512/ --image_file=./demo/000000014439_640x640.jpg --device=gpu --threshold=0.5
+```
+
+##### 与多目标跟踪模型FairMOT联合部署
+
+```shell
+#导出FairMOT跟踪模型
+python tools/export_model.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
+
+#用导出的跟踪和关键点模型Python联合预测
+python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU
+```
+
+**注意:**
+ 跟踪模型导出教程请参考[文档](../mot/README.md)。
+
+### 完整部署教程及Demo
+
+
+ 我们提供了PaddleInference(服务器端)、PaddleLite(移动端)、第三方部署(MNN、OpenVino)支持。无需依赖训练代码,deploy文件夹下相应文件夹提供独立完整部署代码。 详见 [部署文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/deploy/README.md)介绍。
+
+## 自定义数据训练
+
+我们以[tinypose_256x192](./tiny_pose/README.md)为例来说明对于自定义数据如何修改:
+
+#### 1、配置文件[tinypose_256x192.yml](../../configs/keypoint/tiny_pose/tinypose_256x192.yml)
+
+基本的修改内容及其含义如下:
+
+```
+num_joints: &num_joints 17 #自定义数据的关键点数量
+train_height: &train_height 256 #训练图片尺寸-高度h
+train_width: &train_width 192 #训练图片尺寸-宽度w
+hmsize: &hmsize [48, 64] #对应训练尺寸的输出尺寸,这里是输入[w,h]的1/4
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]] #关键点定义中左右对称的关键点,用于flip增强。若没有对称结构在 TrainReader 的 RandomFlipHalfBodyTransform 一栏中 flip_pairs 后面加一行 "flip: False"(注意缩紧对齐)
+num_joints_half_body: 8 #半身关键点数量,用于半身增强
+prob_half_body: 0.3 #半身增强实现概率,若不需要则修改为0
+upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] #上半身对应关键点id,用于半身增强中获取上半身对应的关键点。
+```
+
+上述是自定义数据时所需要的修改部分,完整的配置及含义说明可参考文件:[关键点配置文件说明](../../docs/tutorials/KeyPointConfigGuide_cn.md)。
+
+#### 2、其他代码修改(影响测试、可视化)
+- keypoint_utils.py中的sigmas = np.array([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62, .62, 1.07, 1.07,.87, .87, .89, .89]) / 10.0,表示每个关键点的确定范围方差,根据实际关键点可信区域设置,区域精确的一般0.25-0.5,例如眼睛。区域范围大的一般0.5-1.0,例如肩膀。若不确定建议0.75。
+- visualizer.py中的draw_pose函数中的EDGES,表示可视化时关键点之间的连接线关系。
+- pycocotools工具中的sigmas,同第一个keypoint_utils.py中的设置。用于coco指标评估时计算。
+
+#### 3、数据准备注意
+- 训练数据请按coco数据格式处理。需要包括关键点[Nx3]、检测框[N]标注。
+- 请注意area>0,area=0时数据在训练时会被过滤掉。此外,由于COCO的评估机制,area较小的数据在评估时也会被过滤掉,我们建议在自定义数据时取`area = bbox_w * bbox_h`。
+
+如有遗漏,欢迎反馈
+
+
+## 关键点稳定策略(仅适用于视频数据)
+使用关键点算法处理视频数据时,由于预测针对单帧图像进行,在视频结果上往往会有抖动的现象。在一些依靠精细化坐标的应用场景(例如健身计数、基于关键点的虚拟渲染等)上容易造成误检或体验不佳的问题。针对这个问题,在PaddleDetection关键点视频推理中加入了[OneEuro滤波器](http://www.lifl.fr/~casiez/publications/CHI2012-casiez.pdf)和EMA两种关键点稳定方式。实现将当前关键点坐标结果和历史关键点坐标结果结合计算,使得输出的点坐标更加稳定平滑。该功能同时支持在Python及C++推理中一键开启使用。
+
+```bash
+# 使用Python推理
+python deploy/python/det_keypoint_unite_infer.py \
+ --det_model_dir output_inference/picodet_s_320 \
+ --keypoint_model_dir output_inference/tinypose_256x192 \
+ --video_file test_video.mp4 --device gpu --smooth True
+
+# 使用CPP推理
+./deploy/cpp/build/main --det_model_dir output_inference/picodet_s_320 \
+ --keypoint_model_dir output_inference/tinypose_256x192 \
+ --video_file test_video.mp4 --device gpu --smooth True
+```
+效果如下:
+
+
+
+## BenchMark
+
+我们给出了不同运行环境下的测试结果,供您在选用模型时参考。详细数据请见[Keypoint Inference Benchmark](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/keypoint/KeypointBenchmark.md)。
+
+## 引用
+
+```
+@inproceedings{cheng2020bottom,
+ title={HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation},
+ author={Bowen Cheng and Bin Xiao and Jingdong Wang and Honghui Shi and Thomas S. Huang and Lei Zhang},
+ booktitle={CVPR},
+ year={2020}
+}
+
+@inproceedings{SunXLW19,
+ title={Deep High-Resolution Representation Learning for Human Pose Estimation},
+ author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
+ booktitle={CVPR},
+ year={2019}
+}
+
+@article{wang2019deep,
+ title={Deep High-Resolution Representation Learning for Visual Recognition},
+ author={Wang, Jingdong and Sun, Ke and Cheng, Tianheng and Jiang, Borui and Deng, Chaorui and Zhao, Yang and Liu, Dong and Mu, Yadong and Tan, Mingkui and Wang, Xinggang and Liu, Wenyu and Xiao, Bin},
+ journal={TPAMI},
+ year={2019}
+}
+
+@InProceedings{Zhang_2020_CVPR,
+ author = {Zhang, Feng and Zhu, Xiatian and Dai, Hanbin and Ye, Mao and Zhu, Ce},
+ title = {Distribution-Aware Coordinate Representation for Human Pose Estimation},
+ booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+ month = {June},
+ year = {2020}
+}
+
+@inproceedings{Yulitehrnet21,
+ title={Lite-HRNet: A Lightweight High-Resolution Network},
+ author={Yu, Changqian and Xiao, Bin and Gao, Changxin and Yuan, Lu and Zhang, Lei and Sang, Nong and Wang, Jingdong},
+ booktitle={CVPR},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/keypoint/README_en.md b/PaddleDetection-release-2.6/configs/keypoint/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..64ffc39d61c63a3893b079e255facaec3620aeb6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/README_en.md
@@ -0,0 +1,269 @@
+[简体中文](README.md) | English
+
+# KeyPoint Detection Models
+
+## Content
+
+- [Introduction](#introduction)
+- [Model Recommendation](#model-recommendation)
+- [Model Zoo](#model-zoo)
+- [Getting Start](#getting-start)
+ - [Environmental Installation](#1environmental-installation)
+ - [Dataset Preparation](#2dataset-preparation)
+ - [Training and Testing](#3training-and-testing)
+ - [Training on single GPU](#training-on-single-gpu)
+ - [Training on multiple GPU](#training-on-multiple-gpu)
+ - [Evaluation](#evaluation)
+ - [Inference](#inference)
+ - [Deploy Inference](#deploy-inference)
+ - [Deployment for Top-Down models](#deployment-for-top-down-models)
+ - [Deployment for Bottom-Up models](#deployment-for-bottom-up-models)
+ - [Joint Inference with Multi-Object Tracking Model FairMOT](#joint-inference-with-multi-object-tracking-model-fairmot)
+ - [Complete Deploy Instruction and Demo](#complete-deploy-instruction-and-demo)
+- [Train with custom data](#train-with-custom-data)
+- [BenchMark](#benchmark)
+
+## Introduction
+
+The keypoint detection part in PaddleDetection follows the state-of-the-art algorithm closely, including Top-Down and Bottom-Up methods, which can satisfy the different needs of users. Top-Down detects the object first and then detects the specific keypoint. Top-Down models will be more accurate, but slower as the number of objects increases. Differently, Bottom-Up detects the point first and then group or connect those points to form several instances of human pose. The speed of Bottom-Up is fixed, it won't slow down as the number of objects increases, but it will be less accurate.
+
+At the same time, PaddleDetection provides a self-developed real-time keypoint detection model [PP-TinyPose](./tiny_pose/README_en.md) optimized for mobile devices.
+
+
+

+
+
+## Model Recommendation
+
+### Mobile Terminal
+
+
+
+
+| Detection Model | Keypoint Model | Input Size | Accuracy of COCO | Average Inference Time (FP16) | Params (M) | Flops (G) | Model Weight | Paddle-Lite Inference Model(FP16) |
+| :----------------------------------------------------------- | :------------------------------------ | :-------------------------------------: | :--------------------------------------: | :-----------------------------------: | :--------------------------------: | :--------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| [PicoDet-S-Pedestrian](../picodet/legacy_model/application/pedestrian_detection/picodet_s_192_pedestrian.yml) | [PP-TinyPose](./tiny_pose/tinypose_128x96.yml) | Detection:192x192
Keypoint:128x96 | Detection mAP:29.0
Keypoint AP:58.1 | Detection:2.37ms
Keypoint:3.27ms | Detection:1.18
Keypoint:1.36 | Detection:0.35
Keypoint:0.08 | [Detection](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.pdparams)
[Keypoint](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams) | [Detection](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_fp16.nb)
[Keypoint](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_fp16.nb) |
+| [PicoDet-S-Pedestrian](../picodet/legacy_model/application/pedestrian_detection/picodet_s_320_pedestrian.yml) | [PP-TinyPose](./tiny_pose/tinypose_256x192.yml) | Detection:320x320
Keypoint:256x192 | Detection mAP:38.5
Keypoint AP:68.8 | Detection:6.30ms
Keypoint:8.33ms | Detection:1.18
Keypoint:1.36 | Detection:0.97
Keypoint:0.32 | [Detection](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.pdparams)
[Keypoint](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) | [Detection](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_fp16.nb)
[Keypoint](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_fp16.nb) |
+
+
+*Specific documents of PP-TinyPose, please refer to [Document](./tiny_pose/README.md)。
+
+### Terminal Server
+
+
+| Detection Model | Keypoint Model | Input Size | Accuracy of COCO | Params (M) | Flops (G) | Model Weight |
+| :----------------------------------------------------------- | :----------------------------------------- | :-------------------------------------: | :--------------------------------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------------------------: |
+| [PP-YOLOv2](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml) | [HRNet-w32](./hrnet/hrnet_w32_384x288.yml) | Detection:640x640
Keypoint:384x288 | Detection mAP:49.5
Keypoint AP:77.8 | Detection:54.6
Keypoint:28.6 | Detection:115.8
Keypoint:17.3 | [Detection](https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams)
[Keypoint](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x192.pdparams) |
+| [PP-YOLOv2](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml) | [HRNet-w32](./hrnet/hrnet_w32_256x192.yml) | Detection:640x640
Keypoint:256x192 | Detection mAP:49.5
Keypoint AP:76.9 | Detection:54.6
Keypoint:28.6 | Detection:115.8
Keypoint:7.68 | [Detection](https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams)
[Keypoint](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_384x288.pdparams) |
+
+
+## Model Zoo
+
+COCO Dataset
+| Model | Input Size | AP(coco val) | Model Download | Config File |
+| :---------------- | -------- | :----------: | :----------------------------------------------------------: | ----------------------------------------------------------- |
+| PETR_Res50 |One-Stage| 512 | 65.5 | [petr_res50.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/petr_resnet50_16x2_coco.pdparams) | [config](./petr/petr_resnet50_16x2_coco.yml) |
+| HigherHRNet-w32 | 512 | 67.1 | [higherhrnet_hrnet_w32_512.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/higherhrnet_hrnet_w32_512.pdparams) | [config](./higherhrnet/higherhrnet_hrnet_w32_512.yml) |
+| HigherHRNet-w32 | 640 | 68.3 | [higherhrnet_hrnet_w32_640.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/higherhrnet_hrnet_w32_640.pdparams) | [config](./higherhrnet/higherhrnet_hrnet_w32_640.yml) |
+| HigherHRNet-w32+SWAHR | 512 | 68.9 | [higherhrnet_hrnet_w32_512_swahr.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/higherhrnet_hrnet_w32_512_swahr.pdparams) | [config](./higherhrnet/higherhrnet_hrnet_w32_512_swahr.yml) |
+| HRNet-w32 | 256x192 | 76.9 | [hrnet_w32_256x192.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x192.pdparams) | [config](./hrnet/hrnet_w32_256x192.yml) |
+| HRNet-w32 | 384x288 | 77.8 | [hrnet_w32_384x288.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_384x288.pdparams) | [config](./hrnet/hrnet_w32_384x288.yml) |
+| HRNet-w32+DarkPose | 256x192 | 78.0 | [dark_hrnet_w32_256x192.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_256x192.pdparams) | [config](./hrnet/dark_hrnet_w32_256x192.yml) |
+| HRNet-w32+DarkPose | 384x288 | 78.3 | [dark_hrnet_w32_384x288.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/dark_hrnet_w32_384x288.pdparams) | [config](./hrnet/dark_hrnet_w32_384x288.yml) |
+| WiderNaiveHRNet-18 | 256x192 | 67.6(+DARK 68.4) | [wider_naive_hrnet_18_256x192_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/wider_naive_hrnet_18_256x192_coco.pdparams) | [config](./lite_hrnet/wider_naive_hrnet_18_256x192_coco.yml) |
+| LiteHRNet-18 | 256x192 | 66.5 | [lite_hrnet_18_256x192_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/lite_hrnet_18_256x192_coco.pdparams) | [config](./lite_hrnet/lite_hrnet_18_256x192_coco.yml) |
+| LiteHRNet-18 | 384x288 | 69.7 | [lite_hrnet_18_384x288_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/lite_hrnet_18_384x288_coco.pdparams) | [config](./lite_hrnet/lite_hrnet_18_384x288_coco.yml) |
+| LiteHRNet-30 | 256x192 | 69.4 | [lite_hrnet_30_256x192_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/lite_hrnet_30_256x192_coco.pdparams) | [config](./lite_hrnet/lite_hrnet_30_256x192_coco.yml) |
+| LiteHRNet-30 | 384x288 | 72.5 | [lite_hrnet_30_384x288_coco.pdparams](https://bj.bcebos.com/v1/paddledet/models/keypoint/lite_hrnet_30_384x288_coco.pdparams) | [config](./lite_hrnet/lite_hrnet_30_384x288_coco.yml) |
+
+Note:The AP results of Top-Down models are based on bounding boxes in GroundTruth.
+
+MPII Dataset
+| Model | Input Size | PCKh(Mean) | PCKh(Mean@0.1) | Model Download | Config File |
+| :---- | -------- | :--------: | :------------: | :----------------------------------------------------------: | -------------------------------------------- |
+| HRNet-w32 | 256x256 | 90.6 | 38.5 | [hrnet_w32_256x256_mpii.pdparams](https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x256_mpii.pdparams) | [config](./hrnet/hrnet_w32_256x256_mpii.yml) |
+
+
+Model for Scenes
+| Model | Strategy | Input Size | Precision | Inference Speed |Model Weights | Model Inference and Deployment | description|
+| :---- | ---|----- | :--------: | :-------: |:------------: |:------------: |:-------------------: |
+| HRNet-w32 + DarkPose | Top-Down|256x192 | AP: 87.1 (on internal dataset)| 2.9ms per person |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.pdparams) |[Link](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip) | Especially optimized for fall scenarios, the model is applied to [PP-Human](../../deploy/pipeline/README.md) |
+
+
+We also release [PP-TinyPose](./tiny_pose/README_en.md), a real-time keypoint detection model optimized for mobile devices. Welcome to experience.
+
+## Getting Start
+
+### 1.Environmental Installation
+
+ Please refer to [PaddleDetection Installation Guide](../../docs/tutorials/INSTALL.md) to install PaddlePaddle and PaddleDetection correctly.
+
+### 2.Dataset Preparation
+
+ Currently, KeyPoint Detection Models support [COCO](https://cocodataset.org/#keypoints-2017) and [MPII](http://human-pose.mpi-inf.mpg.de/#overview). Please refer to [Keypoint Dataset Preparation](../../docs/tutorials/data/PrepareKeypointDataSet_en.md) to prepare dataset.
+
+ About the description for config files, please refer to [Keypoint Config Guild](../../docs/tutorials/KeyPointConfigGuide_en.md).
+
+- Note that, when testing by detected bounding boxes in Top-Down method, We should get `bbox.json` by a detection model. You can download the detected results for COCO val2017 [(Detector having human AP of 56.4 on COCO val2017 dataset)](https://paddledet.bj.bcebos.com/data/bbox.json) directly, put it at the root path (`PaddleDetection/`), and set `use_gt_bbox: False` in config file.
+
+### 3.Training and Testing
+
+#### Training on single GPU
+
+```shell
+#COCO DataSet
+CUDA_VISIBLE_DEVICES=0 python3 tools/train.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml
+
+#MPII DataSet
+CUDA_VISIBLE_DEVICES=0 python3 tools/train.py -c configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml
+```
+
+#### Training on multiple GPU
+
+```shell
+#COCO DataSet
+CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml
+
+#MPII DataSet
+CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml
+```
+
+#### Evaluation
+
+```shell
+#COCO DataSet
+CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml
+
+#MPII DataSet
+CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml
+
+#If you only need the prediction result, you can set --save_prediction_only. Then the result will be saved at output/keypoints_results.json by default.
+CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml --save_prediction_only
+```
+
+#### Inference
+
+ Note:Top-down models only support inference for a cropped image with single person. If you want to do inference on image with several people, please see "joint inference by detection and keypoint". Or you can choose a Bottom-up model.
+
+```shell
+CUDA_VISIBLE_DEVICES=0 python3 tools/infer.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml -o weights=./output/higherhrnet_hrnet_w32_512/model_final.pdparams --infer_dir=../images/ --draw_threshold=0.5 --save_txt=True
+```
+
+#### Deploy Inference
+
+##### Deployment for Top-Down models
+
+```shell
+#Export Detection Model
+python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams
+
+
+#Export Keypoint Model
+python tools/export_model.py -c configs/keypoint/hrnet/hrnet_w32_256x192.yml -o weights=https://paddledet.bj.bcebos.com/models/keypoint/hrnet_w32_256x192.pdparams
+
+#Deployment for detector and keypoint, which is only for Top-Down models
+python deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/ppyolo_r50vd_dcn_2x_coco/ --keypoint_model_dir=output_inference/hrnet_w32_384x288/ --video_file=../video/xxx.mp4 --device=gpu
+```
+
+##### Deployment for Bottom-Up models
+
+```shell
+#Export model
+python tools/export_model.py -c configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml -o weights=output/higherhrnet_hrnet_w32_512/model_final.pdparams
+
+
+#Keypoint independent deployment, which is only for bottom-up models
+python deploy/python/keypoint_infer.py --model_dir=output_inference/higherhrnet_hrnet_w32_512/ --image_file=./demo/000000014439_640x640.jpg --device=gpu --threshold=0.5
+```
+
+##### Joint Inference with Multi-Object Tracking Model FairMOT
+
+```shell
+#export FairMOT model
+python tools/export_model.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
+
+#joint inference with Multi-Object Tracking model FairMOT
+python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU
+```
+
+**Note:**
+ To export MOT model, please refer to [Here](../../configs/mot/README_en.md).
+
+### Complete Deploy Instruction and Demo
+
+ We provide standalone deploy of PaddleInference(Server-GPU)、PaddleLite(mobile、ARM)、Third-Engine(MNN、OpenVino), which is independent of training codes。For detail, please click [Deploy-docs](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/deploy/README_en.md)。
+
+## Train with custom data
+
+We take an example of [tinypose_256x192](./tiny_pose/README_en.md) to show how to train with custom data.
+
+#### 1、For configs [tinypose_256x192.yml](../../configs/keypoint/tiny_pose/tinypose_256x192.yml)
+
+you may need to modify these for your job:
+
+```
+num_joints: &num_joints 17 #the number of joints in your job
+train_height: &train_height 256 #the height of model input
+train_width: &train_width 192 #the width of model input
+hmsize: &hmsize [48, 64] #the shape of model output,usually 1/4 of [w,h]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]] #the correspondence between left and right keypoint id,used for flip transform。You can add an line(by "flip: False") behind of flip_pairs in RandomFlipHalfBodyTransform of TrainReader if you don't need it
+num_joints_half_body: 8 #The joint numbers of half body, used for half_body transform
+prob_half_body: 0.3 #The probability of half_body transform, set to 0 if you don't need it
+upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] #The joint ids of half(upper) body, used to get the upper joints in half_body transform
+```
+
+For more configs, please refer to [KeyPointConfigGuide](../../docs/tutorials/KeyPointConfigGuide_en.md)。
+
+#### 2、Others(used for test and visualization)
+- In keypoint_utils.py, please set: "sigmas = np.array([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62, .62, 1.07, 1.07,.87, .87, .89, .89]) / 10.0", the value indicate the variance of a joint locations,normally 0.25-0.5 means the location is highly accuracy,for example: eyes。0.5-1.0 means the location is not sure so much,for example: shoulder。0.75 is recommand if you not sure。
+- In visualizer.py, please set "EDGES" in draw_pose function,this indicate the line to show between joints for visualization。
+- In pycocotools you installed, please set "sigmas",it is the same as that in keypoint_utils.py, but used for coco evaluation。
+
+#### 3、Note for data preparation
+- The data should has the same format as Coco data, and the keypoints(Nx3) and bbox(N) should be annotated.
+- please set "area">0 in annotations files otherwise it will be skiped while training. Moreover, due to the evaluation mechanism of COCO, the data with small area may also be filtered out during evaluation. We recommend to set `area = bbox_w * bbox_h` when customizing your dataset.
+
+
+## BenchMark
+
+We provide benchmarks in different runtime environments for your reference when choosing models. See [Keypoint Inference Benchmark](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/keypoint/KeypointBenchmark.md) for details.
+
+## Reference
+
+```
+@inproceedings{cheng2020bottom,
+ title={HigherHRNet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation},
+ author={Bowen Cheng and Bin Xiao and Jingdong Wang and Honghui Shi and Thomas S. Huang and Lei Zhang},
+ booktitle={CVPR},
+ year={2020}
+}
+
+@inproceedings{SunXLW19,
+ title={Deep High-Resolution Representation Learning for Human Pose Estimation},
+ author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
+ booktitle={CVPR},
+ year={2019}
+}
+
+@article{wang2019deep,
+ title={Deep High-Resolution Representation Learning for Visual Recognition},
+ author={Wang, Jingdong and Sun, Ke and Cheng, Tianheng and Jiang, Borui and Deng, Chaorui and Zhao, Yang and Liu, Dong and Mu, Yadong and Tan, Mingkui and Wang, Xinggang and Liu, Wenyu and Xiao, Bin},
+ journal={TPAMI},
+ year={2019}
+}
+
+@InProceedings{Zhang_2020_CVPR,
+ author = {Zhang, Feng and Zhu, Xiatian and Dai, Hanbin and Ye, Mao and Zhu, Ce},
+ title = {Distribution-Aware Coordinate Representation for Human Pose Estimation},
+ booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+ month = {June},
+ year = {2020}
+}
+
+@inproceedings{Yulitehrnet21,
+ title={Lite-HRNet: A Lightweight High-Resolution Network},
+ author={Yu, Changqian and Xiao, Bin and Gao, Changxin and Yuan, Lu and Zhang, Lei and Sang, Nong and Wang, Jingdong},
+ booktitle={CVPR},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml b/PaddleDetection-release-2.6/configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5dedfb32bb1bad8d130306db8ce55143055049c9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512.yml
@@ -0,0 +1,139 @@
+use_gpu: true
+log_iter: 10
+save_dir: output
+snapshot_epoch: 10
+weights: output/higherhrnet_hrnet_w32_512/model_final
+epoch: 300
+num_joints: &num_joints 17
+flip_perm: &flip_perm [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
+input_size: &input_size 512
+hm_size: &hm_size 128
+hm_size_2x: &hm_size_2x 256
+max_people: &max_people 30
+metric: COCO
+IouType: keypoints
+num_classes: 1
+
+
+#####model
+architecture: HigherHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+HigherHRNet:
+ backbone: HRNet
+ hrhrnet_head: HrHRNetHead
+ post_process: HrHRNetPostProcess
+ flip_perm: *flip_perm
+ eval_flip: true
+
+HRNet:
+ width: &width 32
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+HrHRNetHead:
+ num_joints: *num_joints
+ width: *width
+ loss: HrHRNetLoss
+ swahr: false
+
+HrHRNetLoss:
+ num_joints: *num_joints
+ swahr: false
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [200, 260]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: None
+
+#####data
+TrainDataset:
+ !KeypointBottomUpCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ return_bbox: False
+ return_area: False
+ return_class: False
+
+EvalDataset:
+ !KeypointBottomUpCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ test_mode: true
+ return_bbox: False
+ return_area: False
+ return_class: False
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 8
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomAffine:
+ max_degree: 30
+ scale: [0.75, 1.5]
+ max_shift: 0.2
+ trainsize: [*input_size, *input_size]
+ hmsize: [*hm_size, *hm_size_2x]
+ - KeyPointFlip:
+ flip_prob: 0.5
+ flip_permutation: *flip_perm
+ hmsize: [*hm_size, *hm_size_2x]
+ - ToHeatmaps:
+ num_joints: *num_joints
+ hmsize: [*hm_size, *hm_size_2x]
+ sigma: 2
+ - TagGenerate:
+ num_joints: *num_joints
+ max_people: *max_people
+ - NormalizePermute:
+ mean: *global_mean
+ std: *global_std
+ batch_size: 20
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - EvalAffine:
+ size: *input_size
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - EvalAffine:
+ size: *input_size
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512_swahr.yml b/PaddleDetection-release-2.6/configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512_swahr.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7b0f7560a0c192dda869c2546c2c7c3dd7baa79d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_512_swahr.yml
@@ -0,0 +1,140 @@
+use_gpu: true
+log_iter: 10
+save_dir: output
+snapshot_epoch: 10
+weights: output/higherhrnet_hrnet_w32_512_swahr/model_final
+epoch: 300
+num_joints: &num_joints 17
+flip_perm: &flip_perm [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
+input_size: &input_size 512
+hm_size: &hm_size 128
+hm_size_2x: &hm_size_2x 256
+max_people: &max_people 30
+metric: COCO
+IouType: keypoints
+num_classes: 1
+
+
+#####model
+architecture: HigherHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+HigherHRNet:
+ backbone: HRNet
+ hrhrnet_head: HrHRNetHead
+ post_process: HrHRNetPostProcess
+ flip_perm: *flip_perm
+ eval_flip: true
+
+HRNet:
+ width: &width 32
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+HrHRNetHead:
+ num_joints: *num_joints
+ width: *width
+ loss: HrHRNetLoss
+ swahr: true
+
+HrHRNetLoss:
+ num_joints: *num_joints
+ swahr: true
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [200, 260]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: None
+
+
+#####data
+TrainDataset:
+ !KeypointBottomUpCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ return_bbox: False
+ return_area: False
+ return_class: False
+
+EvalDataset:
+ !KeypointBottomUpCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ test_mode: true
+ return_bbox: False
+ return_area: False
+ return_class: False
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 8
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomAffine:
+ max_degree: 30
+ scale: [0.75, 1.5]
+ max_shift: 0.2
+ trainsize: [*input_size, *input_size]
+ hmsize: [*hm_size, *hm_size_2x]
+ - KeyPointFlip:
+ flip_prob: 0.5
+ flip_permutation: *flip_perm
+ hmsize: [*hm_size, *hm_size_2x]
+ - ToHeatmaps:
+ num_joints: *num_joints
+ hmsize: [*hm_size, *hm_size_2x]
+ sigma: 2
+ - TagGenerate:
+ num_joints: *num_joints
+ max_people: *max_people
+ - NormalizePermute:
+ mean: *global_mean
+ std: *global_std
+ batch_size: 16
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - EvalAffine:
+ size: *input_size
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - EvalAffine:
+ size: *input_size
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_640.yml b/PaddleDetection-release-2.6/configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_640.yml
new file mode 100644
index 0000000000000000000000000000000000000000..edd66e55d5295abc23466a355dad18714afa6e15
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/higherhrnet/higherhrnet_hrnet_w32_640.yml
@@ -0,0 +1,139 @@
+use_gpu: true
+log_iter: 10
+save_dir: output
+snapshot_epoch: 10
+weights: output/higherhrnet_hrnet_w32_640/model_final
+epoch: 300
+num_joints: &num_joints 17
+flip_perm: &flip_perm [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
+input_size: &input_size 640
+hm_size: &hm_size 160
+hm_size_2x: &hm_size_2x 320
+max_people: &max_people 30
+metric: COCO
+IouType: keypoints
+num_classes: 1
+
+
+#####model
+architecture: HigherHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+HigherHRNet:
+ backbone: HRNet
+ hrhrnet_head: HrHRNetHead
+ post_process: HrHRNetPostProcess
+ flip_perm: *flip_perm
+ eval_flip: true
+
+HRNet:
+ width: &width 32
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+HrHRNetHead:
+ num_joints: *num_joints
+ width: *width
+ loss: HrHRNetLoss
+ swahr: false
+
+HrHRNetLoss:
+ num_joints: *num_joints
+ swahr: false
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [200, 260]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: None
+
+#####data
+TrainDataset:
+ !KeypointBottomUpCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ return_bbox: False
+ return_area: False
+ return_class: False
+
+EvalDataset:
+ !KeypointBottomUpCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ test_mode: true
+ return_bbox: False
+ return_area: False
+ return_class: False
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 8
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomAffine:
+ max_degree: 30
+ scale: [0.75, 1.5]
+ max_shift: 0.2
+ trainsize: [*input_size, *input_size]
+ hmsize: [*hm_size, *hm_size_2x]
+ - KeyPointFlip:
+ flip_prob: 0.5
+ flip_permutation: *flip_perm
+ hmsize: [*hm_size, *hm_size_2x]
+ - ToHeatmaps:
+ num_joints: *num_joints
+ hmsize: [*hm_size, *hm_size_2x]
+ sigma: 2
+ - TagGenerate:
+ num_joints: *num_joints
+ max_people: *max_people
+ - NormalizePermute:
+ mean: *global_mean
+ std: *global_std
+ batch_size: 20
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - EvalAffine:
+ size: *input_size
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - EvalAffine:
+ size: *input_size
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/hrnet/dark_hrnet_w32_256x192.yml b/PaddleDetection-release-2.6/configs/keypoint/hrnet/dark_hrnet_w32_256x192.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a759c121a1e891e510f802cfbf53962c98a368be
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/hrnet/dark_hrnet_w32_256x192.yml
@@ -0,0 +1,141 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/hrnet_w32_256x192/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 192
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [48, 64]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+TopDownHRNet:
+ backbone: HRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 32
+ loss: KeyPointMSELoss
+
+HRNet:
+ width: *width
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ bbox_file: bbox.json
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.5
+ rot: 40
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown_DARK:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/hrnet/dark_hrnet_w32_384x288.yml b/PaddleDetection-release-2.6/configs/keypoint/hrnet/dark_hrnet_w32_384x288.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6eaa0ec0ba17e25cd3787e082963c3c863388eeb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/hrnet/dark_hrnet_w32_384x288.yml
@@ -0,0 +1,145 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/hrnet_w32_384x288/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 384
+train_width: &train_width 288
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [72, 96]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+TopDownHRNet:
+ backbone: HRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 32
+ loss: KeyPointMSELoss
+ flip: true
+
+HRNet:
+ width: *width
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ bbox_file: bbox.json
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.5
+ rot: 40
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown_DARK:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 32
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown_DARK:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/hrnet/dark_hrnet_w48_256x192.yml b/PaddleDetection-release-2.6/configs/keypoint/hrnet/dark_hrnet_w48_256x192.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1417e03d22e96aea50833eab7d2a522f192ebfee
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/hrnet/dark_hrnet_w48_256x192.yml
@@ -0,0 +1,141 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/hrnet_w48_256x192/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 192
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [48, 64]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W48_C_pretrained.pdparams
+
+TopDownHRNet:
+ backbone: HRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 48
+ loss: KeyPointMSELoss
+
+HRNet:
+ width: *width
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ bbox_file: bbox.json
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.5
+ rot: 40
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown_DARK:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/hrnet/hrnet_w32_256x192.yml b/PaddleDetection-release-2.6/configs/keypoint/hrnet/hrnet_w32_256x192.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d80d97264082aa1de2af906042a656252f0cd356
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/hrnet/hrnet_w32_256x192.yml
@@ -0,0 +1,142 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/hrnet_w32_256x192/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 192
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [48, 64]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+TopDownHRNet:
+ backbone: HRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 32
+ loss: KeyPointMSELoss
+
+HRNet:
+ width: *width
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ bbox_file: bbox.json
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.5
+ rot: 40
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: false #whether to fuse normalize layer into model while export model
diff --git a/PaddleDetection-release-2.6/configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml b/PaddleDetection-release-2.6/configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml
new file mode 100644
index 0000000000000000000000000000000000000000..09e860989dea37e479a5a3217994dd948c405627
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/hrnet/hrnet_w32_256x256_mpii.yml
@@ -0,0 +1,132 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/hrnet_w32_256x256_mpii/model_final
+epoch: 210
+num_joints: &num_joints 16
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownMPIIEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 256
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [64, 64]
+flip_perm: &flip_perm [[0, 5], [1, 4], [2, 3], [10, 15], [11, 14], [12, 13]]
+
+#####model
+architecture: TopDownHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+TopDownHRNet:
+ backbone: HRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 32
+ loss: KeyPointMSELoss
+
+HRNet:
+ width: *width
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownMPIIDataset
+ image_dir: images
+ anno_path: annotations/mpii_train.json
+ dataset_dir: dataset/mpii
+ num_joints: *num_joints
+
+
+EvalDataset:
+ !KeypointTopDownMPIIDataset
+ image_dir: images
+ anno_path: annotations/mpii_val.json
+ dataset_dir: dataset/mpii
+ num_joints: *num_joints
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 4
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.5
+ rot: 40
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [7, 8, 9, 10, 11, 12, 13, 14, 15]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/hrnet/hrnet_w32_384x288.yml b/PaddleDetection-release-2.6/configs/keypoint/hrnet/hrnet_w32_384x288.yml
new file mode 100644
index 0000000000000000000000000000000000000000..15425059a0f4723d5a13b36aec4962a4dc586d4d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/hrnet/hrnet_w32_384x288.yml
@@ -0,0 +1,142 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/hrnet_w32_384x288/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 384
+train_width: &train_width 288
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [72, 96]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+TopDownHRNet:
+ backbone: HRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 32
+ loss: KeyPointMSELoss
+ flip: true
+
+HRNet:
+ width: *width
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ bbox_file: bbox.json
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.5
+ rot: 40
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_18_256x192_coco.yml b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_18_256x192_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2664082465aa666ff7be789c008cd0873bde4021
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_18_256x192_coco.yml
@@ -0,0 +1,140 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/lite_hrnet_18_256x192_coco/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 192
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [48, 64]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+ backbone: LiteHRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 40
+ loss: KeyPointMSELoss
+ use_dark: false
+
+LiteHRNet:
+ network_type: lite_18
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+ loss_scale: 1.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.25
+ rot: 30
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_18_384x288_coco.yml b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_18_384x288_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f2bddfcca023d4a25a58e36565dfa25b84778957
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_18_384x288_coco.yml
@@ -0,0 +1,140 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/lite_hrnet_18_384x288_coco/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 384
+train_width: &train_width 288
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [72, 96]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+ backbone: LiteHRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 40
+ loss: KeyPointMSELoss
+ use_dark: false
+
+LiteHRNet:
+ network_type: lite_18
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+ loss_scale: 1.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.25
+ rot: 30
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown:
+ hmsize: *hmsize
+ sigma: 3
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 32
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_30_256x192_coco.yml b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_30_256x192_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..118ba360448e59491c2df3b0bc0a23f29e205827
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_30_256x192_coco.yml
@@ -0,0 +1,140 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/lite_hrnet_30_256x192_coco/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 192
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [48, 64]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+ backbone: LiteHRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 40
+ loss: KeyPointMSELoss
+ use_dark: false
+
+LiteHRNet:
+ network_type: lite_30
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+ loss_scale: 1.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 4
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.25
+ rot: 30
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_30_384x288_coco.yml b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_30_384x288_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..97f3aa8e3e671a05828bc20416cfb17bd538c8fc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/lite_hrnet_30_384x288_coco.yml
@@ -0,0 +1,140 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/lite_hrnet_30_384x288_coco/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 384
+train_width: &train_width 288
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [72, 96]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+ backbone: LiteHRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 40
+ loss: KeyPointMSELoss
+ use_dark: false
+
+LiteHRNet:
+ network_type: lite_30
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+ loss_scale: 1.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.25
+ rot: 30
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown:
+ hmsize: *hmsize
+ sigma: 3
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 32
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/wider_naive_hrnet_18_256x192_coco.yml b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/wider_naive_hrnet_18_256x192_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a80d08c1fc98e49b8197c1fea1144cdf1efe34ac
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/lite_hrnet/wider_naive_hrnet_18_256x192_coco.yml
@@ -0,0 +1,140 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/wider_naive_hrnet_18_256x192_coco/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 192
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [48, 64]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+ backbone: LiteHRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 40
+ loss: KeyPointMSELoss
+ use_dark: false
+
+LiteHRNet:
+ network_type: wider_naive
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+ loss_scale: 1.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.25
+ rot: 30
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/petr/petr_resnet50_16x2_coco.yml b/PaddleDetection-release-2.6/configs/keypoint/petr/petr_resnet50_16x2_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a97eff63ab19882e827f780fba024afafa49abca
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/petr/petr_resnet50_16x2_coco.yml
@@ -0,0 +1,254 @@
+use_gpu: true
+log_iter: 50
+save_dir: output
+snapshot_epoch: 1
+weights: output/petr_resnet50_16x2_coco/model_final
+epoch: 100
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: COCO
+num_classes: 1
+trainsize: &trainsize 512
+flip_perm: &flip_perm [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
+find_unused_parameters: False
+
+#####model
+architecture: PETR
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/PETR_pretrained.pdparams
+
+PETR:
+ backbone:
+ name: ResNet
+ depth: 50
+ variant: b
+ norm_type: bn
+ freeze_norm: True
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.1, 0.1, 0.1, 0.1]
+ neck:
+ name: ChannelMapper
+ in_channels: [512, 1024, 2048]
+ kernel_size: 1
+ out_channels: 256
+ norm_type: "gn"
+ norm_groups: 32
+ act: None
+ num_outs: 4
+ bbox_head:
+ name: PETRHead
+ num_query: 300
+ num_classes: 1 # only person
+ in_channels: 2048
+ sync_cls_avg_factor: true
+ with_kpt_refine: true
+ transformer:
+ name: PETRTransformer
+ as_two_stage: true
+ encoder:
+ name: TransformerEncoder
+ encoder_layer:
+ name: TransformerEncoderLayer
+ d_model: 256
+ attn:
+ name: MSDeformableAttention
+ embed_dim: 256
+ num_heads: 8
+ num_levels: 4
+ num_points: 4
+ dim_feedforward: 1024
+ dropout: 0.1
+ num_layers: 6
+ decoder:
+ name: PETR_TransformerDecoder
+ num_layers: 3
+ return_intermediate: true
+ decoder_layer:
+ name: PETR_TransformerDecoderLayer
+ d_model: 256
+ dim_feedforward: 1024
+ dropout: 0.1
+ self_attn:
+ name: MultiHeadAttention
+ embed_dim: 256
+ num_heads: 8
+ dropout: 0.1
+ cross_attn:
+ name: MultiScaleDeformablePoseAttention
+ embed_dims: 256
+ num_heads: 8
+ num_levels: 4
+ num_points: 17
+ hm_encoder:
+ name: TransformerEncoder
+ encoder_layer:
+ name: TransformerEncoderLayer
+ d_model: 256
+ attn:
+ name: MSDeformableAttention
+ embed_dim: 256
+ num_heads: 8
+ num_levels: 1
+ num_points: 4
+ dim_feedforward: 1024
+ dropout: 0.1
+ num_layers: 1
+ refine_decoder:
+ name: PETR_DeformableDetrTransformerDecoder
+ num_layers: 2
+ return_intermediate: true
+ decoder_layer:
+ name: PETR_TransformerDecoderLayer
+ d_model: 256
+ dim_feedforward: 1024
+ dropout: 0.1
+ self_attn:
+ name: MultiHeadAttention
+ embed_dim: 256
+ num_heads: 8
+ dropout: 0.1
+ cross_attn:
+ name: MSDeformableAttention
+ embed_dim: 256
+ num_levels: 4
+ positional_encoding:
+ name: PositionEmbedding
+ num_pos_feats: 128
+ normalize: true
+ offset: -0.5
+ loss_cls:
+ name: Weighted_FocalLoss
+ use_sigmoid: true
+ gamma: 2.0
+ alpha: 0.25
+ loss_weight: 2.0
+ reduction: "mean"
+ loss_kpt:
+ name: L1Loss
+ loss_weight: 70.0
+ loss_kpt_rpn:
+ name: L1Loss
+ loss_weight: 70.0
+ loss_oks:
+ name: OKSLoss
+ loss_weight: 2.0
+ loss_hm:
+ name: CenterFocalLoss
+ loss_weight: 4.0
+ loss_kpt_refine:
+ name: L1Loss
+ loss_weight: 80.0
+ loss_oks_refine:
+ name: OKSLoss
+ loss_weight: 3.0
+ assigner:
+ name: PoseHungarianAssigner
+ cls_cost:
+ name: FocalLossCost
+ weight: 2.0
+ kpt_cost:
+ name: KptL1Cost
+ weight: 70.0
+ oks_cost:
+ name: OksCost
+ weight: 7.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [80]
+ gamma: 0.1
+ use_warmup: false
+ # - !LinearWarmup
+ # start_factor: 0.001
+ # steps: 1000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 0.1
+ optimizer:
+ type: AdamW
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointBottomUpCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ return_mask: false
+
+EvalDataset:
+ !KeypointBottomUpCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ test_mode: true
+ return_mask: false
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - PhotoMetricDistortion:
+ brightness_delta: 32
+ contrast_range: [0.5, 1.5]
+ saturation_range: [0.5, 1.5]
+ hue_delta: 18
+ - KeyPointFlip:
+ flip_prob: 0.5
+ flip_permutation: *flip_perm
+ - RandomAffine:
+ max_degree: 30
+ scale: [1.0, 1.0]
+ max_shift: 0.
+ trainsize: -1
+ - RandomSelect: { transforms1: [ RandomShortSideRangeResize: { scales: [[400, 1400], [1400, 1400]]} ],
+ transforms2: [
+ RandomShortSideResize: { short_side_sizes: [ 400, 500, 600 ] },
+ RandomSizeCrop: { min_size: 384, max_size: 600},
+ RandomShortSideRangeResize: { scales: [[400, 1400], [1400, 1400]]} ]}
+ batch_transforms:
+ - NormalizeImage: {mean: *global_mean, std: *global_std, is_scale: True}
+ - PadGT: {pad_img: True, minimum_gtnum: 1}
+ - Permute: {}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - PETR_Resize: {img_scale: [[800, 1333]], keep_ratio: True}
+ # - MultiscaleTestResize: {origin_target_size: [[800, 1333]], use_flip: false}
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - EvalAffine: {size: 800}
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/README.md b/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..d9181c507ecdf4c0c025eed7d776fff0db2e756a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/README.md
@@ -0,0 +1,281 @@
+简体中文 | [English](README_en.md)
+
+# PP-TinyPose
+
+
+

+
图片来源:COCO2017开源数据集
+
+
+## 最新动态
+- **2022.8.01:发布PP-TinyPose升级版。 在健身、舞蹈等场景的业务数据集端到端AP提升9.1**
+ - 新增体育场景真实数据,复杂动作识别效果显著提升,覆盖侧身、卧躺、跳跃、高抬腿等非常规动作
+ - 检测模型升级为[PP-PicoDet增强版](../../../configs/picodet/README.md),在COCO数据集上精度提升3.1%
+ - 关键点稳定性增强。新增滤波稳定方式,视频预测结果更加稳定平滑
+
+ 
+
+## 简介
+PP-TinyPose是PaddleDetecion针对移动端设备优化的实时关键点检测模型,可流畅地在移动端设备上执行多人姿态估计任务。借助PaddleDetecion自研的优秀轻量级检测模型[PicoDet](../../picodet/README.md),我们同时提供了特色的轻量级垂类行人检测模型。TinyPose的运行环境有以下依赖要求:
+- [PaddlePaddle](https://github.com/PaddlePaddle/Paddle)>=2.2
+
+如希望在移动端部署,则还需要:
+- [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)>=2.11
+
+
+
+

+
+
+## 部署案例
+
+- [Android Fitness Demo](https://github.com/zhiboniu/pose_demo_android) 基于PP-TinyPose, 高效实现健身校准与计数功能。
+
+
+

+
+
+- 欢迎扫码快速体验
+
+

+
+
+
+## 模型库
+
+### Pipeline性能
+| 单人模型配置 | AP (业务数据集) | AP (COCO Val单人)| 单人耗时 (FP32) | 单人耗时 (FP16) |
+| :---------------------------------- | :------: | :------: | :---: | :---: |
+| PicoDet-S-Lcnet-Pedestrian-192\*192 + PP-TinyPose-128\*96 | 77.1 (+9.1) | 52.3 (+0.5) | 12.90 ms| 9.61 ms |
+
+| 多人模型配置 | AP (业务数据集) | AP (COCO Val多人)| 6人耗时 (FP32) | 6人耗时 (FP16)|
+| :------------------------ | :-------: | :-------: | :---: | :---: |
+| PicoDet-S-Lcnet-Pedestrian-320\*320 + PP-TinyPose-128\*96 | 78.0 (+7.7) | 50.1 (-0.2) | 47.63 ms| 34.62 ms |
+
+**说明**
+- 关键点检测模型的精度指标是基于对应行人检测模型检测得到的检测框。
+- 精度测试中去除了flip操作,且检测置信度阈值要求0.5。
+- 速度测试环境为qualcomm snapdragon 865,采用arm8下4线程推理。
+- Pipeline速度包含模型的预处理、推理及后处理部分。
+- 精度值的增量对比自历史版本中对应模型组合, 详情请见**历史版本-Pipeline性能**。
+- 精度测试中,为了公平比较,多人数据去除了6人以上(不含6人)的图像。
+
+### 关键点检测模型
+| 模型 | 输入尺寸 | AP (业务数据集) | AP (COCO Val) | 参数量 | FLOPS |单人推理耗时 (FP32) | 单人推理耗时(FP16) | 配置文件 | 模型权重 | 预测部署模型 | Paddle-Lite部署模型(FP32) | Paddle-Lite部署模型(FP16) |
+| :---------- | :------: | :-----------: | :-----------: | :-----------: | :-----------: | :-----------------: | :-----------------: | :------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| PP-TinyPose | 128*96 | 84.3 | 58.4 | 1.32 M | 81.56 M | 4.57ms | 3.27ms | [Config](./tinypose_128x96.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_128x96.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_128x96.zip) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_128x96_fp32.nb) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_128x96_fp16.nb) |
+| PP-TinyPose | 256*192 | 91.0 | 68.3 | 1.32 M | 326.24M |14.07ms | 8.33ms | [Config](./tinypose_256x192.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_256x192.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_256x192.zip) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_256x192_fp32.nb) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/tinypose_256x192_fp16.nb) |
+
+
+### 行人检测模型
+| 模型 | 输入尺寸 | mAP (COCO Val-Person) | 参数量 | FLOPS | 平均推理耗时 (FP32) | 平均推理耗时 (FP16) | 配置文件 | 模型权重 | 预测部署模型 | Paddle-Lite部署模型(FP32) | Paddle-Lite部署模型(FP16) |
+| :------------------- | :------: | :------------: | :------------: | :------------: | :-----------------: | :-----------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| PicoDet-S-Lcnet-Pedestrian | 192*192 | 31.7 | 1.16 M | 170.03 M | 5.24ms | 3.66ms | [Config](../../picodet/application/pedestrian_detection/picodet_s_192_lcnet_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_192_lcnet_pedestrian.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_192_lcnet_pedestrian.zip) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_192_lcnet_pedestrian_fp32.nb) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_192_lcnet_pedestrian_fp16.nb) |
+| PicoDet-S-Lcnet-Pedestrian | 320*320 | 41.6 | 1.16 M | 472.07 M | 13.87ms | 8.94ms | [Config](../../picodet/application/pedestrian_detection/picodet_s_320_lcnet_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_320_lcnet_pedestrian.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_320_lcnet_pedestrian.zip) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_320_lcnet_pedestrian_fp32.nb) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_enhance/picodet_s_320_lcnet_pedestrian_fp16.nb) |
+
+**说明**
+- 关键点检测模型与行人检测模型均使用`COCO train2017`, `AI Challenger trainset`以及采集的多姿态场景数据集作为训练集。关键点检测模型使用多姿态场景数据集作为测试集,行人检测模型采用`COCO instances val2017`作为测试集。
+- 关键点检测模型的精度指标所依赖的检测框为ground truth标注得到。
+- 关键点检测模型与行人检测模型均在4卡环境下训练,若实际训练环境需要改变GPU数量或batch size, 须参考[FAQ](../../../docs/tutorials/FAQ/README.md)对应调整学习率。
+- 推理速度测试环境为 Qualcomm Snapdragon 865,采用arm8下4线程推理得到。
+
+## 历史版本
+
+
+2021版本
+
+
+### Pipeline性能
+| 单人模型配置 | AP (COCO Val 单人) | 单人耗时 (FP32) | 单人耗时 (FP16) |
+| :------------------------ | :------: | :---: | :---: |
+| PicoDet-S-Pedestrian-192\*192 + PP-TinyPose-128\*96 | 51.8 | 11.72 ms| 8.18 ms |
+| 其他优秀开源模型-192\*192 | 22.3 | 12.0 ms| - |
+
+| 多人模型配置 | AP (COCO Val 多人) | 6人耗时 (FP32) | 6人耗时 (FP16)|
+| :------------------------ | :-------: | :---: | :---: |
+| PicoDet-S-Pedestrian-320\*320 + PP-TinyPose-128\*96 | 50.3 | 44.0 ms| 32.57 ms |
+| 其他优秀开源模型-256\*256 | 39.4 | 51.0 ms| - |
+
+**说明**
+- 关键点检测模型的精度指标是基于对应行人检测模型检测得到的检测框。
+- 精度测试中去除了flip操作,且检测置信度阈值要求0.5。
+- 精度测试中,为了公平比较,多人数据去除了6人以上(不含6人)的图像。
+- 速度测试环境为qualcomm snapdragon 865,采用arm8下4线程、FP32推理得到。
+- Pipeline速度包含模型的预处理、推理及后处理部分。
+- 其他优秀开源模型的测试及部署方案,请参考[这里](https://github.com/zhiboniu/MoveNet-PaddleLite)。
+- 更多环境下的性能测试结果,请参考[Keypoint Inference Benchmark](../KeypointBenchmark.md)。
+
+
+### 关键点检测模型
+| 模型 | 输入尺寸 | AP (COCO Val) | 单人推理耗时 (FP32) | 单人推理耗时(FP16) | 配置文件 | 模型权重 | 预测部署模型 | Paddle-Lite部署模型(FP32) | Paddle-Lite部署模型(FP16) |
+| :---------- | :------: | :-----------: | :-----------------: | :-----------------: | :------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| PP-TinyPose | 128*96 | 58.1 | 4.57ms | 3.27ms | [Config](./tinypose_128x96.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.tar) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_lite.tar) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_fp16_lite.tar) |
+| PP-TinyPose | 256*192 | 68.8 | 14.07ms | 8.33ms | [Config](./tinypose_256x192.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.tar) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_lite.tar) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_fp16_lite.tar) |
+
+### 行人检测模型
+| 模型 | 输入尺寸 | mAP (COCO Val-Person) | 平均推理耗时 (FP32) | 平均推理耗时 (FP16) | 配置文件 | 模型权重 | 预测部署模型 | Paddle-Lite部署模型(FP32) | Paddle-Lite部署模型(FP16) |
+| :------------------- | :------: | :------------: | :-----------------: | :-----------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| PicoDet-S-Pedestrian | 192*192 | 29.0 | 4.30ms | 2.37ms | [Config](../../picodet/legacy_model/application/pedestrian_detection/picodet_s_192_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.tar) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_lite.tar) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_fp16_lite.tar) |
+| PicoDet-S-Pedestrian | 320*320 | 38.5 | 10.26ms | 6.30ms | [Config](../../picodet/legacy_model/application/pedestrian_detection/picodet_s_320_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.pdparams) | [预测部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.tar) | [Lite部署模型](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_lite.tar) | [Lite部署模型(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_fp16_lite.tar) |
+
+
+**说明**
+- 关键点检测模型与行人检测模型均使用`COCO train2017`和`AI Challenger trainset`作为训练集。关键点检测模型使用`COCO person keypoints val2017`作为测试集,行人检测模型采用`COCO instances val2017`作为测试集。
+- 关键点检测模型的精度指标所依赖的检测框为ground truth标注得到。
+- 关键点检测模型与行人检测模型均在4卡环境下训练,若实际训练环境需要改变GPU数量或batch size, 须参考[FAQ](../../../docs/tutorials/FAQ/README.md)对应调整学习率。
+- 推理速度测试环境为 Qualcomm Snapdragon 865,采用arm8下4线程推理得到。
+
+
+
+
+## 模型训练
+关键点检测模型与行人检测模型的训练集在`COCO`以外还扩充了[AI Challenger](https://arxiv.org/abs/1711.06475)数据集,各数据集关键点定义如下:
+```
+COCO keypoint Description:
+ 0: "Nose",
+ 1: "Left Eye",
+ 2: "Right Eye",
+ 3: "Left Ear",
+ 4: "Right Ear",
+ 5: "Left Shoulder,
+ 6: "Right Shoulder",
+ 7: "Left Elbow",
+ 8: "Right Elbow",
+ 9: "Left Wrist",
+ 10: "Right Wrist",
+ 11: "Left Hip",
+ 12: "Right Hip",
+ 13: "Left Knee",
+ 14: "Right Knee",
+ 15: "Left Ankle",
+ 16: "Right Ankle"
+
+AI Challenger Description:
+ 0: "Right Shoulder",
+ 1: "Right Elbow",
+ 2: "Right Wrist",
+ 3: "Left Shoulder",
+ 4: "Left Elbow",
+ 5: "Left Wrist",
+ 6: "Right Hip",
+ 7: "Right Knee",
+ 8: "Right Ankle",
+ 9: "Left Hip",
+ 10: "Left Knee",
+ 11: "Left Ankle",
+ 12: "Head top",
+ 13: "Neck"
+```
+
+由于两个数据集的关键点标注形式不同,我们将两个数据集的标注进行了对齐,仍然沿用COCO的标注形式,您可以下载[训练的参考列表](https://bj.bcebos.com/v1/paddledet/data/keypoint/aic_coco_train_cocoformat.json)并放在`dataset/`下使用。对齐两个数据集标注文件的主要处理如下:
+- `AI Challenger`关键点标注顺序调整至与COCO一致,统一是否标注/可见的标志位;
+- 舍弃了`AI Challenger`中特有的点位;将`AI Challenger`数据中`COCO`特有点位标记为未标注;
+- 重新排列了`image_id`与`annotation id`;
+利用转换为`COCO`形式的合并数据标注,执行模型训练:
+```bash
+# 关键点检测模型
+python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml
+
+# 行人检测模型
+python3 -m paddle.distributed.launch tools/train.py -c configs/picodet/application/pedestrian_detection/picodet_s_320_pedestrian.yml
+```
+
+## 部署流程
+### 实现部署预测
+1. 通过以下命令将训练得到的模型导出:
+```bash
+python3 tools/export_model.py -c configs/picodet/application/pedestrian_detection/picodet_s_192_pedestrian.yml --output_dir=outut_inference -o weights=output/picodet_s_192_pedestrian/model_final
+
+python3 tools/export_model.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml --output_dir=outut_inference -o weights=output/tinypose_128x96/model_final
+```
+导出后的模型如:
+```
+picodet_s_192_pedestrian
+├── infer_cfg.yml
+├── model.pdiparams
+├── model.pdiparams.info
+└── model.pdmodel
+```
+您也可以直接下载模型库中提供的对应`预测部署模型`,分别获取得到行人检测模型和关键点检测模型的预测部署模型,解压即可。
+
+2. 执行Python联合部署预测
+```bash
+# 预测一张图片
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --image_file={your image file} --device=GPU
+
+# 预测多张图片
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --image_dir={dir of image file} --device=GPU
+
+# 预测一个视频
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --video_file={your video file} --device=GPU
+```
+
+3. 执行C++联合部署预测
+- 请先按照[C++端预测部署](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/deploy/cpp),根据您的实际环境准备对应的`paddle_inference`库及相关依赖。
+- 我们提供了[一键编译脚本](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.3/deploy/cpp/scripts/build.sh),您可以通过该脚本填写相关环境变量的位置,编译上述代码后,得到可执行文件。该过程中请保证`WITH_KEYPOINT=ON`.
+- 编译完成后,即可执行部署预测,例如:
+```bash
+# 预测一张图片
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --image_file={your image file} --device=GPU
+
+# 预测多张图片
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --image_dir={dir of image file} --device=GPU
+
+# 预测一个视频
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --video_file={your video file} --device=GPU
+```
+
+### 实现移动端部署
+#### 直接使用我们提供的模型进行部署
+1. 下载模型库中提供的`Paddle-Lite部署模型`,分别获取得到行人检测模型和关键点检测模型的`.nb`格式文件。
+2. 准备Paddle-Lite运行环境, 可直接通过[PaddleLite预编译库下载](https://paddle-lite.readthedocs.io/zh/latest/quick_start/release_lib.html)获取预编译库,无需自行编译。如需要采用FP16推理,则需要下载FP16的预编译库。
+3. 编译模型运行代码,详细步骤见[Paddle-Lite端侧部署](../../../deploy/lite/README.md)。
+
+#### 将训练的模型实现端侧部署
+如果您希望将自己训练的模型应用于部署,可以参考以下步骤:
+1. 将训练的模型导出
+```bash
+python3 tools/export_model.py -c configs/picodet/application/pedestrian_detection/picodet_s_192_pedestrian.yml --output_dir=outut_inference -o weights=output/picodet_s_192_pedestrian/model_final TestReader.fuse_normalize=true
+
+python3 tools/export_model.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml --output_dir=outut_inference -o weights=output/tinypose_128x96/model_final TestReader.fuse_normalize=true
+```
+2. 转换为Lite模型(依赖[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite))
+
+- 安装Paddle-Lite:
+```bash
+pip install paddlelite
+```
+- 执行以下步骤,以得到对应后缀为`.nb`的Paddle-Lite模型用于端侧部署:
+```
+# 1. 转换行人检测模型
+# FP32
+paddle_lite_opt --model_dir=inference_model/picodet_s_192_pedestrian --valid_targets=arm --optimize_out=picodet_s_192_pedestrian_fp32
+# FP16
+paddle_lite_opt --model_dir=inference_model/picodet_s_192_pedestrian --valid_targets=arm --optimize_out=picodet_s_192_pedestrian_fp16 --enable_fp16=true
+
+# 2. 转换关键点检测模型
+# FP32
+paddle_lite_opt --model_dir=inference_model/tinypose_128x96 --valid_targets=arm --optimize_out=tinypose_128x96_fp32
+# FP16
+paddle_lite_opt --model_dir=inference_model/tinypose_128x96 --valid_targets=arm --optimize_out=tinypose_128x96_fp16 --enable_fp16=true
+```
+
+3. 编译模型运行代码,详细步骤见[Paddle-Lite端侧部署](../../../deploy/lite/README.md)。
+
+我们已提供包含数据预处理、模型推理及模型后处理的[全流程示例代码](../../../deploy/lite/),可根据实际需求进行修改。
+
+**注意**
+- 在导出模型时增加`TestReader.fuse_normalize=true`参数,可以将对图像的Normalize操作合并在模型中执行,从而实现加速。
+- FP16推理可实现更快的模型推理速度。若希望部署FP16模型,除模型转换步骤外,还需要编译支持FP16的Paddle-Lite预测库,详见[Paddle Lite 使用 ARM CPU 预测部署](https://paddle-lite.readthedocs.io/zh/latest/demo_guides/arm_cpu.html)。
+
+## 关键点稳定策略(仅支持视频推理)
+请参考[关键点稳定策略](../README.md#关键点稳定策略仅适用于视频数据)。
+
+## 优化策略
+TinyPose采用了以下策略来平衡模型的速度和精度表现:
+- 轻量级的姿态估计任务骨干网络,[wider naive Lite-HRNet](https://arxiv.org/abs/2104.06403)。
+- 更小的输入尺寸,以提升整体推理速度。
+- 加入Distribution-Aware coordinate Representation of Keypoints ([DARK](https://arxiv.org/abs/1910.06278)),以提升低分辨率热力图下模型的精度表现。
+- Unbiased Data Processing ([UDP](https://arxiv.org/abs/1911.07524)),使用无偏数据编解码提升模型精度。
+- Augmentation by Information Dropping ([AID](https://arxiv.org/abs/2008.07139v2)),通过添加信息丢失的数组增强,提升模型对关键点的定位能力。
+- FP16 推理, 实现更快的模型推理速度。
diff --git a/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/README_en.md b/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..6bf5d9b70877dac66c82dfdc821ec5dd4d1fe6f4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/README_en.md
@@ -0,0 +1,224 @@
+[简体中文](README.md) | English
+
+# PP-TinyPose
+
+
+

+
Image Source: COCO2017
+
+
+## Introduction
+PP-TinyPose is a real-time keypoint detection model optimized by PaddleDetecion for mobile devices, which can smoothly run multi-person pose estimation tasks on mobile devices. With the excellent self-developed lightweight detection model [PicoDet](../../picodet/README.md), we also provide a lightweight pedestrian detection model. PP-TinyPose has the following dependency requirements:
+- [PaddlePaddle](https://github.com/PaddlePaddle/Paddle)>=2.2
+
+If you want to deploy it on the mobile devives, you also need:
+- [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)>=2.10
+
+
+
+

+
+
+## Deployment Case
+
+- [Android Fitness Demo](https://github.com/zhiboniu/pose_demo_android) based on PP-TinyPose, which efficiently implements fitness calibration and counting.
+
+
+

+
+
+- Welcome to scan the QR code for quick experience.
+
+

+
+
+
+## Model Zoo
+### Keypoint Detection Model
+| Model | Input Size | AP (COCO Val) | Inference Time for Single Person (FP32)| Inference Time for Single Person(FP16) | Config | Model Weights | Deployment Model | Paddle-Lite Model(FP32) | Paddle-Lite Model(FP16)|
+| :------------------------ | :-------: | :------: | :------: |:---: | :---: | :---: | :---: | :---: | :---: |
+| PP-TinyPose | 128*96 | 58.1 | 4.57ms | 3.27ms | [Config](./tinypose_128x96.yml) |[Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams) | [Deployment Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.tar) | [Lite Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_lite.tar) | [Lite Model(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96_fp16_lite.tar) |
+| PP-TinyPose | 256*192 | 68.8 | 14.07ms | 8.33ms | [Config](./tinypose_256x192.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.pdparams) | [Deployment Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192.tar) | [Lite Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_lite.tar) | [Lite Model(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_256x192_fp16_lite.tar) |
+
+### Pedestrian Detection Model
+| Model | Input Size | mAP (COCO Val) | Average Inference Time (FP32)| Average Inference Time (FP16) | Config | Model Weights | Deployment Model | Paddle-Lite Model(FP32) | Paddle-Lite Model(FP16)|
+| :------------------------ | :-------: | :------: | :------: | :---: | :---: | :---: | :---: | :---: | :---: |
+| PicoDet-S-Pedestrian | 192*192 | 29.0 | 4.30ms | 2.37ms | [Config](../../picodet/legacy_model/application/pedestrian_detection/picodet_s_192_pedestrian.yml) |[Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.pdparams) | [Deployment Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian.tar) | [Lite Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_lite.tar) | [Lite Model(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_192_pedestrian_fp16_lite.tar) |
+| PicoDet-S-Pedestrian | 320*320 | 38.5 | 10.26ms | 6.30ms | [Config](../../picodet/legacy_model/application/pedestrian_detection/picodet_s_320_pedestrian.yml) | [Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.pdparams) | [Deployment Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian.tar) | [Lite Model](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_lite.tar) | [Lite Model(FP16)](https://bj.bcebos.com/v1/paddledet/models/keypoint/picodet_s_320_pedestrian_fp16_lite.tar) |
+
+
+**Tips**
+- The keypoint detection model and pedestrian detection model are both trained on `COCO train2017` and `AI Challenger trainset`. The keypoint detection model is evaluated on `COCO person keypoints val2017`, and the pedestrian detection model is evaluated on `COCO instances val2017`.
+- The AP results of keypoint detection models are based on bounding boxes in GroundTruth.
+- Both keypoint detection model and pedestrian detection model are trained in a 4-GPU environment. In practice, if number of GPUs or batch size need to be changed according to the training environment, you should refer to [FAQ](../../../docs/tutorials/FAQ/README.md) to adjust the learning rate.
+- The inference time is tested on a Qualcomm Snapdragon 865, with 4 threads at arm8.
+
+### Pipeline Performance
+| Model for Single-Pose | AP (COCO Val Single-Person) | Time for Single Person(FP32) | Time for Single Person(FP16) |
+| :------------------------ | :------: | :---: | :---: |
+| PicoDet-S-Pedestrian-192\*192 + PP-TinyPose-128\*96 | 51.8 | 11.72 ms| 8.18 ms |
+| Other opensource model-192\*192 | 22.3 | 12.0 ms| - |
+
+| Model for Multi-Pose | AP (COCO Val Multi-Persons) | Time for Six Persons(FP32) | Time for Six Persons(FP16)|
+| :------------------------ | :-------: | :---: | :---: |
+| PicoDet-S-Pedestrian-320\*320 + PP-TinyPose-128\*96 | 50.3 | 44.0 ms| 32.57 ms |
+| Other opensource model-256\*256 | 39.4 | 51.0 ms| - |
+
+**Tips**
+- The AP results of keypoint detection models are based on bounding boxes detected by corresponding detection model.
+- In accuracy evaluation, there is no flip, and threshold of bounding boxes is set to 0.5.
+- For fairness, in multi-persons test, we remove images with more than 6 people.
+- The inference time is tested on a Qualcomm Snapdragon 865, with 4 threads at arm8, FP32.
+- Pipeline time includes time for preprocess, inferece and postprocess.
+- About the deployment and testing for other opensource model, please refer to [Here](https://github.com/zhiboniu/MoveNet-PaddleLite).
+- For more performance data in other runtime environment, please refer to [Keypoint Inference Benchmark](../KeypointBenchmark.md).
+
+## Model Training
+In addition to `COCO`, the trainset for keypoint detection model and pedestrian detection model also includes [AI Challenger](https://arxiv.org/abs/1711.06475). Keypoints of each dataset are defined as follows:
+```
+COCO keypoint Description:
+ 0: "Nose",
+ 1: "Left Eye",
+ 2: "Right Eye",
+ 3: "Left Ear",
+ 4: "Right Ear",
+ 5: "Left Shoulder,
+ 6: "Right Shoulder",
+ 7: "Left Elbow",
+ 8: "Right Elbow",
+ 9: "Left Wrist",
+ 10: "Right Wrist",
+ 11: "Left Hip",
+ 12: "Right Hip",
+ 13: "Left Knee",
+ 14: "Right Knee",
+ 15: "Left Ankle",
+ 16: "Right Ankle"
+
+AI Challenger Description:
+ 0: "Right Shoulder",
+ 1: "Right Elbow",
+ 2: "Right Wrist",
+ 3: "Left Shoulder",
+ 4: "Left Elbow",
+ 5: "Left Wrist",
+ 6: "Right Hip",
+ 7: "Right Knee",
+ 8: "Right Ankle",
+ 9: "Left Hip",
+ 10: "Left Knee",
+ 11: "Left Ankle",
+ 12: "Head top",
+ 13: "Neck"
+```
+
+Since the annatation format of these two datasets are different, we aligned their annotations to `COCO` format. You can download [Training List](https://bj.bcebos.com/v1/paddledet/data/keypoint/aic_coco_train_cocoformat.json) and put it at `dataset/`. To align these two datasets, we mainly did the following works:
+- Align the indexes of the `AI Challenger` keypoint to be consistent with `COCO` and unify the flags whether the keypoint is labeled/visible.
+- Discard the unique keypoints in `AI Challenger`. For keypoints not in this dataset but in `COCO`, set it to not labeled.
+- Rearranged `image_id` and `annotation id`.
+
+Training with merged annotation file converted to `COCO` format:
+```bash
+# keypoint detection model
+python3 -m paddle.distributed.launch tools/train.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml
+
+# pedestrian detection model
+python3 -m paddle.distributed.launch tools/train.py -c configs/picodet/application/pedestrian_detection/picodet_s_320_pedestrian.yml
+```
+
+## Model Deployment
+### Deploy Inference
+1. Export the trained model through the following command:
+```bash
+python3 tools/export_model.py -c configs/picodet/application/pedestrian_detection/picodet_s_192_pedestrian.yml --output_dir=outut_inference -o weights=output/picodet_s_192_pedestrian/model_final
+
+python3 tools/export_model.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml --output_dir=outut_inference -o weights=output/tinypose_128x96/model_final
+```
+The exported model looks as:
+```
+picodet_s_192_pedestrian
+├── infer_cfg.yml
+├── model.pdiparams
+├── model.pdiparams.info
+└── model.pdmodel
+```
+You can also download `Deployment Model` from `Model Zoo` directly. And obtain the deployment models of pedestrian detection model and keypoint detection model, then unzip them.
+
+2. Python joint inference by detection and keypoint
+```bash
+# inference for one image
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --image_file={your image file} --device=GPU
+
+# inference for several images
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --image_dir={dir of image file} --device=GPU
+
+# inference for a video
+python3 deploy/python/det_keypoint_unite_infer.py --det_model_dir=output_inference/picodet_s_320_pedestrian --keypoint_model_dir=output_inference/tinypose_128x96 --video_file={your video file} --device=GPU
+```
+
+3. C++ joint inference by detection and keypoint
+- First, please refer to [C++ Deploy Inference](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/deploy/cpp), prepare the corresponding `paddle_inference` library and related dependencies according to your environment.
+- We provide [Compile Script](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.3/deploy/cpp/scripts/build.sh). You can fill the location of the relevant environment variables in this script and excute it to compile the above codes. you can get an executable file. Please ensure `WITH_KEYPOINT=ON` during this process.
+- After compilation, you can do inference like:
+```bash
+# inference for one image
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --image_file={your image file} --device=GPU
+
+# inference for several images
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --image_dir={dir of image file} --device=GPU
+
+# inference for a video
+./build/main --model_dir=output_inference/picodet_s_320_pedestrian --model_dir_keypoint=output_inference/tinypose_128x96 --video_file={your video file} --device=GPU
+```
+
+### Deployment on Mobile Devices
+#### Deploy directly using models we provide
+1. Download `Lite Model` from `Model Zoo` directly. And get the `.nb` format files of pedestrian detection model and keypoint detection model.
+2. Prepare environment for Paddle-Lite, you can obtain precompiled libraries from [PaddleLite Precompiled Libraries](https://paddle-lite.readthedocs.io/zh/latest/quick_start/release_lib.html). If FP16 is needed, you should download [Precompiled Libraries for FP16](https://github.com/PaddlePaddle/Paddle-Lite/releases/download/v2.10-rc/inference_lite_lib.android.armv8_clang_c++_static_with_extra_with_cv_with_fp16.tiny_publish_427e46.zip).
+3. Compile the code to run models. The detail can be seen in [Paddle-Lite Deployment on Mobile Devices](../../../deploy/lite/README.md).
+
+#### Deployment self-trained models on Mobile Devices
+If you want to deploy self-trained models, you can refer to the following steps:
+1. Export the trained model
+```bash
+python3 tools/export_model.py -c configs/picodet/application/pedestrian_detection/picodet_s_192_pedestrian.yml --output_dir=outut_inference -o weights=output/picodet_s_192_pedestrian/model_final TestReader.fuse_normalize=true
+
+python3 tools/export_model.py -c configs/keypoint/tiny_pose/tinypose_128x96.yml --output_dir=outut_inference -o weights=output/tinypose_128x96/model_final TestReader.fuse_normalize=true
+```
+2. Convert to Lite Model(rely on [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite))
+
+- Install Paddle-Lite:
+```bash
+pip install paddlelite
+```
+- Run the following commands to obtain `.nb` format models of Paddle-Lite:
+```
+# 1. Convert pedestrian detection model
+# FP32
+paddle_lite_opt --model_dir=inference_model/picodet_s_192_pedestrian --valid_targets=arm --optimize_out=picodet_s_192_pedestrian_fp32
+# FP16
+paddle_lite_opt --model_dir=inference_model/picodet_s_192_pedestrian --valid_targets=arm --optimize_out=picodet_s_192_pedestrian_fp16 --enable_fp16=true
+
+# 2. keypoint detection model
+# FP32
+paddle_lite_opt --model_dir=inference_model/tinypose_128x96 --valid_targets=arm --optimize_out=tinypose_128x96_fp32
+# FP16
+paddle_lite_opt --model_dir=inference_model/tinypose_128x96 --valid_targets=arm --optimize_out=tinypose_128x96_fp16 --enable_fp16=true
+```
+
+3. Compile the code to run models. The detail can be seen in [Paddle-Lite Deployment on Mobile Devices](../../../deploy/lite/README.md).
+
+We provide [Example Code](../../../deploy/lite/) including data preprocessing, inferece and postpreocess. You can modify the codes according to your actual needs.
+
+**Note:**
+- Add `TestReader.fuse_normalize=true` during the step of exporting model. The Normalize operation for the image will be executed in the model, which can achieve acceleration.
+- With FP16, we can get a faster inference speed. If you want to deploy the FP16 model, in addition to the model conversion step, you also need to compile the Paddle-Lite prediction library that supports FP16. The detail is in [Paddle Lite Deployment on ARM CPU](https://paddle-lite.readthedocs.io/zh/latest/demo_guides/arm_cpu.html).
+
+## Optimization Strategies
+TinyPose adopts the following strategies to balance the speed and accuracy of the model:
+- Lightweight backbone network for pose estimation, [wider naive Lite-HRNet](https://arxiv.org/abs/2104.06403).
+- Smaller input size.
+- Distribution-Aware coordinate Representation of Keypoints ([DARK](https://arxiv.org/abs/1910.06278)), which can improve the accuracy of the model under the low-resolution heatmap.
+- Unbiased Data Processing ([UDP](https://arxiv.org/abs/1911.07524)).
+- Augmentation by Information Dropping ([AID](https://arxiv.org/abs/2008.07139v2)).
+- FP16 inference.
diff --git a/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/tinypose_128x96.yml b/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/tinypose_128x96.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e213c299020cae7954ad1fb9214d3e53156e2ee5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/tinypose_128x96.yml
@@ -0,0 +1,147 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/tinypose_128x96/model_final
+epoch: 420
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 128
+train_width: &train_width 96
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [24, 32]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+ backbone: LiteHRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 40
+ loss: KeyPointMSELoss
+ use_dark: true
+
+LiteHRNet:
+ network_type: wider_naive
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+ loss_scale: 1.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.008
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [380, 410]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: ""
+ anno_path: aic_coco_train_cocoformat.json
+ dataset_dir: dataset
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.5
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.25
+ rot: 30
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - AugmentationbyInformantionDropping:
+ prob_cutout: 0.5
+ offset_factor: 0.05
+ num_patch: 1
+ trainsize: *trainsize
+ - TopDownAffine:
+ trainsize: *trainsize
+ use_udp: true
+ - ToHeatmapsTopDown_DARK:
+ hmsize: *hmsize
+ sigma: 1
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 512
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ use_udp: true
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: false
diff --git a/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/tinypose_256x192.yml b/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/tinypose_256x192.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9de2a635f4c8105335916c1a9200b189cc17f016
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/keypoint/tiny_pose/tinypose_256x192.yml
@@ -0,0 +1,147 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/tinypose_256x192/model_final
+epoch: 420
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 192
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [48, 64]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+
+TopDownHRNet:
+ backbone: LiteHRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 40
+ loss: KeyPointMSELoss
+ use_dark: true
+
+LiteHRNet:
+ network_type: wider_naive
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+ loss_scale: 1.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [380, 410]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: ""
+ anno_path: aic_coco_train_cocoformat.json
+ dataset_dir: dataset
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.5
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.25
+ rot: 30
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - AugmentationbyInformantionDropping:
+ prob_cutout: 0.5
+ offset_factor: 0.05
+ num_patch: 1
+ trainsize: *trainsize
+ - TopDownAffine:
+ trainsize: *trainsize
+ use_udp: true
+ - ToHeatmapsTopDown_DARK:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 128
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ use_udp: true
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: false
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/README.md b/PaddleDetection-release-2.6/configs/mask_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..7e7276127554b80b0552f4051702f3ad74a0cfbf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/README.md
@@ -0,0 +1,31 @@
+# Mask R-CNN
+
+## Model Zoo
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | Mask AP | 下载 | 配置文件 |
+| :------------------- | :------------| :-----: | :-----: | :------------: | :-----: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50 | Mask | 1 | 1x | ---- | 37.4 | 32.8 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_1x_coco.yml) |
+| ResNet50 | Mask | 1 | 2x | ---- | 39.7 | 34.5 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_2x_coco.yml) |
+| ResNet50-FPN | Mask | 1 | 1x | ---- | 39.2 | 35.6 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.yml) |
+| ResNet50-FPN | Mask | 1 | 2x | ---- | 40.5 | 36.7 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_fpn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_2x_coco.yml) |
+| ResNet50-vd-FPN | Mask | 1 | 1x | ---- | 40.3 | 36.4 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_1x_coco.yml) |
+| ResNet50-vd-FPN | Mask | 1 | 2x | ---- | 41.4 | 37.5 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_vd_fpn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_2x_coco.yml) |
+| ResNet101-FPN | Mask | 1 | 1x | ---- | 40.6 | 36.6 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r101_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r101_fpn_1x_coco.yml) |
+| ResNet101-vd-FPN | Mask | 1 | 1x | ---- | 42.4 | 38.1 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r101_vd_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r101_vd_fpn_1x_coco.yml) |
+| ResNeXt101-vd-FPN | Mask | 1 | 1x | ---- | 44.0 | 39.5 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_x101_vd_64x4d_fpn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_x101_vd_64x4d_fpn_1x_coco.yml) |
+| ResNeXt101-vd-FPN | Mask | 1 | 2x | ---- | 44.6 | 39.8 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_x101_vd_64x4d_fpn_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_x101_vd_64x4d_fpn_2x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Mask | 1 | 1x | ---- | 42.0 | 38.2 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_vd_fpn_ssld_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml) |
+| ResNet50-vd-SSLDv2-FPN | Mask | 1 | 2x | ---- | 42.7 | 38.9 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml) |
+
+
+## Citations
+```
+@article{He_2017,
+ title={Mask R-CNN},
+ journal={2017 IEEE International Conference on Computer Vision (ICCV)},
+ publisher={IEEE},
+ author={He, Kaiming and Gkioxari, Georgia and Dollar, Piotr and Girshick, Ross},
+ year={2017},
+ month={Oct}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_fpn_reader.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_fpn_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6d95dc6a7cb2fe8c49a0fba79f9b6b71232d4c20
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_fpn_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_rcnn_r50.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_rcnn_r50.yml
new file mode 100644
index 0000000000000000000000000000000000000000..04dab63701171ada046b60e687422e06f8043c26
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_rcnn_r50.yml
@@ -0,0 +1,87 @@
+architecture: MaskRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+MaskRCNN:
+ backbone: ResNet
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ mask_head: MaskHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+ mask_post_process: MaskPostProcess
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [2]
+ num_stages: 3
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [32, 64, 128, 256, 512]
+ strides: [16]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 12000
+ post_nms_top_n: 2000
+ topk_after_collect: False
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 6000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: Res5Head
+ roi_extractor:
+ resolution: 14
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+ with_pool: true
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
+
+MaskHead:
+ head: MaskFeat
+ roi_extractor:
+ resolution: 14
+ sampling_ratio: 0
+ aligned: True
+ mask_assigner: MaskAssigner
+ share_bbox_feat: true
+
+MaskFeat:
+ num_convs: 0
+ out_channel: 256
+
+MaskAssigner:
+ mask_resolution: 14
+
+MaskPostProcess:
+ binary_thresh: 0.5
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_rcnn_r50_fpn.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_rcnn_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..dd7587669661a9e24431a167835ef89527f5e0c8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_rcnn_r50_fpn.yml
@@ -0,0 +1,91 @@
+architecture: MaskRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+MaskRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ mask_head: MaskHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+ mask_post_process: MaskPostProcess
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
+
+MaskHead:
+ head: MaskFeat
+ roi_extractor:
+ resolution: 14
+ sampling_ratio: 0
+ aligned: True
+ mask_assigner: MaskAssigner
+ share_bbox_feat: False
+
+MaskFeat:
+ num_convs: 4
+ out_channel: 256
+
+MaskAssigner:
+ mask_resolution: 28
+
+MaskPostProcess:
+ binary_thresh: 0.5
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_reader.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7001af7ac980eeb8ca688a8e39cca9dfcf950129
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/mask_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..63f898e9c52556bfa0fbbe9c369900c09ab3f94c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r101_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r101_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..aae703c194db64158587baf86d3e6aca60bd8923
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r101_fpn_1x_coco.yml
@@ -0,0 +1,13 @@
+_BASE_: [
+ 'mask_rcnn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_pretrained.pdparams
+weights: output/mask_rcnn_r101_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r101_vd_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r101_vd_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..58d7a7884d7886c39544ee56bf445590122d0acc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r101_vd_fpn_1x_coco.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ 'mask_rcnn_r50_fpn_1x_coco.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
+weights: output/mask_rcnn_r101_vd_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_1x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..01f4721cb8b139fad640b8fbf884d6df76023f13
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/mask_rcnn_r50.yml',
+ '_base_/mask_reader.yml',
+]
+weights: output/mask_rcnn_r50_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_2x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f1e6b669bd7a941608c785593314c6e7feff0b59
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_2x_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ 'mask_rcnn_r50_1x_coco.yml',
+]
+weights: output/mask_rcnn_r50_2x_coco/model_final
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..95e48c2b43d8ec8ca6dc95f2a6a45cf8359bcc49
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/mask_rcnn_r50_fpn.yml',
+ '_base_/mask_fpn_reader.yml',
+]
+weights: output/mask_rcnn_r50_fpn_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f687fd69b1e0521a825da658f2ad14a33ef4b581
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_2x_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ 'mask_rcnn_r50_fpn_1x_coco.yml',
+]
+weights: output/mask_rcnn_r50_fpn_2x_coco/model_final
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d5387417b4bfac35a71b3edf8f062a751dcae3b3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_1x_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ 'mask_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
+weights: output/mask_rcnn_r50_vd_fpn_1x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f85f0299cc2358ca08c548fd3c68eefd108f3d1f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_2x_coco.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ 'mask_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_pretrained.pdparams
+weights: output/mask_rcnn_r50_vd_fpn_2x_coco/model_final
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c5718a8d277d442081d91e89787be16c90b5e01a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_ssld_1x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/mask_rcnn_r50_fpn.yml',
+ '_base_/mask_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/mask_rcnn_r50_vd_fpn_ssld_1x_coco/model_final
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+epoch: 12
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..65b31e6f18d9795db1758b651eccef5969b1f74c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_r50_vd_fpn_ssld_2x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/mask_rcnn_r50_fpn.yml',
+ '_base_/mask_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/mask_rcnn_r50_vd_fpn_ssld_2x_coco/model_final
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_x101_vd_64x4d_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_x101_vd_64x4d_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..238750294f9783a59ca6ff9f8bdcb4799865f5fe
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_x101_vd_64x4d_fpn_1x_coco.yml
@@ -0,0 +1,28 @@
+_BASE_: [
+ 'mask_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
+weights: output/mask_rcnn_x101_vd_64x4d_fpn_1x_coco/model_final
+
+ResNet:
+ # for ResNeXt: groups, base_width, base_channels
+ depth: 101
+ variant: d
+ groups: 64
+ base_width: 4
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+epoch: 12
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_x101_vd_64x4d_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_x101_vd_64x4d_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6a0d0f789972b8f11fc04475b69726d42f150746
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mask_rcnn/mask_rcnn_x101_vd_64x4d_fpn_2x_coco.yml
@@ -0,0 +1,28 @@
+_BASE_: [
+ 'mask_rcnn_r50_fpn_1x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt101_vd_64x4d_pretrained.pdparams
+weights: output/mask_rcnn_x101_vd_64x4d_fpn_2x_coco/model_final
+
+ResNet:
+ # for ResNeXt: groups, base_width, base_channels
+ depth: 101
+ variant: d
+ groups: 64
+ base_width: 4
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/mot/DataDownload.md b/PaddleDetection-release-2.6/configs/mot/DataDownload.md
new file mode 100644
index 0000000000000000000000000000000000000000..a8c9207f8c9d1119b86acf2fdddea5da81e2aa3c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/DataDownload.md
@@ -0,0 +1,39 @@
+# 多目标跟踪数据集下载汇总
+## 目录
+- [行人跟踪](#行人跟踪)
+- [车辆跟踪](#车辆跟踪)
+- [人头跟踪](#人头跟踪)
+- [多类别跟踪](#多类别跟踪)
+
+## 行人跟踪
+
+| 数据集 | 下载链接 | 备注 |
+| :-------------| :-------------| :----: |
+| MOT17 | [download](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip) | - |
+| MOT16 | [download](https://bj.bcebos.com/v1/paddledet/data/mot/MOT16.zip) | - |
+| Caltech | [download](https://bj.bcebos.com/v1/paddledet/data/mot/Caltech.zip) | - |
+| Cityscapes | [download](https://bj.bcebos.com/v1/paddledet/data/mot/Cityscapes.zip) | - |
+| CUHKSYSU | [download](https://bj.bcebos.com/v1/paddledet/data/mot/CUHKSYSU.zip) | - |
+| PRW | [download](https://bj.bcebos.com/v1/paddledet/data/mot/PRW.zip) | - |
+| ETHZ | [download](https://bj.bcebos.com/v1/paddledet/data/mot/ETHZ.zip) | - |
+
+
+## 车辆跟踪
+
+| 数据集 | 下载链接 | 备注 |
+| :-------------| :-------------| :----: |
+| AICity21 | [download](https://bj.bcebos.com/v1/paddledet/data/mot/aic21mtmct_vehicle.zip) | - |
+
+
+## 人头跟踪
+
+| 数据集 | 下载链接 | 备注 |
+| :-------------| :-------------| :----: |
+| HT21 | [download](https://bj.bcebos.com/v1/paddledet/data/mot/HT21.zip) | - |
+
+
+## 多类别跟踪
+
+| 数据集 | 下载链接 | 备注 |
+| :-------------| :-------------| :----: |
+| VisDrone-MOT | [download](https://bj.bcebos.com/v1/paddledet/data/mot/visdrone_mcmot.zip) | - |
diff --git a/PaddleDetection-release-2.6/configs/mot/README.md b/PaddleDetection-release-2.6/configs/mot/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..73bf75fdfc31dfaf4344c8a7a5954e2e35c5baad
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/README.md
@@ -0,0 +1,301 @@
+简体中文 | [English](README_en.md)
+
+# 多目标跟踪 (Multi-Object Tracking)
+
+## 内容
+- [简介](#简介)
+- [安装依赖](#安装依赖)
+- [模型库和选型](#模型库和选型)
+- [MOT数据集准备](#MOT数据集准备)
+ - [SDE数据集](#SDE数据集)
+ - [JDE数据集](#JDE数据集)
+ - [用户自定义数据集准备](#用户自定义数据集准备)
+- [引用](#引用)
+
+
+## 简介
+多目标跟踪(Multi-Object Tracking, MOT)是对给定视频或图片序列,定位出多个感兴趣的目标,并在连续帧之间维持个体的ID信息和记录其轨迹。
+当前主流的做法是Tracking By Detecting方式,算法主要由两部分组成:Detection + Embedding。Detection部分即针对视频,检测出每一帧中的潜在目标。Embedding部分则将检出的目标分配和更新到已有的对应轨迹上(即ReID重识别任务),进行物体间的长时序关联。根据这两部分实现的不同,又可以划分为**SDE**系列和**JDE**系列算法。
+- SDE(Separate Detection and Embedding)这类算法完全分离Detection和Embedding两个环节,最具代表性的是**DeepSORT**算法。这样的设计可以使系统无差别的适配各类检测器,可以针对两个部分分别调优,但由于流程上是串联的导致速度慢耗时较长。也有算法如**ByteTrack**算法为了降低耗时,不使用Embedding特征来计算外观相似度,前提是检测器的精度足够高。
+- JDE(Joint Detection and Embedding)这类算法完是在一个共享神经网络中同时学习Detection和Embedding,使用一个多任务学习的思路设置损失函数。代表性的算法有**JDE**和**FairMOT**。这样的设计兼顾精度和速度,可以实现高精度的实时多目标跟踪。
+
+PaddleDetection中提供了SDE和JDE两个系列的多种算法实现:
+- SDE
+ - [ByteTrack](./bytetrack)
+ - [OC-SORT](./ocsort)
+ - [BoT-SORT](./botsort)
+ - [DeepSORT](./deepsort)
+ - [CenterTrack](./centertrack)
+- JDE
+ - [JDE](./jde)
+ - [FairMOT](./fairmot)
+ - [MCFairMOT](./mcfairmot)
+
+**注意:**
+ - 以上算法原论文均为单类别的多目标跟踪,PaddleDetection团队同时也支持了[ByteTrack](./bytetrack)和FairMOT([MCFairMOT](./mcfairmot))的多类别的多目标跟踪;
+ - [DeepSORT](./deepsort)、[JDE](./jde)、[OC-SORT](./ocsort)、[BoT-SORT](./botsort)和[CenterTrack](./centertrack)均只支持单类别的多目标跟踪;
+ - [DeepSORT](./deepsort)需要额外添加ReID权重一起执行,[ByteTrack](./bytetrack)可加可不加ReID权重,默认不加;
+
+
+### 实时多目标跟踪系统 PP-Tracking
+PaddleDetection团队提供了实时多目标跟踪系统[PP-Tracking](../../deploy/pptracking),是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。
+PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
+PP-Tracking单镜头跟踪采用的方案是[FairMOT](./fairmot),跨镜头跟踪采用的方案是[DeepSORT](./deepsort)。
+
+
+

+
+
+
+

+
+ 视频来源:VisDrone和BDD100K公开数据集
+
+
+#### AI Studio公开项目案例
+教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)。
+
+#### Python端预测部署
+教程请参考[PP-Tracking Python部署文档](../../deploy/pptracking/python/README.md)。
+
+#### C++端预测部署
+教程请参考[PP-Tracking C++部署文档](../../deploy/pptracking/cpp/README.md)。
+
+#### GUI可视化界面预测部署
+教程请参考[PP-Tracking可视化界面使用文档](https://github.com/yangyudong2020/PP-Tracking_GUi)。
+
+
+### 实时行人分析工具 PP-Human
+PaddleDetection团队提供了实时行人分析工具[PP-Human](../../deploy/pipeline),是基于PaddlePaddle深度学习框架的业界首个开源的产业级实时行人分析工具,具有模型丰富、应用广泛和部署高效三大优势。
+PP-Human支持图片/单镜头视频/多镜头视频多种输入方式,功能覆盖多目标跟踪、属性识别、行为分析及人流量计数与轨迹记录。能够广泛应用于智慧交通、智慧社区、工业巡检等领域。支持服务器端部署及TensorRT加速,T4服务器上可达到实时。
+PP-Human跟踪采用的方案是[ByteTrack](./bytetrack)。
+
+
+
+#### AI Studio公开项目案例
+PP-Human实时行人分析全流程实战教程[链接](https://aistudio.baidu.com/aistudio/projectdetail/3842982)。
+
+PP-Human赋能社区智能精细化管理教程[链接](https://aistudio.baidu.com/aistudio/projectdetail/3679564)。
+
+
+
+## 安装依赖
+一键安装MOT相关的依赖:
+```
+pip install -r requirements.txt
+# 或手动pip安装MOT相关的库
+pip install lap motmetrics sklearn
+```
+**注意:**
+ - 预测需确保已安装[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+
+
+
+## 模型库和选型
+- 基础模型
+ - [ByteTrack](bytetrack/README_cn.md)
+ - [OC-SORT](ocsort/README_cn.md)
+ - [BoT-SORT](botsort/README_cn.md)
+ - [DeepSORT](deepsort/README_cn.md)
+ - [JDE](jde/README_cn.md)
+ - [FairMOT](fairmot/README_cn.md)
+ - [CenterTrack](centertrack/README_cn.md)
+- 特色垂类模型
+ - [行人跟踪](pedestrian/README_cn.md)
+ - [人头跟踪](headtracking21/README_cn.md)
+ - [车辆跟踪](vehicle/README_cn.md)
+- 多类别跟踪
+ - [多类别跟踪](mcfairmot/README_cn.md)
+- 跨境头跟踪
+ - [跨境头跟踪](mtmct/README_cn.md)
+
+### 模型选型总结
+
+关于模型选型,PaddleDetection团队提供的总结建议如下:
+
+| MOT方式 | 经典算法 | 算法流程 | 数据集要求 | 其他特点 |
+| :--------------| :--------------| :------- | :----: | :----: |
+| SDE系列 | DeepSORT,ByteTrack,OC-SORT,BoT-SORT,CenterTrack | 分离式,两个独立模型权重先检测后ReID,也可不加ReID | 检测和ReID数据相对独立,不加ReID时即纯检测数据集 |检测和ReID可分别调优,鲁棒性较高,AI竞赛常用|
+| JDE系列 | FairMOT,JDE | 联合式,一个模型权重端到端同时检测和ReID | 必须同时具有检测和ReID标注 | 检测和ReID联合训练,不易调优,泛化性不强|
+
+**注意:**
+ - 由于数据标注的成本较大,建议选型前优先考虑**数据集要求**,如果数据集只有检测框标注而没有ReID标注,是无法使用JDE系列算法训练的,更推荐使用SDE系列;
+ - SDE系列算法在检测器精度足够高时,也可以不使用ReID权重进行物体间的长时序关联,可以参照[ByteTrack](bytetrack);
+ - 耗时速度和模型权重参数量计算量有一定关系,耗时从理论上看`不使用ReID的SDE系列 < JDE系列 < 使用ReID的SDE系列`;
+
+
+
+## MOT数据集准备
+PaddleDetection团队提供了众多公开数据集或整理后数据集的下载链接,参考[数据集下载汇总](DataDownload.md),用户可以自行下载使用。
+
+根据模型选型总结,MOT数据集可以分为两类:一类纯检测框标注的数据集,仅SDE系列可以使用;另一类是同时有检测和ReID标注的数据集,SDE系列和JDE系列都可以使用。
+
+### SDE数据集
+SDE数据集是纯检测标注的数据集,用户自定义数据集可以参照[DET数据准备文档](../../docs/tutorials/data/PrepareDetDataSet.md)准备。
+
+以MOT17数据集为例,下载并解压放在`PaddleDetection/dataset/mot`目录下:
+```
+wget https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip
+
+```
+并修改数据集部分的配置文件如下:
+```
+num_classes: 1
+
+TrainDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/train_half.json
+ image_dir: images/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+ image_dir: images/train
+
+TestDataset:
+ !ImageFolder
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+```
+
+数据集目录为:
+```
+dataset/mot
+ |——————MOT17
+ |——————annotations
+ |——————images
+```
+
+### JDE数据集
+JDE数据集是同时有检测和ReID标注的数据集,首先按照以下命令`image_lists.zip`并解压放在`PaddleDetection/dataset/mot`目录下:
+```
+wget https://bj.bcebos.com/v1/paddledet/data/mot/image_lists.zip
+```
+
+然后按照以下命令可以快速下载各个公开数据集,也解压放在`PaddleDetection/dataset/mot`目录下:
+```
+# MIX数据,同JDE,FairMOT论文使用的数据集
+wget https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/Caltech.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/CUHKSYSU.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/PRW.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/Cityscapes.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/ETHZ.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/MOT16.zip
+```
+数据集目录为:
+```
+dataset/mot
+ |——————image_lists
+ |——————caltech.all
+ |——————citypersons.train
+ |——————cuhksysu.train
+ |——————eth.train
+ |——————mot16.train
+ |——————mot17.train
+ |——————prw.train
+ |——————Caltech
+ |——————Cityscapes
+ |——————CUHKSYSU
+ |——————ETHZ
+ |——————MOT16
+ |——————MOT17
+ |——————PRW
+```
+
+#### JDE数据集的格式
+这几个相关数据集都遵循以下结构:
+```
+MOT17
+ |——————images
+ | └——————train
+ | └——————test
+ └——————labels_with_ids
+ └——————train
+```
+所有数据集的标注是以统一数据格式提供的。各个数据集中每张图片都有相应的标注文本。给定一个图像路径,可以通过将字符串`images`替换为`labels_with_ids`并将`.jpg`替换为`.txt`来生成标注文本路径。在标注文本中,每行都描述一个边界框,格式如下:
+```
+[class] [identity] [x_center] [y_center] [width] [height]
+```
+ - `class`为类别id,支持单类别和多类别,从`0`开始计,单类别即为`0`。
+ - `identity`是从`1`到`num_identities`的整数(`num_identities`是数据集中所有视频或图片序列的不同物体实例的总数),如果此框没有`identity`标注,则为`-1`。
+ - `[x_center] [y_center] [width] [height]`是中心点坐标和宽高,注意他们的值是由图片的宽度/高度标准化的,因此它们是从0到1的浮点数。
+
+
+**注意:**
+ - MIX数据集是[JDE](https://github.com/Zhongdao/Towards-Realtime-MOT)和[FairMOT](https://github.com/ifzhang/FairMOT)原论文使用的数据集,包括**Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17和MOT16**。使用前6者作为联合数据集参与训练,MOT16作为评测数据集。如果您想使用这些数据集,请**遵循他们的License**。
+ - MIX数据集以及其子数据集都是单类别的行人跟踪数据集,可认为相比于行人检测数据集多了id号的标注。
+ - 更多场景的垂类模型例如车辆行人人头跟踪等,垂类数据集也需要处理成与MIX数据集相同的格式,参照[数据集下载汇总](DataDownload.md)、[车辆跟踪](vehicle/README_cn.md)、[人头跟踪](headtracking21/README_cn.md)以及更通用的[行人跟踪](pedestrian/README_cn.md)。
+ - 用户自定义数据集可参照[MOT数据集准备教程](../../docs/tutorials/PrepareMOTDataSet_cn.md)去准备。
+
+
+### 用户自定义数据集准备
+用户自定义数据集准备请参考[MOT数据集准备教程](../../docs/tutorials/PrepareMOTDataSet_cn.md)去准备。
+
+## 引用
+```
+@inproceedings{Wojke2017simple,
+ title={Simple Online and Realtime Tracking with a Deep Association Metric},
+ author={Wojke, Nicolai and Bewley, Alex and Paulus, Dietrich},
+ booktitle={2017 IEEE International Conference on Image Processing (ICIP)},
+ year={2017},
+ pages={3645--3649},
+ organization={IEEE},
+ doi={10.1109/ICIP.2017.8296962}
+}
+
+@inproceedings{Wojke2018deep,
+ title={Deep Cosine Metric Learning for Person Re-identification},
+ author={Wojke, Nicolai and Bewley, Alex},
+ booktitle={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)},
+ year={2018},
+ pages={748--756},
+ organization={IEEE},
+ doi={10.1109/WACV.2018.00087}
+}
+
+@article{wang2019towards,
+ title={Towards Real-Time Multi-Object Tracking},
+ author={Wang, Zhongdao and Zheng, Liang and Liu, Yixuan and Wang, Shengjin},
+ journal={arXiv preprint arXiv:1909.12605},
+ year={2019}
+}
+
+@article{zhang2020fair,
+ title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
+ author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
+ journal={arXiv preprint arXiv:2004.01888},
+ year={2020}
+}
+
+@article{zhang2021bytetrack,
+ title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
+ author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
+ journal={arXiv preprint arXiv:2110.06864},
+ year={2021}
+}
+
+@article{cao2022observation,
+ title={Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking},
+ author={Cao, Jinkun and Weng, Xinshuo and Khirodkar, Rawal and Pang, Jiangmiao and Kitani, Kris},
+ journal={arXiv preprint arXiv:2203.14360},
+ year={2022}
+}
+
+@article{aharon2022bot,
+ title={BoT-SORT: Robust Associations Multi-Pedestrian Tracking},
+ author={Aharon, Nir and Orfaig, Roy and Bobrovsky, Ben-Zion},
+ journal={arXiv preprint arXiv:2206.14651},
+ year={2022}
+}
+
+@article{zhou2020tracking,
+ title={Tracking Objects as Points},
+ author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
+ journal={ECCV},
+ year={2020}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/README_en.md b/PaddleDetection-release-2.6/configs/mot/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..3ae5444eb885591bab53bae7754dd35563a29964
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/README_en.md
@@ -0,0 +1,217 @@
+English | [简体中文](README.md)
+
+# MOT (Multi-Object Tracking)
+
+## Table of Contents
+- [Introduction](#Introduction)
+- [Installation](#Installation)
+- [Model Zoo](#Model_Zoo)
+- [Dataset Preparation](#Dataset_Preparation)
+- [Citations](#Citations)
+
+## Introduction
+The current mainstream of 'Tracking By Detecting' multi-object tracking (MOT) algorithm is mainly composed of two parts: detection and embedding. Detection aims to detect the potential targets in each frame of the video. Embedding assigns and updates the detected target to the corresponding track (named ReID task). According to the different implementation of these two parts, it can be divided into **SDE** series and **JDE** series algorithm.
+
+- **SDE** (Separate Detection and Embedding) is a kind of algorithm which completely separates Detection and Embedding. The most representative is **DeepSORT** algorithm. This design can make the system fit any kind of detectors without difference, and can be improved for each part separately. However, due to the series process, the speed is slow. Time-consuming is a great challenge in the construction of real-time MOT system.
+- **JDE** (Joint Detection and Embedding) is to learn detection and embedding simultaneously in a shared neural network, and set the loss function with a multi task learning approach. The representative algorithms are **JDE** and **FairMOT**. This design can achieve high-precision real-time MOT performance.
+
+Paddledetection implements three MOT algorithms of these two series, they are [DeepSORT](https://arxiv.org/abs/1812.00442) of SDE algorithm, and [JDE](https://arxiv.org/abs/1909.12605),[FairMOT](https://arxiv.org/abs/2004.01888) of JDE algorithm.
+
+### PP-Tracking real-time MOT system
+In addition, PaddleDetection also provides [PP-Tracking](../../deploy/pptracking/README.md) real-time multi-object tracking system.
+PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
+
+PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
+
+### AI studio public project tutorial
+PP-tracking provides an AI studio public project tutorial. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
+
+### Python predict and deployment
+PP-Tracking supports Python predict and deployment. Please refer to this [doc](../../deploy/pptracking/python/README.md).
+
+### C++ predict and deployment
+PP-Tracking supports C++ predict and deployment. Please refer to this [doc](../../deploy/pptracking/cpp/README.md).
+
+### GUI predict and deployment
+PP-Tracking supports GUI predict and deployment. Please refer to this [doc](https://github.com/yangyudong2020/PP-Tracking_GUi).
+
+
+

+
+
+
+

+
+ video source:VisDrone, BDD100K dataset
+
+
+
+## Installation
+Install all the related dependencies for MOT:
+```
+pip install lap motmetrics sklearn
+or
+pip install -r requirements.txt
+```
+**Notes:**
+- Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
+
+
+## Model Zoo
+- Base models
+ - [ByteTrack](bytetrack/README.md)
+ - [OC-SORT](ocsort/README.md)
+ - [BoT-SORT](botsort/README.md)
+ - [DeepSORT](deepsort/README.md)
+ - [JDE](jde/README.md)
+ - [FairMOT](fairmot/README.md)
+ - [CenterTrack](centertrack/README.md)
+- Feature models
+ - [Pedestrian](pedestrian/README.md)
+ - [Head](headtracking21/README.md)
+ - [Vehicle](vehicle/README.md)
+- Multi-Class Tracking
+ - [MCFairMOT](mcfairmot/README.md)
+- Multi-Target Multi-Camera Tracking
+ - [MTMCT](mtmct/README.md)
+
+
+## Dataset Preparation
+### MOT Dataset
+PaddleDetection implement [JDE](https://github.com/Zhongdao/Towards-Realtime-MOT) and [FairMOT](https://github.com/ifzhang/FairMOT), and use the same training data named 'MIX' as them, including **Caltech Pedestrian, CityPersons, CUHK-SYSU, PRW, ETHZ, MOT17 and MOT16**. The former six are used as the mixed dataset for training, and MOT16 are used as the evaluation dataset. If you want to use these datasets, please **follow their licenses**.
+
+**Notes:**
+- Multi-Object Tracking(MOT) datasets are always used for single category tracking. DeepSORT, JDE and FairMOT are single category MOT models. 'MIX' dataset and it's sub datasets are also single category pedestrian tracking datasets. It can be considered that there are additional IDs ground truth for detection datasets.
+- In order to train the feature models of more scenes, more datasets are also processed into the same format as the MIX dataset. PaddleDetection Team also provides feature datasets and models of [vehicle tracking](vehicle/README.md), [head tracking](headtracking21/README.md) and more general [pedestrian tracking](pedestrian/README.md). User defined datasets can also be prepared by referring to data preparation [doc](../../docs/tutorials/data/PrepareMOTDataSet.md).
+- The multipe category MOT model is [MCFairMOT] (mcfairmot/readme_cn.md), and the multi category dataset is the integrated version of VisDrone dataset. Please refer to the doc of [MCFairMOT](mcfairmot/README.md).
+- The Multi-Target Multi-Camera Tracking (MTMCT) model is [AIC21 MTMCT](https://www.aicitychallenge.org)(CityFlow) Multi-Camera Vehicle Tracking dataset. The dataset and model can refer to the doc of [MTMCT](mtmct/README.md)
+
+### Dataset Directory
+First, download the image_lists.zip using the following command, and unzip them into `PaddleDetection/dataset/mot`:
+```
+wget https://bj.bcebos.com/v1/paddledet/data/mot/image_lists.zip
+```
+
+Then, download the MIX dataset using the following command, and unzip them into `PaddleDetection/dataset/mot`:
+```
+wget https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/Caltech.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/CUHKSYSU.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/PRW.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/Cityscapes.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/ETHZ.zip
+wget https://bj.bcebos.com/v1/paddledet/data/mot/MOT16.zip
+```
+
+The final directory is:
+```
+dataset/mot
+ |——————image_lists
+ |——————caltech.10k.val
+ |——————caltech.all
+ |——————caltech.train
+ |——————caltech.val
+ |——————citypersons.train
+ |——————citypersons.val
+ |——————cuhksysu.train
+ |——————cuhksysu.val
+ |——————eth.train
+ |——————mot16.train
+ |——————mot17.train
+ |——————prw.train
+ |——————prw.val
+ |——————Caltech
+ |——————Cityscapes
+ |——————CUHKSYSU
+ |——————ETHZ
+ |——————MOT16
+ |——————MOT17
+ |——————PRW
+```
+
+### Data Format
+These several relevant datasets have the following structure:
+```
+MOT17
+ |——————images
+ | └——————train
+ | └——————test
+ └——————labels_with_ids
+ └——————train
+```
+Annotations of these datasets are provided in a unified format. Every image has a corresponding annotation text. Given an image path, the annotation text path can be generated by replacing the string `images` with `labels_with_ids` and replacing `.jpg` with `.txt`.
+
+In the annotation text, each line is describing a bounding box and has the following format:
+```
+[class] [identity] [x_center] [y_center] [width] [height]
+```
+**Notes:**
+- `class` is the class id, support single class and multi-class, start from `0`, and for single class is `0`.
+- `identity` is an integer from `1` to `num_identities`(`num_identities` is the total number of instances of objects in the dataset of all videos or image squences), or `-1` if this box has no identity annotation.
+- `[x_center] [y_center] [width] [height]` are the center coordinates, width and height, note that they are normalized by the width/height of the image, so they are floating point numbers ranging from 0 to 1.
+
+
+## Citations
+```
+@inproceedings{Wojke2017simple,
+ title={Simple Online and Realtime Tracking with a Deep Association Metric},
+ author={Wojke, Nicolai and Bewley, Alex and Paulus, Dietrich},
+ booktitle={2017 IEEE International Conference on Image Processing (ICIP)},
+ year={2017},
+ pages={3645--3649},
+ organization={IEEE},
+ doi={10.1109/ICIP.2017.8296962}
+}
+
+@inproceedings{Wojke2018deep,
+ title={Deep Cosine Metric Learning for Person Re-identification},
+ author={Wojke, Nicolai and Bewley, Alex},
+ booktitle={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)},
+ year={2018},
+ pages={748--756},
+ organization={IEEE},
+ doi={10.1109/WACV.2018.00087}
+}
+
+@article{wang2019towards,
+ title={Towards Real-Time Multi-Object Tracking},
+ author={Wang, Zhongdao and Zheng, Liang and Liu, Yixuan and Wang, Shengjin},
+ journal={arXiv preprint arXiv:1909.12605},
+ year={2019}
+}
+
+@article{zhang2020fair,
+ title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
+ author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
+ journal={arXiv preprint arXiv:2004.01888},
+ year={2020}
+}
+
+@article{zhang2021bytetrack,
+ title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
+ author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
+ journal={arXiv preprint arXiv:2110.06864},
+ year={2021}
+}
+
+@article{cao2022observation,
+ title={Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking},
+ author={Cao, Jinkun and Weng, Xinshuo and Khirodkar, Rawal and Pang, Jiangmiao and Kitani, Kris},
+ journal={arXiv preprint arXiv:2203.14360},
+ year={2022}
+}
+
+@article{aharon2022bot,
+ title={BoT-SORT: Robust Associations Multi-Pedestrian Tracking},
+ author={Aharon, Nir and Orfaig, Roy and Bobrovsky, Ben-Zion},
+ journal={arXiv preprint arXiv:2206.14651},
+ year={2022}
+}
+
+@article{zhou2020tracking,
+ title={Tracking Objects as Points},
+ author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
+ journal={ECCV},
+ year={2020}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/botsort/README.md b/PaddleDetection-release-2.6/configs/mot/botsort/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..5cf981ca51fc682859c8a3d80f3a34dad36e54a0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/botsort/README.md
@@ -0,0 +1,89 @@
+English | [简体中文](README_cn.md)
+
+# BOT_SORT (BoT-SORT: Robust Associations Multi-Pedestrian Tracking)
+
+## content
+- [introduction](#introduction)
+- [model zoo](#modelzoo)
+- [Quick Start](#QuickStart)
+- [Citation](Citation)
+
+## introduction
+[BOT_SORT](https://arxiv.org/pdf/2206.14651v2.pdf)(BoT-SORT: Robust Associations Multi-Pedestrian Tracking). The configuration of common detectors is provided here for reference. Because different training data sets, input scales, number of training epochs, NMS threshold settings, etc. will lead to differences in model accuracy and performance, please adapt according to your needs
+
+## modelzoo
+
+### BOT_SORT在MOT-17 half Val Set
+
+| Dataset | detector | input size | detector mAP | MOTA | IDF1 | config |
+| :-------- | :----- | :----: | :------: | :----: |:-----: |:----: |
+| MOT-17 half train | PP-YOLOE-l | 640x640 | 52.7 | 55.5 | 64.2 |[config](./botsort_ppyoloe.yml) |
+
+
+**Attention:**
+ - Model weight download link in the configuration file ` ` ` det_ Weights ` ` `, run the verification command to automatically download.
+ - **MOT17-half train** is a data set composed of pictures and labels of the first half frames of each video in the MOT17 train sequence (7 in total). To verify the accuracy, we can use the **MOT17-half val** to eval,It is composed of the second half frame of each video,download [link](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip),decompression `dataset/mot/`
+
+ - BOT_ SORT training is a separate detector training MOT dataset, reasoning is to assemble a tracker to evaluate MOT indicators, and a separate detection model can also evaluate detection indicators.
+ - BOT_SORT export deployment is to export the detection model separately and then assemble the tracker for operation. Refer to [PP-Tracking](../../../deploy/pptracking/python)。
+ - BOT_SORT is the main scheme for PP Human, PP Vehicle and other pipelines to analyze the project tracking direction. For specific use, please refer to [Pipeline](../../../deploy/pipeline) and [MOT](../../../deploy/pipeline/docs/tutorials/pphuman_mot.md).
+
+
+## QuickStart
+
+### 1. train
+Start training and evaluation with the following command
+```bash
+#Single gpu
+CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --eval --amp
+
+#Multi gpu
+python -m paddle.distributed.launch --log_dir=ppyoloe --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --eval --amp
+```
+
+### 2. evaluate
+#### 2.1 detection
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml
+```
+
+**Attention:**
+ - eval detection use ```tools/eval.py```,eval mot use ```tools/eval_mot.py```.
+
+#### 2.2 mot
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/botsort/botsort_ppyoloe.yml --scaled=True
+```
+**Attention:**
+ - `--scaled` indicates whether the coordinates of the output results of the model have been scaled back to the original drawing. If the detection model used is JDE YOLOv3, it is false. If the universal detection model is used, it is true. The default value is false.
+ - mot result save `{output_dir}/mot_results/`,each video sequence in it corresponds to a txt, and each line of information in each txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`, and `{output_dir}` could use `--output_dir` to set.
+
+### 3. export detection model
+
+```bash
+python tools/export_model.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --output_dir=output_inference -o weights=https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+```
+
+### 4. Use the export model to predict
+
+```bash
+# download demo video
+wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
+
+CUDA_VISIBLE_DEVICES=0 python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyoloe_crn_l_36e_640x640_mot17half --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=mot17_demo.mp4 --device=GPU --threshold=0.5
+```
+**Attention:**
+ - You must fix `tracker_config.yml` tracker `type: BOTSORTTracker`,if you want to use BOT_SORT.
+ - The tracking model is used to predict videos. It does not support prediction of a single image. By default, the videos with visualized tracking results are saved. You can add `--save_mot_txts` (save a txt for each video) or `--save_mot_txt_per_img`(Save a txt for each image) or `--save_images` save the visualization picture of tracking results.
+ - Each line of the trace result txt file format `frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+
+## Citation
+```
+@article{aharon2022bot,
+ title={BoT-SORT: Robust Associations Multi-Pedestrian Tracking},
+ author={Aharon, Nir and Orfaig, Roy and Bobrovsky, Ben-Zion},
+ journal={arXiv preprint arXiv:2206.14651},
+ year={2022}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/botsort/README_cn.md b/PaddleDetection-release-2.6/configs/mot/botsort/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..5c92653db2fda7c182ab887608f33755a01b7c66
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/botsort/README_cn.md
@@ -0,0 +1,89 @@
+简体中文 | [English](README.md)
+
+# BOT_SORT (BoT-SORT: Robust Associations Multi-Pedestrian Tracking)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [快速开始](#快速开始)
+- [引用](#引用)
+
+## 简介
+[BOT_SORT](https://arxiv.org/pdf/2206.14651v2.pdf)(BoT-SORT: Robust Associations Multi-Pedestrian Tracking)。此处提供了常用检测器的配置作为参考。由于训练数据集、输入尺度、训练epoch数、NMS阈值设置等的不同均会导致模型精度和性能的差异,请自行根据需求进行适配。
+
+## 模型库
+
+### BOT_SORT在MOT-17 half Val Set上结果
+
+| 检测训练数据集 | 检测器 | 输入尺度 | 检测mAP | MOTA | IDF1 | 配置文件 |
+| :-------- | :----- | :----: | :------: | :----: |:-----: |:----: |
+| MOT-17 half train | PP-YOLOE-l | 640x640 | 52.7 | 55.5 | 64.2 |[配置文件](./botsort_ppyoloe.yml) |
+
+
+**注意:**
+ - 模型权重下载链接在配置文件中的```det_weights```,运行验证的命令即可自动下载。
+ - **MOT17-half train**是MOT17的train序列(共7个)每个视频的前一半帧的图片和标注组成的数据集,而为了验证精度可以都用**MOT17-half val**数据集去评估,它是每个视频的后一半帧组成的,数据集可以从[此链接](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip)下载,并解压放在`dataset/mot/`文件夹下。
+
+ - BOT_SORT的训练是单独的检测器训练MOT数据集,推理是组装跟踪器去评估MOT指标,单独的检测模型也可以评估检测指标。
+ - BOT_SORT的导出部署,是单独导出检测模型,再组装跟踪器运行的,参照[PP-Tracking](../../../deploy/pptracking/python)。
+ - BOT_SORT是PP-Human和PP-Vehicle等Pipeline分析项目跟踪方向的主要方案,具体使用参照[Pipeline](../../../deploy/pipeline)和[MOT](../../../deploy/pipeline/docs/tutorials/pphuman_mot.md)。
+
+
+## 快速开始
+
+### 1. 训练
+通过如下命令一键式启动训练和评估
+```bash
+#单卡训练
+CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --eval --amp
+
+#多卡训练
+python -m paddle.distributed.launch --log_dir=ppyoloe --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --eval --amp
+```
+
+### 2. 评估
+#### 2.1 评估检测效果
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml
+```
+
+**注意:**
+ - 评估检测使用的是```tools/eval.py```, 评估跟踪使用的是```tools/eval_mot.py```。
+
+#### 2.2 评估跟踪效果
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/botsort/botsort_ppyoloe.yml --scaled=True
+```
+**注意:**
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE YOLOv3则为False,如果使用通用检测模型则为True, 默认值是False。
+ - 跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
+
+### 3. 导出预测模型
+
+```bash
+python tools/export_model.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --output_dir=output_inference -o weights=https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+```
+
+### 4. 用导出的模型基于Python去预测
+
+```bash
+# 下载demo视频
+wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
+
+CUDA_VISIBLE_DEVICES=0 python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyoloe_crn_l_36e_640x640_mot17half --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=mot17_demo.mp4 --device=GPU --threshold=0.5
+```
+**注意:**
+ - 运行前需要手动修改`tracker_config.yml`的跟踪器类型为`type: BOTSORTTracker`。
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_mot_txt_per_img`(对每张图片保存一个txt)表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+
+## 引用
+```
+@article{aharon2022bot,
+ title={BoT-SORT: Robust Associations Multi-Pedestrian Tracking},
+ author={Aharon, Nir and Orfaig, Roy and Bobrovsky, Ben-Zion},
+ journal={arXiv preprint arXiv:2206.14651},
+ year={2022}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/botsort/botsort_ppyoloe.yml b/PaddleDetection-release-2.6/configs/mot/botsort/botsort_ppyoloe.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5df704dbdda40742654696aa21fd6e872beda855
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/botsort/botsort_ppyoloe.yml
@@ -0,0 +1,75 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ '../bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml',
+ '../bytetrack/_base_/mot17.yml',
+ '../bytetrack/_base_/ppyoloe_mot_reader_640x640.yml'
+]
+weights: output/botsort_ppyoloe/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+metric: MOT # eval/infer mode, set 'COCO' can be training mode
+num_classes: 1
+
+architecture: ByteTrack
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_300e_coco.pdparams
+ByteTrack:
+ detector: YOLOv3 # PPYOLOe version
+ reid: None
+ tracker: BOTSORTTracker
+det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+reid_weights: None
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 # 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.1 # 0.01 in original detector
+ nms_threshold: 0.4 # 0.6 in original detector
+
+
+BOTSORTTracker:
+ track_high_thresh: 0.3
+ track_low_thresh: 0.2
+ new_track_thresh: 0.4
+ match_thresh: 0.7
+ track_buffer: 30
+ min_box_area: 0
+ camera_motion: False
+ cmc_method: 'sparseOptFlow' # only camera_motion is True,
+ # sparseOptFlow | files (Vidstab GMC) | orb | ecc
+
+
+# MOTDataset for MOT evaluation and inference
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT and ByteTrack
+
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/README.md b/PaddleDetection-release-2.6/configs/mot/bytetrack/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/README_cn.md b/PaddleDetection-release-2.6/configs/mot/bytetrack/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..3e896ec0447fc9179d79828a1724f9dd968d1255
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/README_cn.md
@@ -0,0 +1,195 @@
+简体中文 | [English](README.md)
+
+# ByteTrack (ByteTrack: Multi-Object Tracking by Associating Every Detection Box)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+ - [行人跟踪](#行人跟踪)
+ - [人头跟踪](#人头跟踪)
+- [多类别适配](#多类别适配)
+- [快速开始](#快速开始)
+- [引用](#引用)
+
+
+## 简介
+[ByteTrack](https://arxiv.org/abs/2110.06864)(ByteTrack: Multi-Object Tracking by Associating Every Detection Box) 通过关联每个检测框来跟踪,而不仅是关联高分的检测框。对于低分数检测框会利用它们与轨迹片段的相似性来恢复真实对象并过滤掉背景检测框。此处提供了几个常用检测器的配置作为参考。由于训练数据集、输入尺度、训练epoch数、NMS阈值设置等的不同均会导致模型精度和性能的差异,请自行根据需求进行适配。
+
+
+## 模型库
+
+### 行人跟踪
+
+#### 基于不同检测器的ByteTrack在 MOT-17 half Val Set 上的结果
+
+| 检测训练数据集 | 检测器 | 输入尺度 | ReID | 检测mAP(0.5:0.95) | MOTA | IDF1 | FPS | 配置文件 |
+| :-------- | :----- | :----: | :----:|:------: | :----: |:-----: |:----:|:----: |
+| MOT-17 half train | YOLOv3 | 608x608 | - | 42.7 | 49.5 | 54.8 | - |[配置文件](./bytetrack_yolov3.yml) |
+| MOT-17 half train | PP-YOLOE-l | 640x640 | - | 52.9 | 50.4 | 59.7 | - |[配置文件](./bytetrack_ppyoloe.yml) |
+| MOT-17 half train | PP-YOLOE-l | 640x640 |PPLCNet| 52.9 | 51.7 | 58.8 | - |[配置文件](./bytetrack_ppyoloe_pplcnet.yml) |
+| **mix_mot_ch** | YOLOX-x | 800x1440| - | 61.9 | 77.3 | 71.6 | - |[配置文件](./bytetrack_yolox.yml) |
+| **mix_det** | YOLOX-x | 800x1440| - | 65.4 | 84.5 | 77.4 | - |[配置文件](./bytetrack_yolox.yml) |
+
+**注意:**
+ - 检测任务相关配置和文档请查看[detector](detector/)。
+ - 模型权重下载链接在配置文件中的```det_weights```和```reid_weights```,运行```tools/eval_mot.py```评估的命令即可自动下载,```reid_weights```若为None则表示不需要使用。
+ - **ByteTrack默认不使用ReID权重**,如需使用ReID权重,可以参考 [bytetrack_ppyoloe_pplcnet.yml](./bytetrack_ppyoloe_pplcnet.yml),如需**更换ReID权重,可改动其中的`reid_weights: `为自己的权重路径**。
+ - **MOT17-half train**是MOT17的train序列(共7个)每个视频的前一半帧的图片和标注组成的数据集,而为了验证精度可以都用**MOT17-half val**数据集去评估,它是每个视频的后一半帧组成的,数据集可以从[此链接](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip)下载,并解压放在`dataset/mot/`文件夹下。
+ - **mix_mot_ch**数据集,是MOT17、CrowdHuman组成的联合数据集,**mix_det**数据集是MOT17、CrowdHuman、Cityscapes、ETHZ组成的联合数据集,数据集整理的格式和目录可以参考[此链接](https://github.com/ifzhang/ByteTrack#data-preparation),最终放置于`dataset/mot/`目录下。为了验证精度可以都用**MOT17-half val**数据集去评估。
+
+
+#### YOLOX-x ByteTrack(mix_det)在 MOT-16/MOT-17 上的结果
+
+[](https://paperswithcode.com/sota/multi-object-tracking-on-mot16?p=pp-yoloe-an-evolved-version-of-yolo)
+
+| 网络 | 测试集 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :---------: | :-------: | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| ByteTrack-x| MOT-17 Train | 84.4 | 72.8 | 837 | 5653 | 10985 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./bytetrack_yolox.yml) |
+| ByteTrack-x| **MOT-17 Test** | **78.4** | 69.7 | 4974 | 37551 | 79524 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./bytetrack_yolox.yml) |
+| ByteTrack-x| MOT-16 Train | 83.5 | 72.7 | 800 | 6973 | 10419 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./bytetrack_yolox.yml) |
+| ByteTrack-x| **MOT-16 Test** | **77.7** | 70.1 | 1570 | 15695 | 23304 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./bytetrack_yolox.yml) |
+
+
+**注意:**
+ - **mix_det**数据集是MOT17、CrowdHuman、Cityscapes、ETHZ组成的联合数据集,数据集整理的格式和目录可以参考[此链接](https://github.com/ifzhang/ByteTrack#data-preparation),最终放置于`dataset/mot/`目录下。
+ - MOT-17 Train 和 MOT-16 Train 的指标均为本地评估该数据后的指标,由于Train集包括在了训练集中,此MOTA指标不代表模型的检测跟踪能力,只是因为MOT-17和MOT-16无验证集而它们的Train集有ground truth,是为了方便验证精度。
+ - MOT-17 Test 和 MOT-16 Test 的指标均为交到 [MOTChallenge](https://motchallenge.net)官网评测后的指标,因为MOT-17和MOT-16的Test集未开放ground truth,此MOTA指标可以代表模型的检测跟踪能力。
+ - ByteTrack的训练是单独的检测器训练MOT数据集,推理是组装跟踪器去评估MOT指标,单独的检测模型也可以评估检测指标。
+ - ByteTrack的导出部署,是单独导出检测模型,再组装跟踪器运行的,参照[PP-Tracking](../../../deploy/pptracking/python/README.md)。
+
+
+### 人头跟踪
+
+#### YOLOX-x ByteTrack 在 HT-21 Test Set上的结果
+
+| 模型 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :---: | :----: | :---: | :------: | :----: |:----: |
+| ByteTrack-x | 1440x800 | 64.1 | 63.4 | 4191 | 185162 | 210240 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/bytetrack_yolox_ht21.pdparams) | [配置文件](./bytetrack_yolox_ht21.yml) |
+
+#### YOLOX-x ByteTrack 在 HT-21 Test Set上的结果
+
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: |:-------: | :----: | :----: |
+| ByteTrack-x | 1440x800 | 72.6 | 61.8 | 5163 | 71235 | 154139 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/bytetrack_yolox_ht21.pdparams) | [配置文件](./bytetrack_yolox_ht21.yml) |
+
+**注意:**
+ - 更多人头跟踪模型可以参考[headtracking21](../headtracking21)。
+
+
+## 多类别适配
+
+多类别ByteTrack,可以参考 [bytetrack_ppyoloe_ppvehicle9cls.yml](./bytetrack_ppyoloe_ppvehicle9cls.yml),表示使用 [PP-Vehicle](../../ppvehicle/) 中的PPVehicle9cls数据集训好的模型权重去做多类别车辆跟踪。由于没有跟踪的ground truth标签无法做评估,故只做跟踪预测,只需修改`TestMOTDataset`确保路径存在,且其中的`anno_path`表示指定在一个`label_list.txt`中记录具体类别,需要自己手写,一行表示一个种类,注意路径`anno_path`如果写错或找不到则将默认使用COCO数据集80类的类别。
+
+如需**更换检测器权重,可改动其中的`det_weights: `为自己的权重路径**,并注意**数据集路径、`label_list.txt`和类别数**做出相应更改。
+
+预测多类别车辆跟踪:
+```bash
+# 下载demo视频
+wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/bdd100k_demo.mp4
+
+# 使用PPYOLOE 多类别车辆检测模型
+CUDA_VISIBLE_DEVICES=1 python tools/infer_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe_ppvehicle9cls.yml --video_file=bdd100k_demo.mp4 --scaled=True --save_videos
+```
+
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
+ - `--save_videos`表示保存可视化视频,同时会保存可视化的图片在`{output_dir}/mot_outputs/`中,`{output_dir}`可通过`--output_dir`设置,默认文件夹名为`output`。
+
+
+## 快速开始
+
+### 1. 训练
+通过如下命令一键式启动训练和评估
+```bash
+python -m paddle.distributed.launch --log_dir=ppyoloe --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --eval --amp
+# 或者
+python -m paddle.distributed.launch --log_dir=ppyoloe --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml --eval --amp
+```
+
+**注意:**
+ - ` --eval`是边训练边验证精度;`--amp`是混合精度训练避免溢出,推荐使用paddlepaddle2.2.2版本。
+
+### 2. 评估
+#### 2.1 评估检测效果
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml -o weights=https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+# 或者
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml -o weights=https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_det.pdparams
+```
+
+**注意:**
+ - 评估检测使用的是```tools/eval.py```, 评估跟踪使用的是```tools/eval_mot.py```。
+
+#### 2.2 评估跟踪效果
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/bytetrack/bytetrack_yolov3.yml --scaled=True
+# 或者
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe.yml --scaled=True
+# 或者
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe_pplcnet.yml --scaled=True
+# 或者
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/bytetrack/bytetrack_yolox.yml --scaled=True
+```
+**注意:**
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE YOLOv3则为False,如果使用通用检测模型则为True, 默认值是False。
+ - 跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置,默认文件夹名为`output`。
+
+### 3. 预测
+
+使用单个GPU通过如下命令预测一个视频,并保存为视频
+
+```bash
+# 下载demo视频
+wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
+
+# 使用PPYOLOe行人检测模型
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/bytetrack/bytetrack_ppyoloe.yml --video_file=mot17_demo.mp4 --scaled=True --save_videos
+# 或者使用YOLOX行人检测模型
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/bytetrack/bytetrack_yolox.yml --video_file=mot17_demo.mp4 --scaled=True --save_videos
+```
+
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
+ - `--save_videos`表示保存可视化视频,同时会保存可视化的图片在`{output_dir}/mot_outputs/`中,`{output_dir}`可通过`--output_dir`设置,默认文件夹名为`output`。
+
+
+### 4. 导出预测模型
+
+Step 1:导出检测模型
+```bash
+# 导出PPYOLOe行人检测模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+# 或者导出YOLOX行人检测模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams
+```
+
+Step 2:导出ReID模型(可选步骤,默认不需要)
+```bash
+# 导出PPLCNet ReID模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid/deepsort_pplcnet.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams
+```
+
+### 5. 用导出的模型基于Python去预测
+
+```bash
+python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyoloe_crn_l_36e_640x640_mot17half/ --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts
+# 或者
+python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/yolox_x_24e_800x1440_mix_det/ --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts
+```
+
+**注意:**
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_mot_txt_per_img`(对每张图片保存一个txt)表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+
+## 引用
+```
+@article{zhang2021bytetrack,
+ title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
+ author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
+ journal={arXiv preprint arXiv:2110.06864},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/ht21.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/ht21.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8500af3165e1173cc442396ace1af54f09ab810a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/ht21.yml
@@ -0,0 +1,34 @@
+metric: COCO
+num_classes: 1
+
+# Detection Dataset for training
+TrainDataset:
+ !COCODataSet
+ image_dir: images/train
+ anno_path: annotations/train.json
+ dataset_dir: dataset/mot/HT21
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images/train
+ anno_path: annotations/val_half.json
+ dataset_dir: dataset/mot/HT21
+
+TestDataset:
+ !ImageFolder
+ dataset_dir: dataset/mot/HT21
+ anno_path: annotations/val_half.json
+
+
+# MOTDataset for MOT evaluation and inference
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: HT21/images/test
+ keep_ori_im: True # set as True in DeepSORT and ByteTrack
+
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/mix_det.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/mix_det.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fbe19bdaa29246919189d5d93a3ea01e3734b52c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/mix_det.yml
@@ -0,0 +1,34 @@
+metric: COCO
+num_classes: 1
+
+# Detection Dataset for training
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train.json
+ dataset_dir: dataset/mot/mix_det
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images/train
+ anno_path: annotations/val_half.json
+ dataset_dir: dataset/mot/MOT17
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val_half.json
+ dataset_dir: dataset/mot/MOT17
+
+
+# MOTDataset for MOT evaluation and inference
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT and ByteTrack
+
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/mix_mot_ch.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/mix_mot_ch.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a19f149301a1d993c552a12e60144f63990d6f4d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/mix_mot_ch.yml
@@ -0,0 +1,34 @@
+metric: COCO
+num_classes: 1
+
+# Detection Dataset for training
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train.json
+ dataset_dir: dataset/mot/mix_mot_ch
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images/train
+ anno_path: annotations/val_half.json
+ dataset_dir: dataset/mot/MOT17
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val_half.json
+ dataset_dir: dataset/mot/MOT17
+
+
+# MOTDataset for MOT evaluation and inference
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT and ByteTrack
+
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/mot17.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/mot17.yml
new file mode 100644
index 0000000000000000000000000000000000000000..faf47f622d1c2847a9686dfa8d7e48a49c05436c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/mot17.yml
@@ -0,0 +1,34 @@
+metric: COCO
+num_classes: 1
+
+# Detection Dataset for training
+TrainDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/train_half.json
+ image_dir: images/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+ image_dir: images/train
+
+TestDataset:
+ !ImageFolder
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+
+
+# MOTDataset for MOT evaluation and inference
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT and ByteTrack
+
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/ppyoloe_mot_reader_640x640.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/ppyoloe_mot_reader_640x640.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ef6342fd0e9249acf386b7795cb538b73a26f108
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/ppyoloe_mot_reader_640x640.yml
@@ -0,0 +1,60 @@
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+# add MOTReader for MOT evaluation and inference, note batch_size should be 1 in MOT
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/yolov3_mot_reader_608x608.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/yolov3_mot_reader_608x608.yml
new file mode 100644
index 0000000000000000000000000000000000000000..535a977033ebc346af5cc4625986233618a26917
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/yolov3_mot_reader_608x608.yml
@@ -0,0 +1,66 @@
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - Decode: {}
+ - Mixup: {alpha: 1.5, beta: 1.5}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 50}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ mixup_epoch: 250
+ use_shared_memory: true
+
+EvalReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 608, 608]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+# add MOTReader for MOT evaluation and inference, note batch_size should be 1 in MOT
+EvalMOTReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 608, 608]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/yolox_mot_reader_800x1440.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/yolox_mot_reader_800x1440.yml
new file mode 100644
index 0000000000000000000000000000000000000000..48d4144221f6fa353af90ce3781a21329a566751
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/_base_/yolox_mot_reader_800x1440.yml
@@ -0,0 +1,67 @@
+
+input_height: &input_height 800
+input_width: &input_width 1440
+input_size: &input_size [*input_height, *input_width]
+
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Mosaic:
+ prob: 1.0
+ input_dim: *input_size
+ degrees: [-10, 10]
+ scale: [0.1, 2.0]
+ shear: [-2, 2]
+ translate: [-0.1, 0.1]
+ enable_mixup: True
+ mixup_prob: 1.0
+ mixup_scale: [0.5, 1.5]
+ - AugmentHSV: {is_bgr: False, hgain: 5, sgain: 30, vgain: 30}
+ - PadResize: {target_size: *input_size}
+ - RandomFlip: {}
+ batch_transforms:
+ - Permute: {}
+ batch_size: 6
+ shuffle: True
+ drop_last: True
+ collate_batch: False
+ mosaic_epoch: 20
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *input_size, keep_ratio: True}
+ - Pad: {size: *input_size, fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 800, 1440]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *input_size, keep_ratio: True}
+ - Pad: {size: *input_size, fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 1
+
+
+# add MOTReader for MOT evaluation and inference, note batch_size should be 1 in MOT
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *input_size, keep_ratio: True}
+ - Pad: {size: *input_size, fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 1
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 800, 1440]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *input_size, keep_ratio: True}
+ - Pad: {size: *input_size, fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_ppyoloe.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_ppyoloe.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5e7ffe07f0f758c641596e90ee0da4c31085fd85
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_ppyoloe.yml
@@ -0,0 +1,59 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ 'detector/ppyoloe_crn_l_36e_640x640_mot17half.yml',
+ '_base_/mot17.yml',
+ '_base_/ppyoloe_mot_reader_640x640.yml'
+]
+weights: output/bytetrack_ppyoloe/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+metric: MOT # eval/infer mode, set 'COCO' can be training mode
+num_classes: 1
+
+architecture: ByteTrack
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_300e_coco.pdparams
+ByteTrack:
+ detector: YOLOv3 # PPYOLOe version
+ reid: None
+ tracker: JDETracker
+det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+reid_weights: None
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 # 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.1 # 0.01 in original detector
+ nms_threshold: 0.4 # 0.6 in original detector
+
+# BYTETracker
+JDETracker:
+ use_byte: True
+ match_thres: 0.9
+ conf_thres: 0.2
+ low_conf_thres: 0.1
+ min_box_area: 100
+ vertical_ratio: 1.6 # for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_ppyoloe_pplcnet.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_ppyoloe_pplcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..60f81165d5b324943a997dbc26fbe56f249f2ef6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_ppyoloe_pplcnet.yml
@@ -0,0 +1,59 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ 'detector/ppyoloe_crn_l_36e_640x640_mot17half.yml',
+ '_base_/mot17.yml',
+ '_base_/ppyoloe_mot_reader_640x640.yml'
+]
+weights: output/bytetrack_ppyoloe_pplcnet/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+metric: MOT # eval/infer mode
+num_classes: 1
+
+architecture: ByteTrack
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_300e_coco.pdparams
+ByteTrack:
+ detector: YOLOv3 # PPYOLOe version
+ reid: PPLCNetEmbedding # use reid
+ tracker: JDETracker
+det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+reid_weights: https://bj.bcebos.com/v1/paddledet/models/mot/deepsort_pplcnet.pdparams
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 # 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.1 # 0.01 in original detector
+ nms_threshold: 0.4 # 0.6 in original detector
+
+# BYTETracker
+JDETracker:
+ use_byte: True
+ match_thres: 0.9
+ conf_thres: 0.2
+ low_conf_thres: 0.1
+ min_box_area: 100
+ vertical_ratio: 1.6 # for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_ppyoloe_ppvehicle9cls.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_ppyoloe_ppvehicle9cls.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f847a34d1e0c7af1ccaa6be33036f06a6473a7a4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_ppyoloe_ppvehicle9cls.yml
@@ -0,0 +1,49 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ 'bytetrack_ppyoloe.yml',
+ '_base_/ppyoloe_mot_reader_640x640.yml'
+]
+weights: output/bytetrack_ppyoloe_ppvehicle9cls/model_final
+
+metric: MCMOT # multi-class, `MOT` for single class
+num_classes: 9
+# pedestrian(1), rider(2), car(3), truck(4), bus(5), van(6), motorcycle(7), bicycle(8), others(9)
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+ anno_path: dataset/mot/label_list.txt # absolute path
+
+### write in label_list.txt each line:
+# pedestrian
+# rider
+# car
+# truck
+# bus
+# van
+# motorcycle
+# bicycle
+# others
+###
+
+det_weights: https://paddledet.bj.bcebos.com/models/mot_ppyoloe_l_36e_ppvehicle9cls.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+PPYOLOEHead:
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.1 # 0.01 in original detector
+ nms_threshold: 0.4 # 0.6 in original detector
+
+# BYTETracker
+JDETracker:
+ use_byte: True
+ match_thres: 0.9
+ conf_thres: 0.2
+ low_conf_thres: 0.1
+ min_box_area: 0
+ vertical_ratio: 0 # only use 1.6 in MOT17 pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_yolov3.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_yolov3.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0ce35ae7831d36ead906a60ccbd5632f1b147b2e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_yolov3.yml
@@ -0,0 +1,50 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ 'detector/yolov3_darknet53_40e_608x608_mot17half.yml',
+ '_base_/mot17.yml',
+ '_base_/yolov3_mot_reader_608x608.yml'
+]
+weights: output/bytetrack_yolov3/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+metric: MOT # eval/infer mode
+num_classes: 1
+
+architecture: ByteTrack
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolov3_darknet53_270e_coco.pdparams
+ByteTrack:
+ detector: YOLOv3 # General YOLOv3 version
+ reid: None
+ tracker: JDETracker
+det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/yolov3_darknet53_40e_608x608_mot17half.pdparams
+reid_weights: None
+
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 1000
+
+# BYTETracker
+JDETracker:
+ use_byte: True
+ match_thres: 0.9
+ conf_thres: 0.2
+ low_conf_thres: 0.1
+ min_box_area: 100
+ vertical_ratio: 1.6 # for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_yolox.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_yolox.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2e195c56d00cfc696e93fee4e9f709f123b5dcec
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_yolox.yml
@@ -0,0 +1,68 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ 'detector/yolox_x_24e_800x1440_mix_det.yml',
+ '_base_/mix_det.yml',
+ '_base_/yolox_mot_reader_800x1440.yml'
+]
+weights: output/bytetrack_yolox/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+metric: MOT # eval/infer mode
+num_classes: 1
+
+architecture: ByteTrack
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
+ByteTrack:
+ detector: YOLOX
+ reid: None
+ tracker: JDETracker
+det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_det.pdparams
+reid_weights: None
+
+depth_mult: 1.33
+width_mult: 1.25
+
+YOLOX:
+ backbone: CSPDarkNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ input_size: [800, 1440]
+ size_stride: 32
+ size_range: [18, 22] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
+
+CSPDarkNet:
+ arch: "X"
+ return_idx: [2, 3, 4]
+ depthwise: False
+
+YOLOCSPPAN:
+ depthwise: False
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+YOLOXHead:
+ l1_epoch: 20
+ depthwise: False
+ loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ use_vfl: False
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.7
+ # For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
+ # For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.
+
+
+# BYTETracker
+JDETracker:
+ use_byte: True
+ match_thres: 0.9
+ conf_thres: 0.6
+ low_conf_thres: 0.2
+ min_box_area: 100
+ vertical_ratio: 1.6 # for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_yolox_ht21.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_yolox_ht21.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ea21a87c5ed1ec8297155c80b8e7136e1941c636
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/bytetrack_yolox_ht21.yml
@@ -0,0 +1,68 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ 'detector/yolox_x_24e_800x1440_ht21.yml',
+ '_base_/ht21.yml',
+ '_base_/yolox_mot_reader_800x1440.yml'
+]
+weights: output/bytetrack_yolox_ht21/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+metric: MOT # eval/infer mode
+num_classes: 1
+
+architecture: ByteTrack
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
+ByteTrack:
+ detector: YOLOX
+ reid: None
+ tracker: JDETracker
+det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_ht21.pdparams
+reid_weights: None
+
+depth_mult: 1.33
+width_mult: 1.25
+
+YOLOX:
+ backbone: CSPDarkNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ input_size: [800, 1440]
+ size_stride: 32
+ size_range: [18, 22] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
+
+CSPDarkNet:
+ arch: "X"
+ return_idx: [2, 3, 4]
+ depthwise: False
+
+YOLOCSPPAN:
+ depthwise: False
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+YOLOXHead:
+ l1_epoch: 20
+ depthwise: False
+ loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ use_vfl: False
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 30000
+ keep_top_k: 1000
+ score_threshold: 0.01
+ nms_threshold: 0.7
+ # For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
+ # For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.
+
+
+# BYTETracker
+JDETracker:
+ use_byte: True
+ match_thres: 0.9
+ conf_thres: 0.7
+ low_conf_thres: 0.1
+ min_box_area: 0
+ vertical_ratio: 0 # 1.6 for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/README.md b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/README_cn.md b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..6de434cb4c7c5b80084b926bfc5dd70cbf7e196e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/README_cn.md
@@ -0,0 +1,39 @@
+简体中文 | [English](README.md)
+
+# ByteTrack的检测器
+
+## 简介
+[ByteTrack](https://arxiv.org/abs/2110.06864)(ByteTrack: Multi-Object Tracking by Associating Every Detection Box) 通过关联每个检测框来跟踪,而不仅是关联高分的检测框。此处提供了几个常用检测器的配置作为参考。由于训练数据集、输入尺度、训练epoch数、NMS阈值设置等的不同均会导致模型精度和性能的差异,请自行根据需求进行适配。
+
+## 模型库
+
+### 在MOT17-half val数据集上的检测结果
+| 骨架网络 | 网络类型 | 输入尺度 | 学习率策略 |推理时间(fps) | Box AP | 下载 | 配置文件 |
+| :-------------- | :------------- | :--------: | :---------: | :-----------: | :-----: | :------: | :-----: |
+| DarkNet-53 | YOLOv3 | 608X608 | 40e | ---- | 42.7 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolov3_darknet53_40e_608x608_mot17half.pdparams) | [配置文件](./yolov3_darknet53_40e_608x608_mot17half.yml) |
+| CSPResNet | PPYOLOe | 640x640 | 36e | ---- | 52.9 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyoloe_crn_l_36e_640x640_mot17half.pdparams) | [配置文件](./ppyoloe_crn_l_36e_640x640_mot17half.yml) |
+| CSPDarkNet | YOLOX-x(mix_mot_ch) | 800x1440 | 24e | ---- | 61.9 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolox_x_24e_800x1440_mix_mot_ch.pdparams) | [配置文件](./yolox_x_24e_800x1440_mix_mot_ch.yml) |
+| CSPDarkNet | YOLOX-x(mix_det) | 800x1440 | 24e | ---- | 65.4 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolox_x_24e_800x1440_mix_det.pdparams) | [配置文件](./yolox_x_24e_800x1440_mix_det.yml) |
+
+**注意:**
+ - 以上模型除YOLOX外采用**MOT17-half train**数据集训练,数据集可以从[此链接](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip)下载。
+ - **MOT17-half train**是MOT17的train序列(共7个)每个视频的前一半帧的图片和标注组成的数据集,而为了验证精度可以都用**MOT17-half val**数据集去评估,它是每个视频的后一半帧组成的,数据集可以从[此链接](https://paddledet.bj.bcebos.com/data/mot/mot17half/annotations.zip)下载,并解压放在`dataset/mot/MOT17/images/`文件夹下。
+ - YOLOX-x(mix_mot_ch)采用**mix_mot_ch**数据集,是MOT17、CrowdHuman组成的联合数据集;YOLOX-x(mix_det)采用**mix_det**数据集,是MOT17、CrowdHuman、Cityscapes、ETHZ组成的联合数据集,数据集整理的格式和目录可以参考[此链接](https://github.com/ifzhang/ByteTrack#data-preparation),最终放置于`dataset/mot/`目录下。为了验证精度可以都用**MOT17-half val**数据集去评估。
+ - 行人跟踪请使用行人检测器结合行人ReID模型。车辆跟踪请使用车辆检测器结合车辆ReID模型。
+ - 用于ByteTrack跟踪时,这些模型的NMS阈值等后处理设置会与纯检测任务的设置不同。
+
+
+## 快速开始
+
+通过如下命令一键式启动评估、评估和导出
+```bash
+job_name=ppyoloe_crn_l_36e_640x640_mot17half
+config=configs/mot/bytetrack/detector/${job_name}.yml
+log_dir=log_dir/${job_name}
+# 1. training
+python -m paddle.distributed.launch --log_dir=${log_dir} --gpus 0,1,2,3,4,5,6,7 tools/train.py -c ${config} --eval --amp
+# 2. evaluation
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c ${config} -o weights=output/${job_name}/model_final.pdparams
+# 3. export
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c ${config} -o weights=output/${job_name}/model_final.pdparams
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6c770e9bf85e953a30df43faf57c401518b7f6ad
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml
@@ -0,0 +1,83 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ '../../../ppyoloe/ppyoloe_crn_l_300e_coco.yml',
+ '../_base_/mot17.yml',
+]
+weights: output/ppyoloe_crn_l_36e_640x640_mot17half/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+
+# schedule configuration for fine-tuning
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 1
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+TrainReader:
+ batch_size: 8
+
+
+# detector configuration
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 # 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolov3_darknet53_40e_608x608_mot17half.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolov3_darknet53_40e_608x608_mot17half.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9b9df0f390133da5aaa4c4802245dce8d8d10229
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolov3_darknet53_40e_608x608_mot17half.yml
@@ -0,0 +1,77 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ '../../../yolov3/yolov3_darknet53_270e_coco.yml',
+ '../_base_/mot17.yml',
+]
+weights: output/yolov3_darknet53_40e_608x608_mot17half/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+# schedule configuration for fine-tuning
+epoch: 40
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 32
+ - 36
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 100
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+TrainReader:
+ batch_size: 8
+ mixup_epoch: 35
+
+# detector configuration
+architecture: YOLOv3
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolov3_darknet53_270e_coco.pdparams
+norm_type: sync_bn
+
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+DarkNet:
+ depth: 53
+ return_idx: [2, 3, 4]
+
+# use default config
+# YOLOv3FPN:
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 1000
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolox_x_24e_800x1440_ht21.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolox_x_24e_800x1440_ht21.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bd102a48d1013b9e6399411562b47e1e85e2c2ec
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolox_x_24e_800x1440_ht21.yml
@@ -0,0 +1,80 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ '../../../yolox/yolox_x_300e_coco.yml',
+ '../_base_/ht21.yml',
+]
+weights: output/yolox_x_24e_800x1440_ht21/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+# schedule configuration for fine-tuning
+epoch: 24
+LearningRate:
+ base_lr: 0.0005 # fintune
+ schedulers:
+ - !CosineDecay
+ max_epochs: 24
+ min_lr_ratio: 0.05
+ last_plateau_epochs: 4
+ - !ExpWarmup
+ epochs: 1
+
+OptimizerBuilder:
+ optimizer:
+ type: Momentum
+ momentum: 0.9
+ use_nesterov: True
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+TrainReader:
+ batch_size: 4
+ mosaic_epoch: 20
+
+# detector configuration
+architecture: YOLOX
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
+norm_type: sync_bn
+use_ema: True
+ema_decay: 0.9999
+ema_decay_type: "exponential"
+act: silu
+find_unused_parameters: True
+depth_mult: 1.33
+width_mult: 1.25
+
+YOLOX:
+ backbone: CSPDarkNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ input_size: [800, 1440]
+ size_stride: 32
+ size_range: [18, 32] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
+
+CSPDarkNet:
+ arch: "X"
+ return_idx: [2, 3, 4]
+ depthwise: False
+
+YOLOCSPPAN:
+ depthwise: False
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+YOLOXHead:
+ l1_epoch: 20
+ depthwise: False
+ loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ use_vfl: False
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.7
+ # For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
+ # For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2585e5a47ac0589f7d673803a5172b42f3b902bc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml
@@ -0,0 +1,80 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ '../../../yolox/yolox_x_300e_coco.yml',
+ '../_base_/mix_det.yml',
+]
+weights: output/yolox_x_24e_800x1440_mix_det/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+# schedule configuration for fine-tuning
+epoch: 24
+LearningRate:
+ base_lr: 0.00075 # fintune
+ schedulers:
+ - !CosineDecay
+ max_epochs: 24
+ min_lr_ratio: 0.05
+ last_plateau_epochs: 4
+ - !ExpWarmup
+ epochs: 1
+
+OptimizerBuilder:
+ optimizer:
+ type: Momentum
+ momentum: 0.9
+ use_nesterov: True
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+TrainReader:
+ batch_size: 6
+ mosaic_epoch: 20
+
+# detector configuration
+architecture: YOLOX
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
+norm_type: sync_bn
+use_ema: True
+ema_decay: 0.9999
+ema_decay_type: "exponential"
+act: silu
+find_unused_parameters: True
+depth_mult: 1.33
+width_mult: 1.25
+
+YOLOX:
+ backbone: CSPDarkNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ input_size: [800, 1440]
+ size_stride: 32
+ size_range: [18, 30] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
+
+CSPDarkNet:
+ arch: "X"
+ return_idx: [2, 3, 4]
+ depthwise: False
+
+YOLOCSPPAN:
+ depthwise: False
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+YOLOXHead:
+ l1_epoch: 20
+ depthwise: False
+ loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ use_vfl: False
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.7
+ # For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
+ # For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.
diff --git a/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_mot_ch.yml b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_mot_ch.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ae0fba92e76f267a89ff88702811fe4fc332a6ad
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_mot_ch.yml
@@ -0,0 +1,80 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ '../../../yolox/yolox_x_300e_coco.yml',
+ '../_base_/mix_mot_ch.yml',
+]
+weights: output/yolox_x_24e_800x1440_mix_mot_ch/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+# schedule configuration for fine-tuning
+epoch: 24
+LearningRate:
+ base_lr: 0.00075 # fine-tune
+ schedulers:
+ - !CosineDecay
+ max_epochs: 24
+ min_lr_ratio: 0.05
+ last_plateau_epochs: 4
+ - !ExpWarmup
+ epochs: 1
+
+OptimizerBuilder:
+ optimizer:
+ type: Momentum
+ momentum: 0.9
+ use_nesterov: True
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+TrainReader:
+ batch_size: 6
+ mosaic_epoch: 20
+
+# detector configuration
+architecture: YOLOX
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
+norm_type: sync_bn
+use_ema: True
+ema_decay: 0.9999
+ema_decay_type: "exponential"
+act: silu
+find_unused_parameters: True
+depth_mult: 1.33
+width_mult: 1.25
+
+YOLOX:
+ backbone: CSPDarkNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ input_size: [800, 1440]
+ size_stride: 32
+ size_range: [18, 30] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
+
+CSPDarkNet:
+ arch: "X"
+ return_idx: [2, 3, 4]
+ depthwise: False
+
+YOLOCSPPAN:
+ depthwise: False
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+YOLOXHead:
+ l1_epoch: 20
+ depthwise: False
+ loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ use_vfl: False
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.7
+ # For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
+ # For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.
diff --git a/PaddleDetection-release-2.6/configs/mot/centertrack/README.md b/PaddleDetection-release-2.6/configs/mot/centertrack/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/centertrack/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/mot/centertrack/README_cn.md b/PaddleDetection-release-2.6/configs/mot/centertrack/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..a91a844402ac3ddbcad27b44938fb35438c44e49
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/centertrack/README_cn.md
@@ -0,0 +1,156 @@
+简体中文 | [English](README.md)
+
+# CenterTrack (Tracking Objects as Points)
+
+## 内容
+- [模型库](#模型库)
+- [快速开始](#快速开始)
+- [引用](#引用)
+
+## 模型库
+
+### MOT17
+
+| 训练数据集 | 输入尺度 | 总batch_size | val MOTA | test MOTA | FPS | 配置文件 | 下载链接|
+| :---------------: | :-------: | :------------: | :----------------: | :---------: | :-------: | :----: | :-----: |
+| MOT17-half train | 544x960 | 32 | 69.2(MOT17-half) | - | - |[config](./centertrack_dla34_70e_mot17half.yml) | [download](https://paddledet.bj.bcebos.com/models/mot/centertrack_dla34_70e_mot17half.pdparams) |
+| MOT17 train | 544x960 | 32 | 87.9(MOT17-train) | 70.5(MOT17-test) | - |[config](./centertrack_dla34_70e_mot17.yml) | [download](https://paddledet.bj.bcebos.com/models/mot/centertrack_dla34_70e_mot17.pdparams) |
+| MOT17 train(paper) | 544x960| 32 | - | 67.8(MOT17-test) | - | - | - |
+
+
+**注意:**
+ - CenterTrack默认使用2 GPUs总batch_size为32进行训练,如改变GPU数或单卡batch_size,最好保持总batch_size为32去训练。
+ - **val MOTA**可能会有1.0 MOTA左右的波动,最好使用2 GPUs和总batch_size为32的默认配置去训练。
+ - **MOT17-half train**是MOT17的train序列(共7个)每个视频的**前一半帧**的图片和标注用作训练集,而用每个视频的后一半帧组成的**MOT17-half val**作为验证集去评估得到**val MOTA**,数据集可以从[此链接](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip)下载,并解压放在`dataset/mot/`文件夹下。
+ - **MOT17 train**是MOT17的train序列(共7个)每个视频的所有帧的图片和标注用作训练集,由于MOT17数据集有限也使用**MOT17 train**数据集去评估得到**val MOTA**,而**test MOTA**为交到[MOT Challenge官网](https://motchallenge.net)评测的结果。
+
+
+## 快速开始
+
+### 1.训练
+通过如下命令一键式启动训练和评估
+```bash
+# 单卡训练(不推荐)
+CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml --amp
+# 多卡训练
+python -m paddle.distributed.launch --log_dir=centertrack_dla34_70e_mot17half/ --gpus 0,1 tools/train.py -c configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml --amp
+```
+**注意:**
+ - `--eval`暂不支持边训练边验证跟踪的MOTA精度,如果需要开启`--eval`边训练边验证检测mAP,需设置**注释配置文件中的`mot_metric: True`和`metric: MOT`**;
+ - `--amp`表示混合精度训练避免显存溢出;
+ - CenterTrack默认使用2 GPUs总batch_size为32进行训练,如改变GPU数或单卡batch_size,最好保持总batch_size仍然为32;
+
+
+### 2.评估
+
+#### 2.1 评估检测效果
+
+注意首先需要**注释配置文件中的`mot_metric: True`和`metric: MOT`**:
+```python
+### for detection eval.py/infer.py
+mot_metric: False
+metric: COCO
+
+### for MOT eval_mot.py/infer_mot_mot.py
+#mot_metric: True # 默认是不注释的,评估跟踪需要为 True,会覆盖之前的 mot_metric: False
+#metric: MOT # 默认是不注释的,评估跟踪需要使用 MOT,会覆盖之前的 metric: COCO
+```
+
+然后执行以下语句:
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml -o weights=output/centertrack_dla34_70e_mot17half/model_final.pdparams
+```
+
+**注意:**
+ - 评估检测使用的是```tools/eval.py```, 评估跟踪使用的是```tools/eval_mot.py```。
+
+#### 2.2 评估跟踪效果
+
+注意首先确保设置了**配置文件中的`mot_metric: True`和`metric: MOT`**;
+
+然后执行以下语句:
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml -o weights=output/centertrack_dla34_70e_mot17half/model_final.pdparams
+```
+**注意:**
+ - 评估检测使用的是```tools/eval.py```, 评估跟踪使用的是```tools/eval_mot.py```。
+ - 跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置,默认文件夹名为`output`。
+
+
+### 3.预测
+
+#### 3.1 预测检测效果
+注意首先需要**注释配置文件中的`mot_metric: True`和`metric: MOT`**:
+```python
+### for detection eval.py/infer.py
+mot_metric: False
+metric: COCO
+
+### for MOT eval_mot.py/infer_mot_mot.py
+#mot_metric: True # 默认是不注释的,评估跟踪需要为 True,会覆盖之前的 mot_metric: False
+#metric: MOT # 默认是不注释的,评估跟踪需要使用 MOT,会覆盖之前的 metric: COCO
+```
+
+然后执行以下语句:
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml -o weights=output/centertrack_dla34_70e_mot17half/model_final.pdparams --infer_img=demo/000000014439_640x640.jpg --draw_threshold=0.5
+```
+
+**注意:**
+ - 预测检测使用的是```tools/infer.py```, 预测跟踪使用的是```tools/infer_mot.py```。
+
+
+#### 3.2 预测跟踪效果
+
+注意首先确保设置了**配置文件中的`mot_metric: True`和`metric: MOT`**;
+
+然后执行以下语句:
+```bash
+# 下载demo视频
+wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
+# 预测视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml --video_file=mot17_demo.mp4 --draw_threshold=0.5 --save_videos -o weights=output/centertrack_dla34_70e_mot17half/model_final.pdparams
+#或预测图片文件夹
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml --image_dir=mot17_demo/ --draw_threshold=0.5 --save_videos -o weights=output/centertrack_dla34_70e_mot17half/model_final.pdparams
+```
+
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+ - `--save_videos`表示保存可视化视频,同时会保存可视化的图片在`{output_dir}/mot_outputs/`中,`{output_dir}`可通过`--output_dir`设置,默认文件夹名为`output`。
+
+
+### 4. 导出预测模型
+
+注意首先确保设置了**配置文件中的`mot_metric: True`和`metric: MOT`**;
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/centertrack_dla34_70e_mot17half.pdparams
+```
+
+### 5. 用导出的模型基于Python去预测
+
+注意首先应在`deploy/python/tracker_config.yml`中设置`type: CenterTracker`。
+
+```bash
+# 预测某个视频
+# wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
+python deploy/python/mot_centertrack_infer.py --model_dir=output_inference/centertrack_dla34_70e_mot17half/ --tracker_config=deploy/python/tracker_config.yml --video_file=mot17_demo.mp4 --device=GPU --save_images=True --save_mot_txts
+# 预测图片文件夹
+python deploy/python/mot_centertrack_infer.py --model_dir=output_inference/centertrack_dla34_70e_mot17half/ --tracker_config=deploy/python/tracker_config.yml --image_dir=mot17_demo/ --device=GPU --save_images=True --save_mot_txts
+```
+
+**注意:**
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_mot_txt_per_img`(对每张图片保存一个txt)表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+
+## 引用
+```
+@article{zhou2020tracking,
+ title={Tracking Objects as Points},
+ author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
+ journal={ECCV},
+ year={2020}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/centertrack/_base_/centertrack_dla34.yml b/PaddleDetection-release-2.6/configs/mot/centertrack/_base_/centertrack_dla34.yml
new file mode 100644
index 0000000000000000000000000000000000000000..159165bd159ff7f5ee310b546b5a137fbf470259
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/centertrack/_base_/centertrack_dla34.yml
@@ -0,0 +1,57 @@
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/crowdhuman_centertrack.pdparams
+architecture: CenterTrack
+for_mot: True
+mot_metric: True
+
+### model
+CenterTrack:
+ detector: CenterNet
+ plugin_head: CenterTrackHead
+ tracker: CenterTracker
+
+
+### CenterTrack.detector
+CenterNet:
+ backbone: DLA
+ neck: CenterNetDLAFPN
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+ for_mot: True # Note
+
+DLA:
+ depth: 34
+ pre_img: True # Note
+ pre_hm: True # Note
+
+CenterNetDLAFPN:
+ down_ratio: 4
+ last_level: 5
+ out_channel: 0
+ dcn_v2: True
+
+CenterNetHead:
+ head_planes: 256
+ prior_bias: -4.6 # Note
+ regress_ltrb: False
+ size_loss: 'L1'
+ loss_weight: {'heatmap': 1.0, 'size': 0.1, 'offset': 1.0}
+
+CenterNetPostProcess:
+ max_per_img: 100 # top-K
+ regress_ltrb: False
+
+
+### CenterTrack.plugin_head
+CenterTrackHead:
+ head_planes: 256
+ task: tracking
+ loss_weight: {'tracking': 1.0, 'ltrb_amodal': 0.1}
+ add_ltrb_amodal: True
+
+
+### CenterTrack.tracker
+CenterTracker:
+ min_box_area: -1
+ vertical_ratio: -1
+ track_thresh: 0.4
+ pre_thresh: 0.5
diff --git a/PaddleDetection-release-2.6/configs/mot/centertrack/_base_/centertrack_reader.yml b/PaddleDetection-release-2.6/configs/mot/centertrack/_base_/centertrack_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7a5bf6fda60242be1628635bd97eac4d0a85bb2b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/centertrack/_base_/centertrack_reader.yml
@@ -0,0 +1,75 @@
+input_h: &input_h 544
+input_w: &input_w 960
+input_size: &input_size [*input_h, *input_w]
+pre_img_epoch: &pre_img_epoch 70 # Add previous image as input
+
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - FlipWarpAffine:
+ keep_res: False
+ input_h: *input_h
+ input_w: *input_w
+ not_rand_crop: False
+ flip: 0.5
+ is_scale: True
+ use_random: True
+ add_pre_img: True
+ - CenterRandColor: {saturation: 0.4, contrast: 0.4, brightness: 0.4}
+ - Lighting: {alphastd: 0.1, eigval: [0.2141788, 0.01817699, 0.00341571], eigvec: [[-0.58752847, -0.69563484, 0.41340352], [-0.5832747, 0.00994535, -0.81221408], [-0.56089297, 0.71832671, 0.41158938]]}
+ - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: False}
+ - Permute: {}
+ - Gt2CenterTrackTarget:
+ down_ratio: 4
+ max_objs: 256
+ hm_disturb: 0.05
+ lost_disturb: 0.4
+ fp_disturb: 0.1
+ pre_hm: True
+ add_tracking: True
+ add_ltrb_amodal: True
+ batch_size: 16 # total 32 for 2 GPUs
+ shuffle: True
+ drop_last: True
+ collate_batch: True
+ use_shared_memory: True
+ pre_img_epoch: *pre_img_epoch
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - WarpAffine: {keep_res: True, input_h: *input_h, input_w: *input_w}
+ - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - WarpAffine: {keep_res: True, input_h: *input_h, input_w: *input_w}
+ - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: True
+
+
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - WarpAffine: {keep_res: False, input_h: *input_h, input_w: *input_w}
+ - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+TestMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - WarpAffine: {keep_res: False, input_h: *input_h, input_w: *input_w}
+ - NormalizeImage: {mean: [0.40789655, 0.44719303, 0.47026116], std: [0.2886383 , 0.27408165, 0.27809834], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: True
diff --git a/PaddleDetection-release-2.6/configs/mot/centertrack/_base_/optimizer_70e.yml b/PaddleDetection-release-2.6/configs/mot/centertrack/_base_/optimizer_70e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a336290f2cecb9597b8c5fe351f132eef3235e4c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/centertrack/_base_/optimizer_70e.yml
@@ -0,0 +1,14 @@
+epoch: 70
+
+LearningRate:
+ base_lr: 0.000125
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [60]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
diff --git a/PaddleDetection-release-2.6/configs/mot/centertrack/centertrack_dla34_70e_mot17.yml b/PaddleDetection-release-2.6/configs/mot/centertrack/centertrack_dla34_70e_mot17.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2888a01747a078af34a92dfae014358f61bc668d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/centertrack/centertrack_dla34_70e_mot17.yml
@@ -0,0 +1,66 @@
+_BASE_: [
+ '_base_/optimizer_70e.yml',
+ '_base_/centertrack_dla34.yml',
+ '_base_/centertrack_reader.yml',
+ '../../runtime.yml',
+]
+log_iter: 20
+snapshot_epoch: 5
+weights: output/centertrack_dla34_70e_mot17/model_final
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/crowdhuman_centertrack.pdparams
+
+
+### for Detection eval.py/infer.py
+# mot_metric: False
+# metric: COCO
+
+### for MOT eval_mot.py/infer_mot_mot.py
+mot_metric: True
+metric: MOT
+
+
+worker_num: 4
+TrainReader:
+ batch_size: 16 # total 32 for 2 GPUs
+
+EvalReader:
+ batch_size: 1
+
+EvalMOTReader:
+ batch_size: 1
+
+
+# COCO style dataset for training
+num_classes: 1
+TrainDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/train.json
+ image_dir: images/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_track_id']
+ # add 'gt_track_id', the boxes annotations of json file should have 'gt_track_id'
+
+EvalDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+ image_dir: images/train
+
+TestDataset:
+ !ImageFolder
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot/MOT17
+ data_root: images/train # set 'images/test' for MOTChallenge test
+ keep_ori_im: True # set True if save visualization images or video, or used in SDE MOT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot/MOT17
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml b/PaddleDetection-release-2.6/configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a15dfdc70d7b073659796ea39e92485d56ccd654
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/centertrack/centertrack_dla34_70e_mot17half.yml
@@ -0,0 +1,66 @@
+_BASE_: [
+ '_base_/optimizer_70e.yml',
+ '_base_/centertrack_dla34.yml',
+ '_base_/centertrack_reader.yml',
+ '../../runtime.yml',
+]
+log_iter: 20
+snapshot_epoch: 5
+weights: output/centertrack_dla34_70e_mot17half/model_final
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/crowdhuman_centertrack.pdparams
+
+
+### for Detection eval.py/infer.py
+# mot_metric: False
+# metric: COCO
+
+### for MOT eval_mot.py/infer_mot.py
+mot_metric: True
+metric: MOT
+
+
+worker_num: 4
+TrainReader:
+ batch_size: 16 # total 32 for 2 GPUs
+
+EvalReader:
+ batch_size: 1
+
+EvalMOTReader:
+ batch_size: 1
+
+
+# COCO style dataset for training
+num_classes: 1
+TrainDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/train_half.json
+ image_dir: images/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd', 'gt_track_id']
+ # add 'gt_track_id', the boxes annotations of json file should have 'gt_track_id'
+
+EvalDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+ image_dir: images/train
+
+TestDataset:
+ !ImageFolder
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot/MOT17
+ data_root: images/half
+ keep_ori_im: True # set True if save visualization images or video, or used in SDE MOT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot/MOT17
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/README.md b/PaddleDetection-release-2.6/configs/mot/deepsort/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/README_cn.md b/PaddleDetection-release-2.6/configs/mot/deepsort/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..08bee2e1e4c173c426c608562a4bcd4334bcc5e7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/README_cn.md
@@ -0,0 +1,232 @@
+简体中文 | [English](README.md)
+
+# DeepSORT (Deep Cosine Metric Learning for Person Re-identification)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [快速开始](#快速开始)
+- [适配其他检测器](#适配其他检测器)
+- [引用](#引用)
+
+## 简介
+[DeepSORT](https://arxiv.org/abs/1812.00442)(Deep Cosine Metric Learning SORT) 扩展了原有的[SORT](https://arxiv.org/abs/1703.07402)(Simple Online and Realtime Tracking)算法,增加了一个CNN模型用于在检测器限定的人体部分图像中提取特征,在深度外观描述的基础上整合外观信息,将检出的目标分配和更新到已有的对应轨迹上即进行一个ReID重识别任务。DeepSORT所需的检测框可以由任意一个检测器来生成,然后读入保存的检测结果和视频图片即可进行跟踪预测。ReID模型此处选择[PaddleClas](https://github.com/PaddlePaddle/PaddleClas)提供的`PCB+Pyramid ResNet101`和`PPLCNet`模型。
+
+## 模型库
+
+### DeepSORT在MOT-16 Training Set上结果
+
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 检测结果或模型 | ReID模型 |配置文件 |
+| :---------| :------- | :----: | :----: | :--: | :----: | :---: | :---: | :-----:| :-----: | :-----: |
+| ResNet-101 | 1088x608 | 72.2 | 60.5 | 998 | 8054 | 21644 | - | [检测结果](https://bj.bcebos.com/v1/paddledet/data/mot/det_results_dir.zip) |[ReID模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams)|[配置文件](./reid/deepsort_pcb_pyramid_r101.yml) |
+| ResNet-101 | 1088x608 | 68.3 | 56.5 | 1722 | 17337 | 15890 | - | [检测模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/jde_yolov3_darknet53_30e_1088x608_mix.pdparams) |[ReID模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams)|[配置文件](./deepsort_jde_yolov3_pcb_pyramid.yml) |
+| PPLCNet | 1088x608 | 72.2 | 59.5 | 1087 | 8034 | 21481 | - | [检测结果](https://bj.bcebos.com/v1/paddledet/data/mot/det_results_dir.zip) |[ReID模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams)|[配置文件](./reid/deepsort_pplcnet.yml) |
+| PPLCNet | 1088x608 | 68.1 | 53.6 | 1979 | 17446 | 15766 | - | [检测模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/jde_yolov3_darknet53_30e_1088x608_mix.pdparams) |[ReID模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams)|[配置文件](./deepsort_jde_yolov3_pplcnet.yml) |
+
+### DeepSORT在MOT-16 Test Set上结果
+
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 检测结果或模型 | ReID模型 |配置文件 |
+| :---------| :------- | :----: | :----: | :--: | :----: | :---: | :---: | :-----: | :-----: |:-----: |
+| ResNet-101 | 1088x608 | 64.1 | 53.0 | 1024 | 12457 | 51919 | - | [检测结果](https://bj.bcebos.com/v1/paddledet/data/mot/det_results_dir.zip) | [ReID模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams)|[配置文件](./reid/deepsort_pcb_pyramid_r101.yml) |
+| ResNet-101 | 1088x608 | 61.2 | 48.5 | 1799 | 25796 | 43232 | - | [检测模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/jde_yolov3_darknet53_30e_1088x608_mix.pdparams) |[ReID模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams)|[配置文件](./deepsort_jde_yolov3_pcb_pyramid.yml) |
+| PPLCNet | 1088x608 | 64.0 | 51.3 | 1208 | 12697 | 51784 | - | [检测结果](https://bj.bcebos.com/v1/paddledet/data/mot/det_results_dir.zip) |[ReID模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams)|[配置文件](./reid/deepsort_pplcnet.yml) |
+| PPLCNet | 1088x608 | 61.1 | 48.8 | 2010 | 25401 | 43432 | - | [检测模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/jde_yolov3_darknet53_30e_1088x608_mix.pdparams) |[ReID模型](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams)|[配置文件](./deepsort_jde_yolov3_pplcnet.yml) |
+
+
+### DeepSORT在MOT-17 half Val Set上结果
+
+| 检测训练数据集 | 检测器 | ReID | 检测mAP | MOTA | IDF1 | FPS | 配置文件 |
+| :-------- | :----- | :----: |:------: | :----: |:-----: |:----:|:----: |
+| MIX | JDE YOLOv3 | PCB Pyramid | - | 66.9 | 62.7 | - |[配置文件](./deepsort_jde_yolov3_pcb_pyramid.yml) |
+| MIX | JDE YOLOv3 | PPLCNet | - | 66.3 | 62.1 | - |[配置文件](./deepsort_jde_yolov3_pplcnet.yml) |
+| MOT-17 half train | YOLOv3 | PPLCNet | 42.7 | 50.2 | 52.4 | - |[配置文件](./deepsort_yolov3_pplcnet.yml) |
+| MOT-17 half train | PPYOLOv2 | PPLCNet | 46.8 | 51.8 | 55.8 | - |[配置文件](./deepsort_ppyolov2_pplcnet.yml) |
+| MOT-17 half train | PPYOLOe | PPLCNet | 52.7 | 56.7 | 60.5 | - |[配置文件](./deepsort_ppyoloe_pplcnet.yml) |
+| MOT-17 half train | PPYOLOe | ResNet-50 | 52.7 | 56.7 | 64.6 | - |[配置文件](./deepsort_ppyoloe_resnet.yml) |
+
+**注意:**
+模型权重下载链接在配置文件中的```det_weights```和```reid_weights```,运行验证的命令即可自动下载。
+DeepSORT是分离检测器和ReID模型的,其中检测器单独训练MOT数据集,而组装成DeepSORT后只用于评估,现在支持两种评估的方式。
+- **方式1**:加载检测结果文件和ReID模型,在使用DeepSORT模型评估之前,应该首先通过一个检测模型得到检测结果,然后像这样准备好结果文件:
+```
+det_results_dir
+ |——————MOT16-02.txt
+ |——————MOT16-04.txt
+ |——————MOT16-05.txt
+ |——————MOT16-09.txt
+ |——————MOT16-10.txt
+ |——————MOT16-11.txt
+ |——————MOT16-13.txt
+```
+对于MOT16数据集,可以下载PaddleDetection提供的一个经过匹配之后的检测框结果det_results_dir.zip并解压:
+```
+wget https://bj.bcebos.com/v1/paddledet/data/mot/det_results_dir.zip
+```
+如果使用更强的检测模型,可以取得更好的结果。其中每个txt是每个视频中所有图片的检测结果,每行都描述一个边界框,格式如下:
+```
+[frame_id],[x0],[y0],[w],[h],[score],[class_id]
+```
+- `frame_id`是图片帧的序号
+- `x0,y0`是目标框的左上角x和y坐标
+- `w,h`是目标框的像素宽高
+- `score`是目标框的得分
+- `class_id`是目标框的类别,如果只有1类则是`0`
+
+- **方式2**:同时加载检测模型和ReID模型,此处选用JDE版本的YOLOv3,具体配置见`configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid.yml`。加载其他通用检测模型可参照`configs/mot/deepsort/deepsort_yoloe_pplcnet.yml`进行修改。
+
+## 快速开始
+
+### 1. 评估
+
+#### 1.1 评估检测效果
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/deepsort/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml -o weights=https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+```
+
+**注意:**
+ - 评估检测使用的是```tools/eval.py```, 评估跟踪使用的是```tools/eval_mot.py```。
+
+#### 1.2 评估跟踪效果
+**方式1**:加载检测结果文件和ReID模型,得到跟踪结果
+```bash
+# 下载PaddleDetection提供的MOT16数据集检测结果文件并解压,如需自己使用其他检测器生成请参照这个文件里的格式
+wget https://bj.bcebos.com/v1/paddledet/data/mot/det_results_dir.zip
+
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/reid/deepsort_pcb_pyramid_r101.yml --det_results_dir det_results_dir
+# 或者
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/reid/deepsort_pplcnet.yml --det_results_dir det_results_dir
+```
+
+**方式2**:加载行人检测模型和ReID模型,得到跟踪结果
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid.yml
+# 或者
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_jde_yolov3_pplcnet.yml
+# 或者
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_ppyolov2_pplcnet.yml --scaled=True
+# 或者
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_ppyoloe_resnet.yml --scaled=True
+```
+**注意:**
+ - JDE YOLOv3行人检测模型是和JDE和FairMOT使用同样的MOT数据集训练的,因此MOTA较高。而其他通用检测模型如PPYOLOv2只使用了MOT17 half数据集训练。
+ - JDE YOLOv3模型与通用检测模型如YOLOv3和PPYOLOv2最大的区别是使用了JDEBBoxPostProcess后处理,结果输出坐标没有缩放回原图,而通用检测模型输出坐标是缩放回原图的。
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE YOLOv3则为False,如果使用通用检测模型则为True, 默认值是False。
+ - 跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
+
+### 2. 预测
+
+使用单个GPU通过如下命令预测一个视频,并保存为视频
+
+```bash
+# 下载demo视频
+wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
+
+# 加载JDE YOLOv3行人检测模型和PCB Pyramid ReID模型,并保存为视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid.yml --video_file=mot17_demo.mp4 --save_videos
+
+# 或者加载PPYOLOE行人检测模型和PPLCNet ReID模型,并保存为视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_ppyoloe_pplcnet.yml --video_file=mot17_demo.mp4 --scaled=True --save_videos
+```
+
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
+
+
+### 3. 导出预测模型
+
+Step 1:导出检测模型
+```bash
+# 导出JDE YOLOv3行人检测模型
+CUDA_VISIBLE_DEVICES=0 python3.7 tools/export_model.py -c configs/mot/deepsort/detector/jde_yolov3_darknet53_30e_1088x608_mix.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/jde_yolov3_darknet53_30e_1088x608_mix.pdparams
+
+# 或导出PPYOLOE行人检测模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+```
+
+Step 2:导出ReID模型
+```bash
+# 导出PCB Pyramid ReID模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid/deepsort_pcb_pyramid_r101.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams
+
+# 或者导出PPLCNet ReID模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid/deepsort_pplcnet.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams
+
+# 或者导出ResNet ReID模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid/deepsort_resnet.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_resnet.pdparams
+```
+
+### 4. 用导出的模型基于Python去预测
+
+```bash
+# 用导出的PPYOLOE行人检测模型和PPLCNet ReID模型
+python3.7 deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/ppyoloe_crn_l_36e_640x640_mot17half/ --reid_model_dir=output_inference/deepsort_pplcnet/ --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts --threshold=0.5
+```
+**注意:**
+ - 运行前需要先改动`deploy/pptracking/python/tracker_config.yml`里的tracker为`DeepSORTTracker`。
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示对每个视频保存一个txt,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+
+## 适配其他检测器
+
+### 1、配置文件目录说明
+- `detector/xxx.yml`是纯粹的检测模型配置文件,如`detector/ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml`,支持检测的所有流程(train/eval/infer/export/deploy)。DeepSORT跟踪的eval/infer与这个纯检测的yml文件无关,但是export的时候需要这个纯检测的yml单独导出检测模型,DeepSORT跟踪导出模型是分开detector和reid分别导出的,用户可自行定义和组装detector+reid成为一个完整的DeepSORT跟踪系统。
+- `detector/`下的检测器配置文件中,用户需要将自己的数据集转为COCO格式。由于ID的真实标签不需要参与进去,用户可以在此自行配置任何检测模型,只需保证输出结果包含结果框的种类、坐标和分数即可。
+- `reid/deepsort_yyy.yml`文件夹里的是ReID模型和tracker的配置文件,如`reid/deepsort_pplcnet.yml`,此处ReID模型是由[PaddleClas](https://github.com/PaddlePaddle/PaddleClas)提供的`deepsort_pcb_pyramid_r101.yml`和`deepsort_pplcnet.yml`,是在Market1501(751类人)行人ReID数据集上训练得到的,训练细节待PaddleClas公布。
+- `deepsort_xxx_yyy.yml`是一个完整的DeepSORT跟踪的配置,如`deepsort_ppyolov2_pplcnet.yml`,其中检测部分`xxx`是`detector/`里的,reid和tracker部分`yyy`是`reid/`里的。
+- DeepSORT跟踪的eval/infer有两种方式,方式1是只使用`reid/deepsort_yyy.yml`加载检测结果文件和`yyy`ReID模型,方式2是使用`deepsort_xxx_yyy.yml`加载`xxx`检测模型和`yyy`ReID模型,但是DeepSORT跟踪的deploy必须使用`deepsort_xxx_yyy.yml`。
+- 检测器的eval/infer/deploy只使用到`detector/xxx.yml`,ReID一般不单独使用,如需单独使用必须提前加载检测结果文件然后只使用`reid/deepsort_yyy.yml`。
+
+
+### 2、适配的具体步骤
+1.先将数据集制作成COCO格式按通用检测模型配置来训练,参照`detector/`文件夹里的模型配置文件,制作生成`detector/xxx.yml`, 已经支持有Faster R-CNN、YOLOv3、PPYOLOv2、JDE YOLOv3和PicoDet等模型。
+
+2.制作`deepsort_xxx_yyy.yml`, 其中`DeepSORT.detector`的配置就是`detector/xxx.yml`里的, `EvalMOTDataset`和`det_weights`可以自行设置。`yyy`是`reid/deepsort_yyy.yml`如`reid/deepsort_pplcnet.yml`。
+
+### 3、使用的具体步骤
+#### 1.加载检测模型和ReID模型去评估:
+```
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_xxx_yyy.yml --scaled=True
+```
+#### 2.加载检测模型和ReID模型去推理:
+```
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_xxx_yyy.yml --video_file=mot17_demo.mp4 --scaled=True --save_videos
+```
+#### 3.导出检测模型和ReID模型:
+```bash
+# 导出检测模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/detector/xxx.yml
+# 导出ReID模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/reid/deepsort_yyy.yml
+```
+#### 4.使用导出的检测模型和ReID模型去部署:
+```
+python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/xxx./ --reid_model_dir=output_inference/deepsort_yyy/ --video_file=mot17_demo.mp4 --device=GPU --scaled=True --save_mot_txts
+```
+**注意:**
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
+
+
+## 引用
+```
+@inproceedings{Wojke2017simple,
+ title={Simple Online and Realtime Tracking with a Deep Association Metric},
+ author={Wojke, Nicolai and Bewley, Alex and Paulus, Dietrich},
+ booktitle={2017 IEEE International Conference on Image Processing (ICIP)},
+ year={2017},
+ pages={3645--3649},
+ organization={IEEE},
+ doi={10.1109/ICIP.2017.8296962}
+}
+
+@inproceedings{Wojke2018deep,
+ title={Deep Cosine Metric Learning for Person Re-identification},
+ author={Wojke, Nicolai and Bewley, Alex},
+ booktitle={2018 IEEE Winter Conference on Applications of Computer Vision (WACV)},
+ year={2018},
+ pages={748--756},
+ organization={IEEE},
+ doi={10.1109/WACV.2018.00087}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/_base_/deepsort_reader_1088x608.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/_base_/deepsort_reader_1088x608.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6ab950aa94e0ea203ea6184d7e3910164ef85993
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/_base_/deepsort_reader_1088x608.yml
@@ -0,0 +1,22 @@
+# DeepSORT does not need to train on MOT dataset, only used for evaluation.
+# MOT dataset needs to be trained on the detector(like YOLOv3) only using bboxes.
+# And gt IDs don't need to be trained.
+
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 608, 1088]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/_base_/mot17.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/_base_/mot17.yml
new file mode 100644
index 0000000000000000000000000000000000000000..faf47f622d1c2847a9686dfa8d7e48a49c05436c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/_base_/mot17.yml
@@ -0,0 +1,34 @@
+metric: COCO
+num_classes: 1
+
+# Detection Dataset for training
+TrainDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/train_half.json
+ image_dir: images/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+ image_dir: images/train
+
+TestDataset:
+ !ImageFolder
+ dataset_dir: dataset/mot/MOT17
+ anno_path: annotations/val_half.json
+
+
+# MOTDataset for MOT evaluation and inference
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT and ByteTrack
+
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid.yml
new file mode 100644
index 0000000000000000000000000000000000000000..066e0ec08d1d99c6764f6d8ff3768e57f2a01563
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid.yml
@@ -0,0 +1,71 @@
+_BASE_: [
+ 'detector/jde_yolov3_darknet53_30e_1088x608_mix.yml',
+ '_base_/mot17.yml',
+ '_base_/deepsort_reader_1088x608.yml',
+]
+metric: MOT
+num_classes: 1
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT16/images/train
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/jde_yolov3_darknet53_30e_1088x608_mix.pdparams
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams
+
+
+# DeepSORT configuration
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: YOLOv3 # JDE version YOLOv3
+ reid: PCBPyramid
+ tracker: DeepSORTTracker
+
+
+# reid and tracker configuration
+# see 'configs/mot/deepsort/reid/deepsort_pcb_pyramid_r101.yml'
+PCBPyramid:
+ model_name: "ResNet101"
+ num_conv_out_channels: 128
+ num_classes: 751
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0
+ vertical_ratio: -1
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
+
+
+# detector configuration: JDE version YOLOv3
+# see 'configs/mot/deepsort/detector/jde_yolov3_darknet53_30e_1088x608_mix.yml'
+# The most obvious difference from general YOLOv3 is the JDEBBoxPostProcess and the bboxes coordinates output are not scaled to the original image.
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: JDEBBoxPostProcess
+
+# Tracking requires higher quality boxes, so decode.conf_thresh will be higher
+JDEBBoxPostProcess:
+ decode:
+ name: JDEBox
+ conf_thresh: 0.3
+ downsample_ratio: 32
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.5
+ nms_top_k: 2000
+ normalized: true
+ return_idx: false
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_jde_yolov3_pplcnet.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_jde_yolov3_pplcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a412c9e5afb38e44715f0c7fa6c95b679fe2aa33
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_jde_yolov3_pplcnet.yml
@@ -0,0 +1,70 @@
+_BASE_: [
+ 'detector/jde_yolov3_darknet53_30e_1088x608_mix.yml',
+ '_base_/mot17.yml',
+ '_base_/deepsort_reader_1088x608.yml',
+]
+metric: MOT
+num_classes: 1
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT16/images/train
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/jde_yolov3_darknet53_30e_1088x608_mix.pdparams
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams
+
+
+# DeepSORT configuration
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: YOLOv3 # JDE version YOLOv3
+ reid: PPLCNetEmbedding
+ tracker: DeepSORTTracker
+
+
+# reid and tracker configuration
+# see 'configs/mot/deepsort/reid/deepsort_pplcnet.yml'
+PPLCNetEmbedding:
+ input_ch: 1280
+ output_ch: 512
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0
+ vertical_ratio: -1
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
+
+
+# detector configuration: JDE version YOLOv3
+# see 'configs/mot/deepsort/detector/jde_yolov3_darknet53_30e_1088x608_mix.yml'
+# The most obvious difference from general YOLOv3 is the JDEBBoxPostProcess and the bboxes coordinates output are not scaled to the original image.
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: JDEBBoxPostProcess
+
+# Tracking requires higher quality boxes, so decode.conf_thresh will be higher
+JDEBBoxPostProcess:
+ decode:
+ name: JDEBox
+ conf_thresh: 0.3
+ downsample_ratio: 32
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.5
+ nms_top_k: 2000
+ normalized: true
+ return_idx: false
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_ppyoloe_pplcnet.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_ppyoloe_pplcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0af80a7d899f02ac4b66c5191b2616ed1db1aa8e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_ppyoloe_pplcnet.yml
@@ -0,0 +1,109 @@
+_BASE_: [
+ 'detector/ppyoloe_crn_l_36e_640x640_mot17half.yml',
+ '_base_/mot17.yml',
+ '_base_/deepsort_reader_1088x608.yml',
+]
+metric: MOT
+num_classes: 1
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams
+
+# reader
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+# DeepSORT configuration
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: YOLOv3 # PPYOLOe version
+ reid: PPLCNetEmbedding
+ tracker: DeepSORTTracker
+
+
+# reid and tracker configuration
+# see 'configs/mot/deepsort/reid/deepsort_pplcnet.yml'
+PPLCNetEmbedding:
+ input_ch: 1280
+ output_ch: 512
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0
+ vertical_ratio: -1
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
+
+
+# detector configuration: PPYOLOe version
+# see 'configs/mot/deepsort/detector/ppyoloe_crn_l_300e_640x640_mot17half.yml'
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 # 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.4 # 0.01 in original detector
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_ppyoloe_resnet.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_ppyoloe_resnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d9692304b055040bb22c49a2f90e05e4e7ba53eb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_ppyoloe_resnet.yml
@@ -0,0 +1,108 @@
+_BASE_: [
+ 'detector/ppyoloe_crn_l_36e_640x640_mot17half.yml',
+ '_base_/mot17.yml',
+ '_base_/deepsort_reader_1088x608.yml',
+]
+metric: MOT
+num_classes: 1
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_resnet.pdparams
+
+# reader
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+# DeepSORT configuration
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: YOLOv3 # PPYOLOe version
+ reid: ResNetEmbedding
+ tracker: DeepSORTTracker
+
+
+# reid and tracker configuration
+# see 'configs/mot/deepsort/reid/deepsort_resnet.yml'
+ResNetEmbedding:
+ model_name: "ResNet50"
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0
+ vertical_ratio: -1
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
+
+
+# detector configuration: PPYOLOe version
+# see 'configs/mot/deepsort/detector/ppyoloe_crn_l_300e_640x640_mot17half.yml'
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 # 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.4 # 0.01 in original detector
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_ppyolov2_pplcnet.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_ppyolov2_pplcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8cd393e457b69588140b1c00b5e20ddb69932f5d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_ppyolov2_pplcnet.yml
@@ -0,0 +1,98 @@
+_BASE_: [
+ 'detector/ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml',
+ '_base_/mot17.yml',
+ '_base_/deepsort_reader_1088x608.yml',
+]
+metric: MOT
+num_classes: 1
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_640x640_mot17half.pdparams
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams
+
+# reader
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+# DeepSORT configuration
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: YOLOv3 # PPYOLOv2 version
+ reid: PPLCNetEmbedding
+ tracker: DeepSORTTracker
+
+
+# reid and tracker configuration
+# see 'configs/mot/deepsort/reid/deepsort_pplcnet.yml'
+PPLCNetEmbedding:
+ input_ch: 1280
+ output_ch: 512
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0
+ vertical_ratio: -1
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
+
+
+# detector configuration: PPYOLOv2 version
+# see 'configs/mot/deepsort/detector/ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml'
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.25 # 0.01 in original detector
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.4 # 0.01 in original detector
+ post_threshold: 0.4 # 0.01 in original detector
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_yolov3_pplcnet.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_yolov3_pplcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..adb0aa9a07daea757a9d67119a71a4ee8e1d9e68
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/deepsort_yolov3_pplcnet.yml
@@ -0,0 +1,87 @@
+_BASE_: [
+ 'detector/yolov3_darknet53_40e_608x608_mot17half.yml',
+ '_base_/mot17.yml',
+ '_base_/deepsort_reader_1088x608.yml',
+]
+metric: MOT
+num_classes: 1
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/yolov3_darknet53_40e_608x608_mot17half.pdparams
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams
+
+# reader
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 608, 608]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+# DeepSORT configuration
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: YOLOv3 # General YOLOv3 version
+ reid: PPLCNetEmbedding
+ tracker: DeepSORTTracker
+
+
+# reid and tracker configuration
+# see 'configs/mot/deepsort/reid/deepsort_pplcnet.yml'
+PPLCNetEmbedding:
+ input_ch: 1280
+ output_ch: 512
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0
+ vertical_ratio: -1
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
+
+
+# detector configuration: General YOLOv3 version
+# see 'configs/mot/deepsort/detector/yolov3_darknet53_40e_608x608_mot17half.yml'
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.3 # 0.01 in original detector
+ nms_threshold: 0.45
+ nms_top_k: 1000
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/detector/README.md b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4cfe273fee2add8bf970f503fa0a7fd363435f22
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/README.md
@@ -0,0 +1,34 @@
+English | [简体中文](README_cn.md)
+
+# Detector for DeepSORT
+
+## Introduction
+[DeepSORT](https://arxiv.org/abs/1812.00442)(Deep Cosine Metric Learning SORT) is composed of a detector and a ReID model in series. The configs of several common detectors are provided here as a reference. Note that different training dataset, backbone, input size, training epochs and NMS threshold will lead to differences in model accuracy and performance. Please adapt according to your needs.
+
+## Model Zoo
+### Results on MOT17-half dataset
+| Backbone | Model | input size | lr schedule | FPS | Box AP | download | config |
+| :-------------- | :------------- | :--------: | :---------: | :-----------: | :-----: | :----------: | :-----: |
+| DarkNet-53 | YOLOv3 | 608X608 | 40e | ---- | 42.7 | [download](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolov3_darknet53_40e_608x608_mot17half.pdparams) | [config](./yolov3_darknet53_40e_608x608_mot17half.yml) |
+| ResNet50-vd | PPYOLOv2 | 640x640 | 365e | ---- | 46.8 | [download](https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_640x640_mot17half.pdparams) | [config](./ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml) |
+| CSPResNet | PPYOLOe | 640x640 | 36e | ---- | 52.9 | [download](https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyoloe_crn_l_36e_640x640_mot17half.pdparams) | [config](./ppyoloe_crn_l_36e_640x640_mot17half.yml) |
+
+**Notes:**
+ - The above models are trained with **MOT17-half train** set, it can be downloaded from this [link](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip).
+ - **MOT17-half train** set is a dataset composed of pictures and labels of the first half frame of each video in MOT17 Train dataset (7 sequences in total). **MOT17-half val set** is used for evaluation, which is composed of the second half frame of each video. They can be downloaded from this [link](https://paddledet.bj.bcebos.com/data/mot/mot17half/annotations.zip). Download and unzip it in the `dataset/mot/MOT17/images/`folder.
+ - YOLOv3 is trained with the same pedestrian dataset as `configs/pphuman/pedestrian_yolov3/pedestrian_yolov3_darknet.yml`, which is not open yet.
+ - For pedestrian tracking, please use pedestrian detector combined with pedestrian ReID model. For vehicle tracking, please use vehicle detector combined with vehicle ReID model.
+ - High quality detected boxes are required for DeepSORT tracking, so the post-processing settings such as NMS threshold of these models are different from those in pure detection tasks.
+
+## Quick Start
+
+Start the training and evaluation with the following command
+```bash
+job_name=ppyoloe_crn_l_36e_640x640_mot17half
+config=configs/mot/deepsort/detector/${job_name}.yml
+log_dir=log_dir/${job_name}
+# 1. training
+python -m paddle.distributed.launch --log_dir=${log_dir} --gpus 0,1,2,3,4,5,6,7 tools/train.py -c ${config} --eval --amp --fleet
+# 2. evaluation
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c ${config} -o weights=https://paddledet.bj.bcebos.com/models/mot/deepsort/${job_name}.pdparams
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/detector/README_cn.md b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..050cc8ade944190bef0931909d34724a5b99cb54
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/README_cn.md
@@ -0,0 +1,36 @@
+简体中文 | [English](README.md)
+
+# DeepSORT的检测器
+
+## 简介
+[DeepSORT](https://arxiv.org/abs/1812.00442)(Deep Cosine Metric Learning SORT) 由检测器和ReID模型串联组合而成,此处提供了几个常用检测器的配置作为参考。由于训练数据集、输入尺度、训练epoch数、NMS阈值设置等的不同均会导致模型精度和性能的差异,请自行根据需求进行适配。
+
+## 模型库
+
+### 在MOT17-half val数据集上的检测结果
+| 骨架网络 | 网络类型 | 输入尺度 | 学习率策略 |推理时间(fps) | Box AP | 下载 | 配置文件 |
+| :-------------- | :------------- | :--------: | :---------: | :-----------: | :-----: | :------: | :-----: |
+| DarkNet-53 | YOLOv3 | 608X608 | 40e | ---- | 42.7 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/yolov3_darknet53_40e_608x608_mot17half.pdparams) | [配置文件](./yolov3_darknet53_40e_608x608_mot17half.yml) |
+| ResNet50-vd | PPYOLOv2 | 640x640 | 365e | ---- | 46.8 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_640x640_mot17half.pdparams) | [配置文件](./ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml) |
+| CSPResNet | PPYOLOe | 640x640 | 36e | ---- | 52.9 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyoloe_crn_l_36e_640x640_mot17half.pdparams) | [配置文件](./ppyoloe_crn_l_36e_640x640_mot17half.yml) |
+
+**注意:**
+ - 以上模型均可采用**MOT17-half train**数据集训练,数据集可以从[此链接](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip)下载。
+ - **MOT17-half train**是MOT17的train序列(共7个)每个视频的前一半帧的图片和标注组成的数据集,而为了验证精度可以都用**MOT17-half val**数据集去评估,它是每个视频的后一半帧组成的,数据集可以从[此链接](https://paddledet.bj.bcebos.com/data/mot/mot17half/annotations.zip)下载,并解压放在`dataset/mot/MOT17/images/`文件夹下。
+ - YOLOv3和`configs/pphuman/pedestrian_yolov3/pedestrian_yolov3_darknet.yml`是相同的pedestrian数据集训练的,此数据集暂未开放。
+ - 行人跟踪请使用行人检测器结合行人ReID模型。车辆跟踪请使用车辆检测器结合车辆ReID模型。
+ - 用于DeepSORT跟踪时需要高质量的检出框,因此这些模型的NMS阈值等后处理设置会与纯检测任务的设置不同。
+
+
+## 快速开始
+
+通过如下命令一键式启动训练和评估
+```bash
+job_name=ppyoloe_crn_l_36e_640x640_mot17half
+config=configs/mot/deepsort/detector/${job_name}.yml
+log_dir=log_dir/${job_name}
+# 1. training
+python -m paddle.distributed.launch --log_dir=${log_dir} --gpus 0,1,2,3,4,5,6,7 tools/train.py -c ${config} --eval --amp
+# 2. evaluation
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c ${config} -o weights=https://paddledet.bj.bcebos.com/models/mot/${job_name}.pdparams
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/detector/jde_yolov3_darknet53_30e_1088x608_mix.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/jde_yolov3_darknet53_30e_1088x608_mix.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d895b0ae51ae0fda9266657bd183604af2e213cc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/jde_yolov3_darknet53_30e_1088x608_mix.yml
@@ -0,0 +1,83 @@
+_BASE_: [
+ '../../../datasets/mot.yml',
+ '../../../runtime.yml',
+ '../../jde/_base_/optimizer_30e.yml',
+ '../../jde/_base_/jde_reader_1088x608.yml',
+]
+weights: output/jde_yolov3_darknet53_30e_1088x608_mix/model_final
+
+metric: MOTDet
+num_classes: 1
+EvalReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 608, 1088]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+EvalDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['mot17.half']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+TestDataset:
+ !ImageFolder
+ anno_path: None
+
+
+# detector configuration
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/DarkNet53_pretrained.pdparams
+
+# JDE version for MOT dataset
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: JDEBBoxPostProcess
+
+DarkNet:
+ depth: 53
+ return_idx: [2, 3, 4]
+ freeze_norm: True
+
+YOLOv3FPN:
+ freeze_norm: True
+
+YOLOv3Head:
+ anchors: [[128,384], [180,540], [256,640], [512,640],
+ [32,96], [45,135], [64,192], [90,271],
+ [8,24], [11,34], [16,48], [23,68]]
+ anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
+ loss: JDEDetectionLoss
+
+JDEDetectionLoss:
+ for_mot: False
+
+JDEBBoxPostProcess:
+ decode:
+ name: JDEBox
+ conf_thresh: 0.3
+ downsample_ratio: 32
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.5
+ nms_top_k: 2000
+ normalized: true
+ return_idx: false
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a0501222c9f35d657826fb525e54bd7f4f663ae4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml
@@ -0,0 +1,82 @@
+_BASE_: [
+ '../../../ppyoloe/ppyoloe_crn_l_300e_coco.yml',
+ '../_base_/mot17.yml',
+]
+weights: output/ppyoloe_crn_l_36e_640x640_mot17half/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+
+# schedule configuration for fine-tuning
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 1
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+TrainReader:
+ batch_size: 8
+
+
+# detector configuration
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 # 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/detector/ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cc55e46e722d9c93f76cc96278a6faa0cf29d3ef
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/ppyolov2_r50vd_dcn_365e_640x640_mot17half.yml
@@ -0,0 +1,75 @@
+_BASE_: [
+ '../../../ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml',
+ '../_base_/mot17.yml',
+]
+weights: output/ppyolov2_r50vd_dcn_365e_640x640_mot17half/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+
+# detector configuration
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/detector/yolov3_darknet53_40e_608x608_mot17half.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/yolov3_darknet53_40e_608x608_mot17half.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9ab55f977b9c76e794ce1e6eb83172b459ba4d27
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/detector/yolov3_darknet53_40e_608x608_mot17half.yml
@@ -0,0 +1,76 @@
+_BASE_: [
+ '../../../yolov3/yolov3_darknet53_270e_coco.yml',
+ '../_base_/mot17.yml',
+]
+weights: output/yolov3_darknet53_40e_608x608_mot17half/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+# schedule configuration for fine-tuning
+epoch: 40
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 32
+ - 36
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 100
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+TrainReader:
+ batch_size: 8
+ mixup_epoch: 35
+
+# detector configuration
+architecture: YOLOv3
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolov3_darknet53_270e_coco.pdparams
+norm_type: sync_bn
+
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+DarkNet:
+ depth: 53
+ return_idx: [2, 3, 4]
+
+# use default config
+# YOLOv3FPN:
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 1000
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/reid/README.md b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..57a81b4fa5e059f3bd6c0d1e001d3fa19818f8b6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/README.md
@@ -0,0 +1,26 @@
+English | [简体中文](README_cn.md)
+
+# ReID of DeepSORT
+
+## Introduction
+[DeepSORT](https://arxiv.org/abs/1812.00442)(Deep Cosine Metric Learning SORT) is composed of detector and ReID model in series. Several common ReID models are provided here for the configs of DeepSORT as a reference.
+
+## Model Zoo
+
+### Results on Market1501 pedestrian ReID dataset
+
+| Backbone | Model | Params | FPS | mAP | Top1 | Top5 | download | config |
+| :-------------: | :-----------------: | :-------: | :------: | :-------: | :-------: | :-------: | :-------: | :-------: |
+| ResNet-101 | PCB Pyramid Embedding | 289M | --- | 86.31 | 94.95 | 98.28 | [download](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams) | [config](./deepsort_pcb_pyramid_r101.yml) |
+| PPLCNet-2.5x | PPLCNet Embedding | 36M | --- | 71.59 | 87.38 | 95.49 | [download](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams) | [config](./deepsort_pplcnet.yml) |
+
+### Results on VERI-Wild vehicle ReID dataset
+
+| Backbone | Model | Params | FPS | mAP | Top1 | Top5 | download | config |
+| :-------------: | :-----------------: | :-------: | :------: | :-------: | :-------: | :-------: | :-------: | :-------: |
+| PPLCNet-2.5x | PPLCNet Embedding | 93M | --- | 82.44 | 93.54 | 98.53 | [download](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicle.pdparams) | [config](./deepsort_pplcnet_vehicle.yml) |
+
+**Notes:**
+ - ReID models are provided by [PaddleClas](https://github.com/PaddlePaddle/PaddleClas), the specific training process and code will be published by PaddleClas.
+ - For pedestrian tracking, please use the **Market1501** pedestrian ReID model in combination with a pedestrian detector.
+ - For vehicle tracking, please use the **VERI-Wild** vehicle ReID model in combination with a vehicle detector.
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/reid/README_cn.md b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..d5930b652c456486b491c6ecec5b3739ac028f8b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/README_cn.md
@@ -0,0 +1,26 @@
+简体中文 | [English](README.md)
+
+# DeepSORT的ReID模型
+
+## 简介
+[DeepSORT](https://arxiv.org/abs/1812.00442)(Deep Cosine Metric Learning SORT) 由检测器和ReID模型串联组合而成,此处提供了几个常用ReID模型的配置作为DeepSORT使用的参考。
+
+## 模型库
+
+### 在Market1501行人重识别数据集上的结果
+
+| 骨架网络 | 网络类型 | Params | FPS | mAP | Top1 | Top5 | 下载链接 | 配置文件 |
+| :-------------: | :-----------------: | :-------: | :------: | :-------: | :-------: | :-------: | :-------: | :-------: |
+| ResNet-101 | PCB Pyramid Embedding | 289M | --- | 86.31 | 94.95 | 98.28 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams) | [配置文件](./deepsort_pcb_pyramid_r101.yml) |
+| PPLCNet-2.5x | PPLCNet Embedding | 36M | --- | 71.59 | 87.38 | 95.49 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet.pdparams) | [配置文件](./deepsort_pplcnet.yml) |
+
+### 在VERI-Wild车辆重识别数据集上的结果
+
+| 骨架网络 | 网络类型 | Params | FPS | mAP | Top1 | Top5 | 下载链接 | 配置文件 |
+| :-------------: | :-----------------: | :-------: | :------: | :-------: | :-------: | :-------: | :-------: | :-------: |
+| PPLCNet-2.5x | PPLCNet Embedding | 93M | --- | 82.44 | 93.54 | 98.53 | [下载链接](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicle.pdparams) | [配置文件](./deepsort_pplcnet_vehicle.yml) |
+
+**注意:**
+ - ReID模型由[PaddleClas](https://github.com/PaddlePaddle/PaddleClas)提供,具体训练流程和代码待PaddleClas公布.
+ - 行人跟踪请用**Market1501**行人重识别数据集训练的ReID模型结合行人检测器去使用。
+ - 车辆跟踪请用**VERI-Wild**车辆重识别数据集训练的ReID模型结合车辆检测器去使用。
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_pcb_pyramid_r101.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_pcb_pyramid_r101.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cbca94755fa97d1cdc8de9a55a39e7063de0417c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_pcb_pyramid_r101.yml
@@ -0,0 +1,45 @@
+# This config represents a ReID only configuration of DeepSORT, it has two uses.
+# One is used for loading the detection results and ReID model to get tracking results;
+# Another is used for exporting the ReID model to deploy infer.
+
+_BASE_: [
+ '../../../datasets/mot.yml',
+ '../../../runtime.yml',
+ '../_base_/deepsort_reader_1088x608.yml',
+]
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT16/images/train
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: None
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pcb_pyramid_r101.pdparams
+
+
+# A ReID only configuration of DeepSORT, detector should be None.
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: None
+ reid: PCBPyramid
+ tracker: DeepSORTTracker
+
+PCBPyramid:
+ model_name: "ResNet101"
+ num_conv_out_channels: 128
+ num_classes: 751 # default 751 classes in Market-1501 dataset.
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0 # 0 means no need to filter out too small boxes
+ vertical_ratio: -1 # -1 means no need to filter out bboxes, usually set 1.6 for pedestrian
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_pplcnet.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_pplcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d50da28b2cadf80d42184d37b4428f564c2033ac
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_pplcnet.yml
@@ -0,0 +1,44 @@
+# This config represents a ReID only configuration of DeepSORT, it has two uses.
+# One is used for loading the detection results and ReID model to get tracking results;
+# Another is used for exporting the ReID model to deploy infer.
+
+_BASE_: [
+ '../../../datasets/mot.yml',
+ '../../../runtime.yml',
+ '../_base_/deepsort_reader_1088x608.yml',
+]
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT16/images/train
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: None
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort_pplcnet.pdparams
+
+
+# A ReID only configuration of DeepSORT, detector should be None.
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: None
+ reid: PPLCNetEmbedding
+ tracker: DeepSORTTracker
+
+PPLCNetEmbedding:
+ input_ch: 1280
+ output_ch: 512
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0 # filter out too small boxes
+ vertical_ratio: -1 # filter out bboxes, usually set 1.6 for pedestrian
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_pplcnet_vehicle.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_pplcnet_vehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6e07042d837eb3f6be29f6eef7cfb35275433fa3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_pplcnet_vehicle.yml
@@ -0,0 +1,44 @@
+# This config represents a ReID only configuration of DeepSORT, it has two uses.
+# One is used for loading the detection results and ReID model to get tracking results;
+# Another is used for exporting the ReID model to deploy infer.
+
+_BASE_: [
+ '../../../datasets/mot.yml',
+ '../../../runtime.yml',
+ '../_base_/deepsort_reader_1088x608.yml',
+]
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: kitti_vehicle/images/train
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: None
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort_pplcnet_vehicle.pdparams
+
+
+# A ReID only configuration of DeepSORT, detector should be None.
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: None
+ reid: PPLCNetEmbedding
+ tracker: DeepSORTTracker
+
+PPLCNetEmbedding:
+ input_ch: 1280
+ output_ch: 512
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0 # 0 means no need to filter out too small boxes
+ vertical_ratio: -1 # -1 means no need to filter out bboxes, usually set 1.6 for pedestrian
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
diff --git a/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_resnet.yml b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_resnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a9460586b6485b055d59efb7fe204f044edb2e21
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/deepsort/reid/deepsort_resnet.yml
@@ -0,0 +1,43 @@
+# This config represents a ReID only configuration of DeepSORT, it has two uses.
+# One is used for loading the detection results and ReID model to get tracking results;
+# Another is used for exporting the ReID model to deploy infer.
+
+_BASE_: [
+ '../../../datasets/mot.yml',
+ '../../../runtime.yml',
+ '../_base_/deepsort_reader_1088x608.yml',
+]
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT16/images/train
+ keep_ori_im: True # set as True in DeepSORT
+
+det_weights: None
+reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_resnet.pdparams
+
+
+# A ReID only configuration of DeepSORT, detector should be None.
+architecture: DeepSORT
+pretrain_weights: None
+
+DeepSORT:
+ detector: None
+ reid: ResNetEmbedding
+ tracker: DeepSORTTracker
+
+ResNetEmbedding:
+ model_name: "ResNet50"
+
+DeepSORTTracker:
+ input_size: [64, 192]
+ min_box_area: 0 # filter out too small boxes
+ vertical_ratio: -1 # filter out bboxes, usually set 1.6 for pedestrian
+ budget: 100
+ max_age: 70
+ n_init: 3
+ metric_type: cosine
+ matching_threshold: 0.2
+ max_iou_distance: 0.9
+ motion: KalmanFilter
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/README.md b/PaddleDetection-release-2.6/configs/mot/fairmot/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..adb20bb28120e2b03c55020e5f0ba25d4a7bfa57
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/README.md
@@ -0,0 +1,208 @@
+English | [简体中文](README_cn.md)
+
+# FairMOT (FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking)
+
+## Table of Contents
+- [Introduction](#Introduction)
+- [Model Zoo](#Model_Zoo)
+- [Getting Start](#Getting_Start)
+- [Citations](#Citations)
+
+## Introduction
+
+[FairMOT](https://arxiv.org/abs/2004.01888) is based on an Anchor Free detector Centernet, which overcomes the problem of anchor and feature misalignment in anchor based detection framework. The fusion of deep and shallow features enables the detection and ReID tasks to obtain the required features respectively. It also uses low dimensional ReID features. FairMOT is a simple baseline composed of two homogeneous branches propose to predict the pixel level target score and ReID features. It achieves the fairness between the two tasks and obtains a higher level of real-time MOT performance.
+
+### PP-Tracking real-time MOT system
+In addition, PaddleDetection also provides [PP-Tracking](../../../deploy/pptracking/README.md) real-time multi-object tracking system.
+PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
+
+PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
+
+### AI studio public project tutorial
+PP-tracking provides an AI studio public project tutorial. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
+
+
+## Model Zoo
+
+### FairMOT Results on MOT-16 Training Set
+
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34(paper) | 1088x608 | 83.3 | 81.9 | 544 | 3822 | 14095 | - | - | - |
+| DLA-34 | 1088x608 | 83.2 | 83.1 | 499 | 3861 | 14223 | - | [model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [config](./fairmot_dla34_30e_1088x608.yml) |
+| DLA-34 | 864x480 | 80.8 | 81.1 | 561 | 3643 | 16967 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_864x480.pdparams) | [config](./fairmot_dla34_30e_864x480.yml) |
+| DLA-34 | 576x320 | 74.0 | 76.1 | 640 | 4989 | 23034 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [config](./fairmot_dla34_30e_576x320.yml) |
+
+
+### FairMOT Results on MOT-16 Test Set
+
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34(paper) | 1088x608 | 74.9 | 72.8 | 1074 | - | - | 25.9 | - | - |
+| DLA-34 | 1088x608 | 75.0 | 74.7 | 919 | 7934 | 36747 | - | [model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [config](./fairmot_dla34_30e_1088x608.yml) |
+| DLA-34 | 864x480 | 73.0 | 72.6 | 977 | 7578 | 40601 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_864x480.pdparams) | [config](./fairmot_dla34_30e_864x480.yml) |
+| DLA-34 | 576x320 | 69.9 | 70.2 | 1044 | 8869 | 44898 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [config](./fairmot_dla34_30e_576x320.yml) |
+
+**Notes:**
+ - FairMOT DLA-34 used 2 GPUs for training and mini-batch size as 6 on each GPU, and trained for 30 epochs.
+
+
+### FairMOT enhance model
+### Results on MOT-16 Test Set
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34 | 1088x608 | 75.9 | 74.7 | 1021 | 11425 | 31475 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_dla34_60e_1088x608.pdparams) | [config](./fairmot_enhance_dla34_60e_1088x608.yml) |
+| HarDNet-85 | 1088x608 | 75.0 | 70.0 | 1050 | 11837 | 32774 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [config](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
+
+### Results on MOT-17 Test Set
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34 | 1088x608 | 75.3 | 74.2 | 3270 | 29112 | 106749 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_dla34_60e_1088x608.pdparams) | [config](./fairmot_enhance_dla34_60e_1088x608.yml) |
+| HarDNet-85 | 1088x608 | 74.7 | 70.7 | 3210 | 29790 | 109914 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [config](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
+
+**Notes:**
+ - FairMOT enhance used 8 GPUs for training, and the crowdhuman dataset is added to the train-set during training.
+ - For FairMOT enhance DLA-34 the batch size is 16 on each GPU,and trained for 60 epochs.
+ - For FairMOT enhance HarDNet-85 the batch size is 10 on each GPU,and trained for 30 epochs.
+
+### FairMOT light model
+### Results on MOT-16 Test Set
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| HRNetV2-W18 | 1088x608 | 71.7 | 66.6 | 1340 | 8642 | 41592 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
+
+### Results on MOT-17 Test Set
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| HRNetV2-W18 | 1088x608 | 70.7 | 65.7 | 4281 | 22485 | 138468 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
+| HRNetV2-W18 | 864x480 | 70.3 | 65.8 | 4056 | 18927 | 144486 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) |
+| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [config](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |
+
+**Notes:**
+ - FairMOT HRNetV2-W18 used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epochs. Only ImageNet pre-train model is used, and the optimizer adopts Momentum. The crowdhuman dataset is added to the train-set during training.
+
+### FairMOT + BYTETracker
+
+### Results on MOT-17 Half Set
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34 | 1088x608 | 69.1 | 72.8 | 299 | 1957 | 14412 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [config](./fairmot_dla34_30e_1088x608.yml) |
+| DLA-34 + BYTETracker| 1088x608 | 70.3 | 73.2 | 234 | 2176 | 13598 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_bytetracker.pdparams) | [config](./fairmot_dla34_30e_1088x608_bytetracker.yml) |
+
+**Notes:**
+ - FairMOT here is for ablation study, the training dataset is the 5 datasets of MIX(Caltech,CUHKSYSU,PRW,Cityscapes,ETHZ) and the first half of MOT17 Train, and the pretrain weights is CenterNet COCO model, the evaluation is on the second half of MOT17 Train.
+ - BYTETracker adapt to other FairMOT models of PaddleDetection, you can modify the tracker of the config like this:
+ ```
+ JDETracker:
+ use_byte: True
+ match_thres: 0.8
+ conf_thres: 0.4
+ low_conf_thres: 0.2
+ ```
+
+### Fairmot transfer learning model
+
+### Results on GMOT-40 airplane subset
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34 | 1088x608 | 96.6 | 94.7 | 19 | 300 | 466 | - |[model](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_airplane.pdparams) | [config](./fairmot_dla34_30e_1088x608_airplane.yml) |
+
+**Note:**
+ - The dataset of this model is a subset of airport category extracted from GMOT-40 dataset. The download link provided by the PaddleDetection team is```wget https://bj.bcebos.com/v1/paddledet/data/mot/airplane.zip```, unzip and store it in the ```dataset/mot```, and then copy the ```airplane.train``` to ```dataset/mot/image_lists```.
+ - FairMOT model here uses the pedestrian FairMOT trained model for pre- training weights. The train-set used is the complete set of airplane, with a total of 4 video sequences, and it also used for evaluation.
+- When applied to the tracking other objects, you should modify ```min_box_area``` and ```vertical_ratio``` of the tracker in the corresponding config file, like this:
+ ```
+JDETracker:
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+ min_box_area: 0 # 200 for pedestrian
+ vertical_ratio: 0 # 1.6 for pedestrian
+ ```
+
+
+## Getting Start
+
+### 1. Training
+
+Training FairMOT on 2 GPUs with following command
+
+```bash
+python -m paddle.distributed.launch --log_dir=./fairmot_dla34_30e_1088x608/ --gpus 0,1 tools/train.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml
+```
+
+
+### 2. Evaluation
+
+Evaluating the track performance of FairMOT on val dataset in single GPU with following commands:
+
+```bash
+# use weights released in PaddleDetection model zoo
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
+
+# use saved checkpoint in training
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=output/fairmot_dla34_30e_1088x608/model_final.pdparams
+```
+**Notes:**
+ - The default evaluation dataset is MOT-16 Train Set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mot.yml`:
+ ```
+ EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/train
+ keep_ori_im: False # set True if save visualization images or video
+ ```
+ - Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
+
+### 3. Inference
+
+Inference a video on single GPU with following command:
+
+```bash
+# inference on video and save a video
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams --video_file={your video name}.mp4 --save_videos
+```
+**Notes:**
+ - Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
+
+
+### 4. Export model
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
+```
+
+### 5. Using exported model for python inference
+
+```bash
+python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
+```
+**Notes:**
+ - The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
+ - Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`.
+
+
+### 6. Using exported MOT and keypoint model for unite python inference
+
+```bash
+python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU
+```
+**Notes:**
+ - Keypoint model export tutorial: `configs/keypoint/README.md`.
+
+
+## Citations
+```
+@article{zhang2020fair,
+ title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
+ author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
+ journal={arXiv preprint arXiv:2004.01888},
+ year={2020}
+}
+@article{shao2018crowdhuman,
+ title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
+ author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
+ journal={arXiv preprint arXiv:1805.00123},
+ year={2018}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/README_cn.md b/PaddleDetection-release-2.6/configs/mot/fairmot/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..dd5a27874e6c7439222ca9f8648099ca25bf9863
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/README_cn.md
@@ -0,0 +1,202 @@
+简体中文 | [English](README.md)
+
+# FairMOT (FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [快速开始](#快速开始)
+- [引用](#引用)
+
+## 内容
+
+[FairMOT](https://arxiv.org/abs/2004.01888)以Anchor Free的CenterNet检测器为基础,克服了Anchor-Based的检测框架中anchor和特征不对齐问题,深浅层特征融合使得检测和ReID任务各自获得所需要的特征,并且使用低维度ReID特征,提出了一种由两个同质分支组成的简单baseline来预测像素级目标得分和ReID特征,实现了两个任务之间的公平性,并获得了更高水平的实时多目标跟踪精度。
+
+### PP-Tracking 实时多目标跟踪系统
+此外,PaddleDetection还提供了[PP-Tracking](../../../deploy/pptracking/README.md)实时多目标跟踪系统。PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。
+PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
+
+### AI Studio公开项目案例
+PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)。
+
+## 模型库
+
+### FairMOT在MOT-16 Training Set上结果
+
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :---: | :----: | :---: | :------: | :----: |:----: |
+| DLA-34(paper) | 1088x608 | 83.3 | 81.9 | 544 | 3822 | 14095 | - | - | - |
+| DLA-34 | 1088x608 | 83.2 | 83.1 | 499 | 3861 | 14223 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608.yml) |
+| DLA-34 | 864x480 | 80.8 | 81.1 | 561 | 3643 | 16967 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_864x480.pdparams) | [配置文件](./fairmot_dla34_30e_864x480.yml) |
+| DLA-34 | 576x320 | 74.0 | 76.1 | 640 | 4989 | 23034 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [配置文件](./fairmot_dla34_30e_576x320.yml) |
+
+### FairMOT在MOT-16 Test Set上结果
+
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: |:-------: | :----: | :----: |
+| DLA-34(paper) | 1088x608 | 74.9 | 72.8 | 1074 | - | - | 25.9 | - | - |
+| DLA-34 | 1088x608 | 75.0 | 74.7 | 919 | 7934 | 36747 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608.yml) |
+| DLA-34 | 864x480 | 73.0 | 72.6 | 977 | 7578 | 40601 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_864x480.pdparams) | [配置文件](./fairmot_dla34_30e_864x480.yml) |
+| DLA-34 | 576x320 | 69.9 | 70.2 | 1044 | 8869 | 44898 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_576x320.pdparams) | [配置文件](./fairmot_dla34_30e_576x320.yml) |
+
+**注意:**
+ - FairMOT DLA-34均使用2个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
+
+
+### FairMOT enhance模型
+### 在MOT-16 Test Set上结果
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34 | 1088x608 | 75.9 | 74.7 | 1021 | 11425 | 31475 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_dla34_60e_1088x608.pdparams) | [配置文件](./fairmot_enhance_dla34_60e_1088x608.yml) |
+| HarDNet-85 | 1088x608 | 75.0 | 70.0 | 1050 | 11837 | 32774 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [配置文件](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
+
+### 在MOT-17 Test Set上结果
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34 | 1088x608 | 75.3 | 74.2 | 3270 | 29112 | 106749 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_dla34_60e_1088x608.pdparams) | [配置文件](./fairmot_enhance_dla34_60e_1088x608.yml) |
+| HarDNet-85 | 1088x608 | 74.7 | 70.7 | 3210 | 29790 | 109914 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_enhance_hardnet85_30e_1088x608.pdparams) | [配置文件](./fairmot_enhance_hardnet85_30e_1088x608.yml) |
+
+**注意:**
+ - FairMOT enhance模型均使用8个GPU进行训练,训练集中加入了crowdhuman数据集一起参与训练。
+ - FairMOT enhance DLA-34 每个GPU上batch size为16,训练60个epoch。
+ - FairMOT enhance HarDNet-85 每个GPU上batch size为10,训练30个epoch。
+
+### FairMOT轻量级模型
+### 在MOT-16 Test Set上结果
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| HRNetV2-W18 | 1088x608 | 71.7 | 66.6 | 1340 | 8642 | 41592 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
+
+### 在MOT-17 Test Set上结果
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| HRNetV2-W18 | 1088x608 | 70.7 | 65.7 | 4281 | 22485 | 138468 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml) |
+| HRNetV2-W18 | 864x480 | 70.3 | 65.8 | 4056 | 18927 | 144486 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml) |
+| HRNetV2-W18 | 576x320 | 65.3 | 64.8 | 4137 | 28860 | 163017 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml) |
+
+**注意:**
+ - FairMOT HRNetV2-W18均使用8个GPU进行训练,每个GPU上batch size为4,训练30个epoch,使用的ImageNet预训练,优化器策略采用的是Momentum,并且训练集中加入了crowdhuman数据集一起参与训练。
+
+### FairMOT + BYTETracker
+
+### 在MOT-17 Half上结果
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34 | 1088x608 | 69.1 | 72.8 | 299 | 1957 | 14412 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608.yml) |
+| DLA-34 + BYTETracker| 1088x608 | 70.3 | 73.2 | 234 | 2176 | 13598 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_bytetracker.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_bytetracker.yml) |
+
+
+**注意:**
+ - FairMOT模型此处是ablation study的配置,使用的训练集是原先MIX的5个数据集(Caltech,CUHKSYSU,PRW,Cityscapes,ETHZ)加上MOT17 Train的前一半,且使用是预训练权重是CenterNet的COCO预训练权重,验证是在MOT17 Train的后一半上测的。
+ - BYTETracker应用到PaddleDetection的其他FairMOT模型,只需要更改对应的config文件里的tracker部分为如下所示:
+ ```
+ JDETracker:
+ use_byte: True
+ match_thres: 0.8
+ conf_thres: 0.4
+ low_conf_thres: 0.2
+ ```
+
+### FairMOT迁移学习模型
+
+### 在GMOT-40的airplane子集上的结果
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: | :------: | :----: |:-----: |
+| DLA-34 | 1088x608 | 96.6 | 94.7 | 19 | 300 | 466 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_airplane.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_airplane.yml) |
+
+**注意:**
+ - 此模型数据集是GMOT-40的airplane类别抽离出来的子集,PaddleDetection团队整理后的下载链接为: ```wget https://bj.bcebos.com/v1/paddledet/data/mot/airplane.zip```,下载解压存放于 ```dataset/mot```目录下,并将其中的```airplane.train```复制存放于```dataset/mot/image_lists```。
+ - FairMOT模型此处训练是采用行人FairMOT训好的模型作为预训练权重,使用的训练集是airplane全集共4个视频序列,验证也是在全集上测的。
+ - 应用到其他物体的跟踪,需要更改对应的config文件里的tracker部分的```min_box_area```和```vertical_ratio```,如下所示:
+ ```
+JDETracker:
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+ min_box_area: 0 # 200 for pedestrian
+ vertical_ratio: 0 # 1.6 for pedestrian
+ ```
+
+## 快速开始
+
+### 1. 训练
+
+使用2个GPU通过如下命令一键式启动训练
+
+```bash
+python -m paddle.distributed.launch --log_dir=./fairmot_dla34_30e_1088x608/ --gpus 0,1 tools/train.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml
+```
+
+### 2. 评估
+
+使用单张GPU通过如下命令一键式启动评估
+
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=output/fairmot_dla34_30e_1088x608/model_final.pdparams
+```
+**注意:**
+ - 默认评估的是MOT-16 Train Set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mot.yml`:
+ ```
+ EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/train
+ keep_ori_im: False # set True if save visualization images or video
+ ```
+ - 跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
+
+### 3. 预测
+
+使用单个GPU通过如下命令预测一个视频,并保存为视频
+
+```bash
+# 预测一个视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams --video_file={your video name}.mp4 --save_videos
+```
+
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+
+### 4. 导出预测模型
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
+```
+
+### 5. 用导出的模型基于Python去预测
+
+```bash
+python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
+```
+**注意:**
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+### 6. 用导出的跟踪和关键点模型Python联合预测
+
+```bash
+python deploy/python/mot_keypoint_unite_infer.py --mot_model_dir=output_inference/fairmot_dla34_30e_1088x608/ --keypoint_model_dir=output_inference/higherhrnet_hrnet_w32_512/ --video_file={your video name}.mp4 --device=GPU
+```
+**注意:**
+ - 关键点模型导出教程请参考`configs/keypoint/README.md`。
+
+
+## 引用
+```
+@article{zhang2020fair,
+ title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
+ author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
+ journal={arXiv preprint arXiv:2004.01888},
+ year={2020}
+}
+@article{shao2018crowdhuman,
+ title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
+ author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
+ journal={arXiv preprint arXiv:1805.00123},
+ year={2018}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_dla34.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_dla34.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9388ab6692be242f5532c696393944b71b232821
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_dla34.yml
@@ -0,0 +1,46 @@
+architecture: FairMOT
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/fairmot_dla34_crowdhuman_pretrained.pdparams
+for_mot: True
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+CenterNet:
+ backbone: DLA
+ neck: CenterNetDLAFPN
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+
+CenterNetDLAFPN:
+ down_ratio: 4
+ last_level: 5
+ out_channel: 0
+ dcn_v2: True
+ with_sge: False
+
+CenterNetHead:
+ head_planes: 256
+ prior_bias: -2.19
+ regress_ltrb: True
+ size_loss: 'L1'
+ loss_weight: {'heatmap': 1.0, 'size': 0.1, 'offset': 1.0, 'iou': 0.0}
+ add_iou: False
+
+FairMOTEmbeddingHead:
+ ch_head: 256
+ ch_emb: 128
+
+CenterNetPostProcess:
+ max_per_img: 500
+ down_ratio: 4
+ regress_ltrb: True
+
+JDETracker:
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+ min_box_area: 200
+ vertical_ratio: 1.6 # for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_hardnet85.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_hardnet85.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0924d5fcfa656d2ea6d753930ff8a0ddc7324eaa
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_hardnet85.yml
@@ -0,0 +1,43 @@
+architecture: FairMOT
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/centernet_hardnet85_coco.pdparams
+for_mot: True
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+CenterNet:
+ backbone: HarDNet
+ neck: CenterNetHarDNetFPN
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+
+HarDNet:
+ depth_wise: False
+ return_idx: [1,3,8,13]
+ arch: 85
+
+CenterNetHarDNetFPN:
+ num_layers: 85
+ down_ratio: 4
+ last_level: 4
+ out_channel: 0
+
+CenterNetHead:
+ head_planes: 128
+
+FairMOTEmbeddingHead:
+ ch_head: 512
+
+CenterNetPostProcess:
+ max_per_img: 500
+ regress_ltrb: True
+
+JDETracker:
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+ min_box_area: 200
+ vertical_ratio: 1.6 # for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_hrnetv2_w18_dlafpn.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_hrnetv2_w18_dlafpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..36f761c6f134d48fc54a19854af2a2f37899ad4b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_hrnetv2_w18_dlafpn.yml
@@ -0,0 +1,38 @@
+architecture: FairMOT
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/HRNet_W18_C_pretrained.pdparams
+for_mot: True
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+CenterNet:
+ backbone: HRNet
+ head: CenterNetHead
+ post_process: CenterNetPostProcess
+ neck: CenterNetDLAFPN
+
+HRNet:
+ width: 18
+ freeze_at: 0
+ return_idx: [0, 1, 2, 3]
+ upsample: False
+
+CenterNetDLAFPN:
+ down_ratio: 4
+ last_level: 3
+ out_channel: 0
+ first_level: 0
+ dcn_v2: False
+
+CenterNetPostProcess:
+ max_per_img: 500
+
+JDETracker:
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+ min_box_area: 200
+ vertical_ratio: 1.6 # for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_reader_1088x608.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_reader_1088x608.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6c6f8f51636bbe9b374ce76c6583e512758e1120
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_reader_1088x608.yml
@@ -0,0 +1,41 @@
+worker_num: 4
+TrainReader:
+ inputs_def:
+ image_shape: [3, 608, 1088]
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - MOTRandomAffine: {reject_outside: False}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2FairMOTTarget: {}
+ batch_size: 6
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 608, 1088]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_reader_576x320.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_reader_576x320.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a24cd3630141ce2d97785646b78ff3defb59c279
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_reader_576x320.yml
@@ -0,0 +1,41 @@
+worker_num: 4
+TrainReader:
+ inputs_def:
+ image_shape: [3, 320, 576]
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [320, 576]}
+ - MOTRandomAffine: {reject_outside: False}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2FairMOTTarget: {}
+ batch_size: 6
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [320, 576]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - Permute: {}
+ batch_size: 1
+
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 320, 576]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [320, 576]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_reader_864x480.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_reader_864x480.yml
new file mode 100644
index 0000000000000000000000000000000000000000..92020e9718e30544f18096b59d2d7aed9db33c67
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/fairmot_reader_864x480.yml
@@ -0,0 +1,41 @@
+worker_num: 4
+TrainReader:
+ inputs_def:
+ image_shape: [3, 480, 864]
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [480, 864]}
+ - MOTRandomAffine: {reject_outside: False}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2FairMOTTarget: {}
+ batch_size: 6
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [480, 864]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - Permute: {}
+ batch_size: 1
+
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 480, 864]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [480, 864]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/optimizer_30e.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/optimizer_30e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6e7ec0dc45e9180cf0e632bd19d0de66d619ec7d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/optimizer_30e.yml
@@ -0,0 +1,14 @@
+epoch: 30
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [20,]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/optimizer_30e_momentum.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/optimizer_30e_momentum.yml
new file mode 100644
index 0000000000000000000000000000000000000000..987a9af72ef9e69c5354d53d3c2c74919fea5365
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/_base_/optimizer_30e_momentum.yml
@@ -0,0 +1,19 @@
+epoch: 30
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [15, 22]
+ use_warmup: True
+ - !ExpWarmup
+ steps: 1000
+ power: 4
+
+OptimizerBuilder:
+ optimizer:
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3ef2b55c4e15ac8f3e4a484947c3d11fb8fb9d02
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_1088x608.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e.yml',
+ '_base_/fairmot_dla34.yml',
+ '_base_/fairmot_reader_1088x608.yml',
+]
+
+weights: output/fairmot_dla34_30e_1088x608/model_final
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_1088x608_airplane.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_1088x608_airplane.yml
new file mode 100644
index 0000000000000000000000000000000000000000..441947c95cadc3ec2554d110ee04f95752b974b9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_1088x608_airplane.yml
@@ -0,0 +1,33 @@
+_BASE_: [
+ 'fairmot_dla34_30e_1088x608.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608.pdparams
+weights: output/fairmot_dla34_30e_1088x608_airplane/model_final
+
+JDETracker:
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+ min_box_area: 0
+ vertical_ratio: 0
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['airplane.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: airplane/images/train
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_1088x608_bytetracker.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_1088x608_bytetracker.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a0ad44a0f9a6ef12d3904f1d78ede896f917a90b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_1088x608_bytetracker.yml
@@ -0,0 +1,31 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e.yml',
+ '_base_/fairmot_dla34.yml',
+ '_base_/fairmot_reader_1088x608.yml',
+]
+weights: output/fairmot_dla34_30e_1088x608_bytetracker/model_final
+
+# for ablation study, MIX + MOT17-half
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['mot17.half', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+JDETracker:
+ use_byte: True
+ match_thres: 0.8
+ conf_thres: 0.4
+ low_conf_thres: 0.2
+ min_box_area: 200
+ vertical_ratio: 1.6 # for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_576x320.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_576x320.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e2f9ca5fde20c3601b5c4eb009e26a3359f6b5e5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_576x320.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e.yml',
+ '_base_/fairmot_dla34.yml',
+ '_base_/fairmot_reader_576x320.yml',
+]
+
+weights: output/fairmot_dla34_30e_576x320/model_final
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_864x480.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_864x480.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8bc152d040a11b0e809b63a3ad0f4a3e2da492b6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_dla34_30e_864x480.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e.yml',
+ '_base_/fairmot_dla34.yml',
+ '_base_/fairmot_reader_864x480.yml',
+]
+
+weights: output/fairmot_dla34_30e_864x480/model_final
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_enhance_dla34_60e_1088x608.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_enhance_dla34_60e_1088x608.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c404468e3ecdbf1d6faf7ec0fd47c83a356b18a4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_enhance_dla34_60e_1088x608.yml
@@ -0,0 +1,56 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e.yml',
+ '_base_/fairmot_dla34.yml',
+ '_base_/fairmot_reader_1088x608.yml',
+]
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+# add crowdhuman
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+worker_num: 4
+TrainReader:
+ inputs_def:
+ image_shape: [3, 608, 1088]
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - MOTRandomAffine: {reject_outside: False}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2FairMOTTarget: {}
+ batch_size: 16
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+epoch: 60
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [40,]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
+
+weights: output/fairmot_enhance_dla34_60e_1088x608/model_final
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_enhance_hardnet85_30e_1088x608.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_enhance_hardnet85_30e_1088x608.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5cf598c836a3174d513f809c058618263673e069
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_enhance_hardnet85_30e_1088x608.yml
@@ -0,0 +1,56 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e.yml',
+ '_base_/fairmot_hardnet85.yml',
+ '_base_/fairmot_reader_1088x608.yml',
+]
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+# add crowdhuman
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+worker_num: 4
+TrainReader:
+ inputs_def:
+ image_shape: [3, 608, 1088]
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - MOTRandomAffine: {reject_outside: False}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2FairMOTTarget: {}
+ batch_size: 10
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [20,]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
+
+weights: output/fairmot_enhance_hardnet85_30e_1088x608/model_final
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bd0645fdfe149fb1e00472142b1fcc70224c8641
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml
@@ -0,0 +1,43 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e_momentum.yml',
+ '_base_/fairmot_hrnetv2_w18_dlafpn.yml',
+ '_base_/fairmot_reader_1088x608.yml',
+]
+
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+# add crowdhuman
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+worker_num: 4
+TrainReader:
+ inputs_def:
+ image_shape: [3, 608, 1088]
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - MOTRandomAffine: {reject_outside: False}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2FairMOTTarget: {}
+ batch_size: 4
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_1088x608/model_final
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bc35d346e048fa285514e8a4908af60a4f1937c0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml
@@ -0,0 +1,43 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e_momentum.yml',
+ '_base_/fairmot_hrnetv2_w18_dlafpn.yml',
+ '_base_/fairmot_reader_576x320.yml',
+]
+
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+# add crowdhuman
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+worker_num: 4
+TrainReader:
+ inputs_def:
+ image_shape: [3, 320, 576]
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [320, 576]}
+ - MOTRandomAffine: {reject_outside: False}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2FairMOTTarget: {}
+ batch_size: 4
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_576x320/model_final
diff --git a/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml
new file mode 100644
index 0000000000000000000000000000000000000000..061734a48bba1641d1a37b183980622506bf6cb1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml
@@ -0,0 +1,43 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e_momentum.yml',
+ '_base_/fairmot_hrnetv2_w18_dlafpn.yml',
+ '_base_/fairmot_reader_864x480.yml',
+]
+
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+# add crowdhuman
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['mot17.train', 'caltech.all', 'cuhksysu.train', 'prw.train', 'citypersons.train', 'eth.train', 'crowdhuman.train', 'crowdhuman.val']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+worker_num: 4
+TrainReader:
+ inputs_def:
+ image_shape: [3, 480, 864]
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [480, 864]}
+ - MOTRandomAffine: {reject_outside: False}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1]}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2FairMOTTarget: {}
+ batch_size: 4
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_864x480/model_final
diff --git a/PaddleDetection-release-2.6/configs/mot/headtracking21/README.md b/PaddleDetection-release-2.6/configs/mot/headtracking21/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/headtracking21/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/mot/headtracking21/README_cn.md b/PaddleDetection-release-2.6/configs/mot/headtracking21/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..b7f9274ee1e1e59b9330988edca4d247c39788ac
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/headtracking21/README_cn.md
@@ -0,0 +1,95 @@
+[English](README.md) | 简体中文
+# 特色垂类跟踪模型
+
+## 人头跟踪(Head Tracking)
+
+现有行人跟踪器对高人群密度场景表现不佳,人头跟踪更适用于密集场景的跟踪。
+[HT-21](https://motchallenge.net/data/Head_Tracking_21)是一个高人群密度拥挤场景的人头跟踪数据集,场景包括不同的光线和环境条件下的拥挤的室内和室外场景,所有序列的帧速率都是25fps。
+
+

+
+
+## 模型库
+### FairMOT 和 ByteTrack 在 HT-21 Training Set上的结果
+| 模型 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :---: | :----: | :---: | :------: | :----: |:----: |
+| FairMOT DLA-34 | 1088x608 | 64.7 | 69.0 | 8533 | 148817 | 234970 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_headtracking21.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_headtracking21.yml) |
+| ByteTrack-x | 1440x800 | 64.1 | 63.4 | 4191 | 185162 | 210240 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/bytetrack_yolox_ht21.pdparams) | [配置文件](../bytetrack/bytetrack_yolox_ht21.yml) |
+
+### FairMOT 和 ByteTrack 在 HT-21 Test Set上的结果
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :----: | :----: | :----: |:-------: | :----: | :----: |
+| FairMOT DLA-34 | 1088x608 | 60.8 | 62.8 | 12781 | 118109 | 198896 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_headtracking21.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_headtracking21.yml) |
+| ByteTrack-x | 1440x800 | 72.6 | 61.8 | 5163 | 71235 | 154139 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/bytetrack_yolox_ht21.pdparams) | [配置文件](../bytetrack/bytetrack_yolox_ht21.yml) |
+
+**注意:**
+ - FairMOT DLA-34使用2个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
+ - ByteTrack使用YOLOX-x做检测器,使用8个GPU进行训练,每个GPU上batch size为8,训练30个epoch,具体细节参照[bytetrack](../bytetrack/)。
+ - 此处提供PaddleDetection团队整理后的[下载链接](https://bj.bcebos.com/v1/paddledet/data/mot/HT21.zip),下载后需解压放到`dataset/mot/`目录下,HT-21 Test集的结果需要交到[官网](https://motchallenge.net)评测。
+
+
+## 快速开始
+
+### 1. 训练
+使用2个GPU通过如下命令一键式启动训练
+```bash
+python -m paddle.distributed.launch --log_dir=./fairmot_dla34_30e_1088x608_headtracking21/ --gpus 0,1 tools/train.py -c configs/mot/headtracking21/fairmot_dla34_30e_1088x608_headtracking21.yml
+```
+
+### 2. 评估
+使用单张GPU通过如下命令一键式启动评估
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/headtracking21/fairmot_dla34_30e_1088x608_headtracking21.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_headtracking21.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/headtracking21/fairmot_dla34_30e_1088x608_headtracking21.yml -o weights=output/fairmot_dla34_30e_1088x608_headtracking21/model_final.pdparams
+```
+
+### 3. 预测
+使用单个GPU通过如下命令预测一个视频,并保存为视频
+```bash
+# 预测一个视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/headtracking21/fairmot_dla34_30e_1088x608_headtracking21.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_headtracking21.pdparams --video_file={your video name}.mp4 --save_videos
+```
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+
+### 4. 导出预测模型
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/headtracking21/fairmot_dla34_30e_1088x608_headtracking21.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_headtracking21.pdparams
+```
+
+### 5. 用导出的模型基于Python去预测
+```bash
+python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_headtracking21 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
+```
+**注意:**
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+## 引用
+```
+@article{zhang2020fair,
+ title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
+ author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
+ journal={arXiv preprint arXiv:2004.01888},
+ year={2020}
+}
+
+@InProceedings{Sundararaman_2021_CVPR,
+ author = {Sundararaman, Ramana and De Almeida Braga, Cedric and Marchand, Eric and Pettre, Julien},
+ title = {Tracking Pedestrian Heads in Dense Crowd},
+ booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+ month = {June},
+ year = {2021},
+ pages = {3865-3875}
+}
+
+@article{zhang2021bytetrack,
+ title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
+ author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
+ journal={arXiv preprint arXiv:2110.06864},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/headtracking21/fairmot_dla34_30e_1088x608_headtracking21.yml b/PaddleDetection-release-2.6/configs/mot/headtracking21/fairmot_dla34_30e_1088x608_headtracking21.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8bfbc7ca8b7b76f4b0dbab42999cd6e15f392aaa
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/headtracking21/fairmot_dla34_30e_1088x608_headtracking21.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../fairmot/fairmot_dla34_30e_1088x608.yml'
+]
+
+weights: output/fairmot_dla34_30e_1088x608_headtracking21/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['ht21.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: HT21/images/test
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/README.md b/PaddleDetection-release-2.6/configs/mot/jde/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..6eab26e4fb12d862ff953b977a9393bef70df04f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/README.md
@@ -0,0 +1,119 @@
+English | [简体中文](README_cn.md)
+
+# JDE (Towards Real-Time Multi-Object Tracking)
+
+## Table of Contents
+- [Introduction](#Introduction)
+- [Model Zoo](#Model_Zoo)
+- [Getting Start](#Getting_Start)
+- [Citations](#Citations)
+
+## Introduction
+
+- [JDE](https://arxiv.org/abs/1909.12605) (Joint Detection and Embedding) learns the object detection task and appearance embedding task simutaneously in a shared neural network. And the detection results and the corresponding embeddings are also outputed at the same time. JDE original paper is based on an Anchor Base detector YOLOv3, adding a new ReID branch to learn embeddings. The training process is constructed as a multi-task learning problem, taking into account both accuracy and speed.
+
+### PP-Tracking real-time MOT system
+In addition, PaddleDetection also provides [PP-Tracking](../../../deploy/pptracking/README.md) real-time multi-object tracking system.
+PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
+
+PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
+
+### AI studio public project tutorial
+PP-tracking provides an AI studio public project tutorial. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
+
+
+

+
+
+## Model Zoo
+
+### JDE Results on MOT-16 Training Set
+
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: |
+| DarkNet53 | 1088x608 | 72.0 | 66.9 | 1397 | 7274 | 22209 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_1088x608.yml) |
+| DarkNet53 | 864x480 | 69.1 | 64.7 | 1539 | 7544 | 25046 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_864x480.yml) |
+| DarkNet53 | 576x320 | 63.7 | 64.4 | 1310 | 6782 | 31964 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_576x320.yml) |
+
+### JDE Results on MOT-16 Test Set
+
+| backbone | input shape | MOTA | IDF1 | IDS | FP | FN | FPS | download | config |
+| :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: |
+| DarkNet53(paper) | 1088x608 | 64.4 | 55.8 | 1544 | - | - | - | - | - |
+| DarkNet53 | 1088x608 | 64.6 | 58.5 | 1864 | 10550 | 52088 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_1088x608.yml) |
+| DarkNet53(paper) | 864x480 | 62.1 | 56.9 | 1608 | - | - | - | - | - |
+| DarkNet53 | 864x480 | 63.2 | 57.7 | 1966 | 10070 | 55081 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_864x480.yml) |
+| DarkNet53 | 576x320 | 59.1 | 56.4 | 1911 | 10923 | 61789 | - |[model](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_576x320.yml) |
+
+**Notes:**
+ - JDE used 8 GPUs for training and mini-batch size as 4 on each GPU, and trained for 30 epoches.
+
+## Getting Start
+
+### 1. Training
+
+Training JDE on 8 GPUs with following command
+
+```bash
+python -m paddle.distributed.launch --log_dir=./jde_darknet53_30e_1088x608/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml
+```
+
+### 2. Evaluation
+
+Evaluating the track performance of JDE on val dataset in single GPU with following commands:
+
+```bash
+# use weights released in PaddleDetection model zoo
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams
+
+# use saved checkpoint in training
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=output/jde_darknet53_30e_1088x608/model_final.pdparams
+```
+**Notes:**
+ - The default evaluation dataset is MOT-16 Train Set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mot.yml`:
+ ```
+ EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/train
+ keep_ori_im: False # set True if save visualization images or video
+ ```
+ - Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
+
+### 3. Inference
+
+Inference a video on single GPU with following command:
+
+```bash
+# inference on video and save a video
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams --video_file={your video name}.mp4 --save_videos
+```
+**Notes:**
+ - Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
+
+
+### 4. Export model
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams
+```
+
+### 5. Using exported model for python inference
+
+```bash
+python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/jde_darknet53_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
+```
+**Notes:**
+ - The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
+ - Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,-1,-1,-1`.
+
+
+## Citations
+```
+@article{wang2019towards,
+ title={Towards Real-Time Multi-Object Tracking},
+ author={Wang, Zhongdao and Zheng, Liang and Liu, Yixuan and Wang, Shengjin},
+ journal={arXiv preprint arXiv:1909.12605},
+ year={2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/README_cn.md b/PaddleDetection-release-2.6/configs/mot/jde/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..9fd0375f6e2cfebffe226a837ac99023130f0fc2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/README_cn.md
@@ -0,0 +1,118 @@
+简体中文 | [English](README.md)
+
+# JDE (Towards Real-Time Multi-Object Tracking)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [快速开始](#快速开始)
+- [引用](#引用)
+
+## 内容
+
+[JDE](https://arxiv.org/abs/1909.12605)(Joint Detection and Embedding)是在一个单一的共享神经网络中同时学习目标检测任务和embedding任务,并同时输出检测结果和对应的外观embedding匹配的算法。JDE原论文是基于Anchor Base的YOLOv3检测器新增加一个ReID分支学习embedding,训练过程被构建为一个多任务联合学习问题,兼顾精度和速度。
+
+### PP-Tracking 实时多目标跟踪系统
+此外,PaddleDetection还提供了[PP-Tracking](../../../deploy/pptracking/README.md)实时多目标跟踪系统。PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。
+PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
+
+### AI Studio公开项目案例
+PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)。
+
+
+

+
+
+## 模型库
+
+### JDE在MOT-16 Training Set上结果
+
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: |
+| DarkNet53 | 1088x608 | 72.0 | 66.9 | 1397 | 7274 | 22209 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_1088x608.yml) |
+| DarkNet53 | 864x480 | 69.1 | 64.7 | 1539 | 7544 | 25046 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_864x480.yml) |
+| DarkNet53 | 576x320 | 63.7 | 64.4 | 1310 | 6782 | 31964 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_576x320.yml) |
+
+
+### JDE在MOT-16 Test Set上结果
+
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FP | FN | FPS | 下载链接 | 配置文件 |
+| :----------------- | :------- | :----: | :----: | :---: | :----: | :---: | :---: | :---: | :---: |
+| DarkNet53(paper) | 1088x608 | 64.4 | 55.8 | 1544 | - | - | - | - | - |
+| DarkNet53 | 1088x608 | 64.6 | 58.5 | 1864 | 10550 | 52088 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_1088x608.yml) |
+| DarkNet53(paper) | 864x480 | 62.1 | 56.9 | 1608 | - | - | - | - | - |
+| DarkNet53 | 864x480 | 63.2 | 57.7 | 1966 | 10070 | 55081 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_864x480.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_864x480.yml) |
+| DarkNet53 | 576x320 | 59.1 | 56.4 | 1911 | 10923 | 61789 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_576x320.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mot/jde/jde_darknet53_30e_576x320.yml) |
+
+**注意:**
+ - JDE使用8个GPU进行训练,每个GPU上batch size为4,训练了30个epoch。
+
+## 快速开始
+
+### 1. 训练
+
+使用8GPU通过如下命令一键式启动训练
+
+```bash
+python -m paddle.distributed.launch --log_dir=./jde_darknet53_30e_1088x608/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml
+```
+
+### 2. 评估
+
+使用8GPU通过如下命令一键式启动评估
+
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=output/jde_darknet53_30e_1088x608/model_final.pdparams
+```
+**注意:**
+ - 默认评估的是MOT-16 Train Set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mot.yml`:
+ ```
+ EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/train
+ keep_ori_im: False # set True if save visualization images or video
+```
+ - 跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
+
+### 3. 预测
+
+使用单个GPU通过如下命令预测一个视频,并保存为视频
+
+```bash
+# 预测一个视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams --video_file={your video name}.mp4 --save_videos
+```
+
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+
+### 4. 导出预测模型
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/jde/jde_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_darknet53_30e_1088x608.pdparams
+```
+
+### 5. 用导出的模型基于Python去预测
+
+```bash
+python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/jde_darknet53_30e_1088x608 --video_file={your video name}.mp4 --device=GPU --save_mot_txts
+```
+**注意:**
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+
+## 引用
+```
+@article{wang2019towards,
+ title={Towards Real-Time Multi-Object Tracking},
+ author={Wang, Zhongdao and Zheng, Liang and Liu, Yixuan and Wang, Shengjin},
+ journal={arXiv preprint arXiv:1909.12605},
+ year={2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_darknet53.yml b/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_darknet53.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f5370fc6affa10f33af04185c48d61d5a2f06d98
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_darknet53.yml
@@ -0,0 +1,56 @@
+architecture: JDE
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/DarkNet53_pretrained.pdparams
+find_unused_parameters: True
+
+JDE:
+ detector: YOLOv3
+ reid: JDEEmbeddingHead
+ tracker: JDETracker
+
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: JDEBBoxPostProcess
+ for_mot: True
+
+DarkNet:
+ depth: 53
+ return_idx: [2, 3, 4]
+ freeze_norm: True
+
+YOLOv3FPN:
+ freeze_norm: True
+
+YOLOv3Head:
+ anchors: [[128,384], [180,540], [256,640], [512,640],
+ [32,96], [45,135], [64,192], [90,271],
+ [8,24], [11,34], [16,48], [23,68]]
+ anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
+ loss: JDEDetectionLoss
+
+JDEBBoxPostProcess:
+ decode:
+ name: JDEBox
+ conf_thresh: 0.3
+ downsample_ratio: 32
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.5
+ nms_top_k: 2000
+ normalized: true
+
+JDEEmbeddingHead:
+ anchor_levels: 3
+ anchor_scales: 4
+ embedding_dim: 512
+ emb_loss: JDEEmbeddingLoss
+ jde_loss: JDELoss
+
+JDETracker:
+ det_thresh: 0.3
+ track_buffer: 30
+ min_box_area: 200
+ vertical_ratio: 1.6 # for pedestrian
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_reader_1088x608.yml b/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_reader_1088x608.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e34e2a366217758d8d4bedaefacff027e7ef857c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_reader_1088x608.yml
@@ -0,0 +1,48 @@
+worker_num: 8
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - MOTRandomAffine: {}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2JDETargetThres:
+ anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
+ anchors: [[[128,384], [180,540], [256,640], [512,640]],
+ [[32,96], [45,135], [64,192], [90,271]],
+ [[8,24], [11,34], [16,48], [23,68]]]
+ downsample_ratios: [32, 16, 8]
+ ide_thresh: 0.5
+ fg_thresh: 0.5
+ bg_thresh: 0.4
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 608, 1088]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [608, 1088]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_reader_576x320.yml b/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_reader_576x320.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d1205ada7e8b0d793b6966178b76fea5351f17a5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_reader_576x320.yml
@@ -0,0 +1,48 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [320, 576]}
+ - MOTRandomAffine: {}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2JDETargetThres:
+ anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
+ anchors: [[[85,255], [120,360], [170,420], [340,420]],
+ [[21,64], [30,90], [43,128], [60,180]],
+ [[6,16], [8,23], [11,32], [16,45]]]
+ downsample_ratios: [32, 16, 8]
+ ide_thresh: 0.5
+ fg_thresh: 0.5
+ bg_thresh: 0.4
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [320, 576]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 320, 576]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [320, 576]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_reader_864x480.yml b/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_reader_864x480.yml
new file mode 100644
index 0000000000000000000000000000000000000000..439eced58e59fe893c6f5ea591c9cc7ec7daf240
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/_base_/jde_reader_864x480.yml
@@ -0,0 +1,48 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RGBReverse: {}
+ - AugmentHSV: {}
+ - LetterBoxResize: {target_size: [480, 864]}
+ - MOTRandomAffine: {}
+ - RandomFlip: {}
+ - BboxXYXY2XYWH: {}
+ - NormalizeBox: {}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - RGBReverse: {}
+ - Permute: {}
+ batch_transforms:
+ - Gt2JDETargetThres:
+ anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
+ anchors: [[[102,305], [143, 429], [203,508], [407,508]],
+ [[25,76], [36,107], [51,152], [71,215]],
+ [[6,19], [9,27], [13,38], [18,54]]]
+ downsample_ratios: [32, 16, 8]
+ ide_thresh: 0.5
+ fg_thresh: 0.5
+ bg_thresh: 0.4
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+
+EvalMOTReader:
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [480, 864]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+TestMOTReader:
+ inputs_def:
+ image_shape: [3, 480, 864]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: {target_size: [480, 864]}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/_base_/optimizer_30e.yml b/PaddleDetection-release-2.6/configs/mot/jde/_base_/optimizer_30e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f90439a5c52573ccfa73c208dc82289b7de9ed31
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/_base_/optimizer_30e.yml
@@ -0,0 +1,20 @@
+epoch: 30
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [15, 22]
+ use_warmup: True
+ - !ExpWarmup
+ steps: 1000
+ power: 4
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/_base_/optimizer_60e.yml b/PaddleDetection-release-2.6/configs/mot/jde/_base_/optimizer_60e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..64b81300ded7b5b43ddab2edaaf0c24da547fe89
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/_base_/optimizer_60e.yml
@@ -0,0 +1,20 @@
+epoch: 60
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [30, 44]
+ use_warmup: True
+ - !ExpWarmup
+ steps: 1000
+ power: 4
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/jde_darknet53_30e_1088x608.yml b/PaddleDetection-release-2.6/configs/mot/jde/jde_darknet53_30e_1088x608.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9aa2eaa96e50e1d5cd4267e36793fad2967514ad
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/jde_darknet53_30e_1088x608.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e.yml',
+ '_base_/jde_darknet53.yml',
+ '_base_/jde_reader_1088x608.yml',
+]
+weights: output/jde_darknet53_30e_1088x608/model_final
+
+JDE:
+ detector: YOLOv3
+ reid: JDEEmbeddingHead
+ tracker: JDETracker
+
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: JDEBBoxPostProcess
+ for_mot: True
+
+YOLOv3Head:
+ anchors: [[128,384], [180,540], [256,640], [512,640],
+ [32,96], [45,135], [64,192], [90,271],
+ [8,24], [11,34], [16,48], [23,68]]
+ anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
+ loss: JDEDetectionLoss
+
+JDETracker:
+ det_thresh: 0.3
+ track_buffer: 30
+ min_box_area: 200
+ motion: KalmanFilter
+
+JDEBBoxPostProcess:
+ decode:
+ name: JDEBox
+ conf_thresh: 0.5
+ downsample_ratio: 32
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.4
+ nms_top_k: 2000
+ normalized: true
+ return_index: true
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/jde_darknet53_30e_576x320.yml b/PaddleDetection-release-2.6/configs/mot/jde/jde_darknet53_30e_576x320.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7cee1aafc867d8352d7297dcc6d3bbee59e463f0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/jde_darknet53_30e_576x320.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e.yml',
+ '_base_/jde_darknet53.yml',
+ '_base_/jde_reader_576x320.yml',
+]
+weights: output/jde_darknet53_30e_576x320/model_final
+
+JDE:
+ detector: YOLOv3
+ reid: JDEEmbeddingHead
+ tracker: JDETracker
+
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: JDEBBoxPostProcess
+ for_mot: True
+
+YOLOv3Head:
+ anchors: [[85,255], [120,360], [170,420], [340,420],
+ [21,64], [30,90], [43,128], [60,180],
+ [6,16], [8,23], [11,32], [16,45]]
+ anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
+ loss: JDEDetectionLoss
+
+JDETracker:
+ det_thresh: 0.3
+ track_buffer: 30
+ min_box_area: 200
+ motion: KalmanFilter
+
+JDEBBoxPostProcess:
+ decode:
+ name: JDEBox
+ conf_thresh: 0.5
+ downsample_ratio: 32
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.4
+ nms_top_k: 2000
+ normalized: true
+ return_index: true
diff --git a/PaddleDetection-release-2.6/configs/mot/jde/jde_darknet53_30e_864x480.yml b/PaddleDetection-release-2.6/configs/mot/jde/jde_darknet53_30e_864x480.yml
new file mode 100644
index 0000000000000000000000000000000000000000..96ed2232c3b9d2a5ab2b702792fb61f8dcdffc9a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/jde/jde_darknet53_30e_864x480.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../../datasets/mot.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_30e.yml',
+ '_base_/jde_darknet53.yml',
+ '_base_/jde_reader_864x480.yml',
+]
+weights: output/jde_darknet53_30e_864x480/model_final
+
+JDE:
+ detector: YOLOv3
+ reid: JDEEmbeddingHead
+ tracker: JDETracker
+
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: JDEBBoxPostProcess
+ for_mot: True
+
+YOLOv3Head:
+ anchors: [[102,305], [143, 429], [203,508], [407,508],
+ [25,76], [36,107], [51,152], [71,215],
+ [6,19], [9,27], [13,38], [18,54]]
+ anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
+ loss: JDEDetectionLoss
+
+JDETracker:
+ det_thresh: 0.3
+ track_buffer: 30
+ min_box_area: 200
+ motion: KalmanFilter
+
+JDEBBoxPostProcess:
+ decode:
+ name: JDEBox
+ conf_thresh: 0.5
+ downsample_ratio: 32
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.4
+ nms_top_k: 2000
+ normalized: true
+ return_index: true
diff --git a/PaddleDetection-release-2.6/configs/mot/mcfairmot/README.md b/PaddleDetection-release-2.6/configs/mot/mcfairmot/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..d77a252afc3543226a16b770e0796e4e073d14ce
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mcfairmot/README.md
@@ -0,0 +1,140 @@
+English | [简体中文](README_cn.md)
+
+# MCFairMOT (Multi-class FairMOT)
+
+## Table of Contents
+- [Introduction](#Introduction)
+- [Model Zoo](#Model_Zoo)
+- [Getting Start](#Getting_Start)
+- [Citations](#Citations)
+
+## Introduction
+
+MCFairMOT is the Multi-class extended version of [FairMOT](https://arxiv.org/abs/2004.01888).
+
+### PP-Tracking real-time MOT system
+In addition, PaddleDetection also provides [PP-Tracking](../../../deploy/pptracking/README.md) real-time multi-object tracking system.
+PP-Tracking is the first open source real-time Multi-Object Tracking system, and it is based on PaddlePaddle deep learning framework. It has rich models, wide application and high efficiency deployment.
+
+PP-Tracking supports two paradigms: single camera tracking (MOT) and multi-camera tracking (MTMCT). Aiming at the difficulties and pain points of actual business, PP-Tracking provides various MOT functions and applications such as pedestrian tracking, vehicle tracking, multi-class tracking, small object tracking, traffic statistics and multi-camera tracking. The deployment method supports API and GUI visual interface, and the deployment language supports Python and C++, The deployment platform environment supports Linux, NVIDIA Jetson, etc.
+
+### AI studio public project tutorial
+PP-tracking provides an AI studio public project tutorial. Please refer to this [tutorial](https://aistudio.baidu.com/aistudio/projectdetail/3022582).
+
+## Model Zoo
+### MCFairMOT Results on VisDrone2019 Val Set
+| backbone | input shape | MOTA | IDF1 | IDS | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :---: | :------: | :----: |:----: |
+| DLA-34 | 1088x608 | 24.3 | 41.6 | 2314 | - |[model](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams) | [config](./mcfairmot_dla34_30e_1088x608_visdrone.yml) |
+| HRNetV2-W18 | 1088x608 | 20.4 | 39.9 | 2603 | - |[model](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.pdparams) | [config](./mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.yml) |
+| HRNetV2-W18 | 864x480 | 18.2 | 38.7 | 2416 | - |[model](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone.pdparams) | [config](./mcfairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone.yml) |
+| HRNetV2-W18 | 576x320 | 12.0 | 33.8 | 2178 | - |[model](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.pdparams) | [config](./mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.yml) |
+
+**Notes:**
+ - MOTA is the average MOTA of 10 categories in the VisDrone2019 MOT dataset, and its value is also equal to the average MOTA of all the evaluated video sequences. Here we provide the download [link](https://bj.bcebos.com/v1/paddledet/data/mot/visdrone_mcmot.zip) of the dataset.
+ - MCFairMOT used 4 GPUs for training 30 epochs. The batch size is 6 on each GPU for MCFairMOT DLA-34, and 8 for MCFairMOT HRNetV2-W18.
+
+### MCFairMOT Results on VisDrone Vehicle Val Set
+| backbone | input shape | MOTA | IDF1 | IDS | FPS | download | config |
+| :--------------| :------- | :----: | :----: | :---: | :------: | :----: |:----: |
+| DLA-34 | 1088x608 | 37.7 | 56.8 | 199 | - |[model](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.pdparams) | [config](./mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml) |
+| HRNetV2-W18 | 1088x608 | 35.6 | 56.3 | 190 | - |[model](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle_bytetracker.pdparams) | [config](./mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle_bytetracker.yml) |
+
+**Notes:**
+ - MOTA is the average MOTA of 4 categories in the VisDrone Vehicle dataset, and this dataset is extracted from the VisDrone2019 MOT dataset, here we provide the download [link](https://bj.bcebos.com/v1/paddledet/data/mot/visdrone_mcmot_vehicle.zip).
+ - The tracker used in MCFairMOT model here is ByteTracker.
+
+### MCFairMOT off-line quantization results on VisDrone Vehicle val-set
+| Model | Compression Strategy | Prediction Delay(T4) |Prediction Delay(V100)| Model Configuration File |Compression Algorithm Configuration File |
+| :--------------| :------- | :------: | :----: | :----: | :----: |
+| DLA-34 | baseline | 41.3 | 21.9 |[Configuration File](./mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml)| - |
+| DLA-34 | off-line quantization | 37.8 | 21.2 |[Configuration File](./mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml)|[Configuration File](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/slim/post_quant/mcfairmot_ptq.yml)|
+
+
+## Getting Start
+
+### 1. Training
+Training MCFairMOT on 4 GPUs with following command
+```bash
+python -m paddle.distributed.launch --log_dir=./mcfairmot_dla34_30e_1088x608_visdrone/ --gpus 0,1,2,3 tools/train.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml
+```
+
+### 2. Evaluation
+Evaluating the track performance of MCFairMOT on val dataset in single GPU with following commands:
+```bash
+# use weights released in PaddleDetection model zoo
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams
+
+# use saved checkpoint in training
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=output/mcfairmot_dla34_30e_1088x608_visdrone/model_final.pdparams
+```
+**Notes:**
+ - The default evaluation dataset is VisDrone2019 MOT val-set. If you want to change the evaluation dataset, please refer to the following code and modify `configs/datasets/mcmot.yml`:
+ ```
+ EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: your_dataset/images/val
+ keep_ori_im: False # set True if save visualization images or video
+ ```
+ - Tracking results will be saved in `{output_dir}/mot_results/`, and every sequence has one txt file, each line of the txt file is `frame,id,x1,y1,w,h,score,cls_id,-1,-1`, and you can set `{output_dir}` by `--output_dir`.
+
+### 3. Inference
+Inference a video on single GPU with following command:
+```bash
+# inference on video and save a video
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams --video_file={your video name}.mp4 --save_videos
+```
+**Notes:**
+ - Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
+
+
+### 4. Export model
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams
+```
+
+### 5. Using exported model for python inference
+```bash
+python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/mcfairmot_dla34_30e_1088x608_visdrone --video_file={your video name}.mp4 --device=GPU --save_mot_txts
+```
+**Notes:**
+ - The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
+ - Each line of the tracking results txt file is `frame,id,x1,y1,w,h,score,cls_id,-1,-1`.
+
+### 6. Off-line quantization
+
+The offline quantization model is calibrated using the VisDrone Vehicle val-set, running as:
+```bash
+CUDA_VISIBLE_DEVICES=0 python3.7 tools/post_quant.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml --slim_config=configs/slim/post_quant/mcfairmot_ptq.yml
+```
+**Notes:**
+ - Offline quantization uses the VisDrone Vehicle val-set dataset and a 4-class vehicle tracking model by default.
+
+## Citations
+```
+@article{zhang2020fair,
+ title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
+ author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
+ journal={arXiv preprint arXiv:2004.01888},
+ year={2020}
+}
+
+@ARTICLE{9573394,
+ author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ title={Detection and Tracking Meet Drones Challenge},
+ year={2021},
+ volume={},
+ number={},
+ pages={1-1},
+ doi={10.1109/TPAMI.2021.3119563}
+}
+
+@article{zhang2021bytetrack,
+ title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
+ author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
+ journal={arXiv preprint arXiv:2110.06864},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/mcfairmot/README_cn.md b/PaddleDetection-release-2.6/configs/mot/mcfairmot/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..e8cd6bbc6fc2f953303932f918f7d47864fd4182
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mcfairmot/README_cn.md
@@ -0,0 +1,137 @@
+简体中文 | [English](README.md)
+
+# MCFairMOT (Multi-class FairMOT)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [快速开始](#快速开始)
+- [引用](#引用)
+
+## 内容
+
+MCFairMOT是[FairMOT](https://arxiv.org/abs/2004.01888)的多类别扩展版本。
+
+### PP-Tracking 实时多目标跟踪系统
+此外,PaddleDetection还提供了[PP-Tracking](../../../deploy/pptracking/README.md)实时多目标跟踪系统。PP-Tracking是基于PaddlePaddle深度学习框架的业界首个开源的实时多目标跟踪系统,具有模型丰富、应用广泛和部署高效三大优势。
+PP-Tracking支持单镜头跟踪(MOT)和跨镜头跟踪(MTMCT)两种模式,针对实际业务的难点和痛点,提供了行人跟踪、车辆跟踪、多类别跟踪、小目标跟踪、流量统计以及跨镜头跟踪等各种多目标跟踪功能和应用,部署方式支持API调用和GUI可视化界面,部署语言支持Python和C++,部署平台环境支持Linux、NVIDIA Jetson等。
+
+### AI Studio公开项目案例
+PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)。
+
+## 模型库
+
+### MCFairMOT 在VisDrone2019 MOT val-set上结果
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :---: | :------: | :----: |:----: |
+| DLA-34 | 1088x608 | 24.3 | 41.6 | 2314 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams) | [配置文件](./mcfairmot_dla34_30e_1088x608_visdrone.yml) |
+| HRNetV2-W18 | 1088x608 | 20.4 | 39.9 | 2603 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.pdparams) | [配置文件](./mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.yml) |
+| HRNetV2-W18 | 864x480 | 18.2 | 38.7 | 2416 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone.pdparams) | [配置文件](./mcfairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone.yml) |
+| HRNetV2-W18 | 576x320 | 12.0 | 33.8 | 2178 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.pdparams) | [配置文件](./mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.yml) |
+
+**注意:**
+ - MOTA是VisDrone2019 MOT数据集10类目标的平均MOTA, 其值也等于所有评估的视频序列的平均MOTA,此处提供数据集[下载链接](https://bj.bcebos.com/v1/paddledet/data/mot/visdrone_mcmot.zip)。
+ - MCFairMOT模型均使用4个GPU进行训练,训练30个epoch。DLA-34骨干网络的每个GPU上batch size为6,HRNetV2-W18骨干网络的每个GPU上batch size为8。
+
+### MCFairMOT 在VisDrone Vehicle val-set上结果
+| 骨干网络 | 输入尺寸 | MOTA | IDF1 | IDS | FPS | 下载链接 | 配置文件 |
+| :--------------| :------- | :----: | :----: | :---: | :------: | :----: |:----: |
+| DLA-34 | 1088x608 | 37.7 | 56.8 | 199 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.pdparams) | [配置文件](./mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml) |
+| HRNetV2-W18 | 1088x608 | 35.6 | 56.3 | 190 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle_bytetracker.pdparams) | [配置文件](./mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle_bytetracker.yml) |
+
+**注意:**
+ - MOTA是VisDrone Vehicle数据集4类车辆目标的平均MOTA, 该数据集是VisDrone数据集中抽出4类车辆类别组成的,此处提供数据集[下载链接](https://bj.bcebos.com/v1/paddledet/data/mot/visdrone_mcmot_vehicle.zip)。
+ - MCFairMOT模型此处使用的跟踪器是使用的ByteTracker。
+
+### MCFairMOT 在VisDrone Vehicle val-set上离线量化结果
+| 骨干网络 | 压缩策略 | 预测时延(T4) |预测时延(V100)| 配置文件 |压缩算法配置文件 |
+| :--------------| :------- | :------: | :----: | :----: | :----: |
+| DLA-34 | baseline | 41.3 | 21.9 |[配置文件](./mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml)| - |
+| DLA-34 | 离线量化 | 37.8 | 21.2 |[配置文件](./mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml)|[配置文件](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/slim/post_quant/mcfairmot_ptq.yml)|
+
+## 快速开始
+
+### 1. 训练
+使用4个GPU通过如下命令一键式启动训练
+```bash
+python -m paddle.distributed.launch --log_dir=./mcfairmot_dla34_30e_1088x608_visdrone/ --gpus 0,1,2,3 tools/train.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml
+```
+
+### 2. 评估
+使用单张GPU通过如下命令一键式启动评估
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=output/mcfairmot_dla34_30e_1088x608_visdrone/model_final.pdparams
+```
+**注意:**
+ - 默认评估的是VisDrone2019 MOT val-set数据集, 如需换评估数据集可参照以下代码修改`configs/datasets/mcmot.yml`:
+ ```
+ EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: your_dataset/images/val
+ keep_ori_im: False # set True if save visualization images or video
+ ```
+ - 多类别跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,cls_id,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
+
+### 3. 预测
+使用单个GPU通过如下命令预测一个视频,并保存为视频
+```bash
+# 预测一个视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams --video_file={your video name}.mp4 --save_videos
+```
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+
+### 4. 导出预测模型
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone.pdparams
+```
+
+### 5. 用导出的模型基于Python去预测
+```bash
+python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/mcfairmot_dla34_30e_1088x608_visdrone --video_file={your video name}.mp4 --device=GPU --save_mot_txts
+```
+**注意:**
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 多类别跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,cls_id,-1,-1`。
+
+### 6. 离线量化
+
+使用 VisDrone Vehicle val-set 对离线量化模型进行校准,运行方式:
+```bash
+CUDA_VISIBLE_DEVICES=0 python3.7 tools/post_quant.py -c configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml --slim_config=configs/slim/post_quant/mcfairmot_ptq.yml
+```
+**注意:**
+ - 离线量化默认使用的是VisDrone Vehicle val-set数据集以及4类车辆跟踪模型。
+
+## 引用
+```
+@article{zhang2020fair,
+ title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
+ author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
+ journal={arXiv preprint arXiv:2004.01888},
+ year={2020}
+}
+
+@ARTICLE{9573394,
+ author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ title={Detection and Tracking Meet Drones Challenge},
+ year={2021},
+ volume={},
+ number={},
+ pages={1-1},
+ doi={10.1109/TPAMI.2021.3119563}
+}
+
+@article{zhang2021bytetrack,
+ title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
+ author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
+ journal={arXiv preprint arXiv:2110.06864},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..287255fdaf032d2979083d460ef49335409e0b9f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone.yml
@@ -0,0 +1,42 @@
+_BASE_: [
+ '../fairmot/fairmot_dla34_30e_1088x608.yml',
+ '../../datasets/mcmot.yml'
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/fairmot_dla34_crowdhuman_pretrained.pdparams
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker # multi-class tracker
+
+CenterNetHead:
+ regress_ltrb: False
+
+CenterNetPostProcess:
+ regress_ltrb: False
+ max_per_img: 200
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+
+weights: output/mcfairmot_dla34_30e_1088x608_visdrone/model_final
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [10, 20]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
diff --git a/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml
new file mode 100644
index 0000000000000000000000000000000000000000..99452f5dc55115a4267c9bb4ad4608009a54a16e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.yml
@@ -0,0 +1,68 @@
+_BASE_: [
+ '../fairmot/fairmot_dla34_30e_1088x608.yml',
+ '../../datasets/mcmot.yml'
+]
+metric: MCMOT
+num_classes: 4
+
+# for MCMOT training
+TrainDataset:
+ !MCMOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_mcmot_vehicle.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+ label_list: label_list.txt
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_mcmot_vehicle/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+ anno_path: dataset/mot/visdrone_mcmot_vehicle/label_list.txt
+
+# for MCMOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+ anno_path: dataset/mot/visdrone_mcmot_vehicle/label_list.txt
+
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/centernet_dla34_140e_coco.pdparams
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker # multi-class tracker
+
+CenterNetHead:
+ regress_ltrb: False
+
+CenterNetPostProcess:
+ regress_ltrb: False
+ max_per_img: 200
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ use_byte: True
+ match_thres: 0.8
+ conf_thres: 0.4
+ low_conf_thres: 0.2
+
+weights: output/mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker/model_final
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [10, 20]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
diff --git a/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.yml b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cbbb9fa4af630a489fbbfc3f27468c5d801ffd74
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml',
+ '../../datasets/mcmot.yml'
+]
+
+architecture: FairMOT
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/HRNet_W18_C_pretrained.pdparams
+for_mot: True
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker # multi-class tracker
+
+CenterNetHead:
+ regress_ltrb: False
+
+CenterNetPostProcess:
+ regress_ltrb: False
+ max_per_img: 200
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+
+weights: output/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone/model_final
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [10, 20]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
+
+TrainReader:
+ batch_size: 8
diff --git a/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle_bytetracker.yml b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle_bytetracker.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a1c1de91dc3860b38a1f641cc719cbcfab92d7f3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle_bytetracker.yml
@@ -0,0 +1,78 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml',
+ '../../datasets/mcmot.yml'
+]
+metric: MCMOT
+num_classes: 4
+
+# for MCMOT training
+TrainDataset:
+ !MCMOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_mcmot_vehicle.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+ label_list: label_list.txt
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_mcmot_vehicle/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+ anno_path: dataset/mot/visdrone_mcmot_vehicle/label_list.txt
+
+# for MCMOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+ anno_path: dataset/mot/visdrone_mcmot_vehicle/label_list.txt
+
+
+architecture: FairMOT
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/HRNet_W18_C_pretrained.pdparams
+for_mot: True
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker # multi-class tracker
+
+CenterNetHead:
+ regress_ltrb: False
+
+CenterNetPostProcess:
+ regress_ltrb: False
+ max_per_img: 200
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ use_byte: True
+ match_thres: 0.8
+ conf_thres: 0.4
+ low_conf_thres: 0.2
+
+weights: output/mcfairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle_bytetracker/model_final
+
+epoch: 30
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [15, 22]
+ use_warmup: True
+ - !ExpWarmup
+ steps: 1000
+ power: 4
+
+OptimizerBuilder:
+ optimizer:
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+TrainReader:
+ batch_size: 8
diff --git a/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml
new file mode 100644
index 0000000000000000000000000000000000000000..da1170ac53e8b15046dbf21150945e9dd9af0d1a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot.yml
@@ -0,0 +1,64 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml',
+ '../../datasets/mcmot.yml'
+]
+
+metric: MCMOT
+num_classes: 11
+weights: output/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100k_mcmot/model_final
+
+# for MCMOT training
+TrainDataset:
+ !MCMOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['bdd100k_mcmot.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+ label_list: label_list.txt
+
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: bdd100k_mcmot/images/val
+ keep_ori_im: False
+
+# model config
+architecture: FairMOT
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/HRNet_W18_C_pretrained.pdparams
+for_mot: True
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker # multi-class tracker
+
+CenterNetHead:
+ regress_ltrb: False
+
+CenterNetPostProcess:
+ regress_ltrb: False
+ max_per_img: 200
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [10, 20]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
+
+TrainReader:
+ batch_size: 8
diff --git a/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.yml b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e0fe18381b7490bfb8f263e30599541cd26861b1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml',
+ '../../datasets/mcmot.yml'
+]
+
+architecture: FairMOT
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/HRNet_W18_C_pretrained.pdparams
+for_mot: True
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker # multi-class tracker
+
+CenterNetHead:
+ regress_ltrb: False
+
+CenterNetPostProcess:
+ regress_ltrb: False
+ max_per_img: 200
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+
+weights: output/mcfairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone/model_final
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [10, 20]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
+
+TrainReader:
+ batch_size: 8
diff --git a/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone.yml b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..02d918ddeb0010e6488095bb19658439b7aeebc6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mcfairmot/mcfairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml',
+ '../../datasets/mcmot.yml'
+]
+
+architecture: FairMOT
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/HRNet_W18_C_pretrained.pdparams
+for_mot: True
+
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker # multi-class tracker
+
+CenterNetHead:
+ regress_ltrb: False
+
+CenterNetPostProcess:
+ regress_ltrb: False
+ max_per_img: 200
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
+
+weights: output/mcfairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone/model_final
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [10, 20]
+ use_warmup: False
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer: NULL
+
+TrainReader:
+ batch_size: 8
diff --git a/PaddleDetection-release-2.6/configs/mot/mtmct/README.md b/PaddleDetection-release-2.6/configs/mot/mtmct/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mtmct/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/mot/mtmct/README_cn.md b/PaddleDetection-release-2.6/configs/mot/mtmct/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..f4b5d12ee6fc1ca6afbb86243535fb5435539c52
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mtmct/README_cn.md
@@ -0,0 +1,137 @@
+English | [简体中文](README_cn.md)
+
+# MTMCT (Multi-Target Multi-Camera Tracking)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [快速开始](#快速开始)
+- [引用](#引用)
+
+## 简介
+MTMCT (Multi-Target Multi-Camera Tracking) 跨镜头多目标跟踪是某一场景下的不同摄像头拍摄的视频进行多目标跟踪,是跟踪领域一个非常重要的研究课题,在安防监控、自动驾驶、智慧城市等行业起着重要作用。MTMCT预测的是同一场景下的不同摄像头拍摄的视频,其方法的效果受场景先验知识和相机数量角度拓扑结构等信息的影响较大,PaddleDetection此处提供的是去除场景和相机相关优化方法后的一个基础版本的MTMCT算法实现,如果要继续提高效果,需要专门针对该场景和相机信息设计后处理算法。此处选用DeepSORT方案做MTMCT,为了达到实时性选用了PaddleDetection自研的[PP-YOLOv2](../../ppyolo/)和轻量级网络[PP-PicoDet](../../picodet/)作为检测器,选用PaddleClas自研的轻量级网络PP-LCNet作为ReID模型。
+
+MTMCT是[PP-Tracking](../../../deploy/pptracking)项目中一个非常重要的方向,[PP-Tracking](../../../deploy/pptracking/README.md)是基于PaddlePaddle深度学习框架的业界首个开源实时跟踪系统。针对实际业务的难点痛点,PP-Tracking内置行人车辆跟踪、跨镜头跟踪、多类别跟踪、小目标跟踪及流量计数等能力与产业应用,同时提供可视化开发界面。模型集成目标检测、轻量级ReID、多目标跟踪等算法,进一步提升PP-Tracking在服务器端部署性能。同时支持Python、C++部署,适配Linux、NVIDIA Jetson等多个平台环境。具体可前往该目录使用。
+
+### AI Studio公开项目案例
+PP-Tracking 提供了AI Studio公开项目案例,教程请参考[PP-Tracking之手把手玩转多目标跟踪](https://aistudio.baidu.com/aistudio/projectdetail/3022582)。
+
+## 模型库
+### DeepSORT在 AIC21 MTMCT(CityFlow) 车辆跨境跟踪数据集Test集上的结果
+
+| 检测器 | 输入尺度 | ReID | 场景 | Tricks | IDF1 | IDP | IDR | Precision | Recall | FPS | 检测器下载链接 | ReID下载链接 |
+| :--------- | :--------- | :------- | :----- | :------ |:----- |:------- |:----- |:--------- |:-------- |:----- |:------ | :------ |
+| PP-PicoDet | 640x640 | PP-LCNet | S06 | - | 0.3617 | 0.4417 | 0.3062 | 0.6266 | 0.4343 | - |[Detector](https://paddledet.bj.bcebos.com/models/mot/deepsort/picodet_l_640_aic21mtmct_vehicle.tar) |[ReID](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicle.tar) |
+| PPYOLOv2 | 640x640 | PP-LCNet | S06 | - | 0.4450 | 0.4611 | 0.4300 | 0.6385 | 0.5954 | - |[Detector](https://paddledet.bj.bcebos.com/models/mot/deepsort/ppyolov2_r50vd_dcn_365e_aic21mtmct_vehicle.tar) |[ReID](https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicle.tar) |
+
+**注意:**
+ - S06是AIC21 MTMCT数据集Test集的场景名称,S06场景下有’c041,c042,c043,c044,c045,c046‘共6个摄像头的视频。
+ - 由于在部署过程中只需要前向参数,此处提供的是已经导出的模型,解压后可看到包括`infer_cfg.yml`、`model.pdiparams`、`model.pdiparams.info`和`model.pdmodel`四个文件。
+
+
+## 数据集准备
+对于车辆跨镜头跟踪是选用的[AIC21 MTMCT](https://www.aicitychallenge.org) (CityFlow)车辆跨境跟踪数据集,此处提供PaddleDetection团队整理过后的数据集的下载链接:`wget https://paddledet.bj.bcebos.com/data/mot/aic21mtmct_vehicle.zip`,测试使用的是其中的S06文件夹目录,此外还提供AIC21 MTMCT数据集中S01场景抽出来的极小的一个demo测试数据集:`wget https://paddledet.bj.bcebos.com/data/mot/demo/mtmct-demo.tar`
+
+数据集的处理如下所示:
+```
+# AIC21 MTMCT原始数据集的目录如下所示:
+|——————AIC21_Track3_MTMC_Tracking
+ |——————cam_framenum (Number of frames below each camera)
+ |——————cam_loc (Positional relationship between cameras)
+ |——————cam_timestamp (Time difference between cameras)
+ |——————eval (evaluation function and ground_truth.txt)
+ |——————test (S06 dataset)
+ |——————train (S01,S03,S04 dataset)
+ |——————validation (S02,S05 dataset)
+ |——————DataLicenseAgreement_AICityChallenge_2021.pdf
+ |——————list_cam.txt (List of all camera paths)
+ |——————ReadMe.txt (Dataset description)
+|——————gen_aicity_mtmct_data.py (Camera videos extraction script)
+```
+需要处理成如下格式:
+```
+aic21mtmct_vehicle/
+├── S01
+ ├── gt
+ │ ├── gt.txt
+ ├── images
+ ├── c001
+ │ ├── img1
+ │ │ ├── 0000001.jpg
+ │ │ ...
+ │ ├── roi.jpg
+ ├── c002
+ ...
+ ├── c006
+├── S02
+...
+├── S05
+├── S06
+ ├── images
+ ├── c041
+ ├── img1
+ ├── 0000001.jpg
+ ...
+
+ ├── c042
+ ...
+ ├── c046
+ ├── zone (only for test-set S06 when use camera tricks for testing)
+ ├── c041.png
+ ...
+ ├── c046.png
+```
+
+#### 生成S01场景的验证集数据
+python gen_aicity_mtmct_data.py ./AIC21_Track3_MTMC_Tracking/train/S01
+
+**注意:**
+ - AIC21 MTMCT数据集共有6个场景共计46个摄像头的数据,其中S01、S03和S04为训练集,S02和S05为验证集,S06是测试集,S06场景下有’c041,c042,c043,c044,c045,c046‘共6个摄像头的视频。
+
+## 快速开始
+
+### 1. 导出模型
+Step 1:下载导出的检测模型
+```bash
+wget https://paddledet.bj.bcebos.com/models/mot/deepsort/picodet_l_640_aic21mtmct_vehicle.tar
+tar -xvf picodet_l_640_aic21mtmct_vehicle.tar
+```
+Step 2:下载导出的ReID模型
+```bash
+wget https://paddledet.bj.bcebos.com/models/mot/deepsort/deepsort_pplcnet_vehicle.tar
+tar -xvf deepsort_pplcnet_vehicle.tar
+```
+**注意:**
+ - PP-PicoDet是轻量级检测模型,其训练请参考[configs/picodet](../../picodet/README.md),并注意修改种类数和数据集路径。
+ - PP-LCNet是轻量级ReID模型,其训练请参考[PaddleClas](https://github.com/PaddlePaddle/PaddleClas),是在VERI-Wild车辆重识别数据集训练得到的权重,建议直接使用无需重训。
+
+
+### 2. 用导出的模型基于Python去预测
+```bash
+# 下载demo测试视频
+wget https://paddledet.bj.bcebos.com/data/mot/demo/mtmct-demo.tar
+tar -xvf mtmct-demo.tar
+
+# 用导出的PicoDet车辆检测模型和PPLCNet车辆ReID模型去基于Python预测
+python deploy/pptracking/python/mot_sde_infer.py --model_dir=picodet_l_640_aic21mtmct_vehicle/ --reid_model_dir=deepsort_pplcnet_vehicle/ --mtmct_dir=mtmct-demo --mtmct_cfg=mtmct_cfg --device=GPU --scaled=True --save_mot_txts --save_images
+```
+**注意:**
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt),或`--save_images`表示保存跟踪结果可视化图片。
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
+ - `--mtmct_dir`是MTMCT预测的某个场景的文件夹名字,里面包含该场景不同摄像头拍摄视频的图片文件夹视频,其数量至少为两个。
+ - `--mtmct_cfg`是MTMCT预测的某个场景的配置文件,里面包含该一些trick操作的开关和该场景摄像头相关设置的文件路径,用户可以自行更改相关路径以及设置某些操作是否启用。
+ - MTMCT跨镜头跟踪输出结果为视频和txt形式。每个图片文件夹各生成一个可视化的跨镜头跟踪结果,与单镜头跟踪的结果是不同的,单镜头跟踪的结果在几个视频文件夹间是独立无关的。MTMCT的结果txt只有一个,比单镜头跟踪结果txt多了第一列镜头id号,跨镜头跟踪结果txt文件每行信息是`camera_id,frame,id,x1,y1,w,h,-1,-1`。
+ - MTMCT是[PP-Tracking](../../../deploy/pptracking)项目中的一个非常重要的方向,具体可前往该目录使用。
+
+
+## 引用
+```
+@InProceedings{Tang19CityFlow,
+author = {Zheng Tang and Milind Naphade and Ming-Yu Liu and Xiaodong Yang and Stan Birchfield and Shuo Wang and Ratnesh Kumar and David Anastasiu and Jenq-Neng Hwang},
+title = {CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification},
+booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
+month = {June},
+year = {2019},
+pages = {8797–8806}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/mtmct/gen_aicity_mtmct_data.py b/PaddleDetection-release-2.6/configs/mot/mtmct/gen_aicity_mtmct_data.py
new file mode 100644
index 0000000000000000000000000000000000000000..00bc952f64bce5565cb537fd8c123c10f33e253a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/mtmct/gen_aicity_mtmct_data.py
@@ -0,0 +1,62 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import cv2
+import glob
+
+
+def video2frames(sourceVdo, dstDir):
+ videoData = cv2.VideoCapture(sourceVdo)
+ count = 0
+ while (videoData.isOpened()):
+ count += 1
+ ret, frame = videoData.read()
+ if ret:
+ cv2.imwrite(f"{dstDir}/{count:07d}.jpg", frame)
+ if count % 20 == 0:
+ print(f"{dstDir}/{count:07d}.jpg")
+ else:
+ break
+ videoData.release()
+
+
+def transSeq(seqs_path, new_root):
+ sonCameras = glob.glob(seqs_path + "/*")
+ sonCameras.sort()
+ for vdoList in sonCameras:
+ Seq = vdoList.split('/')[-2]
+ Camera = vdoList.split('/')[-1]
+ os.system(f"mkdir -p {new_root}/{Seq}/images/{Camera}/img1")
+
+ roi_path = vdoList + '/roi.jpg'
+ new_roi_path = f"{new_root}/{Seq}/images/{Camera}"
+ os.system(f"cp {roi_path} {new_roi_path}")
+
+ video2frames(f"{vdoList}/vdo.avi",
+ f"{new_root}/{Scd eq}/images/{Camera}/img1")
+
+
+if __name__ == "__main__":
+ seq_path = sys.argv[1]
+ new_root = 'aic21mtmct_vehicle'
+
+ seq_name = seq_path.split('/')[-1]
+ data_path = seq_path.split('/')[-3]
+ os.system(f"mkdir -p {new_root}/{seq_name}/gt")
+ os.system(f"cp {data_path}/eval/ground*.txt {new_root}/{seq_name}/gt")
+
+ # extract video frames
+ transSeq(seq_path, new_root)
diff --git a/PaddleDetection-release-2.6/configs/mot/ocsort/README.md b/PaddleDetection-release-2.6/configs/mot/ocsort/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/ocsort/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/mot/ocsort/README_cn.md b/PaddleDetection-release-2.6/configs/mot/ocsort/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..b673382dad91b012af786ae1d91e7126bdb4a201
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/ocsort/README_cn.md
@@ -0,0 +1,101 @@
+简体中文 | [English](README.md)
+
+# OC_SORT (Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [快速开始](#快速开始)
+- [引用](#引用)
+
+## 简介
+[OC_SORT](https://arxiv.org/abs/2203.14360)(Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking)。此处提供了几个常用检测器的配置作为参考。由于训练数据集、输入尺度、训练epoch数、NMS阈值设置等的不同均会导致模型精度和性能的差异,请自行根据需求进行适配。
+
+## 模型库
+
+### OC_SORT在MOT-17 half Val Set上结果
+
+| 检测训练数据集 | 检测器 | 输入尺度 | ReID | 检测mAP | MOTA | IDF1 | FPS | 配置文件 |
+| :-------- | :----- | :----: | :----:|:------: | :----: |:-----: |:----:|:----: |
+| MOT-17 half train | PP-YOLOE-l | 640x640 | - | 52.9 | 50.1 | 62.6 | - |[配置文件](./ocsort_ppyoloe.yml) |
+| **mot17_ch** | YOLOX-x | 800x1440| - | 61.9 | 75.5 | 77.0 | - |[配置文件](./ocsort_yolox.yml) |
+
+**注意:**
+ - 模型权重下载链接在配置文件中的```det_weights```和```reid_weights```,运行验证的命令即可自动下载,OC_SORT默认不需要```reid_weights```权重。
+ - **MOT17-half train**是MOT17的train序列(共7个)每个视频的前一半帧的图片和标注组成的数据集,而为了验证精度可以都用**MOT17-half val**数据集去评估,它是每个视频的后一半帧组成的,数据集可以从[此链接](https://bj.bcebos.com/v1/paddledet/data/mot/MOT17.zip)下载,并解压放在`dataset/mot/`文件夹下。
+ - **mix_mot_ch**数据集,是MOT17、CrowdHuman组成的联合数据集,**mix_det**是MOT17、CrowdHuman、Cityscapes、ETHZ组成的联合数据集,数据集整理的格式和目录可以参考[此链接](https://github.com/ifzhang/ByteTrack#data-preparation),最终放置于`dataset/mot/`目录下。为了验证精度可以都用**MOT17-half val**数据集去评估。
+ - OC_SORT的训练是单独的检测器训练MOT数据集,推理是组装跟踪器去评估MOT指标,单独的检测模型也可以评估检测指标。
+ - OC_SORT的导出部署,是单独导出检测模型,再组装跟踪器运行的,参照[PP-Tracking](../../../deploy/pptracking/python)。
+ - OC_SORT是PP-Human和PP-Vehicle等Pipeline分析项目跟踪方向的主要方案,具体使用参照[Pipeline](../../../deploy/pipeline)和[MOT](../../../deploy/pipeline/docs/tutorials/pphuman_mot.md)。
+
+
+## 快速开始
+
+### 1. 训练
+通过如下命令一键式启动训练和评估
+```bash
+python -m paddle.distributed.launch --log_dir=ppyoloe --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml --eval --amp
+```
+
+### 2. 评估
+#### 2.1 评估检测效果
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/mot/bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml
+```
+
+**注意:**
+ - 评估检测使用的是```tools/eval.py```, 评估跟踪使用的是```tools/eval_mot.py```。
+
+#### 2.2 评估跟踪效果
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/ocsort/ocsort_ppyoloe.yml --scaled=True
+# 或者
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/ocsort/ocsort_yolox.yml --scaled=True
+```
+**注意:**
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE YOLOv3则为False,如果使用通用检测模型则为True, 默认值是False。
+ - 跟踪结果会存于`{output_dir}/mot_results/`中,里面每个视频序列对应一个txt,每个txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`, 此外`{output_dir}`可通过`--output_dir`设置。
+
+### 3. 预测
+
+使用单个GPU通过如下命令预测一个视频,并保存为视频
+
+```bash
+# 下载demo视频
+wget https://bj.bcebos.com/v1/paddledet/data/mot/demo/mot17_demo.mp4
+
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/ocsort/ocsort_yolox.yml --video_file=mot17_demo.mp4 --scaled=True --save_videos
+```
+
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+ - `--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
+
+
+### 4. 导出预测模型
+
+Step 1:导出检测模型
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/yolox_x_24e_800x1440_mix_det.pdparams
+```
+
+### 5. 用导出的模型基于Python去预测
+
+```bash
+python deploy/pptracking/python/mot_sde_infer.py --model_dir=output_inference/yolox_x_24e_800x1440_mix_det/ --tracker_config=deploy/pptracking/python/tracker_config.yml --video_file=mot17_demo.mp4 --device=GPU --save_mot_txts
+```
+**注意:**
+ - 运行前需要手动修改`tracker_config.yml`的跟踪器类型为`type: OCSORTTracker`。
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_mot_txt_per_img`(对每张图片保存一个txt)表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+
+## 引用
+```
+@article{cao2022observation,
+ title={Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking},
+ author={Cao, Jinkun and Weng, Xinshuo and Khirodkar, Rawal and Pang, Jiangmiao and Kitani, Kris},
+ journal={arXiv preprint arXiv:2203.14360},
+ year={2022}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/ocsort/ocsort_ppyoloe.yml b/PaddleDetection-release-2.6/configs/mot/ocsort/ocsort_ppyoloe.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ed2881a2138cb3986881accb3173f23f9ab815c0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/ocsort/ocsort_ppyoloe.yml
@@ -0,0 +1,76 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ '../bytetrack/detector/ppyoloe_crn_l_36e_640x640_mot17half.yml',
+ '../bytetrack/_base_/mot17.yml',
+ '../bytetrack/_base_/ppyoloe_mot_reader_640x640.yml'
+]
+weights: output/ocsort_ppyoloe/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+metric: MOT # eval/infer mode, set 'COCO' can be training mode
+num_classes: 1
+
+architecture: ByteTrack
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_300e_coco.pdparams
+ByteTrack:
+ detector: YOLOv3 # PPYOLOe version
+ reid: None
+ tracker: OCSORTTracker
+det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/ppyoloe_crn_l_36e_640x640_mot17half.pdparams
+reid_weights: None
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 # 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.1 # 0.01 in original detector
+ nms_threshold: 0.4 # 0.6 in original detector
+
+
+OCSORTTracker:
+ det_thresh: 0.4 # 0.6 in yolox ocsort
+ max_age: 30
+ min_hits: 3
+ iou_threshold: 0.3
+ delta_t: 3
+ inertia: 0.2
+ vertical_ratio: 0
+ min_box_area: 0
+ use_byte: False
+ use_angle_cost: False
+
+
+# MOTDataset for MOT evaluation and inference
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT and ByteTrack
+
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/ocsort/ocsort_yolox.yml b/PaddleDetection-release-2.6/configs/mot/ocsort/ocsort_yolox.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4f05e2d04ce1d83c98e54b35d21217915c5ee8f4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/ocsort/ocsort_yolox.yml
@@ -0,0 +1,83 @@
+# This config is an assembled config for ByteTrack MOT, used as eval/infer mode for MOT.
+_BASE_: [
+ '../bytetrack/detector/yolox_x_24e_800x1440_mix_det.yml',
+ '../bytetrack/_base_/mix_det.yml',
+ '../bytetrack/_base_/yolox_mot_reader_800x1440.yml'
+]
+weights: output/ocsort_yolox/model_final
+log_iter: 20
+snapshot_epoch: 2
+
+metric: MOT # eval/infer mode
+num_classes: 1
+
+architecture: ByteTrack
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/yolox_x_300e_coco.pdparams
+ByteTrack:
+ detector: YOLOX
+ reid: None
+ tracker: OCSORTTracker
+det_weights: https://bj.bcebos.com/v1/paddledet/models/mot/yolox_x_24e_800x1440_mix_mot_ch.pdparams
+reid_weights: None
+
+depth_mult: 1.33
+width_mult: 1.25
+
+YOLOX:
+ backbone: CSPDarkNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ input_size: [800, 1440]
+ size_stride: 32
+ size_range: [18, 22] # multi-scale range [576*1024 ~ 800*1440], w/h ratio=1.8
+
+CSPDarkNet:
+ arch: "X"
+ return_idx: [2, 3, 4]
+ depthwise: False
+
+YOLOCSPPAN:
+ depthwise: False
+
+# Tracking requires higher quality boxes, so NMS score_threshold will be higher
+YOLOXHead:
+ l1_epoch: 20
+ depthwise: False
+ loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ use_vfl: False
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.1
+ nms_threshold: 0.7
+ # For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
+ # For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.
+
+
+OCSORTTracker:
+ det_thresh: 0.6
+ max_age: 30
+ min_hits: 3
+ iou_threshold: 0.3
+ delta_t: 3
+ inertia: 0.2
+ vertical_ratio: 1.6
+ min_box_area: 100
+ use_byte: False
+
+
+# MOTDataset for MOT evaluation and inference
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: MOT17/images/half
+ keep_ori_im: True # set as True in DeepSORT and ByteTrack
+
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/pedestrian/README.md b/PaddleDetection-release-2.6/configs/mot/pedestrian/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/pedestrian/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/mot/pedestrian/README_cn.md b/PaddleDetection-release-2.6/configs/mot/pedestrian/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..768733db537c5f752bbb56198bad196c68b28602
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/pedestrian/README_cn.md
@@ -0,0 +1,135 @@
+[English](README.md) | 简体中文
+# 特色垂类跟踪模型
+
+## 大规模行人跟踪 (Pedestrian Tracking)
+
+行人跟踪的主要应用之一是交通监控。
+
+[PathTrack](https://www.trace.ethz.ch/publications/2017/pathtrack/index.html)包含720个视频序列,有着超过15000个行人的轨迹。包含了街景、舞蹈、体育运动、采访等各种场景的,大部分是移动摄像头拍摄场景。该数据集只有Pedestrian一类标注作为跟踪任务。
+
+[VisDrone](http://aiskyeye.com)是无人机视角拍摄的数据集,是以俯视视角为主。该数据集涵盖不同位置(取自中国数千个相距数千公里的14个不同城市)、不同环境(城市和乡村)、不同物体(行人、车辆、自行车等)和不同密度(稀疏和拥挤的场景)。[VisDrone2019-MOT](https://github.com/VisDrone/VisDrone-Dataset)包含56个视频序列用于训练,7个视频序列用于验证。此处针对VisDrone2019-MOT多目标跟踪数据集进行提取,抽取出类别为pedestrian和people的数据组合成一个大的Pedestrian类别。
+
+
+## 模型库
+
+### FairMOT在各个数据集val-set上Pedestrian类别的结果
+
+| 数据集 | 骨干网络 | 输入尺寸 | MOTA | IDF1 | FPS | 下载链接 | 配置文件 |
+| :-------------| :-------- | :------- | :----: | :----: | :----: | :-----: |:------: |
+| PathTrack | DLA-34 | 1088x608 | 44.9 | 59.3 | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_pathtrack.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_pathtrack.yml) |
+| VisDrone | DLA-34 | 1088x608 | 49.2 | 63.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_visdrone_pedestrian.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml) |
+| VisDrone | HRNetv2-W18| 1088x608 | 40.5 | 54.7 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.yml) |
+| VisDrone | HRNetv2-W18| 864x480 | 38.6 | 50.9 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_pedestrian.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_pedestrian.yml) |
+| VisDrone | HRNetv2-W18| 576x320 | 30.6 | 47.2 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_pedestrian.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_pedestrian.yml) |
+
+**注意:**
+ - FairMOT均使用DLA-34为骨干网络,4个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
+
+
+## 数据集准备和处理
+
+### 1、数据集处理代码说明
+代码统一都在tools目录下
+```
+# visdrone
+tools/visdrone/visdrone2mot.py: 生成visdrone_pedestrian据集
+```
+
+### 2、visdrone_pedestrian数据集处理
+```
+# 复制tool/visdrone/visdrone2mot.py到数据集目录下
+# 生成visdrone_pedestrian MOT格式的数据,抽取类别classes=1,2 (pedestrian, people)
+<<--生成前目录-->>
+├── VisDrone2019-MOT-val
+│ ├── annotations
+│ ├── sequences
+│ ├── visdrone2mot.py
+<<--生成后目录-->>
+├── VisDrone2019-MOT-val
+│ ├── annotations
+│ ├── sequences
+│ ├── visdrone2mot.py
+│ ├── visdrone_pedestrian
+│ │ ├── images
+│ │ │ ├── train
+│ │ │ ├── val
+│ │ ├── labels_with_ids
+│ │ │ ├── train
+│ │ │ ├── val
+# 执行
+python visdrone2mot.py --transMot=True --data_name=visdrone_pedestrian --phase=val
+python visdrone2mot.py --transMot=True --data_name=visdrone_pedestrian --phase=train
+```
+
+## 快速开始
+
+### 1. 训练
+使用2个GPU通过如下命令一键式启动训练
+```bash
+python -m paddle.distributed.launch --log_dir=./fairmot_dla34_30e_1088x608_visdrone_pedestrian/ --gpus 0,1 tools/train.py -c configs/mot/pedestrian/fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml
+```
+
+### 2. 评估
+使用单张GPU通过如下命令一键式启动评估
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/pedestrian/fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_visdrone_pedestrian.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/pedestrian/fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml -o weights=output/fairmot_dla34_30e_1088x608_visdrone_pedestrian/model_final.pdparams
+```
+
+### 3. 预测
+使用单个GPU通过如下命令预测一个视频,并保存为视频
+```bash
+# 预测一个视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/pedestrian/fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_visdrone_pedestrian.pdparams --video_file={your video name}.mp4 --save_videos
+```
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+
+### 4. 导出预测模型
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/pedestrian/fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_visdrone_pedestrian.pdparams
+```
+
+### 5. 用导出的模型基于Python去预测
+```bash
+python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_visdrone_pedestrian --video_file={your video name}.mp4 --device=GPU --save_mot_txts
+```
+**注意:**
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+## 引用
+```
+@article{zhang2020fair,
+ title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
+ author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
+ journal={arXiv preprint arXiv:2004.01888},
+ year={2020}
+}
+
+@INPROCEEDINGS{8237302,
+author={S. {Manen} and M. {Gygli} and D. {Dai} and L. V. {Gool}},
+booktitle={2017 IEEE International Conference on Computer Vision (ICCV)},
+title={PathTrack: Fast Trajectory Annotation with Path Supervision},
+year={2017},
+volume={},
+number={},
+pages={290-299},
+doi={10.1109/ICCV.2017.40},
+ISSN={2380-7504},
+month={Oct},}
+
+@ARTICLE{9573394,
+ author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ title={Detection and Tracking Meet Drones Challenge},
+ year={2021},
+ volume={},
+ number={},
+ pages={1-1},
+ doi={10.1109/TPAMI.2021.3119563}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_dla34_30e_1088x608_pathtrack.yml b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_dla34_30e_1088x608_pathtrack.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bc16b074cabefa7f85b7c04705ed0794aedefbc4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_dla34_30e_1088x608_pathtrack.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../fairmot/fairmot_dla34_30e_1088x608.yml'
+]
+
+weights: output/fairmot_dla34_30e_1088x608_pathtrack/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['pathtrack.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: pathtrack/images/test
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml
new file mode 100644
index 0000000000000000000000000000000000000000..741c1f45374b923e056019be3f53514df4d93e01
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_dla34_30e_1088x608_visdrone_pedestrian.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../fairmot/fairmot_dla34_30e_1088x608.yml'
+]
+
+weights: output/fairmot_dla34_30e_1088x608_visdrone_pedestrian/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_pedestrian.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_pedestrian/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.yml b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.yml
new file mode 100644
index 0000000000000000000000000000000000000000..aca526dc3d08153dbb529b11e8c62a9b184fb2af
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml'
+]
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_pedestrian/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_pedestrian.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_pedestrian/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_pedestrian.yml b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_pedestrian.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1daab8fdc9afb8c50d1d051fe9bda8de925d5d88
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_pedestrian.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml'
+]
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_pedestrian/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_pedestrian.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_pedestrian/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_pedestrian.yml b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_pedestrian.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8fd0563511577928a8136a9e841cb230aaa3b69b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/pedestrian/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_pedestrian.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml'
+]
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_pedestrian/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_pedestrian.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_pedestrian/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
diff --git a/PaddleDetection-release-2.6/configs/mot/pedestrian/tools/visdrone/visdrone2mot.py b/PaddleDetection-release-2.6/configs/mot/pedestrian/tools/visdrone/visdrone2mot.py
new file mode 100644
index 0000000000000000000000000000000000000000..0be2f1eb8fcb080738ccb45d01d6c20671381706
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/pedestrian/tools/visdrone/visdrone2mot.py
@@ -0,0 +1,299 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import glob
+import os
+import os.path as osp
+import cv2
+import argparse
+import numpy as np
+import random
+
+# The object category indicates the type of annotated object,
+# (i.e., ignored regions(0), pedestrian(1), people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10),others(11))
+
+# Extract single class or multi class
+isExtractMultiClass = False
+# These sequences are excluded because there are too few pedestrians
+exclude_seq = [
+ "uav0000117_02622_v", "uav0000182_00000_v", "uav0000268_05773_v",
+ "uav0000305_00000_v"
+]
+
+
+def mkdir_if_missing(d):
+ if not osp.exists(d):
+ os.makedirs(d)
+
+
+def genGtFile(seqPath, outPath, classes=[]):
+ id_idx = 0
+ old_idx = -1
+ with open(seqPath, 'r') as singleSeqFile:
+ motLine = []
+ allLines = singleSeqFile.readlines()
+ for line in allLines:
+ line = line.replace('\n', '')
+ line = line.split(',')
+ # exclude occlusion!='2'
+ if line[-1] != '2' and line[7] in classes:
+ if old_idx != int(line[1]):
+ id_idx += 1
+ old_idx = int(line[1])
+ newLine = line[0:6]
+ newLine[1] = str(id_idx)
+ newLine.append('1')
+ if (len(classes) > 1 and isExtractMultiClass):
+ class_index = str(classes.index(line[7]) + 1)
+ newLine.append(class_index)
+ else:
+ newLine.append('1') # use permanent class '1'
+ newLine.append('1')
+ motLine.append(newLine)
+ mkdir_if_missing(outPath)
+ gtFilePath = osp.join(outPath, 'gt.txt')
+ with open(gtFilePath, 'w') as gtFile:
+ motLine = list(map(lambda x: str.join(',', x), motLine))
+ motLineStr = str.join('\n', motLine)
+ gtFile.write(motLineStr)
+
+
+def genSeqInfo(img1Path, seqName):
+ imgPaths = glob.glob(img1Path + '/*.jpg')
+ seqLength = len(imgPaths)
+ if seqLength > 0:
+ image1 = cv2.imread(imgPaths[0])
+ imgHeight = image1.shape[0]
+ imgWidth = image1.shape[1]
+ else:
+ imgHeight = 0
+ imgWidth = 0
+ seqInfoStr = f'''[Sequence]\nname={seqName}\nimDir=img1\nframeRate=30\nseqLength={seqLength}\nimWidth={imgWidth}\nimHeight={imgHeight}\nimExt=.jpg'''
+ seqInfoPath = img1Path.replace('/img1', '')
+ with open(seqInfoPath + '/seqinfo.ini', 'w') as seqFile:
+ seqFile.write(seqInfoStr)
+
+
+def copyImg(img1Path, gtTxtPath, outputFileName):
+ with open(gtTxtPath, 'r') as gtFile:
+ allLines = gtFile.readlines()
+ imgList = []
+ for line in allLines:
+ imgIdx = int(line.split(',')[0])
+ if imgIdx not in imgList:
+ imgList.append(imgIdx)
+ seqName = gtTxtPath.replace('./{}/'.format(outputFileName),
+ '').replace('/gt/gt.txt', '')
+ sourceImgPath = osp.join('./sequences', seqName,
+ '{:07d}.jpg'.format(imgIdx))
+ os.system(f'cp {sourceImgPath} {img1Path}')
+
+
+def genMotLabels(datasetPath, outputFileName, classes=['2']):
+ mkdir_if_missing(osp.join(datasetPath, outputFileName))
+ annotationsPath = osp.join(datasetPath, 'annotations')
+ annotationsList = glob.glob(osp.join(annotationsPath, '*.txt'))
+ for annotationPath in annotationsList:
+ seqName = annotationPath.split('/')[-1].replace('.txt', '')
+ if seqName in exclude_seq:
+ continue
+ mkdir_if_missing(osp.join(datasetPath, outputFileName, seqName, 'gt'))
+ mkdir_if_missing(osp.join(datasetPath, outputFileName, seqName, 'img1'))
+ genGtFile(annotationPath,
+ osp.join(datasetPath, outputFileName, seqName, 'gt'), classes)
+ img1Path = osp.join(datasetPath, outputFileName, seqName, 'img1')
+ gtTxtPath = osp.join(datasetPath, outputFileName, seqName, 'gt/gt.txt')
+ copyImg(img1Path, gtTxtPath, outputFileName)
+ genSeqInfo(img1Path, seqName)
+
+
+def deleteFileWhichImg1IsEmpty(mot16Path, dataType='train'):
+ path = mot16Path
+ data_images_train = osp.join(path, 'images', f'{dataType}')
+ data_images_train_seqs = glob.glob(data_images_train + '/*')
+ if (len(data_images_train_seqs) == 0):
+ print('dataset is empty!')
+ for data_images_train_seq in data_images_train_seqs:
+ data_images_train_seq_img1 = osp.join(data_images_train_seq, 'img1')
+ if len(glob.glob(data_images_train_seq_img1 + '/*.jpg')) == 0:
+ print(f"os.system(rm -rf {data_images_train_seq})")
+ os.system(f'rm -rf {data_images_train_seq}')
+
+
+def formatMot16Path(dataPath, pathType='train'):
+ train_path = osp.join(dataPath, 'images', pathType)
+ mkdir_if_missing(train_path)
+ os.system(f'mv {dataPath}/* {train_path}')
+
+
+def VisualGt(dataPath, phase='train'):
+ seqList = sorted(glob.glob(osp.join(dataPath, 'images', phase) + '/*'))
+ seqIndex = random.randint(0, len(seqList) - 1)
+ seqPath = seqList[seqIndex]
+ gt_path = osp.join(seqPath, 'gt', 'gt.txt')
+ img_list_path = sorted(glob.glob(osp.join(seqPath, 'img1', '*.jpg')))
+ imgIndex = random.randint(0, len(img_list_path))
+ img_Path = img_list_path[imgIndex]
+ frame_value = int(img_Path.split('/')[-1].replace('.jpg', ''))
+ gt_value = np.loadtxt(gt_path, dtype=int, delimiter=',')
+ gt_value = gt_value[gt_value[:, 0] == frame_value]
+ get_list = gt_value.tolist()
+ img = cv2.imread(img_Path)
+ colors = [[255, 0, 0], [255, 255, 0], [255, 0, 255], [0, 255, 0],
+ [0, 255, 255], [0, 0, 255]]
+ for seq, _id, pl, pt, w, h, _, bbox_class, _ in get_list:
+ cv2.putText(img,
+ str(bbox_class), (pl, pt), cv2.FONT_HERSHEY_PLAIN, 2,
+ colors[bbox_class - 1])
+ cv2.rectangle(
+ img, (pl, pt), (pl + w, pt + h),
+ colors[bbox_class - 1],
+ thickness=2)
+ cv2.imwrite('testGt.jpg', img)
+
+
+def VisualDataset(datasetPath, phase='train', seqName='', frameId=1):
+ trainPath = osp.join(datasetPath, 'labels_with_ids', phase)
+ seq1Paths = osp.join(trainPath, seqName)
+ seq_img1_path = osp.join(seq1Paths, 'img1')
+ label_with_idPath = osp.join(seq_img1_path, '%07d' % frameId) + '.txt'
+ image_path = label_with_idPath.replace('labels_with_ids', 'images').replace(
+ '.txt', '.jpg')
+ seqInfoPath = str.join('/', image_path.split('/')[:-2])
+ seqInfoPath = seqInfoPath + '/seqinfo.ini'
+ seq_info = open(seqInfoPath).read()
+ width = int(seq_info[seq_info.find('imWidth=') + 8:seq_info.find(
+ '\nimHeight')])
+ height = int(seq_info[seq_info.find('imHeight=') + 9:seq_info.find(
+ '\nimExt')])
+
+ with open(label_with_idPath, 'r') as label:
+ allLines = label.readlines()
+ images = cv2.imread(image_path)
+ for line in allLines:
+ line = line.split(' ')
+ line = list(map(lambda x: float(x), line))
+ c1, c2, w, h = line[2:6]
+ x1 = c1 - w / 2
+ x2 = c2 - h / 2
+ x3 = c1 + w / 2
+ x4 = c2 + h / 2
+ cv2.rectangle(
+ images, (int(x1 * width), int(x2 * height)),
+ (int(x3 * width), int(x4 * height)), (255, 0, 0),
+ thickness=2)
+ cv2.imwrite('test.jpg', images)
+
+
+def gen_image_list(dataPath, datType):
+ inputPath = f'{dataPath}/images/{datType}'
+ pathList = glob.glob(inputPath + '/*')
+ pathList = sorted(pathList)
+ allImageList = []
+ for pathSingle in pathList:
+ imgList = sorted(glob.glob(osp.join(pathSingle, 'img1', '*.jpg')))
+ for imgPath in imgList:
+ allImageList.append(imgPath)
+ with open(f'{dataPath}.{datType}', 'w') as image_list_file:
+ allImageListStr = str.join('\n', allImageList)
+ image_list_file.write(allImageListStr)
+
+
+def gen_labels_mot(MOT_data, phase='train'):
+ seq_root = './{}/images/{}'.format(MOT_data, phase)
+ label_root = './{}/labels_with_ids/{}'.format(MOT_data, phase)
+ mkdir_if_missing(label_root)
+ seqs = [s for s in os.listdir(seq_root)]
+ print('seqs => ', seqs)
+ tid_curr = 0
+ tid_last = -1
+ for seq in seqs:
+ seq_info = open(osp.join(seq_root, seq, 'seqinfo.ini')).read()
+ seq_width = int(seq_info[seq_info.find('imWidth=') + 8:seq_info.find(
+ '\nimHeight')])
+ seq_height = int(seq_info[seq_info.find('imHeight=') + 9:seq_info.find(
+ '\nimExt')])
+
+ gt_txt = osp.join(seq_root, seq, 'gt', 'gt.txt')
+ gt = np.loadtxt(gt_txt, dtype=np.float64, delimiter=',')
+
+ seq_label_root = osp.join(label_root, seq, 'img1')
+ mkdir_if_missing(seq_label_root)
+
+ for fid, tid, x, y, w, h, mark, label, _ in gt:
+ # if mark == 0 or not label == 1:
+ # continue
+ fid = int(fid)
+ tid = int(tid)
+ if not tid == tid_last:
+ tid_curr += 1
+ tid_last = tid
+ x += w / 2
+ y += h / 2
+ label_fpath = osp.join(seq_label_root, '{:07d}.txt'.format(fid))
+ label_str = '0 {:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
+ tid_curr, x / seq_width, y / seq_height, w / seq_width,
+ h / seq_height)
+ with open(label_fpath, 'a') as f:
+ f.write(label_str)
+
+
+def parse_arguments():
+ parser = argparse.ArgumentParser(description='input method')
+ parser.add_argument("--transMot", type=bool, default=False)
+ parser.add_argument("--genMot", type=bool, default=False)
+ parser.add_argument("--formatMotPath", type=bool, default=False)
+ parser.add_argument("--deleteEmpty", type=bool, default=False)
+ parser.add_argument("--genLabelsMot", type=bool, default=False)
+ parser.add_argument("--genImageList", type=bool, default=False)
+ parser.add_argument("--visualImg", type=bool, default=False)
+ parser.add_argument("--visualGt", type=bool, default=False)
+ parser.add_argument("--data_name", type=str, default='visdrone_pedestrian')
+ parser.add_argument("--phase", type=str, default='train')
+ parser.add_argument(
+ "--classes", type=str, default='1,2') # pedestrian and people
+ return parser.parse_args()
+
+
+if __name__ == "__main__":
+ args = parse_arguments()
+ classes = args.classes.split(',')
+ datasetPath = './'
+ dataName = args.data_name
+ phase = args.phase
+ if args.transMot:
+ genMotLabels(datasetPath, dataName, classes)
+ formatMot16Path(dataName, pathType=phase)
+ mot16Path = f'./{dataName}'
+ deleteFileWhichImg1IsEmpty(mot16Path, dataType=phase)
+ gen_labels_mot(dataName, phase=phase)
+ gen_image_list(dataName, phase)
+ if args.genMot:
+ genMotLabels(datasetPath, dataName, classes)
+ if args.formatMotPath:
+ formatMot16Path(dataName, pathType=phase)
+ if args.deleteEmpty:
+ mot16Path = f'./{dataName}'
+ deleteFileWhichImg1IsEmpty(mot16Path, dataType=phase)
+ if args.genLabelsMot:
+ gen_labels_mot(dataName, phase=phase)
+ if args.genImageList:
+ gen_image_list(dataName, phase)
+ if args.visualGt:
+ VisualGt(f'./{dataName}', phase)
+ if args.visualImg:
+ seqName = 'uav0000137_00458_v'
+ frameId = 43
+ VisualDataset(
+ f'./{dataName}', phase=phase, seqName=seqName, frameId=frameId)
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/README.md b/PaddleDetection-release-2.6/configs/mot/vehicle/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4015683cfa5969297febc12e7ca1264afabbc0b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/README.md
@@ -0,0 +1 @@
+README_cn.md
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/README_cn.md b/PaddleDetection-release-2.6/configs/mot/vehicle/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..606c583621431a2036c69f3f626684192422b4bd
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/README_cn.md
@@ -0,0 +1,171 @@
+[English](README.md) | 简体中文
+# 特色垂类跟踪模型
+
+## 车辆跟踪 (Vehicle Tracking)
+
+车辆跟踪的主要应用之一是交通监控。在监控场景中,大多是从公共区域的监控摄像头视角拍摄车辆,获取图像后再进行车辆检测和跟踪。
+
+
+[BDD100K](https://www.bdd100k.com)是伯克利大学AI实验室(BAIR)提出的一个驾驶视频数据集,是以驾驶员视角为主。该数据集不仅分多类别标注,还分晴天、多云等六种天气,住宅区、公路等六种场景,白天、夜晚等三个时间段,以及是否遮挡、是否截断。BDD100K MOT数据集包含1400个视频序列用于训练,200个视频序列用于验证。每个视频序列大约40秒长,每秒5帧,因此每个视频大约有200帧。此处针对BDD100K MOT数据集进行提取,抽取出类别为car, truck, bus, trailer, other vehicle的数据组合成一个Vehicle类别。
+
+[KITTI](http://www.cvlibs.net/datasets/kitti)是一个包含市区、乡村和高速公路等场景采集的数据集,每张图像中最多达15辆车和30个行人,还有各种程度的遮挡与截断。[KITTI-Tracking](http://www.cvlibs.net/datasets/kitti/eval_tracking.php)(2D bounding-boxes)数据集一共有50个视频序列,21个为训练集,29个为测试集,目标是估计类别Car和Pedestrian的目标轨迹,此处抽取出类别为Car的数据作为一个Vehicle类别。
+
+[VisDrone](http://aiskyeye.com)是无人机视角拍摄的数据集,是以俯视视角为主。该数据集涵盖不同位置(取自中国数千个相距数千公里的14个不同城市)、不同环境(城市和乡村)、不同物体(行人、车辆、自行车等)和不同密度(稀疏和拥挤的场景)。[VisDrone2019-MOT](https://github.com/VisDrone/VisDrone-Dataset)包含56个视频序列用于训练,7个视频序列用于验证。此处针对VisDrone2019-MOT多目标跟踪数据集进行提取,抽取出类别为car、van、truck、bus的数据组合成一个Vehicle类别。
+
+
+## 模型库
+
+### FairMOT在各个数据集val-set上Vehicle类别的结果
+
+| 数据集 | 骨干网络 | 输入尺寸 | MOTA | IDF1 | FPS | 下载链接 | 配置文件 |
+| :-------------| :-------- | :------- | :----: | :----: | :----: | :-----: |:------: |
+| BDD100K | DLA-34 | 1088x608 | 43.5 | 50.0 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml) |
+| BDD100K | HRNetv2-W18| 576x320 | 32.6 | 38.7 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.yml) |
+| KITTI | DLA-34 | 1088x608 | 82.7 | - | - |[下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_kitti_vehicle.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_kitti_vehicle.yml) |
+| VisDrone | DLA-34 | 1088x608 | 52.1 | 63.3 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_visdrone_vehicle.pdparams) | [配置文件](./fairmot_dla34_30e_1088x608_visdrone_vehicle.yml) |
+| VisDrone | HRNetv2-W18| 1088x608 | 46.0 | 56.8 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle.yml) |
+| VisDrone | HRNetv2-W18| 864x480 | 43.7 | 56.1 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_vehicle.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_vehicle.yml) |
+| VisDrone | HRNetv2-W18| 576x320 | 39.8 | 52.4 | - | [下载链接](https://paddledet.bj.bcebos.com/models/mot/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.pdparams) | [配置文件](./fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.yml) |
+
+**注意:**
+ - FairMOT均使用DLA-34为骨干网络,4个GPU进行训练,每个GPU上batch size为6,训练30个epoch。
+
+
+## 数据集准备和处理
+
+### 1、数据集处理代码说明
+代码统一都在tools目录下
+```
+# bdd100kmot
+tools/bdd100kmot/gen_bdd100kmot_vehicle.sh:通过执行bdd100k2mot.py和gen_labels_MOT.py生成bdd100kmot_vehicle 数据集
+tools/bdd100kmot/bdd100k2mot.py:将bdd100k全集转换成mot格式
+tools/bdd100kmot/gen_labels_MOT.py:生成单类别的labels_with_ids文件
+# visdrone
+tools/visdrone/visdrone2mot.py:生成visdrone_vehicle
+```
+
+### 2、bdd100kmot_vehicle数据集处理
+```
+# 复制tools/bdd100kmot里的代码到数据集目录下
+# 生成bdd100kmot_vehicle MOT格式的数据,抽取类别classes=2,3,4,9,10 (car, truck, bus, trailer, other vehicle)
+<<--生成前目录-->>
+├── bdd100k
+│ ├── images
+│ ├── labels
+<<--生成后目录-->>
+├── bdd100k
+│ ├── images
+│ ├── labels
+│ ├── bdd100kmot_vehicle
+│ │ ├── images
+│ │ │ ├── train
+│ │ │ ├── val
+│ │ ├── labels_with_ids
+│ │ │ ├── train
+│ │ │ ├── val
+# 执行
+sh gen_bdd100kmot_vehicle.sh
+```
+
+### 3、visdrone_vehicle数据集处理
+```
+# 复制tools/visdrone/visdrone2mot.py到数据集目录下
+# 生成visdrone_vehicle MOT格式的数据,抽取类别classes=4,5,6,9 (car, van, truck, bus)
+<<--生成前目录-->>
+├── VisDrone2019-MOT-val
+│ ├── annotations
+│ ├── sequences
+│ ├── visdrone2mot.py
+<<--生成后目录-->>
+├── VisDrone2019-MOT-val
+│ ├── annotations
+│ ├── sequences
+│ ├── visdrone2mot.py
+│ ├── visdrone_vehicle
+│ │ ├── images
+│ │ │ ├── train
+│ │ │ ├── val
+│ │ ├── labels_with_ids
+│ │ │ ├── train
+│ │ │ ├── val
+# 执行
+python visdrone2mot.py --transMot=True --data_name=visdrone_vehicle --phase=val
+python visdrone2mot.py --transMot=True --data_name=visdrone_vehicle --phase=train
+```
+
+## 快速开始
+
+### 1. 训练
+使用2个GPU通过如下命令一键式启动训练
+```bash
+python -m paddle.distributed.launch --log_dir=./fairmot_dla34_30e_1088x608_bdd100kmot_vehicle/ --gpus 0,1 tools/train.py -c configs/mot/vehicle/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml
+```
+
+### 2. 评估
+使用单张GPU通过如下命令一键式启动评估
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/vehicle/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/vehicle/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml -o weights=output/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle/model_final.pdparams
+```
+
+### 3. 预测
+使用单个GPU通过如下命令预测一个视频,并保存为视频
+```bash
+# 预测一个视频
+CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/vehicle/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.pdparams --video_file={your video name}.mp4 --save_videos
+```
+**注意:**
+ - 请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`。
+
+### 4. 导出预测模型
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/vehicle/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.pdparams
+```
+
+### 5. 用导出的模型基于Python去预测
+```bash
+python deploy/pptracking/python/mot_jde_infer.py --model_dir=output_inference/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle --video_file={your video name}.mp4 --device=GPU --save_mot_txts
+```
+**注意:**
+ - 跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
+ - 跟踪结果txt文件每行信息是`frame,id,x1,y1,w,h,score,-1,-1,-1`。
+
+## 引用
+```
+@article{zhang2020fair,
+ title={FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking},
+ author={Zhang, Yifu and Wang, Chunyu and Wang, Xinggang and Zeng, Wenjun and Liu, Wenyu},
+ journal={arXiv preprint arXiv:2004.01888},
+ year={2020}
+}
+
+@InProceedings{bdd100k,
+ author = {Yu, Fisher and Chen, Haofeng and Wang, Xin and Xian, Wenqi and Chen,
+ Yingying and Liu, Fangchen and Madhavan, Vashisht and Darrell, Trevor},
+ title = {BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning},
+ booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+ month = {June},
+ year = {2020}
+}
+
+@INPROCEEDINGS{Geiger2012CVPR,
+ author = {Andreas Geiger and Philip Lenz and Raquel Urtasun},
+ title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},
+ booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
+ year = {2012}
+}
+
+@ARTICLE{9573394,
+ author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ title={Detection and Tracking Meet Drones Challenge},
+ year={2021},
+ volume={},
+ number={},
+ pages={1-1},
+ doi={10.1109/TPAMI.2021.3119563}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1e2d3b332777d16ac5355f4fde9710f42375d5ff
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle.yml
@@ -0,0 +1,40 @@
+_BASE_: [
+ '../fairmot/fairmot_dla34_30e_1088x608.yml'
+]
+
+weights: output/fairmot_dla34_30e_1088x608_bdd100kmot_vehicle/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['bdd100kmot_vehicle.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: bdd100kmot_vehicle/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+
+# model config
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_dla34_30e_1088x608_kitti_vehicle.yml b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_dla34_30e_1088x608_kitti_vehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..09dc886a2043369db509b86344913c66d55465a0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_dla34_30e_1088x608_kitti_vehicle.yml
@@ -0,0 +1,41 @@
+_BASE_: [
+ '../fairmot/fairmot_dla34_30e_1088x608.yml'
+]
+
+metric: KITTI
+weights: output/fairmot_dla34_30e_1088x608_kitti_vehicle/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['kitti_vehicle.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: kitti_vehicle/images/train
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+
+# model config
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_dla34_30e_1088x608_visdrone_vehicle.yml b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_dla34_30e_1088x608_visdrone_vehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6e7e84cb7439950cd3ecfddc3eab29a98354279a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_dla34_30e_1088x608_visdrone_vehicle.yml
@@ -0,0 +1,40 @@
+_BASE_: [
+ '../fairmot/fairmot_dla34_30e_1088x608.yml'
+]
+
+weights: output/fairmot_dla34_30e_1088x608_visdrone_vehicle/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_vehicle.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_vehicle/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+
+# model config
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle.yml b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..63e79b54213f883156e92c3cc823148e31dc222a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle.yml
@@ -0,0 +1,40 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_1088x608.yml'
+]
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_1088x608_visdrone_vehicle/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_vehicle.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_vehicle/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+
+# model config
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.yml b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..599536ff6aca317358f3877069c19bd17e954a30
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle.yml
@@ -0,0 +1,40 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml'
+]
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_576x320_bdd100kmot_vehicle/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['bdd100kmot_vehicle.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: bdd100kmot_vehicle/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+
+# model config
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.yml b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7a155f110f24d449bacf725785d935753f36eba3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle.yml
@@ -0,0 +1,40 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_576x320.yml'
+]
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_576x320_visdrone_vehicle/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_vehicle.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_vehicle/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+
+# model config
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_vehicle.yml b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_vehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8dbbce5578c4585384997886994cd11772573d5a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_vehicle.yml
@@ -0,0 +1,40 @@
+_BASE_: [
+ '../fairmot/fairmot_hrnetv2_w18_dlafpn_30e_864x480.yml'
+]
+
+weights: output/fairmot_hrnetv2_w18_dlafpn_30e_864x480_visdrone_vehicle/model_final
+
+# for MOT training
+TrainDataset:
+ !MOTDataSet
+ dataset_dir: dataset/mot
+ image_lists: ['visdrone_vehicle.train']
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'gt_ide']
+
+# for MOT evaluation
+# If you want to change the MOT evaluation dataset, please modify 'data_root'
+EvalMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ data_root: visdrone_vehicle/images/val
+ keep_ori_im: False # set True if save visualization images or video, or used in DeepSORT
+
+# for MOT video inference
+TestMOTDataset:
+ !MOTImageFolder
+ dataset_dir: dataset/mot
+ keep_ori_im: True # set True if save visualization images or video
+
+# model config
+FairMOT:
+ detector: CenterNet
+ reid: FairMOTEmbeddingHead
+ loss: FairMOTLoss
+ tracker: JDETracker
+
+JDETracker:
+ min_box_area: 0
+ vertical_ratio: 0 # no need to filter bboxes according to w/h
+ conf_thres: 0.4
+ tracked_thresh: 0.4
+ metric_type: cosine
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/tools/bdd100kmot/bdd100k2mot.py b/PaddleDetection-release-2.6/configs/mot/vehicle/tools/bdd100kmot/bdd100k2mot.py
new file mode 100644
index 0000000000000000000000000000000000000000..82ead01a39111045daa02112df5629f1c82fad14
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/tools/bdd100kmot/bdd100k2mot.py
@@ -0,0 +1,386 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import glob
+import os
+import os.path as osp
+import cv2
+import random
+import numpy as np
+import argparse
+import tqdm
+import json
+
+
+def mkdir_if_missing(d):
+ if not osp.exists(d):
+ os.makedirs(d)
+
+
+def bdd2mot_tracking(img_dir, label_dir, save_img_dir, save_label_dir):
+ label_jsons = os.listdir(label_dir)
+ for label_json in tqdm(label_jsons):
+ with open(os.path.join(label_dir, label_json)) as f:
+ labels_json = json.load(f)
+ for label_json in labels_json:
+ img_name = label_json['name']
+ video_name = label_json['videoName']
+ labels = label_json['labels']
+ txt_string = ""
+ for label in labels:
+ category = label['category']
+ x1 = label['box2d']['x1']
+ x2 = label['box2d']['x2']
+ y1 = label['box2d']['y1']
+ y2 = label['box2d']['y2']
+ width = x2 - x1
+ height = y2 - y1
+ x_center = (x1 + x2) / 2. / args.width
+ y_center = (y1 + y2) / 2. / args.height
+ width /= args.width
+ height /= args.height
+ identity = int(label['id'])
+ # [class] [identity] [x_center] [y_center] [width] [height]
+ txt_string += "{} {} {} {} {} {}\n".format(
+ attr_id_dict[category], identity, x_center, y_center,
+ width, height)
+
+ fn_label = os.path.join(save_label_dir, img_name[:-4] + '.txt')
+ source_img = os.path.join(img_dir, video_name, img_name)
+ target_img = os.path.join(save_img_dir, img_name)
+ with open(fn_label, 'w') as f:
+ f.write(txt_string)
+ os.system('cp {} {}'.format(source_img, target_img))
+
+
+def transBbox(bbox):
+ # bbox --> cx cy w h
+ bbox = list(map(lambda x: float(x), bbox))
+ bbox[0] = (bbox[0] - bbox[2] / 2) * 1280
+ bbox[1] = (bbox[1] - bbox[3] / 2) * 720
+ bbox[2] = bbox[2] * 1280
+ bbox[3] = bbox[3] * 720
+
+ bbox = list(map(lambda x: str(x), bbox))
+ return bbox
+
+
+def genSingleImageMot(inputPath, classes=[]):
+ labelPaths = glob.glob(inputPath + '/*.txt')
+ labelPaths = sorted(labelPaths)
+ allLines = []
+ result = {}
+ for labelPath in labelPaths:
+ frame = str(int(labelPath.split('-')[-1].replace('.txt', '')))
+ with open(labelPath, 'r') as labelPathFile:
+ lines = labelPathFile.readlines()
+ for line in lines:
+ line = line.replace('\n', '')
+ lineArray = line.split(' ')
+ if len(classes) > 0:
+ if lineArray[0] in classes:
+ lineArray.append(frame)
+ allLines.append(lineArray)
+ else:
+ lineArray.append(frame)
+ allLines.append(lineArray)
+ resultMap = {}
+ for line in allLines:
+ if line[1] not in resultMap.keys():
+ resultMap[line[1]] = []
+ resultMap[line[1]].append(line)
+ mot_gt = []
+ id_idx = 0
+ for rid in resultMap.keys():
+ id_idx += 1
+ for id_line in resultMap[rid]:
+ mot_line = []
+ mot_line.append(id_line[-1])
+ mot_line.append(str(id_idx))
+ id_line_temp = transBbox(id_line[2:6])
+ mot_line.extend(id_line_temp)
+ mot_line.append('1') # origin class: id_line[0]
+ mot_line.append('1') # permanent class => 1
+ mot_line.append('1')
+ mot_gt.append(mot_line)
+
+ result = list(map(lambda line: str.join(',', line), mot_gt))
+ resultStr = str.join('\n', result)
+ return resultStr
+
+
+def writeGt(inputPath, outPath, classes=[]):
+ singleImageResult = genSingleImageMot(inputPath, classes=classes)
+ outPathFile = outPath + '/gt.txt'
+ mkdir_if_missing(outPath)
+ with open(outPathFile, 'w') as gtFile:
+ gtFile.write(singleImageResult)
+
+
+def genSeqInfo(seqInfoPath):
+ name = seqInfoPath.split('/')[-2]
+ img1Path = osp.join(str.join('/', seqInfoPath.split('/')[0:-1]), 'img1')
+ seqLength = len(glob.glob(img1Path + '/*.jpg'))
+ seqInfoStr = f'''[Sequence]\nname={name}\nimDir=img1\nframeRate=30\nseqLength={seqLength}\nimWidth=1280\nimHeight=720\nimExt=.jpg'''
+ with open(seqInfoPath, 'w') as seqFile:
+ seqFile.write(seqInfoStr)
+
+
+def genMotGt(dataDir, classes=[]):
+ seqLists = sorted(glob.glob(dataDir))
+ for seqList in seqLists:
+ inputPath = osp.join(seqList, 'img1')
+ outputPath = seqList.replace('labels_with_ids', 'images')
+ outputPath = osp.join(outputPath, 'gt')
+ mkdir_if_missing(outputPath)
+ print('processing...', outputPath)
+ writeGt(inputPath, outputPath, classes=classes)
+ seqList = seqList.replace('labels_with_ids', 'images')
+ seqInfoPath = osp.join(seqList, 'seqinfo.ini')
+ genSeqInfo(seqInfoPath)
+
+
+def updateSeqInfo(dataDir, phase):
+ seqPath = osp.join(dataDir, 'labels_with_ids', phase)
+ seqList = glob.glob(seqPath + '/*')
+ for seqName in seqList:
+ print('seqName=>', seqName)
+ seqName_img1_dir = osp.join(seqName, 'img1')
+ txtLength = glob.glob(seqName_img1_dir + '/*.txt')
+ name = seqName.split('/')[-1].replace('.jpg', '').replace('.txt', '')
+ seqLength = len(txtLength)
+ seqInfoStr = f'''[Sequence]\nname={name}\nimDir=img1\nframeRate=30\nseqLength={seqLength}\nimWidth=1280\nimHeight=720\nimExt=.jpg'''
+ seqInfoPath = seqName_img1_dir.replace('labels_with_ids', 'images')
+ seqInfoPath = seqInfoPath.replace('/img1', '')
+ seqInfoPath = seqInfoPath + '/seqinfo.ini'
+ with open(seqInfoPath, 'w') as seqFile:
+ seqFile.write(seqInfoStr)
+
+
+def VisualDataset(datasetPath, phase='train', seqName='', frameId=1):
+ trainPath = osp.join(datasetPath, 'labels_with_ids', phase)
+ seq1Paths = osp.join(trainPath, seqName)
+ seq_img1_path = osp.join(seq1Paths, 'img1')
+ label_with_idPath = osp.join(seq_img1_path, seqName + '-' + '%07d' %
+ frameId) + '.txt'
+ image_path = label_with_idPath.replace('labels_with_ids', 'images').replace(
+ '.txt', '.jpg')
+
+ seqInfoPath = str.join('/', image_path.split('/')[:-2])
+ seqInfoPath = seqInfoPath + '/seqinfo.ini'
+ seq_info = open(seqInfoPath).read()
+ width = int(seq_info[seq_info.find('imWidth=') + 8:seq_info.find(
+ '\nimHeight')])
+ height = int(seq_info[seq_info.find('imHeight=') + 9:seq_info.find(
+ '\nimExt')])
+
+ with open(label_with_idPath, 'r') as label:
+ allLines = label.readlines()
+ images = cv2.imread(image_path)
+ print('image_path => ', image_path)
+ for line in allLines:
+ line = line.split(' ')
+ line = list(map(lambda x: float(x), line))
+ c1, c2, w, h = line[2:6]
+ x1 = c1 - w / 2
+ x2 = c2 - h / 2
+ x3 = c1 + w / 2
+ x4 = c2 + h / 2
+ cv2.rectangle(
+ images, (int(x1 * width), int(x2 * height)),
+ (int(x3 * width), int(x4 * height)), (255, 0, 0),
+ thickness=2)
+ cv2.imwrite('test.jpg', images)
+
+
+def VisualGt(dataPath, phase='train'):
+ seqList = sorted(glob.glob(osp.join(dataPath, 'images', phase) + '/*'))
+ seqIndex = random.randint(0, len(seqList) - 1)
+ seqPath = seqList[seqIndex]
+ gt_path = osp.join(seqPath, 'gt', 'gt.txt')
+ img_list_path = sorted(glob.glob(osp.join(seqPath, 'img1', '*.jpg')))
+ imgIndex = random.randint(0, len(img_list_path))
+ img_Path = img_list_path[imgIndex]
+
+ frame_value = img_Path.split('/')[-1].replace('.jpg', '')
+ frame_value = frame_value.split('-')[-1]
+ frame_value = int(frame_value)
+ seqNameStr = img_Path.split('/')[-1].replace('.jpg', '').replace('img', '')
+ frame_value = int(seqNameStr.split('-')[-1])
+ print('frame_value => ', frame_value)
+ gt_value = np.loadtxt(gt_path, dtype=float, delimiter=',')
+ gt_value = gt_value[gt_value[:, 0] == frame_value]
+
+ get_list = gt_value.tolist()
+ img = cv2.imread(img_Path)
+
+ colors = [[255, 0, 0], [255, 255, 0], [255, 0, 255], [0, 255, 0],
+ [0, 255, 255], [0, 0, 255]]
+ for seq, _id, pl, pt, w, h, _, bbox_class, _ in get_list:
+ pl, pt, w, h = int(pl), int(pt), int(w), int(h)
+ print('pl,pt,w,h => ', pl, pt, w, h)
+ cv2.putText(img,
+ str(bbox_class), (pl, pt), cv2.FONT_HERSHEY_PLAIN, 2,
+ colors[int(bbox_class - 1)])
+ cv2.rectangle(
+ img, (pl, pt), (pl + w, pt + h),
+ colors[int(bbox_class - 1)],
+ thickness=2)
+ cv2.imwrite('testGt.jpg', img)
+ print(seqPath, frame_value)
+ return seqPath.split('/')[-1], frame_value
+
+
+def gen_image_list(dataPath, datType):
+ inputPath = f'{dataPath}/labels_with_ids/{datType}'
+ pathList = sorted(glob.glob(inputPath + '/*'))
+ print(pathList)
+ allImageList = []
+ for pathSingle in pathList:
+ imgList = sorted(glob.glob(osp.join(pathSingle, 'img1', '*.txt')))
+ for imgPath in imgList:
+ imgPath = imgPath.replace('labels_with_ids', 'images').replace(
+ '.txt', '.jpg')
+ allImageList.append(imgPath)
+ with open(f'{dataPath}.{datType}', 'w') as image_list_file:
+ allImageListStr = str.join('\n', allImageList)
+ image_list_file.write(allImageListStr)
+
+
+def formatOrigin(datapath, phase):
+ label_with_idPath = osp.join(datapath, 'labels_with_ids', phase)
+ print(label_with_idPath)
+ for txtList in sorted(glob.glob(label_with_idPath + '/*.txt')):
+ print(txtList)
+ seqName = txtList.split('/')[-1]
+ seqName = str.join('-', seqName.split('-')[0:-1]).replace('.txt', '')
+ seqPath = osp.join(label_with_idPath, seqName, 'img1')
+ mkdir_if_missing(seqPath)
+ os.system(f'mv {txtList} {seqPath}')
+
+
+def copyImg(fromRootPath, toRootPath, phase):
+ fromPath = osp.join(fromRootPath, 'images', phase)
+ toPathSeqPath = osp.join(toRootPath, 'labels_with_ids', phase)
+ seqList = sorted(glob.glob(toPathSeqPath + '/*'))
+ for seqPath in seqList:
+ seqName = seqPath.split('/')[-1]
+ imgTxtList = sorted(glob.glob(osp.join(seqPath, 'img1') + '/*.txt'))
+ img_toPathSeqPath = osp.join(seqPath, 'img1')
+ img_toPathSeqPath = img_toPathSeqPath.replace('labels_with_ids',
+ 'images')
+ mkdir_if_missing(img_toPathSeqPath)
+
+ for imgTxt in imgTxtList:
+ imgName = imgTxt.split('/')[-1].replace('.txt', '.jpg')
+ imgfromPath = osp.join(fromPath, seqName, imgName)
+ print(f'cp {imgfromPath} {img_toPathSeqPath}')
+ os.system(f'cp {imgfromPath} {img_toPathSeqPath}')
+
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser(description='BDD100K to MOT format')
+ parser.add_argument("--data_path", default='bdd100k')
+ parser.add_argument("--phase", default='train')
+ parser.add_argument("--classes", default='2,3,4,9,10')
+
+ parser.add_argument("--img_dir", default="bdd100k/images/track/")
+ parser.add_argument("--label_dir", default="bdd100k/labels/box_track_20/")
+ parser.add_argument("--save_path", default="bdd100kmot_vehicle")
+ parser.add_argument("--height", default=720)
+ parser.add_argument("--width", default=1280)
+ args = parser.parse_args()
+
+ attr_dict = dict()
+ attr_dict["categories"] = [{
+ "supercategory": "none",
+ "id": 0,
+ "name": "pedestrian"
+ }, {
+ "supercategory": "none",
+ "id": 1,
+ "name": "rider"
+ }, {
+ "supercategory": "none",
+ "id": 2,
+ "name": "car"
+ }, {
+ "supercategory": "none",
+ "id": 3,
+ "name": "truck"
+ }, {
+ "supercategory": "none",
+ "id": 4,
+ "name": "bus"
+ }, {
+ "supercategory": "none",
+ "id": 5,
+ "name": "train"
+ }, {
+ "supercategory": "none",
+ "id": 6,
+ "name": "motorcycle"
+ }, {
+ "supercategory": "none",
+ "id": 7,
+ "name": "bicycle"
+ }, {
+ "supercategory": "none",
+ "id": 8,
+ "name": "other person"
+ }, {
+ "supercategory": "none",
+ "id": 9,
+ "name": "trailer"
+ }, {
+ "supercategory": "none",
+ "id": 10,
+ "name": "other vehicle"
+ }]
+ attr_id_dict = {i['name']: i['id'] for i in attr_dict['categories']}
+
+ # create bdd100kmot_vehicle training set in MOT format
+ print('Loading and converting training set...')
+ train_img_dir = os.path.join(args.img_dir, 'train')
+ train_label_dir = os.path.join(args.label_dir, 'train')
+ save_img_dir = os.path.join(args.save_path, 'images', 'train')
+ save_label_dir = os.path.join(args.save_path, 'labels_with_ids', 'train')
+ if not os.path.exists(save_img_dir): os.makedirs(save_img_dir)
+ if not os.path.exists(save_label_dir): os.makedirs(save_label_dir)
+ bdd2mot_tracking(train_img_dir, train_label_dir, save_img_dir,
+ save_label_dir)
+
+ # create bdd100kmot_vehicle validation set in MOT format
+ print('Loading and converting validation set...')
+ val_img_dir = os.path.join(args.img_dir, 'val')
+ val_label_dir = os.path.join(args.label_dir, 'val')
+ save_img_dir = os.path.join(args.save_path, 'images', 'val')
+ save_label_dir = os.path.join(args.save_path, 'labels_with_ids', 'val')
+ if not os.path.exists(save_img_dir): os.makedirs(save_img_dir)
+ if not os.path.exists(save_label_dir): os.makedirs(save_label_dir)
+ bdd2mot_tracking(val_img_dir, val_label_dir, save_img_dir, save_label_dir)
+
+ # gen gt file
+ dataPath = args.data_path
+ phase = args.phase
+ classes = args.classes.split(',')
+ formatOrigin(osp.join(dataPath, 'bdd100kmot_vehicle'), phase)
+ dataDir = osp.join(
+ osp.join(dataPath, 'bdd100kmot_vehicle'), 'labels_with_ids',
+ phase) + '/*'
+ genMotGt(dataDir, classes=classes)
+ copyImg(dataPath, osp.join(dataPath, 'bdd100kmot_vehicle'), phase)
+ updateSeqInfo(osp.join(dataPath, 'bdd100kmot_vehicle'), phase)
+ gen_image_list(osp.join(dataPath, 'bdd100kmot_vehicle'), phase)
+ os.system(f'rm -r {dataPath}/bdd100kmot_vehicle/images/' + phase + '/*.jpg')
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/tools/bdd100kmot/gen_bdd100kmot_vehicle.sh b/PaddleDetection-release-2.6/configs/mot/vehicle/tools/bdd100kmot/gen_bdd100kmot_vehicle.sh
new file mode 100644
index 0000000000000000000000000000000000000000..b88b25180d9615b5277b1101f321c0d2704c3241
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/tools/bdd100kmot/gen_bdd100kmot_vehicle.sh
@@ -0,0 +1,16 @@
+data_path=bdd100k
+img_dir=${data_path}/images/track
+label_dir=${data_path}/labels/box_track_20
+save_path=${data_path}/bdd100kmot_vehicle
+
+phasetrain=train
+phaseval=val
+classes=2,3,4,9,10
+
+# gen mot dataset
+python bdd100k2mot.py --data_path=${data_path} --phase=${phasetrain} --classes=${classes} --img_dir=${img_dir} --label_dir=${label_dir} --save_path=${save_path}
+python bdd100k2mot.py --data_path=${data_path} --phase=${phaseval} --classes=${classes} --img_dir=${img_dir} --label_dir=${label_dir} --save_path=${save_path}
+
+# gen new labels_with_ids
+python gen_labels_MOT.py --mot_data=${data_path} --phase=${phasetrain}
+python gen_labels_MOT.py --mot_data=${data_path} --phase=${phaseval}
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/tools/bdd100kmot/gen_labels_MOT.py b/PaddleDetection-release-2.6/configs/mot/vehicle/tools/bdd100kmot/gen_labels_MOT.py
new file mode 100644
index 0000000000000000000000000000000000000000..91aa800c38591ce52c146dad9a73aecd7741fed7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/tools/bdd100kmot/gen_labels_MOT.py
@@ -0,0 +1,72 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import os.path as osp
+import numpy as np
+import argparse
+
+
+def mkdirs(d):
+ if not osp.exists(d):
+ os.makedirs(d)
+
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser(description='BDD100K to MOT format')
+ parser.add_argument(
+ "--mot_data", default='./bdd100k')
+ parser.add_argument("--phase", default='train')
+ args = parser.parse_args()
+
+ MOT_data = args.mot_data
+ phase = args.phase
+ seq_root = osp.join(MOT_data, 'bdd100kmot_vehicle', 'images', phase)
+ label_root = osp.join(MOT_data, 'bdd100kmot_vehicle', 'labels_with_ids',
+ phase)
+ mkdirs(label_root)
+ seqs = [s for s in os.listdir(seq_root)]
+ tid_curr = 0
+ tid_last = -1
+
+ os.system(f'rm -r {MOT_data}/bdd100kmot_vehicle/labels_with_ids')
+ for seq in seqs:
+ print('seq => ', seq)
+ seq_info = open(osp.join(seq_root, seq, 'seqinfo.ini')).read()
+ seq_width = int(seq_info[seq_info.find('imWidth=') + 8:seq_info.find(
+ '\nimHeight')])
+ seq_height = int(seq_info[seq_info.find('imHeight=') + 9:seq_info.find(
+ '\nimExt')])
+
+ gt_txt = osp.join(seq_root, seq, 'gt', 'gt.txt')
+ gt = np.loadtxt(gt_txt, dtype=np.float64, delimiter=',')
+
+ seq_label_root = osp.join(label_root, seq, 'img1')
+ mkdirs(seq_label_root)
+
+ for fid, tid, x, y, w, h, mark, label, _ in gt:
+ fid = int(fid)
+ tid = int(tid)
+ if not tid == tid_last:
+ tid_curr += 1
+ tid_last = tid
+ x += w / 2
+ y += h / 2
+ label_fpath = osp.join(seq_label_root,
+ seq + '-' + '{:07d}.txt'.format(fid))
+ label_str = '0 {:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
+ tid_curr, x / seq_width, y / seq_height, w / seq_width,
+ h / seq_height)
+ with open(label_fpath, 'a') as f:
+ f.write(label_str)
diff --git a/PaddleDetection-release-2.6/configs/mot/vehicle/tools/visdrone/visdrone2mot.py b/PaddleDetection-release-2.6/configs/mot/vehicle/tools/visdrone/visdrone2mot.py
new file mode 100644
index 0000000000000000000000000000000000000000..a2fa200204f5656ce015d371715b0f7c2bf9366d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/mot/vehicle/tools/visdrone/visdrone2mot.py
@@ -0,0 +1,295 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import glob
+import os
+import os.path as osp
+import cv2
+import argparse
+import numpy as np
+import random
+
+# The object category indicates the type of annotated object,
+# (i.e., ignored regions(0), pedestrian(1), people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10),others(11))
+
+# Extract single class or multi class
+isExtractMultiClass = False
+# The sequence is excluded because there are too few vehicles
+exclude_seq = ["uav0000086_00000_v"]
+
+
+def mkdir_if_missing(d):
+ if not osp.exists(d):
+ os.makedirs(d)
+
+
+def genGtFile(seqPath, outPath, classes=[]):
+ id_idx = 0
+ old_idx = -1
+ with open(seqPath, 'r') as singleSeqFile:
+ motLine = []
+ allLines = singleSeqFile.readlines()
+ for line in allLines:
+ line = line.replace('\n', '')
+ line = line.split(',')
+ # exclude occlusion!='2'
+ if line[-1] != '2' and line[7] in classes:
+ if old_idx != int(line[1]):
+ id_idx += 1
+ old_idx = int(line[1])
+ newLine = line[0:6]
+ newLine[1] = str(id_idx)
+ newLine.append('1')
+ if (len(classes) > 1 and isExtractMultiClass):
+ class_index = str(classes.index(line[7]) + 1)
+ newLine.append(class_index)
+ else:
+ newLine.append('1') # use permanent class '1'
+ newLine.append('1')
+ motLine.append(newLine)
+ mkdir_if_missing(outPath)
+ gtFilePath = osp.join(outPath, 'gt.txt')
+ with open(gtFilePath, 'w') as gtFile:
+ motLine = list(map(lambda x: str.join(',', x), motLine))
+ motLineStr = str.join('\n', motLine)
+ gtFile.write(motLineStr)
+
+
+def genSeqInfo(img1Path, seqName):
+ imgPaths = glob.glob(img1Path + '/*.jpg')
+ seqLength = len(imgPaths)
+ if seqLength > 0:
+ image1 = cv2.imread(imgPaths[0])
+ imgHeight = image1.shape[0]
+ imgWidth = image1.shape[1]
+ else:
+ imgHeight = 0
+ imgWidth = 0
+ seqInfoStr = f'''[Sequence]\nname={seqName}\nimDir=img1\nframeRate=30\nseqLength={seqLength}\nimWidth={imgWidth}\nimHeight={imgHeight}\nimExt=.jpg'''
+ seqInfoPath = img1Path.replace('/img1', '')
+ with open(seqInfoPath + '/seqinfo.ini', 'w') as seqFile:
+ seqFile.write(seqInfoStr)
+
+
+def copyImg(img1Path, gtTxtPath, outputFileName):
+ with open(gtTxtPath, 'r') as gtFile:
+ allLines = gtFile.readlines()
+ imgList = []
+ for line in allLines:
+ imgIdx = int(line.split(',')[0])
+ if imgIdx not in imgList:
+ imgList.append(imgIdx)
+ seqName = gtTxtPath.replace('./{}/'.format(outputFileName),
+ '').replace('/gt/gt.txt', '')
+ sourceImgPath = osp.join('./sequences', seqName,
+ '{:07d}.jpg'.format(imgIdx))
+ os.system(f'cp {sourceImgPath} {img1Path}')
+
+
+def genMotLabels(datasetPath, outputFileName, classes=['2']):
+ mkdir_if_missing(osp.join(datasetPath, outputFileName))
+ annotationsPath = osp.join(datasetPath, 'annotations')
+ annotationsList = glob.glob(osp.join(annotationsPath, '*.txt'))
+ for annotationPath in annotationsList:
+ seqName = annotationPath.split('/')[-1].replace('.txt', '')
+ if seqName in exclude_seq:
+ continue
+ mkdir_if_missing(osp.join(datasetPath, outputFileName, seqName, 'gt'))
+ mkdir_if_missing(osp.join(datasetPath, outputFileName, seqName, 'img1'))
+ genGtFile(annotationPath,
+ osp.join(datasetPath, outputFileName, seqName, 'gt'), classes)
+ img1Path = osp.join(datasetPath, outputFileName, seqName, 'img1')
+ gtTxtPath = osp.join(datasetPath, outputFileName, seqName, 'gt/gt.txt')
+ copyImg(img1Path, gtTxtPath, outputFileName)
+ genSeqInfo(img1Path, seqName)
+
+
+def deleteFileWhichImg1IsEmpty(mot16Path, dataType='train'):
+ path = mot16Path
+ data_images_train = osp.join(path, 'images', f'{dataType}')
+ data_images_train_seqs = glob.glob(data_images_train + '/*')
+ if (len(data_images_train_seqs) == 0):
+ print('dataset is empty!')
+ for data_images_train_seq in data_images_train_seqs:
+ data_images_train_seq_img1 = osp.join(data_images_train_seq, 'img1')
+ if len(glob.glob(data_images_train_seq_img1 + '/*.jpg')) == 0:
+ print(f"os.system(rm -rf {data_images_train_seq})")
+ os.system(f'rm -rf {data_images_train_seq}')
+
+
+def formatMot16Path(dataPath, pathType='train'):
+ train_path = osp.join(dataPath, 'images', pathType)
+ mkdir_if_missing(train_path)
+ os.system(f'mv {dataPath}/* {train_path}')
+
+
+def VisualGt(dataPath, phase='train'):
+ seqList = sorted(glob.glob(osp.join(dataPath, 'images', phase) + '/*'))
+ seqIndex = random.randint(0, len(seqList) - 1)
+ seqPath = seqList[seqIndex]
+ gt_path = osp.join(seqPath, 'gt', 'gt.txt')
+ img_list_path = sorted(glob.glob(osp.join(seqPath, 'img1', '*.jpg')))
+ imgIndex = random.randint(0, len(img_list_path))
+ img_Path = img_list_path[imgIndex]
+ frame_value = int(img_Path.split('/')[-1].replace('.jpg', ''))
+ gt_value = np.loadtxt(gt_path, dtype=int, delimiter=',')
+ gt_value = gt_value[gt_value[:, 0] == frame_value]
+ get_list = gt_value.tolist()
+ img = cv2.imread(img_Path)
+ colors = [[255, 0, 0], [255, 255, 0], [255, 0, 255], [0, 255, 0],
+ [0, 255, 255], [0, 0, 255]]
+ for seq, _id, pl, pt, w, h, _, bbox_class, _ in get_list:
+ cv2.putText(img,
+ str(bbox_class), (pl, pt), cv2.FONT_HERSHEY_PLAIN, 2,
+ colors[bbox_class - 1])
+ cv2.rectangle(
+ img, (pl, pt), (pl + w, pt + h),
+ colors[bbox_class - 1],
+ thickness=2)
+ cv2.imwrite('testGt.jpg', img)
+
+
+def VisualDataset(datasetPath, phase='train', seqName='', frameId=1):
+ trainPath = osp.join(datasetPath, 'labels_with_ids', phase)
+ seq1Paths = osp.join(trainPath, seqName)
+ seq_img1_path = osp.join(seq1Paths, 'img1')
+ label_with_idPath = osp.join(seq_img1_path, '%07d' % frameId) + '.txt'
+ image_path = label_with_idPath.replace('labels_with_ids', 'images').replace(
+ '.txt', '.jpg')
+ seqInfoPath = str.join('/', image_path.split('/')[:-2])
+ seqInfoPath = seqInfoPath + '/seqinfo.ini'
+ seq_info = open(seqInfoPath).read()
+ width = int(seq_info[seq_info.find('imWidth=') + 8:seq_info.find(
+ '\nimHeight')])
+ height = int(seq_info[seq_info.find('imHeight=') + 9:seq_info.find(
+ '\nimExt')])
+
+ with open(label_with_idPath, 'r') as label:
+ allLines = label.readlines()
+ images = cv2.imread(image_path)
+ for line in allLines:
+ line = line.split(' ')
+ line = list(map(lambda x: float(x), line))
+ c1, c2, w, h = line[2:6]
+ x1 = c1 - w / 2
+ x2 = c2 - h / 2
+ x3 = c1 + w / 2
+ x4 = c2 + h / 2
+ cv2.rectangle(
+ images, (int(x1 * width), int(x2 * height)),
+ (int(x3 * width), int(x4 * height)), (255, 0, 0),
+ thickness=2)
+ cv2.imwrite('test.jpg', images)
+
+
+def gen_image_list(dataPath, datType):
+ inputPath = f'{dataPath}/images/{datType}'
+ pathList = glob.glob(inputPath + '/*')
+ pathList = sorted(pathList)
+ allImageList = []
+ for pathSingle in pathList:
+ imgList = sorted(glob.glob(osp.join(pathSingle, 'img1', '*.jpg')))
+ for imgPath in imgList:
+ allImageList.append(imgPath)
+ with open(f'{dataPath}.{datType}', 'w') as image_list_file:
+ allImageListStr = str.join('\n', allImageList)
+ image_list_file.write(allImageListStr)
+
+
+def gen_labels_mot(MOT_data, phase='train'):
+ seq_root = './{}/images/{}'.format(MOT_data, phase)
+ label_root = './{}/labels_with_ids/{}'.format(MOT_data, phase)
+ mkdir_if_missing(label_root)
+ seqs = [s for s in os.listdir(seq_root)]
+ print('seqs => ', seqs)
+ tid_curr = 0
+ tid_last = -1
+ for seq in seqs:
+ seq_info = open(osp.join(seq_root, seq, 'seqinfo.ini')).read()
+ seq_width = int(seq_info[seq_info.find('imWidth=') + 8:seq_info.find(
+ '\nimHeight')])
+ seq_height = int(seq_info[seq_info.find('imHeight=') + 9:seq_info.find(
+ '\nimExt')])
+
+ gt_txt = osp.join(seq_root, seq, 'gt', 'gt.txt')
+ gt = np.loadtxt(gt_txt, dtype=np.float64, delimiter=',')
+
+ seq_label_root = osp.join(label_root, seq, 'img1')
+ mkdir_if_missing(seq_label_root)
+
+ for fid, tid, x, y, w, h, mark, label, _ in gt:
+ # if mark == 0 or not label == 1:
+ # continue
+ fid = int(fid)
+ tid = int(tid)
+ if not tid == tid_last:
+ tid_curr += 1
+ tid_last = tid
+ x += w / 2
+ y += h / 2
+ label_fpath = osp.join(seq_label_root, '{:07d}.txt'.format(fid))
+ label_str = '0 {:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
+ tid_curr, x / seq_width, y / seq_height, w / seq_width,
+ h / seq_height)
+ with open(label_fpath, 'a') as f:
+ f.write(label_str)
+
+
+def parse_arguments():
+ parser = argparse.ArgumentParser(description='input method')
+ parser.add_argument("--transMot", type=bool, default=False)
+ parser.add_argument("--genMot", type=bool, default=False)
+ parser.add_argument("--formatMotPath", type=bool, default=False)
+ parser.add_argument("--deleteEmpty", type=bool, default=False)
+ parser.add_argument("--genLabelsMot", type=bool, default=False)
+ parser.add_argument("--genImageList", type=bool, default=False)
+ parser.add_argument("--visualImg", type=bool, default=False)
+ parser.add_argument("--visualGt", type=bool, default=False)
+ parser.add_argument("--data_name", type=str, default='visdrone_vehicle')
+ parser.add_argument("--phase", type=str, default='train')
+ parser.add_argument("--classes", type=str, default='4,5,6,9')
+ return parser.parse_args()
+
+
+if __name__ == "__main__":
+ args = parse_arguments()
+ classes = args.classes.split(',')
+ datasetPath = './'
+ dataName = args.data_name
+ phase = args.phase
+ if args.transMot:
+ genMotLabels(datasetPath, dataName, classes)
+ formatMot16Path(dataName, pathType=phase)
+ mot16Path = f'./{dataName}'
+ deleteFileWhichImg1IsEmpty(mot16Path, dataType=phase)
+ gen_labels_mot(dataName, phase=phase)
+ gen_image_list(dataName, phase)
+ if args.genMot:
+ genMotLabels(datasetPath, dataName, classes)
+ if args.formatMotPath:
+ formatMot16Path(dataName, pathType=phase)
+ if args.deleteEmpty:
+ mot16Path = f'./{dataName}'
+ deleteFileWhichImg1IsEmpty(mot16Path, dataType=phase)
+ if args.genLabelsMot:
+ gen_labels_mot(dataName, phase=phase)
+ if args.genImageList:
+ gen_image_list(dataName, phase)
+ if args.visualGt:
+ VisualGt(f'./{dataName}', phase)
+ if args.visualImg:
+ seqName = 'uav0000137_00458_v'
+ frameId = 43
+ VisualDataset(
+ f'./{dataName}', phase=phase, seqName=seqName, frameId=frameId)
diff --git a/PaddleDetection-release-2.6/configs/picodet/FULL_QUANTIZATION.md b/PaddleDetection-release-2.6/configs/picodet/FULL_QUANTIZATION.md
new file mode 100644
index 0000000000000000000000000000000000000000..422ae07fe082849af3305e532677115ef822aabc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/FULL_QUANTIZATION.md
@@ -0,0 +1,163 @@
+# PP-PicoDet全量化示例
+
+目录:
+
+- [1.简介](#1简介)
+- [2.Benchmark](#2Benchmark)
+- [3.全量化流程](#全量化流程)
+ - [3.1 环境准备](#31-准备环境)
+ - [3.2 准备数据集](#32-准备数据集)
+ - [3.3 全精度模型训练](#33-全精度模型训练)
+ - [3.4 导出预测模型](#33-导出预测模型)
+ - [3.5 全量化并产出模型](#35-全量化并产出模型)
+- [4.预测部署](#4预测部署)
+- [5.FAQ](5FAQ)
+
+## 1. 简介
+
+本示例以PicoDet为例,介绍从模型训练、模型全量化,到NPU硬件上部署的全流程。
+
+* [Benchmark](#Benchmark)表格中已经提供了基于COCO数据预训练模型全量化的模型。
+
+* 已经验证的NPU硬件:
+
+ - 瑞芯微-开发板:Rockchip RV1109、Rockchip RV1126、Rockchip RK1808
+
+ - 晶晨-开发板:Amlogic A311D、Amlogic S905D3、Amlogic C308X
+
+ - 恩智浦-开发板:NXP i.MX 8M Plus
+
+ * 未验证硬件部署思路:
+ - 未验证,表示该硬件暂不支持Paddle Lite推理部署,可以选择Paddle2ONNX导出,使用硬件的推理引擎完成部署,前提该硬件支持ONNX的全量化模型。
+
+## 2.Benchmark
+
+### PicoDet-S-NPU
+
+| 模型 | 策略 | mAP | FP32 | INT8 | 配置文件 | 模型 |
+|:------------- |:-------- |:----:|:----:|:----:|:---------------------------------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------:|
+| PicoDet-S-NPU | Baseline | 30.1 | - | - | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_s_416_coco_npu.yml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_416_coco_npu.tar) |
+| PicoDet-S-NPU | 量化训练 | 29.7 | - | - | [config](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/full_quantization/detection/configs/picodet_s_qat_dis.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_npu_quant.tar) |
+
+- mAP的指标均在COCO val2017数据集中评测得到,IoU=0.5:0.95。
+
+## 3. 全量化流程
+基于自己数据训练的模型,可以参考如下流程。
+
+### 3.1 准备环境
+
+- PaddlePaddle >= 2.3 (可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装)
+- PaddleSlim >= 2.3
+- PaddleDet >= 2.4
+
+安装paddlepaddle:
+
+```shell
+# CPU
+pip install paddlepaddle
+# GPU
+pip install paddlepaddle-gpu
+```
+
+安装paddleslim:
+
+```shell
+pip install paddleslim
+```
+
+安装paddledet:
+
+```shell
+pip install paddledet
+```
+
+### 3.2 准备数据集
+
+本案例默认以COCO数据进行全量化实验,如果自定义数据,可将数据按照COCO数据的标准准备;其他自定义数据,可以参考[PaddleDetection数据准备文档](../../docs/tutorials/data/PrepareDataSet.md) 来准备。
+
+以PicoDet-S-NPU模型为例,如果已经准备好数据集,请直接修改[picodet_reader.yml](./configs/picodet_reader.yml)中`EvalDataset`的`dataset_dir`字段为自己数据集路径即可。
+
+### 3.3 全精度模型训练
+
+如需模型全量化,需要准备一个训好的全精度模型,如果已训好模型可跳过该步骤。
+
+- 单卡GPU上训练:
+
+```shell
+# training on single-GPU
+export CUDA_VISIBLE_DEVICES=0
+python tools/train.py -c configs/picodet/picodet_s_416_coco_npu.yml --eval
+```
+
+**注意:**如果训练时显存out memory,将TrainReader中batch_size调小,同时LearningRate中base_lr等比例减小。同时我们发布的config均由4卡训练得到,如果改变GPU卡数为1,那么base_lr需要减小4倍。
+
+- 多卡GPU上训练:
+
+```shell
+# training on multi-GPU
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/picodet/picodet_s_416_coco_npu.yml --eval
+```
+
+**注意:**PicoDet所有模型均由4卡GPU训练得到,如果改变训练GPU卡数,需要按线性比例缩放学习率base_lr。
+
+- 评估:
+
+```shell
+python tools/eval.py -c configs/picodet/picodet_s_416_coco_npu.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_npu.pdparams
+```
+
+### 3.4 导出预测模型
+
+使用如下命令,导出Inference模型,用于全量化训练。导出模型默认存放在`output_inference`文件夹,包括*.pdmodel和*.pdiparams文件,用于全量化。
+
+* 命令说明:
+ - -c: [3.3 全精度模型训练](#3.3全精度模型训练)训练时使用的yam配置文件。
+ - -o weight: 预测模型文件,该文档直接使用基于COCO上训练好的模型。
+
+```shell
+python tools/export_model.py \
+ -c configs/picodet/picodet_s_416_coco_npu.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_npu.pdparams \
+```
+
+### 3.5 全量化训练并产出模型
+
+- 进入PaddleSlim自动化压缩Demo文件夹下:
+
+ ```shell
+ cd deploy/auto_compression/
+ ```
+
+全量化示例通过run.py脚本启动,会使用接口```paddleslim.auto_compression.AutoCompression```对模型进行全量化。配置config文件中模型路径、蒸馏、量化、和训练等部分的参数,配置完成后便可对模型进行量化和蒸馏。具体运行命令为:
+
+- 单卡量化训练:
+
+ ```
+ export CUDA_VISIBLE_DEVICES=0
+ python run.py --config_path=./configs/picodet_s_qat_dis.yaml --save_dir='./output/'
+ ```
+
+- 多卡量化训练:
+
+ ```
+ CUDA_VISIBLE_DEVICES=0,1,2,3
+ python -m paddle.distributed.launch --log_dir=log --gpus 0,1,2,3 run.py \
+ --config_path=./configs/picodet_s_qat_dis.yaml --save_dir='./output/'
+ ```
+
+- 最终模型默认产出在`output`文件夹下,训练完成后,测试全量化模型精度
+
+将config要测试的模型路径可以在配置文件中`model_dir`字段下进行修改。使用eval.py脚本得到模型的mAP:
+
+```
+export CUDA_VISIBLE_DEVICES=0
+python eval.py --config_path=./configs/picodet_s_qat_dis.yaml
+```
+
+## 4.预测部署
+
+请直接使用PicoDet的[Paddle Lite全量化Demo](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/linux/picodet_detection)进行落地部署。
+
+## 5.FAQ
diff --git a/PaddleDetection-release-2.6/configs/picodet/README.md b/PaddleDetection-release-2.6/configs/picodet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..02a22c4c2f35ce99199ecd02ca1e6c5b428c3b8a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/README.md
@@ -0,0 +1,355 @@
+简体中文 | [English](README_en.md)
+
+# PP-PicoDet
+
+
+
+## 最新动态
+
+- 发布PicoDet-NPU模型,支持模型全量化部署。详情请参考[PicoDet全量化示例](./FULL_QUANTIZATION.md) **(2022.08.10)**
+
+- 发布全新系列PP-PicoDet模型:**(2022.03.20)**
+ - (1)引入TAL及ETA Head,优化PAN等结构,精度提升2个点以上;
+ - (2)优化CPU端预测速度,同时训练速度提升一倍;
+ - (3)导出模型将后处理包含在网络中,预测直接输出box结果,无需二次开发,迁移成本更低,端到端预测速度提升10%-20%。
+
+## 历史版本模型
+
+- 详情请参考:[PicoDet 2021.10版本](./legacy_model/)
+
+## 简介
+
+PaddleDetection中提出了全新的轻量级系列模型`PP-PicoDet`,在移动端具有卓越的性能,成为全新SOTA轻量级模型。详细的技术细节可以参考我们的[arXiv技术报告](https://arxiv.org/abs/2111.00902)。
+
+PP-PicoDet模型有如下特点:
+
+- 🌟 更高的mAP: 第一个在1M参数量之内`mAP(0.5:0.95)`超越**30+**(输入416像素时)。
+- 🚀 更快的预测速度: 网络预测在ARM CPU下可达150FPS。
+- 😊 部署友好: 支持PaddleLite/MNN/NCNN/OpenVINO等预测库,支持转出ONNX,提供了C++/Python/Android的demo。
+- 😍 先进的算法: 我们在现有SOTA算法中进行了创新, 包括:ESNet, CSP-PAN, SimOTA等等。
+
+
+
+

+
+
+## 基线
+
+| 模型 | 输入尺寸 | mAPval
0.5:0.95 | mAPval
0.5 | 参数量
(M) | FLOPS
(G) | 预测时延[CPU](#latency)
(ms) | 预测时延[Lite](#latency)
(ms) | 权重下载 | 配置文件 | 导出模型 |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- | :--------------------------------------- |
+| PicoDet-XS | 320*320 | 23.5 | 36.1 | 0.70 | 0.67 | 3.9ms | 7.81ms | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_320_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_xs_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet_non_postprocess.tar) |
+| PicoDet-XS | 416*416 | 26.2 | 39.3 | 0.70 | 1.13 | 6.1ms | 12.38ms | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_416_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_xs_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet_non_postprocess.tar) |
+| PicoDet-S | 320*320 | 29.1 | 43.4 | 1.18 | 0.97 | 4.8ms | 9.56ms | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_s_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet_non_postprocess.tar) |
+| PicoDet-S | 416*416 | 32.5 | 47.6 | 1.18 | 1.65 | 6.6ms | 15.20ms | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_s_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_non_postprocess.tar) |
+| PicoDet-M | 320*320 | 34.4 | 50.0 | 3.46 | 2.57 | 8.2ms | 17.68ms | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_m_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet_non_postprocess.tar) |
+| PicoDet-M | 416*416 | 37.5 | 53.4 | 3.46 | 4.34 | 12.7ms | 28.39ms | [model](https://paddledet.bj.bcebos.com/models/picodet_m_416_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_m_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet_non_postprocess.tar) |
+| PicoDet-L | 320*320 | 36.1 | 52.0 | 5.80 | 4.20 | 11.5ms | 25.21ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_l_320_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet_non_postprocess.tar) |
+| PicoDet-L | 416*416 | 39.4 | 55.7 | 5.80 | 7.10 | 20.7ms | 42.23ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_l_416_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet_non_postprocess.tar) |
+| PicoDet-L | 640*640 | 42.6 | 59.2 | 5.80 | 16.81 | 62.5ms | 108.1ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_l_640_coco_lcnet.yml) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet_non_postprocess.tar) |
+
+- 特色模型
+
+| 模型 | 输入尺寸 | mAPval
0.5:0.95 | mAPval
0.5 | 参数量
(M) | FLOPS
(G) | 预测时延[CPU](#latency)
(ms) | 预测时延[Lite](#latency)
(ms) | 权重下载 | 配置文件 |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- |
+| PicoDet-S-NPU | 416*416 | 30.1 | 44.2 | - | - | - | - | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_npu.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco_npu.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_s_416_coco_npu.yml) |
+
+
+
+注意事项:
+
+- 时延测试: 我们所有的模型都在`英特尔酷睿i7 10750H`的CPU 和`骁龙865(4xA77+4xA55)`的ARM CPU上测试(4线程,FP16预测)。上面表格中标有`CPU`的是使用OpenVINO测试,标有`Lite`的是使用[Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite)进行测试。
+- PicoDet在COCO train2017上训练,并且在COCO val2017上进行验证。使用4卡GPU训练,并且上表所有的预训练模型都是通过发布的默认配置训练得到。
+- Benchmark测试:测试速度benchmark性能时,导出模型后处理不包含在网络中,需要设置`-o export.benchmark=True` 或手动修改[runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/runtime.yml#L12)。
+
+
+
+#### 其他模型的基线
+
+| 模型 | 输入尺寸 | mAPval
0.5:0.95 | mAPval
0.5 | 参数量
(M) | FLOPS
(G) | 预测时延[NCNN](#latency)
(ms) |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: |
+| YOLOv3-Tiny | 416*416 | 16.6 | 33.1 | 8.86 | 5.62 | 25.42 |
+| YOLOv4-Tiny | 416*416 | 21.7 | 40.2 | 6.06 | 6.96 | 23.69 |
+| PP-YOLO-Tiny | 320*320 | 20.6 | - | 1.08 | 0.58 | 6.75 |
+| PP-YOLO-Tiny | 416*416 | 22.7 | - | 1.08 | 1.02 | 10.48 |
+| Nanodet-M | 320*320 | 20.6 | - | 0.95 | 0.72 | 8.71 |
+| Nanodet-M | 416*416 | 23.5 | - | 0.95 | 1.2 | 13.35 |
+| Nanodet-M 1.5x | 416*416 | 26.8 | - | 2.08 | 2.42 | 15.83 |
+| YOLOX-Nano | 416*416 | 25.8 | - | 0.91 | 1.08 | 19.23 |
+| YOLOX-Tiny | 416*416 | 32.8 | - | 5.06 | 6.45 | 32.77 |
+| YOLOv5n | 640*640 | 28.4 | 46.0 | 1.9 | 4.5 | 40.35 |
+| YOLOv5s | 640*640 | 37.2 | 56.0 | 7.2 | 16.5 | 78.05 |
+
+- ARM测试的benchmark脚本来自: [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark)。
+
+## 快速开始
+
+
+依赖包:
+
+- PaddlePaddle == 2.2.2
+
+
+
+
+安装
+
+- [安装指导文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/INSTALL.md)
+- [准备数据文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/data/PrepareDataSet_en.md)
+
+
+
+
+训练&评估
+
+- 单卡GPU上训练:
+
+```shell
+# training on single-GPU
+export CUDA_VISIBLE_DEVICES=0
+python tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
+```
+
+**注意:**如果训练时显存out memory,将TrainReader中batch_size调小,同时LearningRate中base_lr等比例减小。同时我们发布的config均由4卡训练得到,如果改变GPU卡数为1,那么base_lr需要减小4倍。
+
+- 多卡GPU上训练:
+
+
+```shell
+# training on multi-GPU
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
+```
+
+**注意:**PicoDet所有模型均由4卡GPU训练得到,如果改变训练GPU卡数,需要按线性比例缩放学习率base_lr。
+
+- 评估:
+
+```shell
+python tools/eval.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
+```
+
+- 测试:
+
+```shell
+python tools/infer.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
+```
+
+详情请参考[快速开始文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/GETTING_STARTED.md).
+
+
+
+
+## 部署
+
+### 导出及转换模型
+
+
+1. 导出模型
+
+```shell
+cd PaddleDetection
+python tools/export_model.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams \
+ --output_dir=output_inference
+```
+
+- 如无需导出后处理,请指定:`-o export.benchmark=True`(如果-o已出现过,此处删掉-o)或者手动修改[runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/runtime.yml) 中相应字段。
+- 如无需导出NMS,请指定:`-o export.nms=False`或者手动修改[runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/runtime.yml) 中相应字段。 许多导出至ONNX场景只支持单输入及固定shape输出,所以如果导出至ONNX,推荐不导出NMS。
+
+
+
+
+2. 转换模型至Paddle Lite (点击展开)
+
+- 安装Paddlelite>=2.10:
+
+```shell
+pip install paddlelite
+```
+
+- 转换模型至Paddle Lite格式:
+
+```shell
+# FP32
+paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp32
+# FP16
+paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp16 --enable_fp16=true
+```
+
+
+
+
+3. 转换模型至ONNX (点击展开)
+
+- 安装[Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX) >= 0.7 并且 ONNX > 1.10.1, 细节请参考[导出ONNX模型教程](../../deploy/EXPORT_ONNX_MODEL.md)
+
+```shell
+pip install onnx
+pip install paddle2onnx==0.9.2
+```
+
+- 转换模型:
+
+```shell
+paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
+ --model_filename model.pdmodel \
+ --params_filename model.pdiparams \
+ --opset_version 11 \
+ --save_file picodet_s_320_coco.onnx
+```
+
+- 简化ONNX模型: 使用`onnx-simplifier`库来简化ONNX模型。
+
+ - 安装 onnxsim >= 0.4.1:
+ ```shell
+ pip install onnxsim
+ ```
+ - 简化ONNX模型:
+ ```shell
+ onnxsim picodet_s_320_coco.onnx picodet_s_processed.onnx
+ ```
+
+
+
+- 部署用的模型
+
+| 模型 | 输入尺寸 | ONNX | Paddle Lite(fp32) | Paddle Lite(fp16) |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: |
+| PicoDet-XS | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_lcnet_postprocessed.onnx) | [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet_fp16.tar) |
+| PicoDet-XS | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_lcnet_postprocessed.onnx) | [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet_fp16.tar) |
+| PicoDet-S | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_lcnet_postprocessed.onnx) | [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet_fp16.tar) |
+| PicoDet-S | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_lcnet_postprocessed.onnx) | [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_fp16.tar) |
+| PicoDet-M | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_lcnet_postprocessed.onnx) | [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet_fp16.tar) |
+| PicoDet-M | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_lcnet_postprocessed.onnx) | [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet_fp16.tar) |
+| PicoDet-L | 320*320 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_lcnet_postprocessed.onnx) | [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet_fp16.tar) |
+| PicoDet-L | 416*416 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_lcnet_postprocessed.onnx) | [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet_fp16.tar) |
+| PicoDet-L | 640*640 | [( w/ 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_lcnet_postprocessed.onnx) | [( w/o 后处理)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet_fp16.tar) |
+
+### 部署
+
+| 预测库 | Python | C++ | 带后处理预测 |
+| :-------- | :--------: | :---------------------: | :----------------: |
+| OpenVINO | [Python](../../deploy/third_engine/demo_openvino/python) | [C++](../../deploy/third_engine/demo_openvino)(带后处理开发中) | ✔︎ |
+| Paddle Lite | - | [C++](../../deploy/lite) | ✔︎ |
+| Android Demo | - | [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | ✔︎ |
+| PaddleInference | [Python](../../deploy/python) | [C++](../../deploy/cpp) | ✔︎ |
+| ONNXRuntime | [Python](../../deploy/third_engine/demo_onnxruntime) | Coming soon | ✔︎ |
+| NCNN | Coming soon | [C++](../../deploy/third_engine/demo_ncnn) | ✘ |
+| MNN | Coming soon | [C++](../../deploy/third_engine/demo_mnn) | ✘ |
+
+
+
+Android demo可视化:
+
+
+
+## 量化
+
+
+依赖包:
+
+- PaddlePaddle >= 2.2.2
+- PaddleSlim >= 2.2.2
+
+**安装:**
+
+```shell
+pip install paddleslim==2.2.2
+```
+
+
+
+
+量化训练
+
+开始量化训练:
+
+```shell
+python tools/train.py -c configs/picodet/picodet_s_416_coco_lcnet.yml \
+ --slim_config configs/slim/quant/picodet_s_416_lcnet_quant.yml --eval
+```
+
+- 更多细节请参考[slim文档](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim)
+
+
+
+- 量化训练Model ZOO:
+
+| 量化模型 | 输入尺寸 | mAPval
0.5:0.95 | Configs | Weight | Inference Model | Paddle Lite(INT8) |
+| :-------- | :--------: | :--------------------: | :-------: | :----------------: | :----------------: | :----------------: |
+| PicoDet-S | 416*416 | 31.5 | [config](./picodet_s_416_coco_lcnet.yml) | [slim config](../slim/quant/picodet_s_416_lcnet_quant.yml) | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet_quant.pdparams) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_quant.tar) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_quant_non_postprocess.tar) | [w/ 后处理](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_quant.nb) | [w/o 后处理](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_quant_non_postprocess.nb) |
+
+## 非结构化剪枝
+
+
+教程:
+
+训练及部署细节请参考[非结构化剪枝文档](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/legacy_model/pruner/README.md)。
+
+
+
+## 应用
+
+- **行人检测:** `PicoDet-S-Pedestrian`行人检测模型请参考[PP-TinyPose](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/keypoint/tiny_pose#%E8%A1%8C%E4%BA%BA%E6%A3%80%E6%B5%8B%E6%A8%A1%E5%9E%8B)
+
+- **主体检测:** `PicoDet-L-Mainbody`主体检测模型请参考[主体检测文档](./legacy_model/application/mainbody_detection/README.md)
+
+## FAQ
+
+
+显存爆炸(Out of memory error)
+
+请减小配置文件中`TrainReader`的`batch_size`。
+
+
+
+
+如何迁移学习
+
+请重新设置配置文件中的`pretrain_weights`字段,比如利用COCO上训好的模型在自己的数据上继续训练:
+```yaml
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
+```
+
+
+
+
+`transpose`算子在某些硬件上耗时验证
+
+请使用`PicoDet-LCNet`模型,`transpose`较少。
+
+
+
+
+
+如何计算模型参数量。
+
+可以将以下代码插入:[trainer.py](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/ppdet/engine/trainer.py#L141) 来计算参数量。
+
+```python
+params = sum([
+ p.numel() for n, p in self.model. named_parameters()
+ if all([x not in n for x in ['_mean', '_variance']])
+]) # exclude BatchNorm running status
+print('params: ', params)
+```
+
+
+
+## 引用PP-PicoDet
+如果需要在你的研究中使用PP-PicoDet,请通过一下方式引用我们的技术报告:
+```
+@misc{yu2021pppicodet,
+ title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},
+ author={Guanghua Yu and Qinyao Chang and Wenyu Lv and Chang Xu and Cheng Cui and Wei Ji and Qingqing Dang and Kaipeng Deng and Guanzhong Wang and Yuning Du and Baohua Lai and Qiwen Liu and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
+ year={2021},
+ eprint={2111.00902},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+
+```
diff --git a/PaddleDetection-release-2.6/configs/picodet/README_en.md b/PaddleDetection-release-2.6/configs/picodet/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..e5c63b7d1b44f0772304f19d00f3581a36f5600f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/README_en.md
@@ -0,0 +1,342 @@
+English | [简体中文](README.md)
+
+# PP-PicoDet
+
+
+
+## News
+
+- Released a new series of PP-PicoDet models: **(2022.03.20)**
+ - (1) It was used TAL/ETA Head and optimized PAN, which greatly improved the accuracy;
+ - (2) Moreover optimized CPU prediction speed, and the training speed is greatly improved;
+ - (3) The export model includes post-processing, and the prediction directly outputs the result, without secondary development, and the migration cost is lower.
+
+### Legacy Model
+
+- Please refer to: [PicoDet 2021.10](./legacy_model/)
+
+## Introduction
+
+We developed a series of lightweight models, named `PP-PicoDet`. Because of the excellent performance, our models are very suitable for deployment on mobile or CPU. For more details, please refer to our [report on arXiv](https://arxiv.org/abs/2111.00902).
+
+- 🌟 Higher mAP: the **first** object detectors that surpass mAP(0.5:0.95) **30+** within 1M parameters when the input size is 416.
+- 🚀 Faster latency: 150FPS on mobile ARM CPU.
+- 😊 Deploy friendly: support PaddleLite/MNN/NCNN/OpenVINO and provide C++/Python/Android implementation.
+- 😍 Advanced algorithm: use the most advanced algorithms and offer innovation, such as ESNet, CSP-PAN, SimOTA with VFL, etc.
+
+
+
+

+
+
+## Benchmark
+
+| Model | Input size | mAPval
0.5:0.95 | mAPval
0.5 | Params
(M) | FLOPS
(G) | Latency[CPU](#latency)
(ms) | Latency[Lite](#latency)
(ms) | Weight | Config | Inference Model |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- | :--------------------------------------- |
+| PicoDet-XS | 320*320 | 23.5 | 36.1 | 0.70 | 0.67 | 3.9ms | 7.81ms | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_320_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_xs_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_320_coco_lcnet_non_postprocess.tar) |
+| PicoDet-XS | 416*416 | 26.2 | 39.3 | 0.70 | 1.13 | 6.1ms | 12.38ms | [model](https://paddledet.bj.bcebos.com/models/picodet_xs_416_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_xs_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_xs_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_xs_416_coco_lcnet_non_postprocess.tar) |
+| PicoDet-S | 320*320 | 29.1 | 43.4 | 1.18 | 0.97 | 4.8ms | 9.56ms | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_s_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_320_coco_lcnet_non_postprocess.tar) |
+| PicoDet-S | 416*416 | 32.5 | 47.6 | 1.18 | 1.65 | 6.6ms | 15.20ms | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_s_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_non_postprocess.tar) |
+| PicoDet-M | 320*320 | 34.4 | 50.0 | 3.46 | 2.57 | 8.2ms | 17.68ms | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_m_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_320_coco_lcnet_non_postprocess.tar) |
+| PicoDet-M | 416*416 | 37.5 | 53.4 | 3.46 | 4.34 | 12.7ms | 28.39ms | [model](https://paddledet.bj.bcebos.com/models/picodet_m_416_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_m_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_m_416_coco_lcnet_non_postprocess.tar) |
+| PicoDet-L | 320*320 | 36.1 | 52.0 | 5.80 | 4.20 | 11.5ms | 25.21ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_320_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_l_320_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_320_coco_lcnet_non_postprocess.tar) |
+| PicoDet-L | 416*416 | 39.4 | 55.7 | 5.80 | 7.10 | 20.7ms | 42.23ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_l_416_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_416_coco_lcnet_non_postprocess.tar) |
+| PicoDet-L | 640*640 | 42.6 | 59.2 | 5.80 | 16.81 | 62.5ms | 108.1ms | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco_lcnet.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_l_640_coco_lcnet.yml) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_l_640_coco_lcnet_non_postprocess.tar) |
+
+
+Table Notes:
+
+- Latency: All our models test on `Intel core i7 10750H` CPU with MKLDNN by 12 threads and `Qualcomm Snapdragon 865(4xA77+4xA55)` with 4 threads by arm8 and with FP16. In the above table, test CPU latency on Paddle-Inference and testing Mobile latency with `Lite`->[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite).
+- PicoDet is trained on COCO train2017 dataset and evaluated on COCO val2017. And PicoDet used 4 GPUs for training and all checkpoints are trained with default settings and hyperparameters.
+- Benchmark test: When testing the speed benchmark, the post-processing is not included in the exported model, you need to set `-o export.benchmark=True` or manually modify [runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/runtime.yml#L12).
+
+
+
+#### Benchmark of Other Models
+
+| Model | Input size | mAPval
0.5:0.95 | mAPval
0.5 | Params
(M) | FLOPS
(G) | Latency[NCNN](#latency)
(ms) |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: |
+| YOLOv3-Tiny | 416*416 | 16.6 | 33.1 | 8.86 | 5.62 | 25.42 |
+| YOLOv4-Tiny | 416*416 | 21.7 | 40.2 | 6.06 | 6.96 | 23.69 |
+| PP-YOLO-Tiny | 320*320 | 20.6 | - | 1.08 | 0.58 | 6.75 |
+| PP-YOLO-Tiny | 416*416 | 22.7 | - | 1.08 | 1.02 | 10.48 |
+| Nanodet-M | 320*320 | 20.6 | - | 0.95 | 0.72 | 8.71 |
+| Nanodet-M | 416*416 | 23.5 | - | 0.95 | 1.2 | 13.35 |
+| Nanodet-M 1.5x | 416*416 | 26.8 | - | 2.08 | 2.42 | 15.83 |
+| YOLOX-Nano | 416*416 | 25.8 | - | 0.91 | 1.08 | 19.23 |
+| YOLOX-Tiny | 416*416 | 32.8 | - | 5.06 | 6.45 | 32.77 |
+| YOLOv5n | 640*640 | 28.4 | 46.0 | 1.9 | 4.5 | 40.35 |
+| YOLOv5s | 640*640 | 37.2 | 56.0 | 7.2 | 16.5 | 78.05 |
+
+- Testing Mobile latency with code: [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark).
+
+## Quick Start
+
+
+Requirements:
+
+- PaddlePaddle >= 2.2.2
+
+
+
+
+Installation
+
+- [Installation guide](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/INSTALL.md)
+- [Prepare dataset](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/data/PrepareDataSet_en.md)
+
+
+
+
+Training and Evaluation
+
+- Training model on single-GPU:
+
+```shell
+# training on single-GPU
+export CUDA_VISIBLE_DEVICES=0
+python tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
+```
+If the GPU is out of memory during training, reduce the batch_size in TrainReader, and reduce the base_lr in LearningRate proportionally. At the same time, the configs we published are all trained with 4 GPUs. If the number of GPUs is changed to 1, the base_lr needs to be reduced by a factor of 4.
+
+- Training model on multi-GPU:
+
+
+```shell
+# training on multi-GPU
+export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
+```
+
+- Evaluation:
+
+```shell
+python tools/eval.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
+```
+
+- Infer:
+
+```shell
+python tools/infer.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
+```
+
+Detail also can refer to [Quick start guide](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/GETTING_STARTED.md).
+
+
+
+
+## Deployment
+
+### Export and Convert Model
+
+
+1. Export model
+
+```shell
+cd PaddleDetection
+python tools/export_model.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams \
+ --output_dir=output_inference
+```
+
+- If no post processing is required, please specify: `-o export.benchmark=True` (if -o has already appeared, delete -o here) or manually modify corresponding fields in [runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/runtime.yml).
+- If no NMS is required, please specify: `-o export.nms=True` or manually modify corresponding fields in [runtime.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/runtime.yml). Many scenes exported to ONNX only support single input and fixed shape output, so if exporting to ONNX, it is recommended not to export NMS.
+
+
+
+
+
+2. Convert to PaddleLite (click to expand)
+
+- Install Paddlelite>=2.10:
+
+```shell
+pip install paddlelite
+```
+
+- Convert model:
+
+```shell
+# FP32
+paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp32
+# FP16
+paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp16 --enable_fp16=true
+```
+
+
+
+
+3. Convert to ONNX (click to expand)
+
+- Install [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX) >= 0.7 and ONNX > 1.10.1, for details, please refer to [Tutorials of Export ONNX Model](../../deploy/EXPORT_ONNX_MODEL.md)
+
+```shell
+pip install onnx
+pip install paddle2onnx==0.9.2
+```
+
+- Convert model:
+
+```shell
+paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
+ --model_filename model.pdmodel \
+ --params_filename model.pdiparams \
+ --opset_version 11 \
+ --save_file picodet_s_320_coco.onnx
+```
+
+- Simplify ONNX model: use onnx-simplifier to simplify onnx model.
+
+ - Install onnxsim >= 0.4.1:
+ ```shell
+ pip install onnxsim
+ ```
+ - simplify onnx model:
+ ```shell
+ onnxsim picodet_s_320_coco.onnx picodet_s_processed.onnx
+ ```
+
+
+
+- Deploy models
+
+| Model | Input size | ONNX(w/o postprocess) | Paddle Lite(fp32) | Paddle Lite(fp16) |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: |
+| PicoDet-XS | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_lcnet_postprocessed.onnx) | [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_320_coco_lcnet_fp16.tar) |
+| PicoDet-XS | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_lcnet_postprocessed.onnx) | [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_xs_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_xs_416_coco_lcnet_fp16.tar) |
+| PicoDet-S | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_lcnet_postprocessed.onnx) | [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_coco_lcnet_fp16.tar) |
+| PicoDet-S | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_lcnet_postprocessed.onnx) | [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_fp16.tar) |
+| PicoDet-M | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_lcnet_postprocessed.onnx) | [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_coco_lcnet_fp16.tar) |
+| PicoDet-M | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_lcnet_postprocessed.onnx) | [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_coco_lcnet_fp16.tar) |
+| PicoDet-L | 320*320 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_lcnet_postprocessed.onnx) | [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_coco_lcnet_fp16.tar) |
+| PicoDet-L | 416*416 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_lcnet_postprocessed.onnx) | [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_coco_lcnet.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_coco_lcnet_fp16.tar) |
+| PicoDet-L | 640*640 | [( w/ postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_lcnet_postprocessed.onnx) | [( w/o postprocess)](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_coco_lcnet.onnx) [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_coco_lcnet_fp16.tar) |
+
+
+### Deploy
+
+| Infer Engine | Python | C++ | Predict With Postprocess |
+| :-------- | :--------: | :---------------------: | :----------------: |
+| OpenVINO | [Python](../../deploy/third_engine/demo_openvino/python) | [C++](../../deploy/third_engine/demo_openvino)(postprocess coming soon) | ✔︎ |
+| Paddle Lite | - | [C++](../../deploy/lite) | ✔︎ |
+| Android Demo | - | [Paddle Lite](https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/app/cxx/picodet_detection_demo) | ✔︎ |
+| PaddleInference | [Python](../../deploy/python) | [C++](../../deploy/cpp) | ✔︎ |
+| ONNXRuntime | [Python](../../deploy/third_engine/demo_onnxruntime) | Coming soon | ✔︎ |
+| NCNN | Coming soon | [C++](../../deploy/third_engine/demo_ncnn) | ✘ |
+| MNN | Coming soon | [C++](../../deploy/third_engine/demo_mnn) | ✘ |
+
+
+Android demo visualization:
+
+
+
+## Quantization
+
+
+Requirements:
+
+- PaddlePaddle >= 2.2.2
+- PaddleSlim >= 2.2.2
+
+**Install:**
+
+```shell
+pip install paddleslim==2.2.2
+```
+
+
+
+
+Quant aware
+
+Configure the quant config and start training:
+
+```shell
+python tools/train.py -c configs/picodet/picodet_s_416_coco_lcnet.yml \
+ --slim_config configs/slim/quant/picodet_s_416_lcnet_quant.yml --eval
+```
+
+- More detail can refer to [slim document](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim)
+
+
+
+- Quant Aware Model ZOO:
+
+| Quant Model | Input size | mAPval
0.5:0.95 | Configs | Weight | Inference Model | Paddle Lite(INT8) |
+| :-------- | :--------: | :--------------------: | :-------: | :----------------: | :----------------: | :----------------: |
+| PicoDet-S | 416*416 | 31.5 | [config](./picodet_s_416_coco_lcnet.yml) | [slim config](../slim/quant/picodet_s_416_lcnet_quant.yml) | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet_quant.pdparams) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_quant.tar) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/Inference/picodet_s_416_coco_lcnet_quant_non_postprocess.tar) | [w/ postprocess](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_quant.nb) | [w/o postprocess](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_coco_lcnet_quant_non_postprocess.nb) |
+
+## Unstructured Pruning
+
+
+Tutorial:
+
+Please refer this [documentation](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/legacy_model/pruner/README.md) for details such as requirements, training and deployment.
+
+
+
+## Application
+
+- **Pedestrian detection:** model zoo of `PicoDet-S-Pedestrian` please refer to [PP-TinyPose](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/keypoint/tiny_pose#%E8%A1%8C%E4%BA%BA%E6%A3%80%E6%B5%8B%E6%A8%A1%E5%9E%8B)
+
+- **Mainbody detection:** model zoo of `PicoDet-L-Mainbody` please refer to [mainbody detection](./legacy_model/application/mainbody_detection/README.md)
+
+## FAQ
+
+
+Out of memory error.
+
+Please reduce the `batch_size` of `TrainReader` in config.
+
+
+
+
+How to transfer learning.
+
+Please reset `pretrain_weights` in config, which trained on coco. Such as:
+```yaml
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
+```
+
+
+
+
+The transpose operator is time-consuming on some hardware.
+
+Please use `PicoDet-LCNet` model, which has fewer `transpose` operators.
+
+
+
+
+
+How to count model parameters.
+
+You can insert below code at [here](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/ppdet/engine/trainer.py#L141) to count learnable parameters.
+
+```python
+params = sum([
+ p.numel() for n, p in self.model. named_parameters()
+ if all([x not in n for x in ['_mean', '_variance']])
+]) # exclude BatchNorm running status
+print('params: ', params)
+```
+
+
+
+## Cite PP-PicoDet
+If you use PicoDet in your research, please cite our work by using the following BibTeX entry:
+```
+@misc{yu2021pppicodet,
+ title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},
+ author={Guanghua Yu and Qinyao Chang and Wenyu Lv and Chang Xu and Cheng Cui and Wei Ji and Qingqing Dang and Kaipeng Deng and Guanzhong Wang and Yuning Du and Baohua Lai and Qiwen Liu and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
+ year={2021},
+ eprint={2111.00902},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+
+```
diff --git a/PaddleDetection-release-2.6/configs/picodet/_base_/optimizer_300e.yml b/PaddleDetection-release-2.6/configs/picodet/_base_/optimizer_300e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..113707a03f6fd63dc075d0426c1a10b15d998140
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/_base_/optimizer_300e.yml
@@ -0,0 +1,18 @@
+epoch: 300
+
+LearningRate:
+ base_lr: 0.32
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 300
+ - name: LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_320_reader.yml b/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_320_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7d6500679dba0f06c6238aa8bed4f2fd0ad8bd5b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_320_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 6
+eval_height: &eval_height 320
+eval_width: &eval_width 320
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_416_reader.yml b/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_416_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ee4ae98865f7eb58994c0a79964d24e41c697373
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_416_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 6
+eval_height: &eval_height 416
+eval_width: &eval_width 416
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [352, 384, 416, 448, 480], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_640_reader.yml b/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_640_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5502026af8b1d0762405db17e655b2b6628dea04
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_640_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 6
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 32
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_v2.yml b/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_v2.yml
new file mode 100644
index 0000000000000000000000000000000000000000..24e92b95cb32e3fd26e819bbe49795c6121cb2b2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/_base_/picodet_v2.yml
@@ -0,0 +1,61 @@
+architecture: PicoDet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x1_5_pretrained.pdparams
+
+PicoDet:
+ backbone: LCNet
+ neck: LCPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 1.5
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 128
+ use_depthwise: True
+ num_features: 4
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 128
+ feat_out: 128
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 128
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ grid_cell_scale: 5.0
+ static_assigner_epoch: 100
+ use_align_head: True
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ force_gt_matching: False
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: False
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.5
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.5
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/picodet/application/pedestrian_detection/picodet_s_192_lcnet_pedestrian.yml b/PaddleDetection-release-2.6/configs/picodet/application/pedestrian_detection/picodet_s_192_lcnet_pedestrian.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bb3d2e9bc923f0ee41c12fbbb7d1a7b91b97d339
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/application/pedestrian_detection/picodet_s_192_lcnet_pedestrian.yml
@@ -0,0 +1,161 @@
+use_gpu: true
+use_xpu: false
+log_iter: 20
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+metric: COCO
+num_classes: 1
+
+architecture: PicoDet
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
+weights: output/picodet_s_192_lcnet_pedestrian/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 300
+snapshot_epoch: 10
+
+PicoDet:
+ backbone: LCNet
+ neck: LCPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 0.75
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 96
+ use_depthwise: True
+ num_features: 4
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ feat_in_chan: 96
+ fpn_stride: [8, 16, 32, 64]
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ grid_cell_scale: 5.0
+ static_assigner_epoch: 100
+ use_align_head: True
+ static_assigner:
+ name: ATSSAssigner
+ topk: 4
+ force_gt_matching: False
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: False
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.5
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.5
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
+
+LearningRate:
+ base_lr: 0.32
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+worker_num: 6
+eval_height: &eval_height 192
+eval_width: &eval_width 192
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [128, 160, 192, 224, 256], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: aic_coco_train_cocoformat.json
+ dataset_dir: dataset
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/picodet/application/pedestrian_detection/picodet_s_320_lcnet_pedestrian.yml b/PaddleDetection-release-2.6/configs/picodet/application/pedestrian_detection/picodet_s_320_lcnet_pedestrian.yml
new file mode 100644
index 0000000000000000000000000000000000000000..91402ba5e6cf8edb587566260c1bb7a202d3be61
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/application/pedestrian_detection/picodet_s_320_lcnet_pedestrian.yml
@@ -0,0 +1,160 @@
+use_gpu: true
+use_xpu: false
+log_iter: 20
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+metric: COCO
+num_classes: 1
+
+architecture: PicoDet
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
+weights: output/picodet_s_320_lcnet_pedestrian/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 300
+snapshot_epoch: 10
+
+PicoDet:
+ backbone: LCNet
+ neck: LCPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 0.75
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 96
+ use_depthwise: True
+ num_features: 4
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ feat_in_chan: 96
+ fpn_stride: [8, 16, 32, 64]
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ grid_cell_scale: 5.0
+ static_assigner_epoch: 100
+ use_align_head: True
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ force_gt_matching: False
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: False
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.5
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.5
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
+
+LearningRate:
+ base_lr: 0.32
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+worker_num: 6
+eval_height: &eval_height 320
+eval_width: &eval_width 320
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: aic_coco_train_cocoformat.json
+ dataset_dir: dataset
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/README.md b/PaddleDetection-release-2.6/configs/picodet/legacy_model/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..19236e28258d7b4036e380b874fc5f7943acb4cb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/README.md
@@ -0,0 +1,60 @@
+# PP-PicoDet Legacy Model-ZOO (2021.10)
+
+| Model | Input size | mAPval
0.5:0.95 | mAPval
0.5 | Params
(M) | FLOPS
(G) | Latency[NCNN](#latency)
(ms) | Latency[Lite](#latency)
(ms) | Download | Config |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- |
+| PicoDet-S | 320*320 | 27.1 | 41.4 | 0.99 | 0.73 | 8.13 | **6.65** | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_s_320_coco.yml) |
+| PicoDet-S | 416*416 | 30.7 | 45.8 | 0.99 | 1.24 | 12.37 | **9.82** | [model](https://paddledet.bj.bcebos.com/models/picodet_s_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_s_416_coco.yml) |
+| PicoDet-M | 320*320 | 30.9 | 45.7 | 2.15 | 1.48 | 11.27 | **9.61** | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_m_320_coco.yml) |
+| PicoDet-M | 416*416 | 34.8 | 50.5 | 2.15 | 2.50 | 17.39 | **15.88** | [model](https://paddledet.bj.bcebos.com/models/picodet_m_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_m_416_coco.yml) |
+| PicoDet-L | 320*320 | 32.9 | 48.2 | 3.30 | 2.23 | 15.26 | **13.42** | [model](https://paddledet.bj.bcebos.com/models/picodet_l_320_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_l_320_coco.yml) |
+| PicoDet-L | 416*416 | 36.6 | 52.5 | 3.30 | 3.76 | 23.36 | **21.85** | [model](https://paddledet.bj.bcebos.com/models/picodet_l_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_l_416_coco.yml) |
+| PicoDet-L | 640*640 | 40.9 | 57.6 | 3.30 | 8.91 | 54.11 | **50.55** | [model](https://paddledet.bj.bcebos.com/models/picodet_l_640_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_l_640_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_l_640_coco.yml) |
+
+#### More Configs
+
+| Model | Input size | mAPval
0.5:0.95 | mAPval
0.5 | Params
(M) | FLOPS
(G) | Latency[NCNN](#latency)
(ms) | Latency[Lite](#latency)
(ms) | Download | Config |
+| :--------------------------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: | :--------------------------------------- |
+| PicoDet-Shufflenetv2 1x | 416*416 | 30.0 | 44.6 | 1.17 | 1.53 | 15.06 | **10.63** | [model](https://paddledet.bj.bcebos.com/models/picodet_shufflenetv2_1x_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_shufflenetv2_1x_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/more_config/picodet_shufflenetv2_1x_416_coco.yml) |
+| PicoDet-MobileNetv3-large 1x | 416*416 | 35.6 | 52.0 | 3.55 | 2.80 | 20.71 | **17.88** | [model](https://paddledet.bj.bcebos.com/models/picodet_mobilenetv3_large_1x_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_mobilenetv3_large_1x_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/more_config/picodet_mobilenetv3_large_1x_416_coco.yml) |
+| PicoDet-LCNet 1.5x | 416*416 | 36.3 | 52.2 | 3.10 | 3.85 | 21.29 | **20.8** | [model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_1_5x_416_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_lcnet_1_5x_416_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/more_config/picodet_lcnet_1_5x_416_coco.yml) |
+| PicoDet-LCNet 1.5x | 640*640 | 40.6 | 57.4 | 3.10 | - | - | - | [model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_1_5x_640_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_lcnet_1_5x_640_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/more_config/picodet_lcnet_1_5x_640_coco.yml) |
+| PicoDet-R18 | 640*640 | 40.7 | 57.2 | 11.10 | - | - | - | [model](https://paddledet.bj.bcebos.com/models/picodet_r18_640_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_r18_640_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/more_config/picodet_r18_640_coco.yml) |
+
+
+Table Notes:
+
+- Latency: All our models test on `Qualcomm Snapdragon 865(4xA77+4xA55)` with 4 threads by arm8 and with FP16. In the above table, test latency on [NCNN](https://github.com/Tencent/ncnn) and `Lite`->[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite). And testing latency with code: [MobileDetBenchmark](https://github.com/JiweiMaster/MobileDetBenchmark).
+- PicoDet is trained on COCO train2017 dataset and evaluated on COCO val2017.
+- PicoDet used 4 or 8 GPUs for training and all checkpoints are trained with default settings and hyperparameters.
+
+
+
+- Deploy models
+
+| Model | Input size | ONNX | Paddle Lite(fp32) | Paddle Lite(fp16) |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: |
+| PicoDet-S | 320*320 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_320_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_320_fp16.tar) |
+| PicoDet-S | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_s_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_s_416_fp16.tar) |
+| PicoDet-M | 320*320 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_320_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_320_fp16.tar) |
+| PicoDet-M | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_m_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_m_416_fp16.tar) |
+| PicoDet-L | 320*320 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_320_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_320_fp16.tar) |
+| PicoDet-L | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_416_fp16.tar) |
+| PicoDet-L | 640*640 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_l_640_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_l_640_fp16.tar) |
+| PicoDet-Shufflenetv2 1x | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_shufflenetv2_1x_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_shufflenetv2_1x.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_shufflenetv2_1x_fp16.tar) |
+| PicoDet-MobileNetv3-large 1x | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_mobilenetv3_large_1x_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_mobilenetv3_large_1x.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_mobilenetv3_large_1x_fp16.tar) |
+| PicoDet-LCNet 1.5x | 416*416 | [model](https://paddledet.bj.bcebos.com/deploy/third_engine/picodet_lcnet_1_5x_416_coco.onnx) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_lcnet_1_5x.tar) | [model](https://paddledet.bj.bcebos.com/deploy/paddlelite/picodet_lcnet_1_5x_fp16.tar) |
+
+
+
+## Cite PP-PicoDet
+```
+@misc{yu2021pppicodet,
+ title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},
+ author={Guanghua Yu and Qinyao Chang and Wenyu Lv and Chang Xu and Cheng Cui and Wei Ji and Qingqing Dang and Kaipeng Deng and Guanzhong Wang and Yuning Du and Baohua Lai and Qiwen Liu and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
+ year={2021},
+ eprint={2111.00902},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+
+```
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/optimizer_100e.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/optimizer_100e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c866b39985cbd3dfd80220798d90e6995299f4f2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/optimizer_100e.yml
@@ -0,0 +1,18 @@
+epoch: 100
+
+LearningRate:
+ base_lr: 0.4
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 100
+ - name: LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/optimizer_300e.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/optimizer_300e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fa4c9094a23d39fcce343a529cacac9beb74a675
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/optimizer_300e.yml
@@ -0,0 +1,18 @@
+epoch: 300
+
+LearningRate:
+ base_lr: 0.4
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 300
+ - name: LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_320_reader.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_320_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4d3f0cbd8648bf2d8ef44cdbf1d2422865a22c94
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_320_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 6
+eval_height: &eval_height 320
+eval_width: &eval_width 320
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 128
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_416_reader.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_416_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..59433c64534163a454ad7e5a07b71d011119913c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_416_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 6
+eval_height: &eval_height 416
+eval_width: &eval_width 416
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [352, 384, 416, 448, 480], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 80
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_640_reader.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_640_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..60904fb6ba77c858a50f1e743e637961c38ccd1f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_640_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 6
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 56
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_esnet.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_esnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..aa099fca12122282641dc456eeb7f232338d447f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/_base_/picodet_esnet.yml
@@ -0,0 +1,55 @@
+architecture: PicoDet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_0_pretrained.pdparams
+
+PicoDet:
+ backbone: ESNet
+ neck: CSPPAN
+ head: PicoHead
+
+ESNet:
+ scale: 1.0
+ feature_maps: [4, 11, 14]
+ act: hard_swish
+ channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
+
+CSPPAN:
+ out_channels: 128
+ use_depthwise: True
+ num_csp_blocks: 1
+ num_features: 4
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 128
+ feat_out: 128
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 128
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: True
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.25
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.0
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ iou_weight: 6
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/README.md b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..2d15c97b213c4ada85f76b6a7f86cbf181398f00
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/README.md
@@ -0,0 +1,56 @@
+# 更多应用
+
+
+## 1. 版面分析任务
+
+版面分析指的是对图片形式的文档进行区域划分,定位其中的关键区域,如文字、标题、表格、图片等。版面分析示意图如下图所示。
+
+
+

+
+
+### 1.1 数据集
+
+使用[PubLayNet](https://github.com/ibm-aur-nlp/PubLayNet)训练英文文档版面分析模型,该数据面向英文文献类(论文)场景,分别训练集(333,703张标注图片)、验证集(11,245张标注图片)和测试集(11,405张图片),包含5类:Table、Figure、Title、Text、List,更多[版面分析数据集](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README.md#32)
+
+### 1.2 模型库
+
+使用PicoDet模型在PubLayNet数据集进行训练,同时采用FGD蒸馏,预训练模型如下:
+
+| 模型 | 图像输入尺寸 | mAPval
0.5 | 下载地址 | 配置文件 |
+| :-------- | :--------: | :----------------: | :---------------: | ----------------- |
+| PicoDet-LCNet_x1_0 | 800*608 | 93.5% | [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout.pdparams) | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_layout_infer.tar) | [config](./picodet_lcnet_x1_0_layout.yml) |
+| PicoDet-LCNet_x1_0 + FGD | 800*608 | 94.0% | [trained model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout.pdparams) | [inference model](https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar) | [teacher config](./picodet_lcnet_x2_5_layout.yml)|[student config](./picodet_lcnet_x1_0_layout.yml) |
+
+ [FGD蒸馏介绍](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/slim/distill/README.md)
+
+### 1.3 模型推理
+
+了解版面分析整个流程(数据准备、模型训练、评估等),请参考[版面分析](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/ppstructure/layout/README.md),这里仅展示模型推理过程。首先下载模型库中的inference_model模型。
+
+```
+mkdir inference_model
+cd inference_model
+# 下载并解压PubLayNet推理模型
+wget https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_infer.tar && tar xf picodet_lcnet_x1_0_fgd_layout_infer.tar
+cd ..
+```
+
+版面恢复任务进行推理,可以执行如下命令:
+
+```bash
+python3 deploy/python/infer.py \
+ --model_dir=inference_model/picodet_lcnet_x1_0_fgd_layout_infer/ \
+ --image_file=docs/images/layout.jpg \
+ --device=CPU
+```
+
+可视化版面结果如下图所示:
+
+
+

+
+
+## 2 Reference
+
+[1] Zhong X, Tang J, Yepes A J. Publaynet: largest dataset ever for document layout analysis[C]//2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 2019: 1015-1022.
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/images/layout_demo.png b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/images/layout_demo.png
new file mode 100644
index 0000000000000000000000000000000000000000..da9640e245e34659771353e328bf97da129bd622
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/images/layout_demo.png differ
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/images/layout_res.jpg b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/images/layout_res.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..93b3a8bef3bfc9f5c80a9505239af05d526b45a7
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/images/layout_res.jpg differ
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml
new file mode 100644
index 0000000000000000000000000000000000000000..46e7e235f704e75f6b73b5497f694ba726a16143
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x1_0_layout.yml
@@ -0,0 +1,88 @@
+_BASE_: [
+ '../../../../runtime.yml',
+ '../../_base_/picodet_esnet.yml',
+ '../../_base_/optimizer_100e.yml',
+ '../../_base_/picodet_640_reader.yml',
+]
+
+weights: output/picodet_lcnet_x1_0_layout/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 10
+snapshot_epoch: 1
+epoch: 100
+
+PicoDet:
+ backbone: LCNet
+ neck: CSPPAN
+ head: PicoHead
+
+LCNet:
+ scale: 1.0
+ feature_maps: [3, 4, 5]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train
+ anno_path: train.json
+ dataset_dir: ./dataset/publaynet/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val
+ anno_path: val.json
+ dataset_dir: ./dataset/publaynet/
+
+TestDataset:
+ !ImageFolder
+ anno_path: ./dataset/publaynet/val.json
+
+
+worker_num: 8
+eval_height: &eval_height 800
+eval_width: &eval_width 608
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [[768, 576], [800, 608], [832, 640]], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 24
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 608], keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, 800, 608]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 608], keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cb7f2157dc3fab588c25c569b3d504a7cb58a9ed
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/layout_analysis/picodet_lcnet_x2_5_layout.yml
@@ -0,0 +1,32 @@
+_BASE_: [
+ '../../_base_/picodet_esnet.yml',
+]
+
+weights: output/picodet_lcnet_x2_5_layout/model_final
+find_unused_parameters: True
+
+PicoDet:
+ backbone: LCNet
+ neck: CSPPAN
+ head: PicoHead
+
+LCNet:
+ scale: 2.5
+ feature_maps: [3, 4, 5]
+
+CSPPAN:
+ spatial_scales: [0.125, 0.0625, 0.03125]
+
+slim: Distill
+slim_method: FGD
+distill_loss: FGDFeatureLoss
+distill_loss_name: ['neck_f_3', 'neck_f_2', 'neck_f_1', 'neck_f_0']
+
+FGDFeatureLoss:
+ student_channels: 128
+ teacher_channels: 128
+ temp: 0.5
+ alpha_fgd: 0.001
+ beta_fgd: 0.0005
+ gamma_fgd: 0.0005
+ lambda_fgd: 0.000005
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/mainbody_detection/README.md b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/mainbody_detection/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..0408587e62a81dbd97ae9128f59497287da26f5f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/mainbody_detection/README.md
@@ -0,0 +1,30 @@
+# 更多应用
+
+
+## 1. 主体检测任务
+
+主体检测技术是目前应用非常广泛的一种检测技术,它指的是检测出图片中一个或者多个主体的坐标位置,然后将图像中的对应区域裁剪下来,进行识别,从而完成整个识别过程。主体检测是识别任务的前序步骤,可以有效提升识别精度。
+
+主体检测是图像识别的前序步骤,被用于PaddleClas的PP-ShiTu图像识别系统中。PP-ShiTu中使用的主体检测模型基于PP-PicoDet。更多关于PP-ShiTu的介绍与使用可以参考:[PP-ShiTu](https://github.com/PaddlePaddle/PaddleClas)。
+
+
+### 1.1 数据集
+
+PP-ShiTu图像识别任务中,训练主体检测模型时主要用到了以下几个数据集。
+
+| 数据集 | 数据量 | 主体检测任务中使用的数据量 | 场景 | 数据集地址 |
+| :------------: | :-------------: | :-------: | :-------: | :--------: |
+| Objects365 | 1700K | 173k | 通用场景 | [地址](https://www.objects365.org/overview.html) |
+| COCO2017 | 118K | 118k | 通用场景 | [地址](https://cocodataset.org/) |
+| iCartoonFace | 48k | 48k | 动漫人脸检测 | [地址](https://github.com/luxiangju-PersonAI/iCartoonFace) |
+| LogoDet-3k | 155k | 155k | Logo检测 | [地址](https://github.com/Wangjing1551/LogoDet-3K-Dataset) |
+| RPC | 54k | 54k | 商品检测 | [地址](https://rpc-dataset.github.io/) |
+
+在实际训练的过程中,将所有数据集混合在一起。由于是主体检测,这里将所有标注出的检测框对应的类别都修改为 `前景` 的类别,最终融合的数据集中只包含 1 个类别,即前景,数据集定义配置可以参考[picodet_lcnet_x2_5_640_mainbody.yml](./picodet_lcnet_x2_5_640_mainbody.yml)。
+
+
+### 1.2 模型库
+
+| 模型 | 图像输入尺寸 | mAPval
0.5:0.95 | mAPval
0.5 | 下载地址 | config |
+| :-------- | :--------: | :---------------------: | :----------------: | :----------------: | :---------------: |
+| PicoDet-LCNet_x2_5 | 640*640 | 41.5 | 62.0 | [trained model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_x2_5_640_mainbody.pdparams) | [inference model](https://paddledet.bj.bcebos.com/models/picodet_lcnet_x2_5_640_mainbody_infer.tar) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_lcnet_x2_5_640_mainbody.log) | [config](./picodet_lcnet_x2_5_640_mainbody.yml) |
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/mainbody_detection/picodet_lcnet_x2_5_640_mainbody.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/mainbody_detection/picodet_lcnet_x2_5_640_mainbody.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cc06c1c7a58cd089c698ab6a35175fdbc317540f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/mainbody_detection/picodet_lcnet_x2_5_640_mainbody.yml
@@ -0,0 +1,41 @@
+_BASE_: [
+ '../../../../runtime.yml',
+ '../../_base_/picodet_esnet.yml',
+ '../../_base_/optimizer_100e.yml',
+ '../../_base_/picodet_640_reader.yml',
+]
+
+weights: output/picodet_lcnet_x2_5_640_mainbody/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 20
+snapshot_epoch: 2
+
+PicoDet:
+ backbone: LCNet
+ neck: CSPPAN
+ head: PicoHead
+
+LCNet:
+ scale: 2.5
+ feature_maps: [3, 4, 5]
+
+metric: COCO
+num_classes: 1
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ./
+ anno_path: train.json
+ dataset_dir: dataset/mainbody/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ./
+ anno_path: val.json
+ dataset_dir: dataset/mainbody/
+
+TestDataset:
+ !ImageFolder
+ anno_path: ./dataset/mainbody/val.json
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/pedestrian_detection/picodet_s_192_pedestrian.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/pedestrian_detection/picodet_s_192_pedestrian.yml
new file mode 100644
index 0000000000000000000000000000000000000000..47f50425d138d174fce50f5e82e8be06382c4ade
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/pedestrian_detection/picodet_s_192_pedestrian.yml
@@ -0,0 +1,148 @@
+use_gpu: true
+log_iter: 20
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+weights: output/picodet_s_192_pedestrian/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+epoch: 300
+metric: COCO
+num_classes: 1
+# Exporting the model
+export:
+ post_process: False # Whether post-processing is included in the network when export model.
+ nms: False # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+architecture: PicoDet
+
+PicoDet:
+ backbone: ESNet
+ neck: CSPPAN
+ head: PicoHead
+
+ESNet:
+ scale: 0.75
+ feature_maps: [4, 11, 14]
+ act: hard_swish
+ channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
+
+CSPPAN:
+ out_channels: 96
+ use_depthwise: True
+ num_csp_blocks: 1
+ num_features: 4
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 96
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: True
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.25
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.0
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ iou_weight: 6
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
+
+LearningRate:
+ base_lr: 0.4
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: aic_coco_train_cocoformat.json
+ dataset_dir: dataset
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/instances_val2017.json
+
+worker_num: 8
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [128, 160, 192, 224, 256], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 128
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [192, 192], keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, 192, 192]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [192, 192], keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ fuse_normalize: true
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/pedestrian_detection/picodet_s_320_pedestrian.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/pedestrian_detection/picodet_s_320_pedestrian.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cf78ea61986201aaf58c78e4b8d0f6bdcd361464
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/application/pedestrian_detection/picodet_s_320_pedestrian.yml
@@ -0,0 +1,147 @@
+use_gpu: true
+log_iter: 20
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+weights: output/picodet_s_320_pedestrian/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+epoch: 300
+metric: COCO
+num_classes: 1
+# Exporting the model
+export:
+ post_process: False # Whether post-processing is included in the network when export model.
+ nms: False # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+architecture: PicoDet
+
+PicoDet:
+ backbone: ESNet
+ neck: CSPPAN
+ head: PicoHead
+
+ESNet:
+ scale: 0.75
+ feature_maps: [4, 11, 14]
+ act: hard_swish
+ channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
+
+CSPPAN:
+ out_channels: 96
+ use_depthwise: True
+ num_csp_blocks: 1
+ num_features: 4
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 96
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: True
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.25
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.0
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ iou_weight: 6
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
+
+LearningRate:
+ base_lr: 0.4
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: aic_coco_train_cocoformat.json
+ dataset_dir: dataset
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/instances_val2017.json
+
+worker_num: 8
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [256, 288, 320, 352, 384], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 128
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [320, 320], keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, 320, 320]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [320, 320], keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_lcnet_1_5x_416_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_lcnet_1_5x_416_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..aaa1a807c70a4a028f2195ae8633e0cce2cd0bde
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_lcnet_1_5x_416_coco.yml
@@ -0,0 +1,23 @@
+_BASE_: [
+ '../../../datasets/coco_detection.yml',
+ '../../../runtime.yml',
+ '../_base_/picodet_esnet.yml',
+ '../_base_/optimizer_300e.yml',
+ '../_base_/picodet_416_reader.yml',
+]
+
+
+weights: output/picodet_lcnet_1_5x_416_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+
+PicoDet:
+ backbone: LCNet
+ neck: CSPPAN
+ head: PicoHead
+
+LCNet:
+ scale: 1.5
+ feature_maps: [3, 4, 5]
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_lcnet_1_5x_640_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_lcnet_1_5x_640_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7622d749cb267168ccd5d481bbe8bb4fbbc25054
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_lcnet_1_5x_640_coco.yml
@@ -0,0 +1,49 @@
+_BASE_: [
+ '../../../datasets/coco_detection.yml',
+ '../../../runtime.yml',
+ '../_base_/picodet_esnet.yml',
+ '../_base_/optimizer_300e.yml',
+ '../_base_/picodet_640_reader.yml',
+]
+
+
+weights: output/picodet_lcnet_1_5x_640_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+
+PicoDet:
+ backbone: LCNet
+ neck: CSPPAN
+ head: PicoHead
+
+LCNet:
+ scale: 1.5
+ feature_maps: [3, 4, 5]
+
+CSPPAN:
+ out_channels: 160
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ feat_in_chan: 160
+
+TrainReader:
+ batch_size: 24
+
+LearningRate:
+ base_lr: 0.2
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_mobilenetv3_large_1x_416_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_mobilenetv3_large_1x_416_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..375bff97b7677bbe256a8f93b8d10218dc4cc2bf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_mobilenetv3_large_1x_416_coco.yml
@@ -0,0 +1,39 @@
+_BASE_: [
+ '../../../datasets/coco_detection.yml',
+ '../../../runtime.yml',
+ '../_base_/picodet_esnet.yml',
+ '../_base_/optimizer_300e.yml',
+ '../_base_/picodet_416_reader.yml',
+]
+
+
+weights: output/picodet_mobilenetv3_large_1x_416_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+epoch: 180
+
+PicoDet:
+ backbone: MobileNetV3
+ neck: CSPPAN
+ head: PicoHead
+
+MobileNetV3:
+ model_name: large
+ scale: 1.0
+ with_extra_blocks: false
+ extra_block_filters: []
+ feature_maps: [7, 13, 16]
+
+TrainReader:
+ batch_size: 56
+
+LearningRate:
+ base_lr: 0.3
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_r18_640_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_r18_640_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a1f60d4f01cec6ff1629f32538f1f783496a3825
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_r18_640_coco.yml
@@ -0,0 +1,39 @@
+_BASE_: [
+ '../../../datasets/coco_detection.yml',
+ '../../../runtime.yml',
+ '../_base_/picodet_esnet.yml',
+ '../_base_/optimizer_300e.yml',
+ '../_base_/picodet_640_reader.yml',
+]
+
+
+weights: output/picodet_r18_640_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+
+PicoDet:
+ backbone: ResNet
+ neck: CSPPAN
+ head: PicoHead
+
+ResNet:
+ depth: 18
+ variant: d
+ return_idx: [1, 2, 3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+TrainReader:
+ batch_size: 56
+
+LearningRate:
+ base_lr: 0.3
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_shufflenetv2_1x_416_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_shufflenetv2_1x_416_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..15f62bcec5cc503c5ab0329dc868e789e87b2fe3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/more_config/picodet_shufflenetv2_1x_416_coco.yml
@@ -0,0 +1,37 @@
+_BASE_: [
+ '../../../datasets/coco_detection.yml',
+ '../../../runtime.yml',
+ '../_base_/picodet_esnet.yml',
+ '../_base_/optimizer_300e.yml',
+ '../_base_/picodet_416_reader.yml',
+]
+
+weights: output/picodet_shufflenetv2_1x_416_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+
+PicoDet:
+ backbone: ShuffleNetV2
+ neck: CSPPAN
+ head: PicoHead
+
+ShuffleNetV2:
+ scale: 1.0
+ feature_maps: [5, 13, 17]
+ act: leaky_relu
+
+CSPPAN:
+ out_channels: 96
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ feat_in_chan: 96
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_l_320_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_l_320_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a41d5823e3bc3872dd90220408fd73ef7bcf8f3d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_l_320_coco.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../../datasets/coco_detection.yml',
+ '../../runtime.yml',
+ '_base_/picodet_esnet.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_320_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
+weights: output/picodet_l_320_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+epoch: 250
+
+ESNet:
+ scale: 1.25
+ feature_maps: [4, 11, 14]
+ act: hard_swish
+ channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
+
+CSPPAN:
+ out_channels: 160
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ feat_in_chan: 160
+
+TrainReader:
+ batch_size: 56
+
+LearningRate:
+ base_lr: 0.3
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_l_416_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_l_416_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fcee20c3eff8fef97247ca3b4cfb5db95124114b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_l_416_coco.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../../datasets/coco_detection.yml',
+ '../../runtime.yml',
+ '_base_/picodet_esnet.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_416_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
+weights: output/picodet_l_416_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+epoch: 250
+
+ESNet:
+ scale: 1.25
+ feature_maps: [4, 11, 14]
+ act: hard_swish
+ channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
+
+CSPPAN:
+ out_channels: 160
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ feat_in_chan: 160
+
+TrainReader:
+ batch_size: 48
+
+LearningRate:
+ base_lr: 0.3
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_l_640_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_l_640_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..990db111dfe704f8dda661d09a2e7eb6474aa262
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_l_640_coco.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../../datasets/coco_detection.yml',
+ '../../runtime.yml',
+ '_base_/picodet_esnet.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_640_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams
+weights: output/picodet_l_640_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+epoch: 250
+
+ESNet:
+ scale: 1.25
+ feature_maps: [4, 11, 14]
+ act: hard_swish
+ channel_ratio: [0.875, 0.5, 1.0, 0.625, 0.5, 0.75, 0.625, 0.625, 0.5, 0.625, 1.0, 0.625, 0.75]
+
+CSPPAN:
+ out_channels: 160
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ feat_in_chan: 160
+
+TrainReader:
+ batch_size: 32
+
+LearningRate:
+ base_lr: 0.3
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_m_320_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_m_320_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5b9f1ce7aa8f1b4cf1e9eea42b618c284b92c98f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_m_320_coco.yml
@@ -0,0 +1,13 @@
+_BASE_: [
+ '../../datasets/coco_detection.yml',
+ '../../runtime.yml',
+ '_base_/picodet_esnet.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_320_reader.yml',
+]
+
+weights: output/picodet_m_320_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_m_416_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_m_416_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8c52f72ead3f523435a091b9ffaade66929e9645
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_m_416_coco.yml
@@ -0,0 +1,13 @@
+_BASE_: [
+ '../../datasets/coco_detection.yml',
+ '../../runtime.yml',
+ '_base_/picodet_esnet.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_416_reader.yml',
+]
+
+weights: output/picodet_m_416_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_s_320_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_s_320_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9945e3db13967ec875e050619cb66cac7827aa3e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_s_320_coco.yml
@@ -0,0 +1,34 @@
+_BASE_: [
+ '../../datasets/coco_detection.yml',
+ '../../runtime.yml',
+ '_base_/picodet_esnet.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_320_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
+weights: output/picodet_s_320_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+
+ESNet:
+ scale: 0.75
+ feature_maps: [4, 11, 14]
+ act: hard_swish
+ channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
+
+CSPPAN:
+ out_channels: 96
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ feat_in_chan: 96
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_s_320_voc.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_s_320_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0be56616dcede4d60e4a54fb09cc66c45c63ebdf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_s_320_voc.yml
@@ -0,0 +1,37 @@
+_BASE_: [
+ '../../datasets/voc.yml',
+ '../../runtime.yml',
+ '_base_/picodet_esnet.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_320_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
+weights: output/picodet_s_320_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+
+ESNet:
+ scale: 0.75
+ feature_maps: [4, 11, 14]
+ act: hard_swish
+ channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
+
+CSPPAN:
+ out_channels: 96
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ feat_in_chan: 96
+
+EvalReader:
+ collate_batch: false
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_s_416_coco.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_s_416_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3764b6e4f26cc328b5e7a19815f88bbadf6e4013
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/picodet_s_416_coco.yml
@@ -0,0 +1,34 @@
+_BASE_: [
+ '../../datasets/coco_detection.yml',
+ '../../runtime.yml',
+ '_base_/picodet_esnet.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_416_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x0_75_pretrained.pdparams
+weights: output/picodet_s_416_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
+
+ESNet:
+ scale: 0.75
+ feature_maps: [4, 11, 14]
+ act: hard_swish
+ channel_ratio: [0.875, 0.5, 0.5, 0.5, 0.625, 0.5, 0.625, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]
+
+CSPPAN:
+ out_channels: 96
+
+PicoHead:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ feat_in_chan: 96
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/pruner/README.md b/PaddleDetection-release-2.6/configs/picodet/legacy_model/pruner/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..ce51e36862827f16c1aac54531e15128eab79a87
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/pruner/README.md
@@ -0,0 +1,135 @@
+# 非结构化稀疏在 PicoDet 上的应用教程
+
+## 1. 介绍
+在模型压缩中,常见的稀疏方式为结构化稀疏和非结构化稀疏,前者在某个特定维度(特征通道、卷积核等等)上对卷积、矩阵乘法进行剪枝操作,然后生成一个更小的模型结构,这样可以复用已有的卷积、矩阵乘计算,无需特殊实现推理算子;后者以每一个参数为单元进行稀疏化,然而并不会改变参数矩阵的形状,所以更依赖于推理库、硬件对于稀疏后矩阵运算的加速能力。我们在 PP-PicoDet (以下简称PicoDet) 模型上运用了非结构化稀疏技术,在精度损失较小时,获得了在 ARM CPU 端推理的显著性能提升。本文档会介绍如何非结构化稀疏训练 PicoDet,关于非结构化稀疏的更多介绍请参照[这里](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/dygraph/unstructured_pruning)。
+
+## 2. 版本要求
+```bash
+PaddlePaddle >= 2.1.2
+PaddleSlim develop分支 (pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple)
+```
+
+## 3. 数据准备
+同 PicoDet
+
+## 4. 预训练模型
+在非结构化稀疏训练中,我们规定预训练模型是已经收敛完成的模型参数,所以需要额外在相关配置文件中声明。
+
+声明预训练模型地址的配置文件:./configs/picodet/pruner/picodet_m_320_coco_pruner.yml
+预训练模型地址请参照 PicoDet 文档:./configs/picodet/README.md
+
+## 5. 自定义稀疏化的作用范围
+为达到最佳推理加速效果,我们建议只对 1x1 卷积层进行稀疏化,其他层参数保持稠密。另外,有些层对于精度影响较大(例如head的最后几层,se-block的若干层),我们同样不建议对他们进行稀疏化,我们支持开发者通过传入自定义函数的形式,方便的指定哪些层不参与稀疏。例如,基于picodet_m_320这个模型,我们稀疏时跳过了后4层卷积以及6层se-block中的卷积,自定义函数如下:
+
+```python
+NORMS_ALL = [ 'BatchNorm', 'GroupNorm', 'LayerNorm', 'SpectralNorm', 'BatchNorm1D',
+ 'BatchNorm2D', 'BatchNorm3D', 'InstanceNorm1D', 'InstanceNorm2D',
+ 'InstanceNorm3D', 'SyncBatchNorm', 'LocalResponseNorm' ]
+
+def skip_params_self(model):
+ skip_params = set()
+ for _, sub_layer in model.named_sublayers():
+ if type(sub_layer).__name__.split('.')[-1] in NORMS_ALL:
+ skip_params.add(sub_layer.full_name())
+ for param in sub_layer.parameters(include_sublayers=False):
+ cond_is_conv1x1 = len(param.shape) == 4 and param.shape[2] == 1 and param.shape[3] == 1
+ cond_is_head_m = cond_is_conv1x1 and param.shape[0] == 112 and param.shape[1] == 128
+ cond_is_se_block_m = param.name.split('.')[0] in ['conv2d_17', 'conv2d_18', 'conv2d_56', 'conv2d_57', 'conv2d_75', 'conv2d_76']
+ if not cond_is_conv1x1 or cond_is_head_m or cond_is_se_block_m:
+ skip_params.add(param.name)
+ return skip_params
+```
+
+## 6. 训练
+我们已经将非结构化稀疏的核心功能通过 API 调用的方式嵌入到了训练中,所以如果您没有更细节的需求,直接运行 6.1 的命令启动训练即可。同时,为帮助您根据自己的需求更改、适配代码,我们也提供了更为详细的使用介绍,请参照 6.2。
+
+### 6.1 直接使用
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python3.7 -m paddle.distributed.launch --log_dir=log_test --gpus 0,1,2,3 tools/train.py -c configs/picodet/pruner/picodet_m_320_coco_pruner.yml --slim_config configs/slim/prune/picodet_m_unstructured_prune_75.yml --eval
+```
+
+### 6.2 详细介绍
+- 自定义稀疏化的作用范围:可以参照本教程的第 5 节
+- 如何添加稀疏化训练所需的 4 行代码
+
+```python
+# after constructing model and before training
+
+# Pruner Step1: configs
+configs = {
+ 'pruning_strategy': 'gmp',
+ 'stable_iterations': self.stable_epochs * steps_per_epoch,
+ 'pruning_iterations': self.pruning_epochs * steps_per_epoch,
+ 'tunning_iterations': self.tunning_epochs * steps_per_epoch,
+ 'resume_iteration': 0,
+ 'pruning_steps': self.pruning_steps,
+ 'initial_ratio': self.initial_ratio,
+}
+
+# Pruner Step2: construct a pruner object
+self.pruner = GMPUnstructuredPruner(
+ model,
+ ratio=self.cfg.ratio,
+ skip_params_func=skip_params_self, # Only pass in this value when you design your own skip_params function. And the following argument (skip_params_type) will be ignored.
+ skip_params_type=self.cfg.skip_params_type,
+ local_sparsity=True,
+ configs=configs)
+
+# training
+for epoch_id in range(self.start_epoch, self.cfg.epoch):
+ model.train()
+ for step_id, data in enumerate(self.loader):
+ # model forward
+ outputs = model(data)
+ loss = outputs['loss']
+ # model backward
+ loss.backward()
+ self.optimizer.step()
+
+ # Pruner Step3: step during training
+ self.pruner.step()
+
+ # Pruner Step4: save the sparse model
+ self.pruner.update_params()
+ # model-saving API
+```
+
+## 7. 模型评估与推理部署
+这部分与 PicoDet 文档中基本一致,只是在转换到 PaddleLite 模型时,需要添加一个输入参数(sparse_model):
+
+```bash
+paddle_lite_opt --model_dir=inference_model/picodet_m_320_coco --valid_targets=arm --optimize_out=picodet_m_320_coco_fp32_sparse --sparse_model=True
+```
+
+**注意:** 目前稀疏化推理适用于 PaddleLite的 FP32 和 INT8 模型,所以执行上述命令时,请不要打开 FP16 开关。
+
+## 8. 稀疏化结果
+我们在75%和85%稀疏度下,训练得到了 FP32 PicoDet-m模型,并在 SnapDragon-835设备上实测推理速度,效果如下表。其中:
+- 对于 m 模型,mAP损失1.5,获得了 34\%-58\% 的加速性能
+- 同样对于 m 模型,除4线程推理速度基本持平外,单线程推理速度、mAP、模型体积均优于 s 模型。
+
+
+| Model | Input size | Sparsity | mAPval
0.5:0.95 | Size
(MB) | Latency single-thread[Lite](#latency)
(ms) | speed-up single-thread | Latency 4-thread[Lite](#latency)
(ms) | speed-up 4-thread | Download | SlimConfig |
+| :-------- | :--------: |:--------: | :---------------------: | :----------------: | :----------------: |:----------------: | :---------------: | :-----------------------------: | :-----------------------------: | :----------------------------------------: |
+| PicoDet-m-1.0 | 320*320 | 0 | 30.9 | 8.9 | 127 | 0 | 43 | 0 | [model](https://paddledet.bj.bcebos.com/models/picodet_m_320_coco.pdparams)| [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet/picodet_m_320_coco.yml)|
+| PicoDet-m-1.0 | 320*320 | 75% | 29.4 | 5.6 | **80** | 58% | **32** | 34% | [model](https://paddledet.bj.bcebos.com/models/slim/picodet_m_320__coco_sparse_75.pdparams)| [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320__coco_sparse_75.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/slim/prune/picodet_m_unstructured_prune_75.yml)|
+| PicoDet-s-1.0 | 320*320 | 0 | 27.1 | 4.6 | 68 | 0 | 26 | 0 | [model](https://paddledet.bj.bcebos.com/models/picodet_s_320_coco.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_s_320_coco.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet/picodet_s_320_coco.yml)|
+| PicoDet-m-1.0 | 320*320 | 85% | 27.6 | 4.1 | **65** | 96% | **27** | 59% | [model](https://paddledet.bj.bcebos.com/models/slim/picodet_m_320__coco_sparse_85.pdparams) | [log](https://paddledet.bj.bcebos.com/logs/train_picodet_m_320__coco_sparse_85.log) | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/slim/prune/picodet_m_unstructured_prune_85.yml)|
+
+**注意:**
+- 上述模型体积是**部署模型体积**,即 PaddleLite 转换得到的 *.nb 文件的体积。
+- 加速一栏我们按照 FPS 增加百分比计算,即:$(dense\_latency - sparse\_latency) / sparse\_latency$
+- 上述稀疏化训练时,我们额外添加了一种数据增强方式到 _base_/picodet_320_reader.yml,代码如下。但是不添加的话,预期mAP也不会有明显下降(<0.1),且对速度和模型体积没有影响。
+```yaml
+worker_num: 6
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomDistort: {}
+ batch_transforms:
+etc.
+```
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/pruner/optimizer_300e_pruner.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/pruner/optimizer_300e_pruner.yml
new file mode 100644
index 0000000000000000000000000000000000000000..064d5623372bc8a7122bfc073bd289edc2a0b1b5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/pruner/optimizer_300e_pruner.yml
@@ -0,0 +1,18 @@
+epoch: 300
+
+LearningRate:
+ base_lr: 0.15
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 1.0
+ steps: 34350
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/picodet/legacy_model/pruner/picodet_m_320_coco_pruner.yml b/PaddleDetection-release-2.6/configs/picodet/legacy_model/pruner/picodet_m_320_coco_pruner.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cf55a882ead0739c5859c7f335cbeb0d20f6415c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/legacy_model/pruner/picodet_m_320_coco_pruner.yml
@@ -0,0 +1,13 @@
+_BASE_: [
+ '../../../datasets/coco_detection.yml',
+ '../../../runtime.yml',
+ '../_base_/picodet_esnet.yml',
+ './optimizer_300e_pruner.yml',
+ '../_base_/picodet_320_reader.yml',
+]
+
+weights: output/picodet_m_320_coco/model_final
+find_unused_parameters: True
+use_ema: true
+cycle_epoch: 40
+snapshot_epoch: 10
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_l_320_coco_lcnet.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_l_320_coco_lcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c9225ff30f563877c8870dfaefdefb46f50effd7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_l_320_coco_lcnet.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_320_reader.yml',
+]
+
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams
+weights: output/picodet_l_320_coco/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 250
+snapshot_epoch: 10
+
+LCNet:
+ scale: 2.0
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 160
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ feat_in_chan: 160
+
+LearningRate:
+ base_lr: 0.12
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+TrainReader:
+ batch_size: 24
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_l_416_coco_lcnet.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_l_416_coco_lcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f508e21d7ea3ddc518b4618873d78b56625bb93f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_l_416_coco_lcnet.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_416_reader.yml',
+]
+
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams
+weights: output/picodet_l_320_coco/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 250
+snapshot_epoch: 10
+
+LCNet:
+ scale: 2.0
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 160
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ feat_in_chan: 160
+
+LearningRate:
+ base_lr: 0.12
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 300
+ - name: LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+TrainReader:
+ batch_size: 24
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_l_640_coco_lcnet.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_l_640_coco_lcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2fadd6a25b9ebe1e598cf10cbf01af23eefc14d4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_l_640_coco_lcnet.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_640_reader.yml',
+]
+
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x2_0_pretrained.pdparams
+weights: output/picodet_l_320_coco/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 200
+snapshot_epoch: 10
+
+LCNet:
+ scale: 2.0
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 160
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ feat_in_chan: 160
+
+LearningRate:
+ base_lr: 0.06
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
+
+TrainReader:
+ batch_size: 12
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_m_320_coco_lcnet.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_m_320_coco_lcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bd188c2188f73400e2423629aa8856137aa5082c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_m_320_coco_lcnet.yml
@@ -0,0 +1,25 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_320_reader.yml',
+]
+
+weights: output/picodet_m_320_coco/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 300
+snapshot_epoch: 10
+
+TrainReader:
+ batch_size: 48
+
+LearningRate:
+ base_lr: 0.24
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 300
+ - name: LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_m_416_coco_lcnet.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_m_416_coco_lcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c224f4e0975f04a7e76c0d80c511b730c02175d4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_m_416_coco_lcnet.yml
@@ -0,0 +1,25 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_416_reader.yml',
+]
+
+weights: output/picodet_m_416_coco/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 250
+snapshot_epoch: 10
+
+TrainReader:
+ batch_size: 48
+
+LearningRate:
+ base_lr: 0.24
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 300
+ - name: LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_s_320_coco_lcnet.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_s_320_coco_lcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c9fb52f320239cccf30257fe695e82fb5bb26121
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_s_320_coco_lcnet.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_320_reader.yml',
+]
+
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
+weights: output/picodet_s_320_coco/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 300
+snapshot_epoch: 10
+
+LCNet:
+ scale: 0.75
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 96
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ feat_in_chan: 96
+
+TrainReader:
+ batch_size: 64
+
+LearningRate:
+ base_lr: 0.32
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_s_416_coco_lcnet.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_s_416_coco_lcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ed00b479b3729e4202e190e945d2c5ddce0f7f4a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_s_416_coco_lcnet.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_416_reader.yml',
+]
+
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
+weights: output/picodet_s_416_coco/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 300
+snapshot_epoch: 10
+
+LCNet:
+ scale: 0.75
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 96
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ feat_in_chan: 96
+
+TrainReader:
+ batch_size: 48
+
+LearningRate:
+ base_lr: 0.24
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_s_416_coco_npu.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_s_416_coco_npu.yml
new file mode 100644
index 0000000000000000000000000000000000000000..761cfde11334b42c993682d981c4ea28c44da092
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_s_416_coco_npu.yml
@@ -0,0 +1,106 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+]
+
+pretrain_weights: https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/legendary_models/PPLCNet_x0_75_pretrained.pdparams
+weights: output/picodet_s_416_coco/best_model
+find_unused_parameters: True
+keep_best_weight: True
+use_ema: True
+epoch: 300
+snapshot_epoch: 10
+
+PicoDet:
+ backbone: LCNet
+ neck: CSPPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 0.75
+ feature_maps: [3, 4, 5]
+ act: relu6
+
+CSPPAN:
+ out_channels: 96
+ use_depthwise: True
+ num_csp_blocks: 1
+ num_features: 4
+ act: relu6
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ act: relu6
+ feat_in_chan: 96
+ act: relu6
+
+LearningRate:
+ base_lr: 0.2
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ min_lr_ratio: 0.08
+ last_plateau_epochs: 30
+ - !ExpWarmup
+ epochs: 2
+
+worker_num: 6
+eval_height: &eval_height 416
+eval_width: &eval_width 416
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Mosaic:
+ prob: 0.6
+ input_dim: [640, 640]
+ degrees: [-10, 10]
+ scale: [0.1, 2.0]
+ shear: [-2, 2]
+ translate: [-0.1, 0.1]
+ enable_mixup: True
+ - AugmentHSV: {is_bgr: False, hgain: 5, sgain: 30, vgain: 30}
+ - RandomFlip: {prob: 0.5}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 40
+ shuffle: true
+ drop_last: true
+ mosaic_epoch: 180
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_xs_320_coco_lcnet.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_xs_320_coco_lcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3b1b75313daa2b0e1bb72474dbf2e2f1ace5ff52
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_xs_320_coco_lcnet.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_320_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x0_35_pretrained.pdparams
+weights: output/picodet_xs_320_coco/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 300
+snapshot_epoch: 10
+
+LCNet:
+ scale: 0.35
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 96
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ feat_in_chan: 96
+
+TrainReader:
+ batch_size: 64
+
+LearningRate:
+ base_lr: 0.32
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/picodet/picodet_xs_416_coco_lcnet.yml b/PaddleDetection-release-2.6/configs/picodet/picodet_xs_416_coco_lcnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8ca47d23a9c6541e9d02aac74fa43d31b8469ed9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/picodet/picodet_xs_416_coco_lcnet.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/picodet_v2.yml',
+ '_base_/optimizer_300e.yml',
+ '_base_/picodet_416_reader.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/LCNet_x0_35_pretrained.pdparams
+weights: output/picodet_xs_416_coco/best_model
+find_unused_parameters: True
+use_ema: true
+epoch: 300
+snapshot_epoch: 10
+
+LCNet:
+ scale: 0.35
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 96
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 96
+ feat_out: 96
+ num_convs: 2
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ feat_in_chan: 96
+
+TrainReader:
+ batch_size: 56
+
+LearningRate:
+ base_lr: 0.28
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 300
+ - name: LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/pose3d/README.md b/PaddleDetection-release-2.6/configs/pose3d/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..0b9dec7e9c58740bede2de59a5163fa59528094b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pose3d/README.md
@@ -0,0 +1,157 @@
+简体中文
+
+
+

+
+
+# 3D Pose系列模型
+
+## 目录
+
+- [简介](#简介)
+- [模型推荐](#模型推荐)
+- [快速开始](#快速开始)
+ - [环境安装](#1环境安装)
+ - [数据准备](#2数据准备)
+ - [训练与测试](#3训练与测试)
+ - [单卡训练](#单卡训练)
+ - [多卡训练](#多卡训练)
+ - [模型评估](#模型评估)
+ - [模型预测](#模型预测)
+ - [使用说明](#4使用说明)
+
+## 简介
+
+PaddleDetection 中提供了两种3D Pose算法(稀疏关键点),分别是适用于服务器端的大模型Metro3D和移动端的TinyPose3D。其中Metro3D基于[End-to-End Human Pose and Mesh Reconstruction with Transformers](https://arxiv.org/abs/2012.09760)进行了稀疏化改造,TinyPose3D是在TinyPose基础上修改输出3D关键点。
+
+## 模型推荐
+
+|模型|适用场景|human3.6m精度(14关键点)|human3.6m精度(17关键点)|模型下载|
+|:--:|:--:|:--:|:--:|:--:|
+|Metro3D|服务器端|56.014|46.619|[metro3d_24kpts.pdparams](https://bj.bcebos.com/v1/paddledet/models/pose3d/metro3d_24kpts.pdparams)|
+|TinyPose3D|移动端|86.381|71.223|[tinypose3d_human36m.pdparams](https://bj.bcebos.com/v1/paddledet/models/pose3d/tinypose3d_human36M.pdparams)|
+
+注:
+1. 训练数据基于 [MeshTransfomer](https://github.com/microsoft/MeshTransformer) 中的训练数据。
+2. 测试精度同 MeshTransfomer 采用 14 关键点测试。
+
+## 快速开始
+
+### 1、环境安装
+
+ 请参考PaddleDetection [安装文档](../../docs/tutorials/INSTALL_cn.md)正确安装PaddlePaddle和PaddleDetection即可。
+
+### 2、数据准备
+
+ 我们的训练数据由coco、human3.6m、hr-lspet、posetrack3d、mpii组成。
+
+ 2.1 我们的训练数据下载地址为:
+
+ [coco](https://bj.bcebos.com/v1/paddledet/data/coco.tar)
+
+ [human3.6m](https://bj.bcebos.com/v1/paddledet/data/pose3d/human3.6m.tar.gz)
+
+ [lspet+posetrack+mpii](https://bj.bcebos.com/v1/paddledet/data/pose3d/pose3d_others.tar.gz)
+
+ [标注文件下载](https://bj.bcebos.com/v1/paddledet/data/pose3d/pose3d.tar.gz)
+
+ 2.2 数据下载后按如下结构放在repo目录下
+
+```
+${REPO_DIR}
+|-- dataset
+| |-- traindata
+| |-- coco
+| |-- hr-lspet
+| |-- human3.6m
+| |-- mpii
+| |-- posetrack3d
+| \-- pose3d
+| |-- COCO2014-All-ver01.json
+| |-- COCO2014-Part-ver01.json
+| |-- COCO2014-Val-ver10.json
+| |-- Human3.6m_train.json
+| |-- Human3.6m_valid.json
+| |-- LSPet_train_ver10.json
+| |-- LSPet_test_ver10.json
+| |-- MPII_ver01.json
+| |-- PoseTrack_ver01.json
+|-- ppdet
+|-- deploy
+|-- demo
+|-- README_cn.md
+|-- README_en.md
+|-- ...
+```
+
+
+### 3、训练与测试
+
+#### 单卡训练
+
+```shell
+#单卡训练
+CUDA_VISIBLE_DEVICES=0 python3 tools/train.py -c configs/pose3d/metro3d_24kpts.yml
+
+#多卡训练
+CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/train.py -c configs/pose3d/metro3d_24kpts.yml
+```
+
+#### 模型评估
+
+```shell
+#单卡评估
+CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/pose3d/metro3d_24kpts.yml -o weights=output/metro3d_24kpts/best_model.pdparams
+
+#当只需要保存评估预测的结果时,可以通过设置save_prediction_only参数实现,评估预测结果默认保存在output/keypoints_results.json文件中
+CUDA_VISIBLE_DEVICES=0 python3 tools/eval.py -c configs/pose3d/metro3d_24kpts.yml -o weights=output/metro3d_24kpts/best_model.pdparams --save_prediction_only
+
+#多卡评估
+CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m paddle.distributed.launch tools/eval.py -c configs/pose3d/metro3d_24kpts.yml -o weights=output/metro3d_24kpts/best_model.pdparams
+```
+
+#### 模型预测
+
+```shell
+#图片生成3视角图
+CUDA_VISIBLE_DEVICES=0 python3 tools/infer.py -c configs/pose3d/metro3d_24kpts.yml -o weights=./output/metro3d_24kpts/best_model.pdparams --infer_img=./demo/hrnet_demo.jpg --draw_threshold=0.5
+```
+
+### 4、使用说明
+
+ 3D Pose在使用中相比2D Pose有更多的困难,该困难主要是由于以下两个原因导致的。
+
+ - 1)训练数据标注成本高;
+
+ - 2)图像在深度信息上的模糊性;
+
+ 由于(1)的原因训练数据往往只能覆盖少量动作,导致模型泛化性困难。由于(2)的原因图像在预测3D Pose坐标时深度z轴上误差通常大于x、y方向,容易导致时序间的较大抖动,且数据标注误差越大该问题表现的更加明显。
+
+ 要解决上述两个问题,就造成了两个矛盾的需求:1)提高泛化性需要更多的标注数据;2)降低预测误差需要高精度的数据标注。而3D Pose本身数据标注的困难导致越高精度的标注成本越高,标注数量则会相应降低。
+
+ 因此,我们提供的解决方案是:
+
+ - 1)使用自动拟合标注方法自动产生大量低精度的数据。训练第一版模型,使其具有较普遍的泛化性。
+
+ - 2)标注少量目标动作的高精度数据,基于第一版模型finetune,得到目标动作上的高精度模型,且一定程度上继承了第一版模型的泛化性。
+
+ 我们的训练数据提供了大量的低精度自动生成式的数据,用户可以在此数据训练的基础上,标注自己高精度的目标动作数据进行finetune,即可得到相对稳定较好的模型。
+
+ 我们在医疗康复高精度数据上的训练效果展示如下 [高清视频](https://user-images.githubusercontent.com/31800336/218949226-22e6ab25-facb-4cc6-8eca-38d4bfd973e5.mp4)
+
+
+

+
+
+
+
+## 引用
+
+```
+@inproceedings{lin2021end-to-end,
+author = {Lin, Kevin and Wang, Lijuan and Liu, Zicheng},
+title = {End-to-End Human Pose and Mesh Reconstruction with Transformers},
+booktitle = {CVPR},
+year = {2021},
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/pose3d/metro3d_24kpts.yml b/PaddleDetection-release-2.6/configs/pose3d/metro3d_24kpts.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b8ea08a230b2b0e25b7d7859c02377dfc149411f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pose3d/metro3d_24kpts.yml
@@ -0,0 +1,144 @@
+use_gpu: True
+log_iter: 20
+save_dir: output
+snapshot_epoch: 3
+weights: output/metro_modified/model_final
+epoch: 50
+metric: Pose3DEval
+num_classes: 1
+train_height: &train_height 224
+train_width: &train_width 224
+trainsize: &trainsize [*train_width, *train_height]
+num_joints: &num_joints 24
+
+#####model
+architecture: METRO_Body
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+METRO_Body:
+ backbone: HRNet
+ trans_encoder: TransEncoder
+ num_joints: *num_joints
+ loss: Pose3DLoss
+
+HRNet:
+ width: 32
+ freeze_at: -1
+ freeze_norm: False
+ norm_momentum: 0.1
+ downsample: True
+
+TransEncoder:
+ vocab_size: 30522
+ num_hidden_layers: 4
+ num_attention_heads: 4
+ position_embeddings_size: 512
+ intermediate_size: 3072
+ input_feat_dim: [2048, 512, 128]
+ hidden_feat_dim: [1024, 256, 128]
+ attention_probs_dropout_prob: 0.1
+ fc_dropout_prob: 0.1
+ act_fn: 'gelu'
+ output_attentions: False
+ output_hidden_feats: False
+
+Pose3DLoss:
+ weight_3d: 1.0
+ weight_2d: 0.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 52
+ - !LinearWarmup
+ start_factor: 0.01
+ steps: 2000
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 0.2
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !Pose3DDataset
+ dataset_dir: dataset/traindata/
+ image_dirs: ["human3.6m", "posetrack3d", "hr-lspet", "hr-lspet", "mpii/images", "coco/train2017"]
+ anno_list: ["pose3d/Human3.6m_train.json", "pose3d/PoseTrack_ver01.json", "pose3d/LSPet_train_ver10.json", "pose3d/LSPet_test_ver10.json", "pose3d/MPII_ver01.json", "pose3d/COCO2014-All-ver01.json"]
+ num_joints: *num_joints
+ test_mode: False
+
+EvalDataset:
+ !Pose3DDataset
+ dataset_dir: dataset/traindata/
+ image_dirs: ["human3.6m"]
+ anno_list: ["pose3d/Human3.6m_valid.json"]
+ num_joints: *num_joints
+ test_mode: True
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/traindata/coco/keypoint_imagelist.txt
+
+worker_num: 4
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - SinglePoseAffine:
+ trainsize: *trainsize
+ rotate: [1.0, 30] #[prob, rotate range]
+ scale: [1.0, 0.25] #[prob, scale range]
+ - FlipPose:
+ flip_prob: 0.5
+ img_res: *train_width
+ num_joints: *num_joints
+ - NoiseJitter:
+ noise_factor: 0.4
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: true
+
+EvalReader:
+ sample_transforms:
+ - SinglePoseAffine:
+ trainsize: *trainsize
+ rotate: [0., 30]
+ scale: [0., 0.25]
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+ shuffle: false
+ drop_last: false
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: false #whether to fuse nomalize layer into model while export model
diff --git a/PaddleDetection-release-2.6/configs/pose3d/tinypose3d_human36M.yml b/PaddleDetection-release-2.6/configs/pose3d/tinypose3d_human36M.yml
new file mode 100644
index 0000000000000000000000000000000000000000..05c6656d145a7bb4af14bcc0a1781cf54de552b1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pose3d/tinypose3d_human36M.yml
@@ -0,0 +1,122 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 1
+weights: output/tinypose3d_human36M/model_final
+epoch: 220
+num_joints: &num_joints 24
+pixel_std: &pixel_std 200
+metric: Pose3DEval
+num_classes: 1
+train_height: &train_height 128
+train_width: &train_width 128
+trainsize: &trainsize [*train_width, *train_height]
+
+#####model
+architecture: TinyPose3DHRHeatmapNet
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/keypoint/tinypose_128x96.pdparams
+
+TinyPose3DHRHeatmapNet:
+ backbone: LiteHRNet
+ post_process: HR3DNetPostProcess
+ num_joints: *num_joints
+ width: &width 40
+ loss: Pose3DLoss
+
+LiteHRNet:
+ network_type: wider_naive
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+Pose3DLoss:
+ weight_3d: 1.0
+ weight_2d: 0.0
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [17, 21]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.01
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !Pose3DDataset
+ dataset_dir: dataset/traindata/
+ image_dirs: ["human3.6m"]
+ anno_list: ['pose3d/Human3.6m_train.json']
+ num_joints: *num_joints
+ test_mode: False
+
+EvalDataset:
+ !Pose3DDataset
+ dataset_dir: dataset/traindata/
+ image_dirs: ["human3.6m"]
+ anno_list: ['pose3d/Human3.6m_valid.json']
+ num_joints: *num_joints
+ test_mode: True
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 4
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - SinglePoseAffine:
+ trainsize: *trainsize
+ rotate: [0.5, 30] #[prob, rotate range]
+ scale: [0.5, 0.25] #[prob, scale range]
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 128
+ shuffle: true
+ drop_last: true
+
+EvalReader:
+ sample_transforms:
+ - SinglePoseAffine:
+ trainsize: *trainsize
+ rotate: [0., 30]
+ scale: [0., 0.25]
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 128
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: false
diff --git a/PaddleDetection-release-2.6/configs/pose3d/tinypose3d_medical_multi_frames.yml b/PaddleDetection-release-2.6/configs/pose3d/tinypose3d_medical_multi_frames.yml
new file mode 100644
index 0000000000000000000000000000000000000000..aad7a405571b4a6fa89714c00e6e39664483799a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pose3d/tinypose3d_medical_multi_frames.yml
@@ -0,0 +1,138 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 1
+weights: output/tinypose_3D_multi_frames/model_final
+epoch: 420
+num_joints: &num_joints 24
+pixel_std: &pixel_std 200
+metric: Pose3DEval
+num_classes: 1
+train_height: &train_height 128
+train_width: &train_width 96
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [24, 32]
+flip_perm: &flip_perm [[1, 2], [4, 5], [7, 8], [10, 11], [13, 14], [16, 17], [18, 19], [20, 21], [22, 23]]
+
+
+#####model
+architecture: TinyPose3DHRNet
+pretrain_weights: medical_multi_frames_best_model.pdparams
+
+TinyPose3DHRNet:
+ backbone: LiteHRNet
+ post_process: TinyPose3DPostProcess
+ num_joints: *num_joints
+ width: &width 40
+ loss: KeyPointRegressionMSELoss
+
+LiteHRNet:
+ network_type: wider_naive
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointRegressionMSELoss:
+ reduction: 'mean'
+
+#####optimizer
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [17, 21]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.01
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+#####data
+TrainDataset:
+ !Keypoint3DMultiFramesDataset
+ dataset_dir: "data/medical/multi_frames/train"
+ image_dir: "images"
+ p3d_dir: "joint_pc/player_0"
+ json_path: "json_results/player_0/player_0.json"
+ img_size: *trainsize # w,h
+ num_frames: 6
+
+
+EvalDataset:
+ !Keypoint3DMultiFramesDataset
+ dataset_dir: "data/medical/multi_frames/val"
+ image_dir: "images"
+ p3d_dir: "joint_pc/player_0"
+ json_path: "json_results/player_0/player_0.json"
+ img_size: *trainsize # w,h
+ num_frames: 6
+
+TestDataset:
+ !Keypoint3DMultiFramesDataset
+ dataset_dir: "data/medical/multi_frames/val"
+ image_dir: "images"
+ p3d_dir: "joint_pc/player_0"
+ json_path: "json_results/player_0/player_0.json"
+ img_size: *trainsize # w,h
+ num_frames: 6
+
+worker_num: 4
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - CropAndFlipImages:
+ crop_range: [556, 1366]
+ - RandomFlipHalfBody3DTransformImages:
+ scale: 0.25
+ rot: 30
+ num_joints_half_body: 9
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 3, 6, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
+ flip_pairs: *flip_perm
+ do_occlusion: true
+ - Resize: {interp: 2, target_size: [*train_height,*train_width], keep_ratio: false}
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - PermuteImages: {}
+ batch_size: 32
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - CropAndFlipImages:
+ crop_range: [556, 1366]
+ - Resize: {interp: 2, target_size: [*train_height,*train_width], keep_ratio: false}
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - PermuteImages: {}
+ batch_size: 32
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: { target_size: [*train_height,*train_width]}
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: false
diff --git a/PaddleDetection-release-2.6/configs/pose3d/tinypose3d_multi_frames_heatmap.yml b/PaddleDetection-release-2.6/configs/pose3d/tinypose3d_multi_frames_heatmap.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a5893ec9b9135a359a82af7897714b234a79c47c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pose3d/tinypose3d_multi_frames_heatmap.yml
@@ -0,0 +1,138 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 1
+weights: output/tinypose3d_multi_frames_heatmap/model_final
+epoch: 420
+num_joints: &num_joints 24
+pixel_std: &pixel_std 200
+metric: Pose3DEval
+num_classes: 1
+train_height: &train_height 128
+train_width: &train_width 128
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [24, 32]
+flip_perm: &flip_perm [[1, 2], [4, 5], [7, 8], [10, 11], [13, 14], [16, 17], [18, 19], [20, 21], [22, 23]]
+
+#####model
+architecture: TinyPose3DHRHeatmapNet
+pretrain_weights: medical_multi_frames_best_model.pdparams
+
+TinyPose3DHRHeatmapNet:
+ backbone: LiteHRNet
+ post_process: TinyPosePostProcess
+ num_joints: *num_joints
+ width: &width 40
+ loss: KeyPointRegressionMSELoss
+
+LiteHRNet:
+ network_type: wider_naive
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointRegressionMSELoss:
+ reduction: 'mean'
+
+#####optimizer
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [17, 21]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.01
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+#####data
+TrainDataset:
+ !Keypoint3DMultiFramesDataset
+ dataset_dir: "data/medical/multi_frames/train"
+ image_dir: "images"
+ p3d_dir: "joint_pc/player_0"
+ json_path: "json_results/player_0/player_0.json"
+ img_size: *trainsize # w,h
+ num_frames: 6
+
+
+EvalDataset:
+ !Keypoint3DMultiFramesDataset
+ dataset_dir: "data/medical/multi_frames/val"
+ image_dir: "images"
+ p3d_dir: "joint_pc/player_0"
+ json_path: "json_results/player_0/player_0.json"
+ img_size: *trainsize # w,h
+ num_frames: 6
+
+TestDataset:
+ !Keypoint3DMultiFramesDataset
+ dataset_dir: "data/medical/multi_frames/val"
+ image_dir: "images"
+ p3d_dir: "joint_pc/player_0"
+ json_path: "json_results/player_0/player_0.json"
+ img_size: *trainsize # w,h
+ num_frames: 6
+
+worker_num: 4
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - CropAndFlipImages:
+ crop_range: [556, 1366] # 保留train_height/train_width比例的情况下,裁剪原图左右两个的黑色填充
+ - RandomFlipHalfBody3DTransformImages:
+ scale: 0.25
+ rot: 30
+ num_joints_half_body: 9
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 3, 6, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]
+ flip_pairs: *flip_perm
+ do_occlusion: true
+ - Resize: {interp: 2, target_size: [*train_height,*train_width], keep_ratio: false}
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - PermuteImages: {}
+ batch_size: 1 #32
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - CropAndFlipImages:
+ crop_range: [556, 1366]
+ - Resize: {interp: 2, target_size: [*train_height,*train_width], keep_ratio: false}
+ #- OriginPointTranslationImages: {}
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - PermuteImages: {}
+ batch_size: 32
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - LetterBoxResize: { target_size: [*train_height,*train_width]}
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: false
diff --git a/PaddleDetection-release-2.6/configs/pphuman/README.md b/PaddleDetection-release-2.6/configs/pphuman/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a568f120d45549215ad4a4105845b0d50d9f5106
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/README.md
@@ -0,0 +1,84 @@
+简体中文 | [English](README.md)
+
+# PP-YOLOE Human 检测模型
+
+PaddleDetection团队提供了针对行人的基于PP-YOLOE的检测模型,用户可以下载模型进行使用。PP-Human中使用模型为业务数据集模型,我们同时提供CrowdHuman训练配置,可以使用开源数据进行训练。
+其中整理后的COCO格式的CrowdHuman数据集[下载链接](https://bj.bcebos.com/v1/paddledet/data/crowdhuman.zip),检测类别仅一类 `pedestrian(1)`,原始数据集[下载链接](http://www.crowdhuman.org/download.html)。
+
+相关模型的部署模型均在[PP-Human](../../deploy/pipeline/)项目中使用。
+
+| 模型 | 数据集 | mAPval
0.5:0.95 | mAPval
0.5 | 下载 | 配置文件 |
+|:---------|:-------:|:------:|:------:| :----: | :------:|
+|PP-YOLOE-s| CrowdHuman | 42.5 | 77.9 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_36e_crowdhuman.pdparams) | [配置文件](./ppyoloe_crn_s_36e_crowdhuman.yml) |
+|PP-YOLOE-l| CrowdHuman | 48.0 | 81.9 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_36e_crowdhuman.pdparams) | [配置文件](./ppyoloe_crn_l_36e_crowdhuman.yml) |
+|PP-YOLOE-s| 业务数据集 | 53.2 | - | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_36e_pphuman.pdparams) | [配置文件](./ppyoloe_crn_s_36e_pphuman.yml) |
+|PP-YOLOE-l| 业务数据集 | 57.8 | - | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_36e_pphuman.pdparams) | [配置文件](./ppyoloe_crn_l_36e_pphuman.yml) |
+|PP-YOLOE+_t-aux(320)| 业务数据集 | 45.7 | 81.2 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_320_60e_pphuman.pdparams) | [配置文件](./ppyoloe_plus_crn_t_auxhead_320_60e_pphuman.yml) |
+
+
+**注意:**
+- PP-YOLOE模型训练过程中使用8 GPUs进行混合精度训练,如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- 具体使用教程请参考[ppyoloe](../ppyoloe#getting-start)。
+
+# YOLOv3 Human 检测模型
+
+请参考[Human_YOLOv3页面](./pedestrian_yolov3/README_cn.md)
+
+# PP-YOLOE 香烟检测模型
+基于PP-YOLOE模型的香烟检测模型,是实现PP-Human中的基于检测的行为识别方案的一环,如何在PP-Human中使用该模型进行吸烟行为识别,可参考[PP-Human行为识别模块](../../deploy/pipeline/docs/tutorials/pphuman_action.md)。该模型检测类别仅包含香烟一类。由于数据来源限制,目前暂无法直接公开训练数据。该模型使用了小目标数据集VisDrone上的权重(参照[visdrone](../visdrone))作为预训练模型,以提升检测效果。
+
+| 模型 | 数据集 | mAPval
0.5:0.95 | mAPval
0.5 | 下载 | 配置文件 |
+|:---------|:-------:|:------:|:------:| :----: | :------:|
+| PP-YOLOE-s | 香烟业务数据集 | 39.7 | 79.5 |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/ppyoloe_crn_s_80e_smoking_visdrone.pdparams) | [配置文件](./ppyoloe_crn_s_80e_smoking_visdrone.yml) |
+
+# PP-HGNet 打电话识别模型
+基于PP-HGNet模型实现了打电话行为识别,详细可参考[PP-Human行为识别模块](../../deploy/pipeline/docs/tutorials/pphuman_action.md)。该模型基于[PaddleClas](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models/PP-HGNet.md#3.3)套件进行训练。此处提供预测模型下载:
+
+| 模型 | 数据集 | Acc | 下载 | 配置文件 |
+|:---------|:-------:|:------:| :----: | :------:|
+| PP-HGNet | 业务数据集 | 86.85 |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_tiny_calling_halfbody.zip) | - |
+
+# HRNet 人体关键点模型
+人体关键点模型与ST-GCN模型一起完成[基于骨骼点的行为识别](../../deploy/pipeline/docs/tutorials/pphuman_action.md)方案。关键点模型采用HRNet模型,关于关键点模型相关详细资料可以查看关键点专栏页面[KeyPoint](../keypoint/README.md)。此处提供训练模型下载链接。
+
+| 模型 | 数据集 | APval
0.5:0.95 | 下载 | 配置文件 |
+|:---------|:-------:|:------:| :----: | :------:|
+| HRNet | 业务数据集 | 87.1 |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.pdparams) | [配置文件](./hrnet_w32_256x192.yml) |
+
+
+# ST-GCN 骨骼点行为识别模型
+人体关键点模型与[ST-GCN](https://arxiv.org/abs/1801.07455)模型一起完成[基于骨骼点的行为识别](../../deploy/pipeline/docs/tutorials/pphuman_action.md)方案。
+ST-GCN模型基于[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo/blob/develop/applications/PPHuman)完成训练。
+此处提供预测模型下载链接。
+
+| 模型 | 数据集 | APval
0.5:0.95 | 下载 | 配置文件 |
+|:---------|:-------:|:------:| :----: | :------:|
+| ST-GCN | 业务数据集 | 87.1 |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/STGCN.zip) | [配置文件](https://github.com/PaddlePaddle/PaddleVideo/blob/develop/applications/PPHuman/configs/stgcn_pphuman.yaml) |
+
+# PP-TSM 视频分类模型
+基于`PP-TSM`模型完成了[基于视频分类的行为识别](../../deploy/pipeline/docs/tutorials/pphuman_action.md)方案。
+PP-TSM模型基于[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo/tree/develop/applications/FightRecognition)完成训练。
+此处提供预测模型下载链接。
+
+| 模型 | 数据集 | Acc | 下载 | 配置文件 |
+|:---------|:-------:|:------:| :----: | :------:|
+| PP-TSM | 组合开源数据集 | 89.06 |[下载链接](https://videotag.bj.bcebos.com/PaddleVideo-release2.3/ppTSM_fight.zip) | [配置文件](https://github.com/PaddlePaddle/PaddleVideo/tree/develop/applications/FightRecognition/pptsm_fight_frames_dense.yaml) |
+
+# PP-HGNet、PP-LCNet 属性识别模型
+基于PP-HGNet、PP-LCNet 模型实现了行人属性识别,详细可参考[PP-Human行为识别模块](../../deploy/pipeline/docs/tutorials/pphuman_attribute.md)。该模型基于[PaddleClas](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models/PP-LCNet.md)套件进行训练。此处提供预测模型下载链接.
+
+| 模型 | 数据集 | mA | 下载 | 配置文件 |
+|:---------|:-------:|:------:| :----: | :------:|
+| PP-HGNet_small | 业务数据集 | 95.4 |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPHGNet_small_person_attribute_954_infer.zip) | - |
+| PP-LCNet | 业务数据集 | 94.5 |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/PPLCNet_x1_0_person_attribute_945_infer.zip) | [配置文件](https://github.com/PaddlePaddle/PaddleClas/blob/develop/ppcls/configs/PULC/person_attribute/PPLCNet_x1_0.yaml) |
+
+
+## 引用
+```
+@article{shao2018crowdhuman,
+ title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
+ author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
+ journal={arXiv preprint arXiv:1805.00123},
+ year={2018}
+ }
+```
diff --git a/PaddleDetection-release-2.6/configs/pphuman/dark_hrnet_w32_256x192.yml b/PaddleDetection-release-2.6/configs/pphuman/dark_hrnet_w32_256x192.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a759c121a1e891e510f802cfbf53962c98a368be
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/dark_hrnet_w32_256x192.yml
@@ -0,0 +1,141 @@
+use_gpu: true
+log_iter: 5
+save_dir: output
+snapshot_epoch: 10
+weights: output/hrnet_w32_256x192/model_final
+epoch: 210
+num_joints: &num_joints 17
+pixel_std: &pixel_std 200
+metric: KeyPointTopDownCOCOEval
+num_classes: 1
+train_height: &train_height 256
+train_width: &train_width 192
+trainsize: &trainsize [*train_width, *train_height]
+hmsize: &hmsize [48, 64]
+flip_perm: &flip_perm [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, 14], [15, 16]]
+
+
+#####model
+architecture: TopDownHRNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Trunc_HRNet_W32_C_pretrained.pdparams
+
+TopDownHRNet:
+ backbone: HRNet
+ post_process: HRNetPostProcess
+ flip_perm: *flip_perm
+ num_joints: *num_joints
+ width: &width 32
+ loss: KeyPointMSELoss
+
+HRNet:
+ width: *width
+ freeze_at: -1
+ freeze_norm: false
+ return_idx: [0]
+
+KeyPointMSELoss:
+ use_target_weight: true
+
+
+#####optimizer
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [170, 200]
+ gamma: 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: Adam
+ regularizer:
+ factor: 0.0
+ type: L2
+
+
+#####data
+TrainDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: train2017
+ anno_path: annotations/person_keypoints_train2017.json
+ dataset_dir: dataset/coco
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+
+
+EvalDataset:
+ !KeypointTopDownCocoDataset
+ image_dir: val2017
+ anno_path: annotations/person_keypoints_val2017.json
+ dataset_dir: dataset/coco
+ bbox_file: bbox.json
+ num_joints: *num_joints
+ trainsize: *trainsize
+ pixel_std: *pixel_std
+ use_gt_bbox: True
+ image_thre: 0.0
+
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/keypoint_imagelist.txt
+
+worker_num: 2
+global_mean: &global_mean [0.485, 0.456, 0.406]
+global_std: &global_std [0.229, 0.224, 0.225]
+TrainReader:
+ sample_transforms:
+ - RandomFlipHalfBodyTransform:
+ scale: 0.5
+ rot: 40
+ num_joints_half_body: 8
+ prob_half_body: 0.3
+ pixel_std: *pixel_std
+ trainsize: *trainsize
+ upper_body_ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+ flip_pairs: *flip_perm
+ - TopDownAffine:
+ trainsize: *trainsize
+ - ToHeatmapsTopDown_DARK:
+ hmsize: *hmsize
+ sigma: 2
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: false
+
+EvalReader:
+ sample_transforms:
+ - TopDownAffine:
+ trainsize: *trainsize
+ batch_transforms:
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 16
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *train_height, *train_width]
+ sample_transforms:
+ - Decode: {}
+ - TopDownEvalAffine:
+ trainsize: *trainsize
+ - NormalizeImage:
+ mean: *global_mean
+ std: *global_std
+ is_scale: true
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/README.md b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..825eedb6ea50bf3893297e0342367cca6517d0f1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/README.md
@@ -0,0 +1,50 @@
+English | [简体中文](README_cn.md)
+# PaddleDetection applied for specific scenarios
+
+We provide some models implemented by PaddlePaddle to detect objects in specific scenarios, users can download the models and use them in these scenarios.
+
+| Task | Algorithm | Box AP | Download | Configs |
+|:---------------------|:---------:|:------:| :-------------------------------------------------------------------------------------: |:------:|
+| Pedestrian Detection | YOLOv3 | 51.8 | [model](https://paddledet.bj.bcebos.com/models/pedestrian_yolov3_darknet.pdparams) | [config](./pedestrian_yolov3_darknet.yml) |
+
+## Pedestrian Detection
+
+The main applications of pedetestrian detection include intelligent monitoring. In this scenary, photos of pedetestrians are taken by surveillance cameras in public areas, then pedestrian detection are conducted on these photos.
+
+### 1. Network
+
+The network for detecting vehicles is YOLOv3, the backbone of which is Dacknet53.
+
+### 2. Configuration for training
+
+PaddleDetection provides users with a configuration file [yolov3_darknet53_270e_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) to train YOLOv3 on the COCO dataset, compared with this file, we modify some parameters as followed to conduct the training for pedestrian detection:
+
+* num_classes: 1
+* dataset_dir: dataset/pedestrian
+
+### 3. Accuracy
+
+The accuracy of the model trained and evaluted on our private data is shown as followed:
+
+AP at IoU=.50:.05:.95 is 0.518.
+
+AP at IoU=.50 is 0.792.
+
+### 4. Inference
+
+Users can employ the model to conduct the inference:
+
+```
+export CUDA_VISIBLE_DEVICES=0
+python -u tools/infer.py -c configs/pphuman/pedestrian_yolov3/pedestrian_yolov3_darknet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/pedestrian_yolov3_darknet.pdparams \
+ --infer_dir configs/pphuman/pedestrian_yolov3/demo \
+ --draw_threshold 0.3 \
+ --output_dir configs/pphuman/pedestrian_yolov3/demo/output
+```
+
+Some inference results are visualized below:
+
+
+
+
diff --git a/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/README_cn.md b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..50a183f8384686760befb2ce17f7617a4547a97b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/README_cn.md
@@ -0,0 +1,51 @@
+[English](README.md) | 简体中文
+# 特色垂类检测模型
+
+我们提供了针对不同场景的基于PaddlePaddle的检测模型,用户可以下载模型进行使用。
+
+| 任务 | 算法 | 精度(Box AP) | 下载 | 配置文件 |
+|:---------------------|:---------:|:------:| :---------------------------------------------------------------------------------: | :------:|
+| 行人检测 | YOLOv3 | 51.8 | [下载链接](https://paddledet.bj.bcebos.com/models/pedestrian_yolov3_darknet.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/pphuman/pedestrian_yolov3/pedestrian_yolov3_darknet.yml) |
+
+## 行人检测(Pedestrian Detection)
+
+行人检测的主要应用有智能监控。在监控场景中,大多是从公共区域的监控摄像头视角拍摄行人,获取图像后再进行行人检测。
+
+### 1. 模型结构
+
+Backbone为Dacknet53的YOLOv3。
+
+
+### 2. 训练参数配置
+
+PaddleDetection提供了使用COCO数据集对YOLOv3进行训练的参数配置文件[yolov3_darknet53_270e_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml),与之相比,在进行行人检测的模型训练时,我们对以下参数进行了修改:
+
+* num_classes: 1
+* dataset_dir: dataset/pedestrian
+
+### 2. 精度指标
+
+模型在我们针对监控场景的内部数据上精度指标为:
+
+IOU=.5时的AP为 0.792。
+
+IOU=.5-.95时的AP为 0.518。
+
+### 3. 预测
+
+用户可以使用我们训练好的模型进行行人检测:
+
+```
+export CUDA_VISIBLE_DEVICES=0
+python -u tools/infer.py -c configs/pphuman/pedestrian_yolov3/pedestrian_yolov3_darknet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/pedestrian_yolov3_darknet.pdparams \
+ --infer_dir configs/pphuman/pedestrian_yolov3/demo \
+ --draw_threshold 0.3 \
+ --output_dir configs/pphuman/pedestrian_yolov3/demo/output
+```
+
+预测结果示例:
+
+
+
+
diff --git a/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/001.png b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/001.png
new file mode 100644
index 0000000000000000000000000000000000000000..63ae9167fd03e8a95756fe5f6195fc8d741b9cfa
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/001.png differ
diff --git a/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/002.png b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/002.png
new file mode 100644
index 0000000000000000000000000000000000000000..0de905cf55e6b02487ee1b8220810df8eaa24c2c
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/002.png differ
diff --git a/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/003.png b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/003.png
new file mode 100644
index 0000000000000000000000000000000000000000..e9026e099df42d4267be07a71401eb5426b47745
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/003.png differ
diff --git a/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/004.png b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/004.png
new file mode 100644
index 0000000000000000000000000000000000000000..d8118ec3e0ef63bc74e825b5e7638a1886580604
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/demo/004.png differ
diff --git a/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/pedestrian.json b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/pedestrian.json
new file mode 100644
index 0000000000000000000000000000000000000000..f72fe6dc65209ab3506d18556fb8b83b6ec832a9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/pedestrian.json
@@ -0,0 +1,11 @@
+{
+ "images": [],
+ "annotations": [],
+ "categories": [
+ {
+ "supercategory": "component",
+ "id": 1,
+ "name": "pedestrian"
+ }
+ ]
+}
diff --git a/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/pedestrian_yolov3_darknet.yml b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/pedestrian_yolov3_darknet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..febd30643db87a1856b3776b5c2c6014f1587576
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/pedestrian_yolov3/pedestrian_yolov3_darknet.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../../datasets/coco_detection.yml',
+ '../../runtime.yml',
+ '../../yolov3/_base_/optimizer_270e.yml',
+ '../../yolov3/_base_/yolov3_darknet53.yml',
+ '../../yolov3/_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: https://paddledet.bj.bcebos.com/models/pedestrian_yolov3_darknet.pdparams
+
+num_classes: 1
+
+TrainDataset:
+ !COCODataSet
+ dataset_dir: dataset/pedestrian
+ anno_path: annotations/instances_train2017.json
+ image_dir: train2017
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ dataset_dir: dataset/pedestrian
+ anno_path: annotations/instances_val2017.json
+ image_dir: val2017
+
+TestDataset:
+ !ImageFolder
+ anno_path: configs/pphuman/pedestrian_yolov3/pedestrian.json
diff --git a/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_l_36e_crowdhuman.yml b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_l_36e_crowdhuman.yml
new file mode 100644
index 0000000000000000000000000000000000000000..445fefdc5c1a86c307a5c11b471df1aa95aafe7d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_l_36e_crowdhuman.yml
@@ -0,0 +1,55 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/ppyoloe_crn_l_36e_crowdhuman/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+num_classes: 1
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train.json
+ dataset_dir: dataset/crowdhuman
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val.json
+ dataset_dir: dataset/crowdhuman
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val.json
+ dataset_dir: dataset/crowdhuman
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_l_36e_pphuman.yml b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_l_36e_pphuman.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c1ac43ede159ad8f6086abc18ca83aac3c2ff4a2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_l_36e_pphuman.yml
@@ -0,0 +1,55 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/ppyoloe_crn_l_36e_pphuman/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+num_classes: 1
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train.json
+ dataset_dir: dataset/pphuman
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val.json
+ dataset_dir: dataset/pphuman
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val.json
+ dataset_dir: dataset/pphuman
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_s_36e_crowdhuman.yml b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_s_36e_crowdhuman.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7be5fe7e72e28c1d1fc9f1d517a95caa796fee76
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_s_36e_crowdhuman.yml
@@ -0,0 +1,55 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/ppyoloe_crn_s_36e_crowdhuman/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+num_classes: 1
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train.json
+ dataset_dir: dataset/crowdhuman
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val.json
+ dataset_dir: dataset/crowdhuman
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val.json
+ dataset_dir: dataset/crowdhuman
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_s_36e_pphuman.yml b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_s_36e_pphuman.yml
new file mode 100644
index 0000000000000000000000000000000000000000..34911e2fe96cf7278f8dde9029f3028d4adf900c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_s_36e_pphuman.yml
@@ -0,0 +1,55 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/ppyoloe_crn_s_36e_pphuman/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+num_classes: 1
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train.json
+ dataset_dir: dataset/pphuman
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val.json
+ dataset_dir: dataset/pphuman
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val.json
+ dataset_dir: dataset/pphuman
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_s_80e_smoking_visdrone.yml b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_s_80e_smoking_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..40a731d4dece54b02948c58e9bbaef60d1d6d9ce
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_crn_s_80e_smoking_visdrone.yml
@@ -0,0 +1,54 @@
+_BASE_: [
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_s_80e_smoking_visdrone/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_80e_visdrone.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+TrainReader:
+ batch_size: 16
+
+LearningRate:
+ base_lr: 0.01
+
+epoch: 80
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: 80
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+
+metric: COCO
+num_classes: 1
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: smoking_train_cocoformat.json
+ dataset_dir: dataset/smoking
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: smoking_test_cocoformat.json
+ dataset_dir: dataset/smoking
+
+TestDataset:
+ !ImageFolder
+ anno_path: smoking_test_cocoformat.json
+ dataset_dir: dataset/smoking
diff --git a/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_plus_crn_t_auxhead_320_60e_pphuman.yml b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_plus_crn_t_auxhead_320_60e_pphuman.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9d542fe6b4f20d6950edcc3a775839028ef702fb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/pphuman/ppyoloe_plus_crn_t_auxhead_320_60e_pphuman.yml
@@ -0,0 +1,60 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_plus_crn_tiny_auxhead.yml',
+ '../ppyoloe/_base_/ppyoloe_plus_reader_320.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 4
+weights: output/ppyoloe_plus_crn_t_auxhead_320_60e_pphuman/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_300e_coco.pdparams # 640*640 COCO mAP 39.7
+depth_mult: 0.33
+width_mult: 0.375
+
+
+num_classes: 1
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train.json
+ dataset_dir: dataset/pphuman
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val.json
+ dataset_dir: dataset/pphuman
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val.json
+ dataset_dir: dataset/pphuman
+
+
+TrainReader:
+ batch_size: 8
+
+
+epoch: 60
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 72
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/README.md b/PaddleDetection-release-2.6/configs/ppvehicle/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..de4b783799e96cf132f9e1f7e49c317161f18d20
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/README.md
@@ -0,0 +1,81 @@
+简体中文 | [English](README.md)
+
+## PP-YOLOE Vehicle 检测模型
+
+PaddleDetection团队提供了针对自动驾驶场景的基于PP-YOLOE的检测模型,用户可以下载模型进行使用,主要包含5个数据集(BDD100K-DET、BDD100K-MOT、UA-DETRAC、PPVehicle9cls、PPVehicle)。其中前3者为公开数据集,后两者为整合数据集。
+- BDD100K-DET具体类别为10类,包括`pedestrian(1), rider(2), car(3), truck(4), bus(5), train(6), motorcycle(7), bicycle(8), traffic light(9), traffic sign(10)`。
+- BDD100K-MOT具体类别为8类,包括`pedestrian(1), rider(2), car(3), truck(4), bus(5), train(6), motorcycle(7), bicycle(8)`,但数据集比BDD100K-DET更大更多。
+- UA-DETRAC具体类别为4类,包括`car(1), bus(2), van(3), others(4)`。
+- PPVehicle9cls数据集整合了BDD100K-MOT和UA-DETRAC,具体类别为9类,包括`pedestrian(1), rider(2), car(3), truck(4), bus(5), van(6), motorcycle(7), bicycle(8), others(9)`。
+- PPVehicle数据集整合了BDD100K-MOT和UA-DETRAC,是将BDD100K-MOT中的`car, truck, bus, van`和UA-DETRAC中的`car, bus, van`都合并为1类`vehicle(1)`后的数据集。
+
+相关模型的部署模型均在[PP-Vehicle](../../deploy/pipeline/)项目中使用。
+
+| 模型 | 数据集 | 类别数 | mAPval
0.5:0.95 | 下载链接 | 配置文件 |
+|:---------|:---------------:|:------:|:-----------------------:|:---------:| :-----: |
+|PP-YOLOE-l| BDD100K-DET | 10 | 35.6 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_36e_bdd100kdet.pdparams) | [配置文件](./ppyoloe_crn_l_36e_bdd100kdet.yml) |
+|PP-YOLOE-l| BDD100K-MOT | 8 | 33.7 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_36e_bdd100kmot.pdparams) | [配置文件](./ppyoloe_crn_l_36e_bdd100kmot.yml) |
+|PP-YOLOE-l| UA-DETRAC | 4 | 51.4 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_36e_uadetrac.pdparams) | [配置文件](./ppyoloe_crn_l_36e_uadetrac.yml) |
+|PP-YOLOE-l| PPVehicle9cls | 9 | 40.0 | [下载链接](https://paddledet.bj.bcebos.com/models/mot_ppyoloe_l_36e_ppvehicle9cls.pdparams) | [配置文件](./mot_ppyoloe_l_36e_ppvehicle9cls.yml) |
+|PP-YOLOE-s| PPVehicle9cls | 9 | 35.3 | [下载链接](https://paddledet.bj.bcebos.com/models/mot_ppyoloe_s_36e_ppvehicle9cls.pdparams) | [配置文件](./mot_ppyoloe_s_36e_ppvehicle9cls.yml) |
+|PP-YOLOE-l| PPVehicle | 1 | 63.9 | [下载链接](https://paddledet.bj.bcebos.com/models/mot_ppyoloe_l_36e_ppvehicle.pdparams) | [配置文件](./mot_ppyoloe_l_36e_ppvehicle.yml) |
+|PP-YOLOE-s| PPVehicle | 1 | 61.3 | [下载链接](https://paddledet.bj.bcebos.com/models/mot_ppyoloe_s_36e_ppvehicle.pdparams) | [配置文件](./mot_ppyoloe_s_36e_ppvehicle.yml) |
+|PP-YOLOE+_t-aux(320)| PPVehicle | 1 | 53.5 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_320_60e_ppvehicle.pdparams) | [配置文件](./ppyoloe_plus_crn_t_auxhead_320_60e_ppvehicle.yml) |
+
+
+**注意:**
+- PP-YOLOE模型训练过程中使用8 GPUs进行混合精度训练,如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- 具体使用教程请参考[ppyoloe](../ppyoloe#getting-start)。
+- 如需预测出对应类别,可自行修改和添加对应的label_list.txt文件(一行记录一个对应种类),TestDataset中的anno_path为绝对路径,如:
+```
+TestDataset:
+ !ImageFolder
+ anno_path: label_list.txt # 如不使用dataset_dir,则anno_path即为相对于PaddleDetection主目录的相对路径
+ # dataset_dir: dataset/ppvehicle # 如使用dataset_dir,则dataset_dir/anno_path作为新的anno_path
+```
+label_list.txt里的一行记录一个对应种类,如下所示:
+```
+vehicle
+```
+
+## YOLOv3 Vehicle 检测模型
+
+请参考[Vehicle_YOLOv3页面](./vehicle_yolov3/README_cn.md)
+
+## PP-OCRv3 车牌识别模型
+
+车牌识别采用Paddle自研超轻量级模型PP-OCRv3_det、PP-OCRv3_rec。在[CCPD数据集](https://github.com/detectRecog/CCPD)(CCPD2019+CCPD2020车牌数据集)上进行了fine-tune。模型训练基于[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/applications/%E8%BD%BB%E9%87%8F%E7%BA%A7%E8%BD%A6%E7%89%8C%E8%AF%86%E5%88%AB.md)完成,我们提供了预测模型下载:
+
+| 模型 | 数据集 | 精度 | 下载 | 配置文件 |
+|:---------|:-------:|:------:| :----: | :------:|
+| PP-OCRv3_det | CCPD组合数据集 | hmean:0.979 |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_det_infer.tar.gz) | [配置文件](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/configs/det/ch_PP-OCRv3/ch_PP-OCRv3_det_cml.yml) |
+| PP-OCRv3_rec | CCPD组合数据集 | acc:0.773 |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/ch_PP-OCRv3_rec_infer.tar.gz) | [配置文件](https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/configs/rec/PP-OCRv3/ch_PP-OCRv3_rec_distillation.yml) |
+
+## PP-LCNet 车牌属性模型
+
+车牌属性采用Paddle自研超轻量级模型PP-LCNet。在[VeRi数据集](https://www.v7labs.com/open-datasets/veri-dataset)进行训练。模型训练基于[PaddleClas](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.4/docs/en/PULC/PULC_vehicle_attribute_en.md)完成,我们提供了预测模型下载:
+
+| 模型 | 数据集 | 精度 | 下载 | 配置文件 |
+|:---------|:-------:|:------:| :----: | :------:|
+| PP-LCNet_x1_0 | VeRi数据集 | 90.81 |[下载链接](https://bj.bcebos.com/v1/paddledet/models/pipeline/vehicle_attribute_model.zip) | [配置文件](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.4/ppcls/configs/PULC/vehicle_attribute/PPLCNet_x1_0.yaml) |
+
+
+## 引用
+```
+@InProceedings{bdd100k,
+ author = {Yu, Fisher and Chen, Haofeng and Wang, Xin and Xian, Wenqi and Chen,
+ Yingying and Liu, Fangchen and Madhavan, Vashisht and Darrell, Trevor},
+ title = {BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning},
+ booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+ month = {June},
+ year = {2020}
+}
+
+@article{CVIU_UA-DETRAC,
+ author = {Longyin Wen and Dawei Du and Zhaowei Cai and Zhen Lei and Ming{-}Ching Chang and
+ Honggang Qi and Jongwoo Lim and Ming{-}Hsuan Yang and Siwei Lyu},
+ title = {{UA-DETRAC:} {A} New Benchmark and Protocol for Multi-Object Detection and Tracking},
+ journal = {Computer Vision and Image Understanding},
+ year = {2020}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_l_36e_ppvehicle.yml b/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_l_36e_ppvehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..61df2fcc4b55820d6dca9e4f57ecc1fc02484777
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_l_36e_ppvehicle.yml
@@ -0,0 +1,57 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/mot_ppyoloe_l_36e_ppvehicle/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+num_classes: 1
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train_all.json
+ dataset_dir: dataset/ppvehicle
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+ allow_empty: true
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val_all.json
+ dataset_dir: dataset/ppvehicle
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val_all.json
+ dataset_dir: dataset/ppvehicle
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_l_36e_ppvehicle9cls.yml b/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_l_36e_ppvehicle9cls.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4cd73b7e244a47fcdb3f64663df8995e1dde3e55
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_l_36e_ppvehicle9cls.yml
@@ -0,0 +1,56 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/mot_ppyoloe_l_36e_ppvehicle9cls/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+num_classes: 9
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train_all_9cls.json
+ dataset_dir: dataset/ppvehicle
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val_all_9cls.json
+ dataset_dir: dataset/ppvehicle
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val_all_9cls.json
+ dataset_dir: dataset/ppvehicle
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_s_36e_ppvehicle.yml b/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_s_36e_ppvehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f4f384584c12ae0eff897ebc0fb7f233463ea708
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_s_36e_ppvehicle.yml
@@ -0,0 +1,57 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/mot_ppyoloe_s_36e_ppvehicle/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+num_classes: 1
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train_all.json
+ dataset_dir: dataset/ppvehicle
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+ allow_empty: true
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val_all.json
+ dataset_dir: dataset/ppvehicle
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val_all.json
+ dataset_dir: dataset/ppvehicle
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_s_36e_ppvehicle9cls.yml b/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_s_36e_ppvehicle9cls.yml
new file mode 100644
index 0000000000000000000000000000000000000000..653ff1a75822f965bfb0a8134f5fa78a309d52b9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/mot_ppyoloe_s_36e_ppvehicle9cls.yml
@@ -0,0 +1,56 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/mot_ppyoloe_s_36e_ppvehicle9cls/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+num_classes: 9
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train_all_9cls.json
+ dataset_dir: dataset/ppvehicle
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val_all_9cls.json
+ dataset_dir: dataset/ppvehicle
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val_all_9cls.json
+ dataset_dir: dataset/ppvehicle
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_crn_l_36e_bdd100kdet.yml b/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_crn_l_36e_bdd100kdet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..921d8b33f17a2a6850cf292769bf51b00a7b1d92
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_crn_l_36e_bdd100kdet.yml
@@ -0,0 +1,56 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/ppyoloe_crn_l_36e_bdd100kdet/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+num_classes: 10
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images/100k/train
+ anno_path: labels/det_20/det_train_cocofmt.json
+ dataset_dir: dataset/bdd100k
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images/100k/val
+ anno_path: labels/det_20/det_val_cocofmt.json
+ dataset_dir: dataset/bdd100k
+
+TestDataset:
+ !ImageFolder
+ anno_path: labels/det_20/det_val_cocofmt.json
+ dataset_dir: dataset/bdd100k
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_crn_l_36e_bdd100kmot.yml b/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_crn_l_36e_bdd100kmot.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b9d32be10d6cb415f22257bd778aab412420fa8a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_crn_l_36e_bdd100kmot.yml
@@ -0,0 +1,56 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/ppyoloe_crn_l_36e_bdd100kmot/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+num_classes: 8
+
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train.json
+ dataset_dir: dataset/bdd100k
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val.json
+ dataset_dir: dataset/bdd100k
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val.json
+ dataset_dir: dataset/bdd100k
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_crn_l_36e_uadetrac.yml b/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_crn_l_36e_uadetrac.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5f3dd59cd9ef9d0c5e2947608f264187f433983c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_crn_l_36e_uadetrac.yml
@@ -0,0 +1,56 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 4
+weights: output/ppyoloe_crn_l_36e_uadetrac/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+num_classes: 4
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train
+ anno_path: annotations/train.json
+ dataset_dir: dataset/uadetrac
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val
+ anno_path: annotations/test.json
+ dataset_dir: dataset/uadetrac
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/uadetrac
+
+TrainReader:
+ batch_size: 8
+
+epoch: 36
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 43
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_plus_crn_t_auxhead_320_60e_ppvehicle.yml b/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_plus_crn_t_auxhead_320_60e_ppvehicle.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7ed888d7a4e0fe04fe1c68f5e8282457506597bc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/ppyoloe_plus_crn_t_auxhead_320_60e_ppvehicle.yml
@@ -0,0 +1,61 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_plus_crn_tiny_auxhead.yml',
+ '../ppyoloe/_base_/ppyoloe_plus_reader_320.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 4
+weights: output/ppyoloe_plus_crn_t_auxhead_320_60e_ppvehicle/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_300e_coco.pdparams # 640*640 COCO mAP 39.7
+depth_mult: 0.33
+width_mult: 0.375
+
+
+num_classes: 1
+TrainDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/train_all.json
+ dataset_dir: dataset/ppvehicle
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+ allow_empty: true
+
+EvalDataset:
+ !COCODataSet
+ image_dir: ""
+ anno_path: annotations/val_all.json
+ dataset_dir: dataset/ppvehicle
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val_all.json
+ dataset_dir: dataset/ppvehicle
+
+
+TrainReader:
+ batch_size: 8
+
+
+epoch: 60
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 72
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/README.md b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..464472ff164404f2cd601adffcc7163ca34ae894
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/README.md
@@ -0,0 +1,53 @@
+English | [简体中文](README_cn.md)
+# PaddleDetection applied for specific scenarios
+
+We provide some models implemented by PaddlePaddle to detect objects in specific scenarios, users can download the models and use them in these scenarios.
+
+| Task | Algorithm | Box AP | Download | Configs |
+|:---------------------|:---------:|:------:| :-------------------------------------------------------------------------------------: |:------:|
+| Vehicle Detection | YOLOv3 | 54.5 | [model](https://paddledet.bj.bcebos.com/models/vehicle_yolov3_darknet.pdparams) | [config](./vehicle_yolov3_darknet.yml) |
+
+## Vehicle Detection
+
+One of major applications of vehichle detection is traffic monitoring. In this scenary, vehicles to be detected are mostly captured by the cameras mounted on top of traffic light columns.
+
+### 1. Network
+
+The network for detecting vehicles is YOLOv3, the backbone of which is Dacknet53.
+
+### 2. Configuration for training
+
+PaddleDetection provides users with a configuration file [yolov3_darknet53_270e_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) to train YOLOv3 on the COCO dataset, compared with this file, we modify some parameters as followed to conduct the training for vehicle detection:
+
+* num_classes: 6
+* anchors: [[8, 9], [10, 23], [19, 15], [23, 33], [40, 25], [54, 50], [101, 80], [139, 145], [253, 224]]
+* nms/nms_top_k: 400
+* nms/score_threshold: 0.005
+* dataset_dir: dataset/vehicle
+
+### 3. Accuracy
+
+The accuracy of the model trained and evaluated on our private data is shown as followed:
+
+AP at IoU=.50:.05:.95 is 0.545.
+
+AP at IoU=.50 is 0.764.
+
+### 4. Inference
+
+Users can employ the model to conduct the inference:
+
+```
+export CUDA_VISIBLE_DEVICES=0
+python -u tools/infer.py -c configs/ppvehicle/vehicle_yolov3/vehicle_yolov3_darknet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/vehicle_yolov3_darknet.pdparams \
+ --infer_dir configs/ppvehicle/vehicle_yolov3/demo \
+ --draw_threshold 0.2 \
+ --output_dir configs/ppvehicle/vehicle_yolov3/demo/output
+```
+
+Some inference results are visualized below:
+
+
+
+
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/README_cn.md b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..94657c866e5904c4729047bf9868ad2805015fe8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/README_cn.md
@@ -0,0 +1,54 @@
+[English](README.md) | 简体中文
+# 特色垂类检测模型
+
+我们提供了针对不同场景的基于PaddlePaddle的检测模型,用户可以下载模型进行使用。
+
+| 任务 | 算法 | 精度(Box AP) | 下载 | 配置文件 |
+|:---------------------|:---------:|:------:| :---------------------------------------------------------------------------------: | :------:|
+| 车辆检测 | YOLOv3 | 54.5 | [下载链接](https://paddledet.bj.bcebos.com/models/vehicle_yolov3_darknet.pdparams) | [配置文件](./vehicle_yolov3_darknet.yml) |
+
+
+## 车辆检测(Vehicle Detection)
+
+车辆检测的主要应用之一是交通监控。在这样的监控场景中,待检测的车辆多为道路红绿灯柱上的摄像头拍摄所得。
+
+### 1. 模型结构
+
+Backbone为Dacknet53的YOLOv3。
+
+### 2. 训练参数配置
+
+PaddleDetection提供了使用COCO数据集对YOLOv3进行训练的参数配置文件[yolov3_darknet53_270e_coco.yml](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml),与之相比,在进行车辆检测的模型训练时,我们对以下参数进行了修改:
+
+* num_classes: 6
+* anchors: [[8, 9], [10, 23], [19, 15], [23, 33], [40, 25], [54, 50], [101, 80], [139, 145], [253, 224]]
+* nms/nms_top_k: 400
+* nms/score_threshold: 0.005
+* dataset_dir: dataset/vehicle
+
+### 3. 精度指标
+
+模型在我们内部数据上的精度指标为:
+
+IOU=.50:.05:.95时的AP为 0.545。
+
+IOU=.5时的AP为 0.764。
+
+### 4. 预测
+
+用户可以使用我们训练好的模型进行车辆检测:
+
+```
+export CUDA_VISIBLE_DEVICES=0
+python -u tools/infer.py -c configs/ppvehicle/vehicle_yolov3/vehicle_yolov3_darknet.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/vehicle_yolov3_darknet.pdparams \
+ --infer_dir configs/ppvehicle/vehicle_yolov3/demo \
+ --draw_threshold 0.2 \
+ --output_dir configs/ppvehicle/vehicle_yolov3/demo/output
+```
+
+预测结果示例:
+
+
+
+
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/001.jpeg b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/001.jpeg
new file mode 100644
index 0000000000000000000000000000000000000000..8786db5eb6773931c363358bb39462b33db55369
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/001.jpeg differ
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/003.png b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/003.png
new file mode 100644
index 0000000000000000000000000000000000000000..c01ab4ce769fb3b1c8863093a35d27da0ab10efd
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/003.png differ
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/004.png b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/004.png
new file mode 100644
index 0000000000000000000000000000000000000000..8907eb8d4d9b82e08ca214509c9fb41ca889db2a
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/004.png differ
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/005.png b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/005.png
new file mode 100644
index 0000000000000000000000000000000000000000..bf17712809c2fe6fa8e7d4f093ec4ac94523537c
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/demo/005.png differ
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/vehicle.json b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/vehicle.json
new file mode 100644
index 0000000000000000000000000000000000000000..5863a9a8c9e0d8b4daeff31e7fe7869e084d3fb4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/vehicle.json
@@ -0,0 +1,36 @@
+{
+ "images": [],
+ "annotations": [],
+ "categories": [
+ {
+ "supercategory": "component",
+ "id": 1,
+ "name": "car"
+ },
+ {
+ "supercategory": "component",
+ "id": 2,
+ "name": "truck"
+ },
+ {
+ "supercategory": "component",
+ "id": 3,
+ "name": "bus"
+ },
+ {
+ "supercategory": "component",
+ "id": 4,
+ "name": "motorbike"
+ },
+ {
+ "supercategory": "component",
+ "id": 5,
+ "name": "tricycle"
+ },
+ {
+ "supercategory": "component",
+ "id": 6,
+ "name": "carplate"
+ }
+ ]
+}
diff --git a/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/vehicle_yolov3_darknet.yml b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/vehicle_yolov3_darknet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..feb32952b00265cbac4ba0c3f17a72862b12c4e9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppvehicle/vehicle_yolov3/vehicle_yolov3_darknet.yml
@@ -0,0 +1,42 @@
+_BASE_: [
+ '../../datasets/coco_detection.yml',
+ '../../runtime.yml',
+ '../../yolov3/_base_/optimizer_270e.yml',
+ '../../yolov3/_base_/yolov3_darknet53.yml',
+ '../../yolov3/_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: https://paddledet.bj.bcebos.com/models/vehicle_yolov3_darknet.pdparams
+
+YOLOv3Head:
+ anchors: [[8, 9], [10, 23], [19, 15],
+ [23, 33], [40, 25], [54, 50],
+ [101, 80], [139, 145], [253, 224]]
+
+BBoxPostProcess:
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.005
+ nms_threshold: 0.45
+ nms_top_k: 400
+
+num_classes: 6
+
+TrainDataset:
+ !COCODataSet
+ dataset_dir: dataset/vehicle
+ anno_path: annotations/instances_train2017.json
+ image_dir: train2017
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ dataset_dir: dataset/vehicle
+ anno_path: annotations/instances_val2017.json
+ image_dir: val2017
+
+TestDataset:
+ !ImageFolder
+ anno_path: configs/ppvehicle/vehicle_yolov3/vehicle.json
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/README.md b/PaddleDetection-release-2.6/configs/ppyolo/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a50b5f9deb4ecc4784d8fe4b65c071940ac14063
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/README.md
@@ -0,0 +1,240 @@
+English | [简体中文](README_cn.md)
+
+# PP-YOLO
+
+## Table of Contents
+- [Introduction](#Introduction)
+- [Model Zoo](#Model_Zoo)
+- [Getting Start](#Getting_Start)
+- [Future Work](#Future_Work)
+- [Appendix](#Appendix)
+
+## Introduction
+
+[PP-YOLO](https://arxiv.org/abs/2007.12099) is a optimized model based on YOLOv3 in PaddleDetection,whose performance(mAP on COCO) and inference spped are better than [YOLOv4](https://arxiv.org/abs/2004.10934),PaddlePaddle 2.0.2(available on pip now) or [Daily Version](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#whl-develop) is required to run this PP-YOLO。
+
+PP-YOLO reached mmAP(IoU=0.5:0.95) as 45.9% on COCO test-dev2017 dataset, and inference speed of FP32 on single V100 is 72.9 FPS, inference speed of FP16 with TensorRT on single V100 is 155.6 FPS.
+
+
+

+
+
+PP-YOLO and PP-YOLOv2 improved performance and speed of YOLOv3 with following methods:
+
+- Better backbone: ResNet50vd-DCN
+- Larger training batch size: 8 GPUs and mini-batch size as 24 on each GPU
+- [Drop Block](https://arxiv.org/abs/1810.12890)
+- [Exponential Moving Average](https://www.investopedia.com/terms/e/ema.asp)
+- [IoU Loss](https://arxiv.org/pdf/1902.09630.pdf)
+- [Grid Sensitive](https://arxiv.org/abs/2004.10934)
+- [Matrix NMS](https://arxiv.org/pdf/2003.10152.pdf)
+- [CoordConv](https://arxiv.org/abs/1807.03247)
+- [Spatial Pyramid Pooling](https://arxiv.org/abs/1406.4729)
+- Better ImageNet pretrain weights
+- [PAN](https://arxiv.org/abs/1803.01534)
+- Iou aware Loss
+- larger input size
+
+## Model Zoo
+
+### PP-YOLO
+
+| Model | GPU number | images/GPU | backbone | input shape | Box APval | Box APtest | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | download | config |
+|:------------------------:|:-------:|:-------------:|:----------:| :-------:| :------------------: | :-------------------: | :------------: | :---------------------: | :------: | :------: |
+| PP-YOLO | 8 | 24 | ResNet50vd | 608 | 44.8 | 45.2 | 72.9 | 155.6 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
+| PP-YOLO | 8 | 24 | ResNet50vd | 512 | 43.9 | 44.4 | 89.9 | 188.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
+| PP-YOLO | 8 | 24 | ResNet50vd | 416 | 42.1 | 42.5 | 109.1 | 215.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
+| PP-YOLO | 8 | 24 | ResNet50vd | 320 | 38.9 | 39.3 | 132.2 | 242.2 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
+| PP-YOLO_2x | 8 | 24 | ResNet50vd | 608 | 45.3 | 45.9 | 72.9 | 155.6 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
+| PP-YOLO_2x | 8 | 24 | ResNet50vd | 512 | 44.4 | 45.0 | 89.9 | 188.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
+| PP-YOLO_2x | 8 | 24 | ResNet50vd | 416 | 42.7 | 43.2 | 109.1 | 215.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
+| PP-YOLO_2x | 8 | 24 | ResNet50vd | 320 | 39.5 | 40.1 | 132.2 | 242.2 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
+| PP-YOLO | 4 | 32 | ResNet18vd | 512 | 29.2 | 29.5 | 357.1 | 657.9 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r18vd_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r18vd_coco.yml) |
+| PP-YOLO | 4 | 32 | ResNet18vd | 416 | 28.6 | 28.9 | 409.8 | 719.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r18vd_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r18vd_coco.yml) |
+| PP-YOLO | 4 | 32 | ResNet18vd | 320 | 26.2 | 26.4 | 480.7 | 763.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r18vd_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r18vd_coco.yml) |
+| PP-YOLOv2 | 8 | 12 | ResNet50vd | 640 | 49.1 | 49.5 | 68.9 | 106.5 | [model](https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml) |
+| PP-YOLOv2 | 8 | 12 | ResNet101vd | 640 | 49.7 | 50.3 | 49.5 | 87.0 | [model](https://paddledet.bj.bcebos.com/models/ppyolov2_r101vd_dcn_365e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolov2_r101vd_dcn_365e_coco.yml) |
+
+
+**Notes:**
+
+- PP-YOLO is trained on COCO train2017 dataset and evaluated on val2017 & test-dev2017 dataset,Box APtest is evaluation results of `mAP(IoU=0.5:0.95)`.
+- PP-YOLO used 8 GPUs for training and mini-batch size as 24 on each GPU, if GPU number and mini-batch size is changed, learning rate and iteration times should be adjusted according [FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/FAQ).
+- PP-YOLO inference speed is tested on single Tesla V100 with batch size as 1, CUDA 10.2, CUDNN 7.5.1, TensorRT 5.1.2.2 in TensorRT mode.
+- PP-YOLO FP32 inference speed testing uses inference model exported by `tools/export_model.py` and benchmarked by running `depoly/python/infer.py` with `--run_benchmark`. All testing results do not contains the time cost of data reading and post-processing(NMS), which is same as [YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet) in testing method.
+- TensorRT FP16 inference speed testing exclude the time cost of bounding-box decoding(`yolo_box`) part comparing with FP32 testing above, which means that data reading, bounding-box decoding and post-processing(NMS) is excluded(test method same as [YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet) too)
+- If you set `--run_benchmark=True`,you should install these dependencies at first, `pip install pynvml psutil GPUtil`.
+
+### PP-YOLO for mobile
+
+| Model | GPU number | images/GPU | Model Size | input shape | Box APval | Box AP50val | Kirin 990 1xCore(FPS) | download | config |
+|:----------------------------:|:-------:|:-------------:|:----------:| :-------:| :------------------: | :--------------------: | :--------------------: | :------: | :------: |
+| PP-YOLO_MobileNetV3_large | 4 | 32 | 28MB | 320 | 23.2 | 42.6 | 14.1 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) |
+| PP-YOLO_MobileNetV3_small | 4 | 32 | 16MB | 320 | 17.2 | 33.8 | 21.5 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_small_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_small_coco.yml) |
+
+**Notes:**
+
+- PP-YOLO_MobileNetV3 is trained on COCO train2017 datast and evaluated on val2017 dataset,Box APval is evaluation results of `mAP(IoU=0.5:0.95)`, Box APval is evaluation results of `mAP(IoU=0.5)`.
+- PP-YOLO_MobileNetV3 used 4 GPUs for training and mini-batch size as 32 on each GPU, if GPU number and mini-batch size is changed, learning rate and iteration times should be adjusted according [FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/FAQ).
+- PP-YOLO_MobileNetV3 inference speed is tested on Kirin 990 with 1 thread.
+
+### PP-YOLO tiny
+
+| Model | GPU number | images/GPU | Model Size | Post Quant Model Size | input shape | Box APval | Kirin 990 4xCore(FPS) | download | config | post quant model |
+|:----------------------------:|:-------:|:-------------:|:----------:| :-------------------: | :---------: | :------------------: | :-------------------: | :------: | :----: | :--------------: |
+| PP-YOLO tiny | 8 | 32 | 4.2MB | **1.3M** | 320 | 20.6 | 92.3 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_650e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_tiny_650e_coco.yml) | [inference model](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_quant.tar) |
+| PP-YOLO tiny | 8 | 32 | 4.2MB | **1.3M** | 416 | 22.7 | 65.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_650e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_tiny_650e_coco.yml) | [inference model](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_quant.tar) |
+
+**Notes:**
+
+- PP-YOLO-tiny is trained on COCO train2017 datast and evaluated on val2017 dataset,Box APval is evaluation results of `mAP(IoU=0.5:0.95)`, Box APval is evaluation results of `mAP(IoU=0.5)`.
+- PP-YOLO-tiny used 8 GPUs for training and mini-batch size as 32 on each GPU, if GPU number and mini-batch size is changed, learning rate and iteration times should be adjusted according [FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/FAQ/README.md).
+- PP-YOLO-tiny inference speed is tested on Kirin 990 with 4 threads by arm8
+- we alse provide PP-YOLO-tiny post quant inference model, which can compress model to **1.3MB** with nearly no inference on inference speed and performance
+
+### PP-YOLO on Pascal VOC
+
+PP-YOLO trained on Pascal VOC dataset as follows:
+
+| Model | GPU number | images/GPU | backbone | input shape | Box AP50val | download | config |
+|:------------------:|:----------:|:----------:|:----------:| :----------:| :--------------------: | :------: | :-----: |
+| PP-YOLO | 8 | 12 | ResNet50vd | 608 | 84.9 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_voc.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml) |
+| PP-YOLO | 8 | 12 | ResNet50vd | 416 | 84.3 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_voc.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml) |
+| PP-YOLO | 8 | 12 | ResNet50vd | 320 | 82.2 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_voc.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml) |
+
+## Getting Start
+
+### 1. Training
+
+Training PP-YOLO on 8 GPUs with following command(all commands should be run under PaddleDetection dygraph directory as default)
+
+```bash
+python -m paddle.distributed.launch --log_dir=./ppyolo_dygraph/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml &>ppyolo_dygraph.log 2>&1 &
+```
+
+optional: Run `tools/anchor_cluster.py` to get anchors suitable for your dataset, and modify the anchor setting in model configuration file and reader configuration file, such as `configs/ppyolo/_base_/ppyolo_tiny.yml` and `configs/ppyolo/_base_/ppyolo_tiny_reader.yml`.
+
+``` bash
+python tools/anchor_cluster.py -c configs/ppyolo/ppyolo_tiny_650e_coco.yml -n 9 -s 320 -m v2 -i 1000
+```
+
+### 2. Evaluation
+
+Evaluating PP-YOLO on COCO val2017 dataset in single GPU with following commands:
+
+```bash
+# use weights released in PaddleDetection model zoo
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
+
+# use saved checkpoint in training
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=output/ppyolo_r50vd_dcn_1x_coco/model_final
+```
+
+For evaluation on COCO test-dev2017 dataset, `configs/ppyolo/ppyolo_test.yml` should be used, please download COCO test-dev2017 dataset from [COCO dataset download](https://cocodataset.org/#download) and decompress to pathes configured by `EvalReader.dataset` in `configs/ppyolo/ppyolo_test.yml` and run evaluation by following command:
+
+```bash
+# use weights released in PaddleDetection model zoo
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_test.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
+
+# use saved checkpoint in training
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_test.yml -o weights=output/ppyolo_r50vd_dcn_1x_coco/model_final
+```
+
+Evaluation results will be saved in `bbox.json`, compress it into a `zip` package and upload to [COCO dataset evaluation](https://competitions.codalab.org/competitions/20794#participate) to evaluate.
+
+**NOTE 1:** `configs/ppyolo/ppyolo_test.yml` is only used for evaluation on COCO test-dev2017 dataset, could not be used for training or COCO val2017 dataset evaluating.
+
+**NOTE 2:** Due to the overall upgrade of the dynamic graph framework, the following weight models published by paddledetection need to be evaluated by adding the -- bias field, such as
+
+```bash
+# use weights released in PaddleDetection model zoo
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --bias
+```
+These models are:
+
+1.ppyolo_r50vd_dcn_1x_coco
+
+2.ppyolo_r50vd_dcn_voc
+
+3.ppyolo_r18vd_coco
+
+4.ppyolo_mbv3_large_coco
+
+5.ppyolo_mbv3_small_coco
+
+6.ppyolo_tiny_650e_coco
+
+### 3. Inference
+
+Inference images in single GPU with following commands, use `--infer_img` to inference a single image and `--infer_dir` to inference all images in the directory.
+
+```bash
+# inference single image
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
+
+# inference all images in the directory
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_dir=demo
+```
+
+### 4. Inference deployment
+
+For inference deployment or benchmard, model exported with `tools/export_model.py` should be used and perform inference with Paddle inference library with following commands:
+
+```bash
+# export model, model will be save in output/ppyolo as default
+python tools/export_model.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
+
+# inference with Paddle Inference library
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyolo_r50vd_dcn_1x_coco --image_file=demo/000000014439_640x640.jpg --device=GPU
+```
+
+
+## Appendix
+
+Optimizing method and ablation experiments of PP-YOLO compared with YOLOv3.
+
+| NO. | Model | Box APval | Box APtest | Params(M) | FLOPs(G) | V100 FP32 FPS |
+| :--: | :--------------------------- | :------------------: |:--------------------: | :-------: | :------: | :-----------: |
+| A | YOLOv3-DarkNet53 | 38.9 | - | 59.13 | 65.52 | 58.2 |
+| B | YOLOv3-ResNet50vd-DCN | 39.1 | - | 43.89 | 44.71 | 79.2 |
+| C | B + LB + EMA + DropBlock | 41.4 | - | 43.89 | 44.71 | 79.2 |
+| D | C + IoU Loss | 41.9 | - | 43.89 | 44.71 | 79.2 |
+| E | D + IoU Aware | 42.5 | - | 43.90 | 44.71 | 74.9 |
+| F | E + Grid Sensitive | 42.8 | - | 43.90 | 44.71 | 74.8 |
+| G | F + Matrix NMS | 43.5 | - | 43.90 | 44.71 | 74.8 |
+| H | G + CoordConv | 44.0 | - | 43.93 | 44.76 | 74.1 |
+| I | H + SPP | 44.3 | 45.2 | 44.93 | 45.12 | 72.9 |
+| J | I + Better ImageNet Pretrain | 44.8 | 45.2 | 44.93 | 45.12 | 72.9 |
+| K | J + 2x Scheduler | 45.3 | 45.9 | 44.93 | 45.12 | 72.9 |
+
+**Notes:**
+
+- Performance and inference spedd are measure with input shape as 608
+- All models are trained on COCO train2017 datast and evaluated on val2017 & test-dev2017 dataset,`Box AP` is evaluation results as `mAP(IoU=0.5:0.95)`.
+- Inference speed is tested on single Tesla V100 with batch size as 1 following test method and environment configuration in benchmark above.
+- [YOLOv3-DarkNet53](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) with mAP as 39.0 is optimized YOLOv3 model in PaddleDetection,see [YOLOv3](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/yolov3/README.md) for details.
+
+## Citation
+
+```
+@article{huang2021pp,
+ title={PP-YOLOv2: A Practical Object Detector},
+ author={Huang, Xin and Wang, Xinxin and Lv, Wenyu and Bai, Xiaying and Long, Xiang and Deng, Kaipeng and Dang, Qingqing and Han, Shumin and Liu, Qiwen and Hu, Xiaoguang and others},
+ journal={arXiv preprint arXiv:2104.10419},
+ year={2021}
+}
+@misc{long2020ppyolo,
+title={PP-YOLO: An Effective and Efficient Implementation of Object Detector},
+author={Xiang Long and Kaipeng Deng and Guanzhong Wang and Yang Zhang and Qingqing Dang and Yuan Gao and Hui Shen and Jianguo Ren and Shumin Han and Errui Ding and Shilei Wen},
+year={2020},
+eprint={2007.12099},
+archivePrefix={arXiv},
+primaryClass={cs.CV}
+}
+@misc{ppdet2019,
+title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
+author={PaddlePaddle Authors},
+howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
+year={2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/README_cn.md b/PaddleDetection-release-2.6/configs/ppyolo/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..5463f96eed11af300714175435a49730914f91cc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/README_cn.md
@@ -0,0 +1,233 @@
+简体中文 | [English](README.md)
+
+# PP-YOLO 模型
+
+## 内容
+- [简介](#简介)
+- [模型库与基线](#模型库与基线)
+- [使用说明](#使用说明)
+- [未来工作](#未来工作)
+- [附录](#附录)
+
+## 简介
+
+[PP-YOLO](https://arxiv.org/abs/2007.12099)是PaddleDetection优化和改进的YOLOv3的模型,其精度(COCO数据集mAP)和推理速度均优于[YOLOv4](https://arxiv.org/abs/2004.10934)模型,要求使用PaddlePaddle 2.0.2(可使用pip安装) 或适当的[develop版本](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#whl-develop)。
+
+PP-YOLO在[COCO](http://cocodataset.org) test-dev2017数据集上精度达到45.9%,在单卡V100上FP32推理速度为72.9 FPS, V100上开启TensorRT下FP16推理速度为155.6 FPS。
+
+
+

+
+
+PP-YOLO和PP-YOLOv2从如下方面优化和提升YOLOv3模型的精度和速度:
+
+- 更优的骨干网络: ResNet50vd-DCN
+- 更大的训练batch size: 8 GPUs,每GPU batch_size=24,对应调整学习率和迭代轮数
+- [Drop Block](https://arxiv.org/abs/1810.12890)
+- [Exponential Moving Average](https://www.investopedia.com/terms/e/ema.asp)
+- [IoU Loss](https://arxiv.org/pdf/1902.09630.pdf)
+- [Grid Sensitive](https://arxiv.org/abs/2004.10934)
+- [Matrix NMS](https://arxiv.org/pdf/2003.10152.pdf)
+- [CoordConv](https://arxiv.org/abs/1807.03247)
+- [Spatial Pyramid Pooling](https://arxiv.org/abs/1406.4729)
+- 更优的预训练模型
+- [PAN](https://arxiv.org/abs/1803.01534)
+- Iou aware Loss
+- 更大的输入尺寸
+
+## 模型库
+
+### PP-YOLO模型
+
+| 模型 | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box APval | Box APtest | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | 模型下载 | 配置文件 |
+|:------------------------:|:-------:|:-------------:|:----------:| :-------:| :------------------: | :-------------------: | :------------: | :---------------------: | :------: | :------: |
+| PP-YOLO | 8 | 24 | ResNet50vd | 608 | 44.8 | 45.2 | 72.9 | 155.6 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
+| PP-YOLO | 8 | 24 | ResNet50vd | 512 | 43.9 | 44.4 | 89.9 | 188.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
+| PP-YOLO | 8 | 24 | ResNet50vd | 416 | 42.1 | 42.5 | 109.1 | 215.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
+| PP-YOLO | 8 | 24 | ResNet50vd | 320 | 38.9 | 39.3 | 132.2 | 242.2 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) |
+| PP-YOLO_2x | 8 | 24 | ResNet50vd | 608 | 45.3 | 45.9 | 72.9 | 155.6 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
+| PP-YOLO_2x | 8 | 24 | ResNet50vd | 512 | 44.4 | 45.0 | 89.9 | 188.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
+| PP-YOLO_2x | 8 | 24 | ResNet50vd | 416 | 42.7 | 43.2 | 109.1 | 215.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
+| PP-YOLO_2x | 8 | 24 | ResNet50vd | 320 | 39.5 | 40.1 | 132.2 | 242.2 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml) |
+| PP-YOLO | 4 | 32 | ResNet18vd | 512 | 29.2 | 29.5 | 357.1 | 657.9 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r18vd_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r18vd_coco.yml) |
+| PP-YOLO | 4 | 32 | ResNet18vd | 416 | 28.6 | 28.9 | 409.8 | 719.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r18vd_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r18vd_coco.yml) |
+| PP-YOLO | 4 | 32 | ResNet18vd | 320 | 26.2 | 26.4 | 480.7 | 763.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r18vd_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r18vd_coco.yml) |
+| PP-YOLOv2 | 8 | 12 | ResNet50vd | 640 | 49.1 | 49.5 | 68.9 | 106.5 | [model](https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml) |
+| PP-YOLOv2 | 8 | 12 | ResNet101vd | 640 | 49.7 | 50.3 | 49.5 | 87.0 | [model](https://paddledet.bj.bcebos.com/models/ppyolov2_r101vd_dcn_365e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolov2_r101vd_dcn_365e_coco.yml) |
+
+**注意:**
+
+- PP-YOLO模型使用COCO数据集中train2017作为训练集,使用val2017和test-dev2017作为测试集,Box APtest为`mAP(IoU=0.5:0.95)`评估结果。
+- PP-YOLO模型训练过程中使用8 GPUs,每GPU batch size为24进行训练,如训练GPU数和batch size不使用上述配置,须参考[FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/FAQ)调整学习率和迭代次数。
+- PP-YOLO模型推理速度测试采用单卡V100,batch size=1进行测试,使用CUDA 10.2, CUDNN 7.5.1,TensorRT推理速度测试使用TensorRT 5.1.2.2。
+- PP-YOLO模型FP32的推理速度测试数据为使用`tools/export_model.py`脚本导出模型后,使用`deploy/python/infer.py`脚本中的`--run_benchnark`参数使用Paddle预测库进行推理速度benchmark测试结果, 且测试的均为不包含数据预处理和模型输出后处理(NMS)的数据(与[YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet)测试方法一致)。
+- TensorRT FP16的速度测试相比于FP32去除了`yolo_box`(bbox解码)部分耗时,即不包含数据预处理,bbox解码和NMS(与[YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet)测试方法一致)。
+
+### PP-YOLO 轻量级模型
+
+| 模型 | GPU个数 | 每GPU图片个数 | 模型体积 | 输入尺寸 | Box APval | Box AP50val | Kirin 990 1xCore (FPS) | 模型下载 | 配置文件 |
+|:----------------------------:|:-------:|:-------------:|:----------:| :-------:| :------------------: | :--------------------: | :--------------------: | :------: | :------: |
+| PP-YOLO_MobileNetV3_large | 4 | 32 | 28MB | 320 | 23.2 | 42.6 | 14.1 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) |
+| PP-YOLO_MobileNetV3_small | 4 | 32 | 16MB | 320 | 17.2 | 33.8 | 21.5 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_small_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_small_coco.yml) |
+
+- PP-YOLO_MobileNetV3 模型使用COCO数据集中train2017作为训练集,使用val2017作为测试集,Box APval为`mAP(IoU=0.5:0.95)`评估结果, Box AP50val为`mAP(IoU=0.5)`评估结果。
+- PP-YOLO_MobileNetV3 模型训练过程中使用4GPU,每GPU batch size为32进行训练,如训练GPU数和batch size不使用上述配置,须参考[FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/FAQ)调整学习率和迭代次数。
+- PP-YOLO_MobileNetV3 模型推理速度测试环境配置为麒麟990芯片单线程。
+
+### PP-YOLO tiny模型
+
+| 模型 | GPU 个数 | 每GPU图片个数 | 模型体积 | 后量化模型体积 | 输入尺寸 | Box APval | Kirin 990 1xCore (FPS) | 模型下载 | 配置文件 | 量化后模型 |
+|:----------------------------:|:----------:|:-------------:| :--------: | :------------: | :----------:| :------------------: | :--------------------: | :------: | :------: | :--------: |
+| PP-YOLO tiny | 8 | 32 | 4.2MB | **1.3M** | 320 | 20.6 | 92.3 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_650e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_tiny_650e_coco.yml) | [预测模型](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_quant.tar) |
+| PP-YOLO tiny | 8 | 32 | 4.2MB | **1.3M** | 416 | 22.7 | 65.4 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_650e_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_tiny_650e_coco.yml) | [预测模型](https://paddledet.bj.bcebos.com/models/ppyolo_tiny_quant.tar) |
+
+- PP-YOLO-tiny 模型使用COCO数据集中train2017作为训练集,使用val2017作为测试集,Box APval为`mAP(IoU=0.5:0.95)`评估结果, Box AP50val为`mAP(IoU=0.5)`评估结果。
+- PP-YOLO-tiny 模型训练过程中使用8GPU,每GPU batch size为32进行训练,如训练GPU数和batch size不使用上述配置,须参考[FAQ](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/FAQ/README.md)调整学习率和迭代次数。
+- PP-YOLO-tiny 模型推理速度测试环境配置为麒麟990芯片4线程,arm8架构。
+- 我们也提供的PP-YOLO-tiny的后量化压缩模型,将模型体积压缩到**1.3M**,对精度和预测速度基本无影响
+
+### Pascal VOC数据集上的PP-YOLO
+
+PP-YOLO在Pascal VOC数据集上训练模型如下:
+
+| 模型 | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box AP50val | 模型下载 | 配置文件 |
+|:------------------:|:-------:|:-------------:|:----------:| :----------:| :--------------------: | :------: | :-----: |
+| PP-YOLO | 8 | 12 | ResNet50vd | 608 | 84.9 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_voc.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml) |
+| PP-YOLO | 8 | 12 | ResNet50vd | 416 | 84.3 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_voc.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml) |
+| PP-YOLO | 8 | 12 | ResNet50vd | 320 | 82.2 | [model](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_voc.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml) |
+
+## 使用说明
+
+### 1. 训练
+
+使用8GPU通过如下命令一键式启动训练(以下命令均默认在PaddleDetection根目录运行), 通过`--eval`参数开启训练中交替评估。
+
+```bash
+python -m paddle.distributed.launch --log_dir=./ppyolo_dygraph/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml &>ppyolo_dygraph.log 2>&1 &
+```
+
+可选:在训练之前使用`tools/anchor_cluster.py`得到适用于你的数据集的anchor,并注意修改模型配置文件和Reader配置文件中的anchor设置,如`configs/ppyolo/_base_/ppyolo_tiny.yml`和`configs/ppyolo/_base_/ppyolo_tiny_reader.yml`中anchor设置
+```bash
+python tools/anchor_cluster.py -c configs/ppyolo/ppyolo_tiny_650e_coco.yml -n 9 -s 320 -m v2 -i 1000
+```
+
+### 2. 评估
+
+使用单GPU通过如下命令一键式评估模型在COCO val2017数据集效果
+
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=output/ppyolo_r50vd_dcn_1x_coco/model_final
+```
+
+我们提供了`configs/ppyolo/ppyolo_test.yml`用于评估COCO test-dev2017数据集的效果,评估COCO test-dev2017数据集的效果须先从[COCO数据集下载页](https://cocodataset.org/#download)下载test-dev2017数据集,解压到`configs/ppyolo/ppyolo_test.yml`中`EvalReader.dataset`中配置的路径,并使用如下命令进行评估
+
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_test.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
+
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_test.yml -o weights=output/ppyolo_r50vd_dcn_1x_coco/model_final
+```
+
+评估结果保存于`bbox.json`中,将其压缩为zip包后通过[COCO数据集评估页](https://competitions.codalab.org/competitions/20794#participate)提交评估。
+
+**注意1:** `configs/ppyolo/ppyolo_test.yml`仅用于评估COCO test-dev数据集,不用于训练和评估COCO val2017数据集。
+
+**注意2:** 由于动态图框架整体升级,以下几个PaddleDetection发布的权重模型评估时需要添加--bias字段, 例如
+
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --bias
+```
+主要有:
+
+1.ppyolo_r50vd_dcn_1x_coco
+
+2.ppyolo_r50vd_dcn_voc
+
+3.ppyolo_r18vd_coco
+
+4.ppyolo_mbv3_large_coco
+
+5.ppyolo_mbv3_small_coco
+
+6.ppyolo_tiny_650e_coco
+
+### 3. 推理
+
+使用单GPU通过如下命令一键式推理图像,通过`--infer_img`指定图像路径,或通过`--infer_dir`指定目录并推理目录下所有图像
+
+```bash
+# 推理单张图像
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
+
+# 推理目录下所有图像
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_dir=demo
+```
+
+### 4. 推理部署
+
+PP-YOLO模型部署及推理benchmark需要通过`tools/export_model.py`导出模型后使用Paddle预测库进行部署和推理,可通过如下命令一键式启动。
+
+```bash
+# 导出模型,默认存储于output/ppyolo目录
+python tools/export_model.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
+
+# 预测库推理
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyolo_r50vd_dcn_1x_coco --image_file=demo/000000014439_640x640.jpg --device=GPU
+```
+
+
+## 附录
+
+PP-YOLO模型相对于YOLOv3模型优化项消融实验数据如下表所示。
+
+| 序号 | 模型 | Box APval | Box APtest | 参数量(M) | FLOPs(G) | V100 FP32 FPS |
+| :--: | :--------------------------- | :------------------: | :-------------------: | :-------: | :------: | :-----------: |
+| A | YOLOv3-DarkNet53 | 38.9 | - | 59.13 | 65.52 | 58.2 |
+| B | YOLOv3-ResNet50vd-DCN | 39.1 | - | 43.89 | 44.71 | 79.2 |
+| C | B + LB + EMA + DropBlock | 41.4 | - | 43.89 | 44.71 | 79.2 |
+| D | C + IoU Loss | 41.9 | - | 43.89 | 44.71 | 79.2 |
+| E | D + IoU Aware | 42.5 | - | 43.90 | 44.71 | 74.9 |
+| F | E + Grid Sensitive | 42.8 | - | 43.90 | 44.71 | 74.8 |
+| G | F + Matrix NMS | 43.5 | - | 43.90 | 44.71 | 74.8 |
+| H | G + CoordConv | 44.0 | - | 43.93 | 44.76 | 74.1 |
+| I | H + SPP | 44.3 | 45.2 | 44.93 | 45.12 | 72.9 |
+| J | I + Better ImageNet Pretrain | 44.8 | 45.2 | 44.93 | 45.12 | 72.9 |
+| K | J + 2x Scheduler | 45.3 | 45.9 | 44.93 | 45.12 | 72.9 |
+
+**注意:**
+
+- 精度与推理速度数据均为使用输入图像尺寸为608的测试结果
+- Box AP为在COCO train2017数据集训练,val2017和test-dev2017数据集上评估`mAP(IoU=0.5:0.95)`数据
+- 推理速度为单卡V100上,batch size=1, 使用上述benchmark测试方法的测试结果,测试环境配置为CUDA 10.2,CUDNN 7.5.1
+- [YOLOv3-DarkNet53](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml)精度38.9为PaddleDetection优化后的YOLOv3模型,可参见[YOLOv3](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/yolov3/README.md)
+
+## 引用
+
+```
+@article{huang2021pp,
+ title={PP-YOLOv2: A Practical Object Detector},
+ author={Huang, Xin and Wang, Xinxin and Lv, Wenyu and Bai, Xiaying and Long, Xiang and Deng, Kaipeng and Dang, Qingqing and Han, Shumin and Liu, Qiwen and Hu, Xiaoguang and others},
+ journal={arXiv preprint arXiv:2104.10419},
+ year={2021}
+}
+@misc{long2020ppyolo,
+title={PP-YOLO: An Effective and Efficient Implementation of Object Detector},
+author={Xiang Long and Kaipeng Deng and Guanzhong Wang and Yang Zhang and Qingqing Dang and Yuan Gao and Hui Shen and Jianguo Ren and Shumin Han and Errui Ding and Shilei Wen},
+year={2020},
+eprint={2007.12099},
+archivePrefix={arXiv},
+primaryClass={cs.CV}
+}
+@misc{ppdet2019,
+title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
+author={PaddlePaddle Authors},
+howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
+year={2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fe51b296c72e4c663bf4c611d80a1173ff69f6a9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_1x.yml
@@ -0,0 +1,22 @@
+epoch: 405
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 243
+ - 324
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_2x.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_2x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c601a18601c7a0d8a79049cb0d1b9a87f41900f4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_2x.yml
@@ -0,0 +1,22 @@
+epoch: 811
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 649
+ - 730
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_365e.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_365e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d834a4ce0547a77a236964f7dc6ce52c217be2d5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_365e.yml
@@ -0,0 +1,21 @@
+epoch: 365
+
+LearningRate:
+ base_lr: 0.005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 243
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_650e.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_650e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..79a1f98eacb86cf8ae8ac34ce0c1e601cce78322
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/optimizer_650e.yml
@@ -0,0 +1,22 @@
+epoch: 650
+
+LearningRate:
+ base_lr: 0.005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 430
+ - 540
+ - 610
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_mbv3_large.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_mbv3_large.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0faaa9a9a3bb1d94abe183ed385558852d0fbc20
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_mbv3_large.yml
@@ -0,0 +1,56 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x1_0_ssld_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: MobileNetV3
+ neck: PPYOLOFPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+MobileNetV3:
+ model_name: large
+ scale: 1.
+ with_extra_blocks: false
+ extra_block_filters: []
+ feature_maps: [13, 16]
+
+PPYOLOFPN:
+ in_channels: [160, 368]
+ coord_conv: true
+ conv_block_num: 0
+ spp: true
+ drop_block: true
+
+YOLOv3Head:
+ anchors: [[11, 18], [34, 47], [51, 126],
+ [115, 71], [120, 195], [254, 235]]
+ anchor_masks: [[3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.5
+ downsample: [32, 16]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ nms_threshold: 0.45
+ nms_top_k: 1000
+ score_threshold: 0.005
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_mbv3_small.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_mbv3_small.yml
new file mode 100644
index 0000000000000000000000000000000000000000..dda938298f2c1b65652405b808c6df14ed049c77
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_mbv3_small.yml
@@ -0,0 +1,56 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_small_x1_0_ssld_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: MobileNetV3
+ neck: PPYOLOFPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+MobileNetV3:
+ model_name: small
+ scale: 1.
+ with_extra_blocks: false
+ extra_block_filters: []
+ feature_maps: [9, 12]
+
+PPYOLOFPN:
+ in_channels: [96, 304]
+ coord_conv: true
+ conv_block_num: 0
+ spp: true
+ drop_block: true
+
+YOLOv3Head:
+ anchors: [[11, 18], [34, 47], [51, 126],
+ [115, 71], [120, 195], [254, 235]]
+ anchor_masks: [[3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.5
+ downsample: [32, 16]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ nms_threshold: 0.45
+ nms_top_k: 1000
+ score_threshold: 0.005
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_r18vd.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_r18vd.yml
new file mode 100644
index 0000000000000000000000000000000000000000..56a34838574f277b4b43dd536449ee39b7c4e0c1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_r18vd.yml
@@ -0,0 +1,57 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet18_vd_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOFPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 18
+ variant: d
+ return_idx: [2, 3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOFPN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ conv_block_num: 0
+
+YOLOv3Head:
+ anchor_masks: [[3, 4, 5], [0, 1, 2]]
+ anchors: [[10, 14], [23, 27], [37, 58],
+ [81, 82], [135, 169], [344, 319]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_r50vd_dcn.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_r50vd_dcn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..22cad952379161a58dd298b98c1ab36999dae28d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_r50vd_dcn.yml
@@ -0,0 +1,66 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOFPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOFPN:
+ coord_conv: true
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.4
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_reader.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1698539afc0b63bf002831a3a6cd0c63a1828db9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - Decode: {}
+ - Mixup: {alpha: 1.5, beta: 1.5}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 50}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 24
+ shuffle: true
+ drop_last: true
+ mixup_epoch: 25000
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 608, 608]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_tiny.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_tiny.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d03e2bb86a494d07b785ede5bf93db7886fe40cc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_tiny.yml
@@ -0,0 +1,55 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x0_5_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: MobileNetV3
+ neck: PPYOLOTinyFPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+MobileNetV3:
+ model_name: large
+ scale: .5
+ with_extra_blocks: false
+ extra_block_filters: []
+ feature_maps: [7, 13, 16]
+
+PPYOLOTinyFPN:
+ detection_block_channels: [160, 128, 96]
+ spp: true
+ drop_block: true
+
+YOLOv3Head:
+ anchors: [[10, 15], [24, 36], [72, 42],
+ [35, 87], [102, 96], [60, 170],
+ [220, 125], [128, 222], [264, 266]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.5
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ nms_threshold: 0.45
+ nms_top_k: 1000
+ score_threshold: 0.005
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_tiny_reader.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_tiny_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..14c8a7f5aab0fca7d9c5dfccce4d8b590c9ab2ef
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolo_tiny_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 4
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - Mixup: {alpha: 1.5, beta: 1.5}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [192, 224, 256, 288, 320, 352, 384, 416, 448, 480, 512], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 15], [24, 36], [72, 42], [35, 87], [102, 96], [60, 170], [220, 125], [128, 222], [264, 266]], downsample_ratios: [32, 16, 8]}
+ batch_size: 32
+ shuffle: true
+ drop_last: true
+ mixup_epoch: 500
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [320, 320], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 320, 320]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [320, 320], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolov2_r50vd_dcn.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolov2_r50vd_dcn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6288adeed8a4b057261f98132456f71b724fc45d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolov2_r50vd_dcn.yml
@@ -0,0 +1,65 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolov2_reader.yml b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolov2_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f0dfd9f62207676c95988331a2d6ba8a07a0b2b1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/_base_/ppyolov2_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - Mixup: {alpha: 1.5, beta: 1.5}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 12
+ shuffle: true
+ drop_last: true
+ mixup_epoch: 25000
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..01558786e5f75658a023883bca9c6accd3ef23a2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml
@@ -0,0 +1,81 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolo_mbv3_large.yml',
+ './_base_/optimizer_1x.yml',
+ './_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 10
+weights: output/ppyolo_mbv3_large_coco/model_final
+
+TrainReader:
+ inputs_def:
+ num_max_boxes: 90
+ sample_transforms:
+ - Decode: {}
+ - Mixup: {alpha: 1.5, beta: 1.5}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize:
+ target_size: [224, 256, 288, 320, 352, 384, 416, 448, 480, 512]
+ random_size: True
+ random_interp: True
+ keep_ratio: False
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 90}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget:
+ anchor_masks: [[3, 4, 5], [0, 1, 2]]
+ anchors: [[11, 18], [34, 47], [51, 126], [115, 71], [120, 195], [254, 235]]
+ downsample_ratios: [32, 16]
+ iou_thresh: 0.25
+ num_classes: 80
+ batch_size: 32
+ mixup_epoch: 200
+ shuffle: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [320, 320], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 320, 320]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [320, 320], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+epoch: 270
+
+LearningRate:
+ base_lr: 0.005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 162
+ - 216
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_mbv3_small_coco.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_mbv3_small_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..53554c40ccba90cbb8019b23f2b7a64ce3c35bc7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_mbv3_small_coco.yml
@@ -0,0 +1,81 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolo_mbv3_small.yml',
+ './_base_/optimizer_1x.yml',
+ './_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 10
+weights: output/ppyolo_mbv3_small_coco/model_final
+
+TrainReader:
+ inputs_def:
+ num_max_boxes: 90
+ sample_transforms:
+ - Decode: {}
+ - Mixup: {alpha: 1.5, beta: 1.5}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize:
+ target_size: [224, 256, 288, 320, 352, 384, 416, 448, 480, 512]
+ random_size: True
+ random_interp: True
+ keep_ratio: False
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 90}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget:
+ anchor_masks: [[3, 4, 5], [0, 1, 2]]
+ anchors: [[11, 18], [34, 47], [51, 126], [115, 71], [120, 195], [254, 235]]
+ downsample_ratios: [32, 16]
+ iou_thresh: 0.25
+ num_classes: 80
+ batch_size: 32
+ mixup_epoch: 200
+ shuffle: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [320, 320], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 320, 320]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [320, 320], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+epoch: 270
+
+LearningRate:
+ base_lr: 0.005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 162
+ - 216
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r18vd_coco.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r18vd_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..311e3f16f7932bf493cddf21bcf05db9e8dd20cc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r18vd_coco.yml
@@ -0,0 +1,81 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolo_r18vd.yml',
+ './_base_/optimizer_1x.yml',
+ './_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 10
+weights: output/ppyolo_r18vd_coco/model_final
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Mixup: {alpha: 1.5, beta: 1.5}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize:
+ target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608]
+ random_size: True
+ random_interp: True
+ keep_ratio: False
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 50}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage:
+ mean: [0.485, 0.456, 0.406]
+ std: [0.229, 0.224, 0.225]
+ is_scale: True
+ - Permute: {}
+ - Gt2YoloTarget:
+ anchor_masks: [[3, 4, 5], [0, 1, 2]]
+ anchors: [[10, 14], [23, 27], [37, 58], [81, 82], [135, 169], [344, 319]]
+ downsample_ratios: [32, 16]
+
+ batch_size: 32
+ mixup_epoch: 500
+ shuffle: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [512, 512], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 512, 512]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [512, 512], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+epoch: 270
+
+LearningRate:
+ base_lr: 0.004
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 162
+ - 216
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..918f3401e79a34c6859d594603b322e833e263c0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolo_r50vd_dcn.yml',
+ './_base_/optimizer_1x.yml',
+ './_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 16
+weights: output/ppyolo_r50vd_dcn_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_minicoco.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_minicoco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..87b976b99640dbf66c92ba5b1180a80e696ba195
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_minicoco.yml
@@ -0,0 +1,44 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolo_r50vd_dcn.yml',
+ './_base_/optimizer_1x.yml',
+ './_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 8
+use_ema: true
+weights: output/ppyolo_r50vd_dcn_1x_minicoco/model_final
+
+TrainReader:
+ batch_size: 12
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ # refer to https://github.com/giddyyupp/coco-minitrain
+ anno_path: annotations/instances_minitrain2017.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+epoch: 192
+
+LearningRate:
+ base_lr: 0.005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 153
+ - 173
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ac6531fe78ae85ec56fdaf6eed17b38dd807b805
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_2x_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolo_r50vd_dcn.yml',
+ './_base_/optimizer_2x.yml',
+ './_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 16
+weights: output/ppyolo_r50vd_dcn_2x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5349d6b1ed381705218f32daf17bff92a233d89e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_r50vd_dcn_voc.yml
@@ -0,0 +1,42 @@
+_BASE_: [
+ '../datasets/voc.yml',
+ '../runtime.yml',
+ './_base_/ppyolo_r50vd_dcn.yml',
+ './_base_/optimizer_1x.yml',
+ './_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 83
+weights: output/ppyolo_r50vd_dcn_voc/model_final
+
+TrainReader:
+ mixup_epoch: 350
+ batch_size: 12
+
+# set collate_batch to false because ground-truth info is needed
+# on voc dataset and should not collate data in batch when batch size
+# is larger than 1.
+EvalReader:
+ collate_batch: false
+
+epoch: 583
+
+LearningRate:
+ base_lr: 0.00333
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 466
+ - 516
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_test.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_test.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7c2ca0b5355c73f964bc950d3ab2d42629c9d82b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_test.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolo_r50vd_dcn.yml',
+ './_base_/optimizer_1x.yml',
+ './_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 16
+
+EvalDataset:
+ !COCODataSet
+ image_dir: test2017
+ anno_path: annotations/image_info_test-dev2017.json
+ dataset_dir: dataset/coco
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_tiny_650e_coco.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_tiny_650e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..288a0eba8063864877762dfecf9b22373121fe2a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolo_tiny_650e_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolo_tiny.yml',
+ './_base_/optimizer_650e.yml',
+ './_base_/ppyolo_tiny_reader.yml',
+]
+
+snapshot_epoch: 1
+weights: output/ppyolo_tiny_650e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolov2_r101vd_dcn_365e_coco.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolov2_r101vd_dcn_365e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0f1aee746e4fd58ed060c83213c3306aea57e83e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolov2_r101vd_dcn_365e_coco.yml
@@ -0,0 +1,20 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolov2_r50vd_dcn.yml',
+ './_base_/optimizer_365e.yml',
+ './_base_/ppyolov2_reader.yml',
+]
+
+snapshot_epoch: 8
+weights: output/ppyolov2_r101vd_dcn_365e_coco/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_ssld_pretrained.pdparams
+
+ResNet:
+ depth: 101
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a5e1bc33560f882594156a6deb03798ea5553e7f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyolov2_r50vd_dcn.yml',
+ './_base_/optimizer_365e.yml',
+ './_base_/ppyolov2_reader.yml',
+]
+
+snapshot_epoch: 8
+weights: output/ppyolov2_r50vd_dcn_365e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/ppyolo/ppyolov2_r50vd_dcn_voc.yml b/PaddleDetection-release-2.6/configs/ppyolo/ppyolov2_r50vd_dcn_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cb4d3451a57d2363850fb697ff3c21ac50e6c648
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyolo/ppyolov2_r50vd_dcn_voc.yml
@@ -0,0 +1,42 @@
+_BASE_: [
+ '../datasets/voc.yml',
+ '../runtime.yml',
+ './_base_/ppyolov2_r50vd_dcn.yml',
+ './_base_/optimizer_365e.yml',
+ './_base_/ppyolov2_reader.yml',
+]
+
+snapshot_epoch: 83
+weights: output/ppyolov2_r50vd_dcn_voc/model_final
+
+TrainReader:
+ mixup_epoch: 350
+ batch_size: 12
+
+# set collate_batch to false because ground-truth info is needed
+# on voc dataset and should not collate data in batch when batch size
+# is larger than 1.
+EvalReader:
+ collate_batch: false
+
+epoch: 583
+
+LearningRate:
+ base_lr: 0.00333
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 466
+ - 516
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/README.md b/PaddleDetection-release-2.6/configs/ppyoloe/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..1c90e8ad6915e70182e45fe8dff7ed6e7ff7ba5f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/README.md
@@ -0,0 +1,317 @@
+English | [简体中文](README_cn.md)
+
+# PP-YOLOE
+
+## Latest News
+- Release PP-YOLOE+ model: **(2022.08)**
+ - Pre training model using large-scale data set obj365
+ - In the backbone, add the alpha parameter to the block branch
+ - Optimize the end-to-end inference speed and improve the training convergence speed
+
+## Legacy model
+- Please refer to:[PP-YOLOE 2022.03](./README_legacy.md) for details
+
+## Table of Contents
+- [Introduction](#Introduction)
+- [Model Zoo](#Model-Zoo)
+- [Getting Start](#Getting-Start)
+- [Appendix](#Appendix)
+
+## Introduction
+PP-YOLOE is an excellent single-stage anchor-free model based on PP-YOLOv2, surpassing a variety of popular YOLO models. PP-YOLOE has a series of models, named s/m/l/x, which are configured through width multiplier and depth multiplier. PP-YOLOE avoids using special operators, such as Deformable Convolution or Matrix NMS, to be deployed friendly on various hardware. For more details, please refer to our [report](https://arxiv.org/abs/2203.16250).
+
+
+

+
+
+PP-YOLOE+_l achieves 53.3 mAP on COCO test-dev2017 dataset with 78.1 FPS on Tesla V100. While using TensorRT FP16, PP-YOLOE+_l can be further accelerated to 149.2 FPS. PP-YOLOE+_s/m/x also have excellent accuracy and speed performance, which can be found in [Model Zoo](#Model-Zoo)
+
+PP-YOLOE is composed of following methods:
+- Scalable backbone and neck
+- [Task Alignment Learning](https://arxiv.org/abs/2108.07755)
+- Efficient Task-aligned head with [DFL](https://arxiv.org/abs/2006.04388) and [VFL](https://arxiv.org/abs/2008.13367)
+- [SiLU(Swish) activation function](https://arxiv.org/abs/1710.05941)
+
+## Model Zoo
+
+### Model Zoo on COCO
+
+| Model | Epoch | GPU number | images/GPU | backbone | input shape | Box APval
0.5:0.95 | Box APtest
0.5:0.95 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | download | config |
+|:--------------:|:-----:|:-------:|:----------:|:----------:| :-------:|:--------------------------:|:---------------------------:|:---------:|:--------:|:---------------:| :---------------------: |:------------------------------------------------------------------------------------:|:-------------------------------------------:|
+| PP-YOLOE+_s | 80 | 8 | 8 | cspresnet-s | 640 | 43.7 | 43.9 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams) | [config](./ppyoloe_plus_crn_s_80e_coco.yml) |
+| PP-YOLOE+_m | 80 | 8 | 8 | cspresnet-m | 640 | 49.8 | 50.0 | 23.43 | 49.91 | 123.4 | 208.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams) | [config](./ppyoloe_plus_crn_m_80e_coco.yml) |
+| PP-YOLOE+_l | 80 | 8 | 8 | cspresnet-l | 640 | 52.9 | 53.3 | 52.20 | 110.07 | 78.1 | 149.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) | [config](./ppyoloe_plus_crn_l_80e_coco.yml) |
+| PP-YOLOE+_x | 80 | 8 | 8 | cspresnet-x | 640 | 54.7 | 54.9 | 98.42 | 206.59 | 45.0 | 95.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_x_80e_coco.pdparams) | [config](./ppyoloe_plus_crn_x_80e_coco.yml) |
+
+
+#### Tiny model
+
+| Model | Epoch | GPU number | images/GPU | backbone | input shape | Box APval
0.5:0.95 | Box APtest
0.5:0.95 | Params(M) | FLOPs(G) | T4 TensorRT FP16(FPS) | download | config |
+|:--------:|:-----:|:----------:|:----------:|:----------:|:-----------:|:--------------------------:|:---------------------------:|:---------:|:--------:|:---------------------:| :------: |:--------:|
+| PP-YOLOE+_t-aux(640) | 300 | 8 | 8 | cspresnet-t | 640 | 39.7 | 56.4 | 4.85 | 19.15 | 344.8 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_300e_coco.pdparams) | [config](./ppyoloe_plus_crn_t_auxhead_300e_coco.yml) |
+| PP-YOLOE+_t-aux(640)-relu | 300 | 8 | 8 | cspresnet-t | 640 | 36.4 | 53.0 | 3.60 | 12.17 | 476.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_relu_300e_coco.pdparams) | [config](./ppyoloe_plus_crn_t_auxhead_relu_300e_coco.yml) |
+| PP-YOLOE+_t-aux(320) | 300 | 8 | 8 | cspresnet-t | 320 | 33.3 | 48.5 | 4.85 | 4.80 | 729.9 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_320_300e_coco.pdparams) | [config](./ppyoloe_plus_crn_t_auxhead_320_300e_coco.yml) |
+| PP-YOLOE+_t-aux(320)-relu | 300 | 8 | 8 | cspresnet-t | 320 | 29.5 | 43.7 | 3.60 | 3.04 | 984.8 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_relu_320_300e_coco.pdparams) | [config](./ppyoloe_plus_crn_t_auxhead_relu_320_300e_coco.yml) |
+
+
+### Comprehensive Metrics
+| Model | Epoch | AP0.5:0.95 | AP0.5 | AP0.75 | APsmall | APmedium | APlarge | ARsmall | ARmedium | ARlarge |
+|:------------------------:|:-----:|:---------------:|:----------:|:------------:|:------------:| :-----------: |:------------:|:------------:|:-------------:|:------------:|
+| PP-YOLOE+_s | 80 | 43.7 | 60.6 | 47.9 | 26.5 | 47.5 | 59.0 | 46.7 | 71.4 | 81.7 |
+| PP-YOLOE+_m | 80 | 49.8 | 67.1 | 54.5 | 31.8 | 53.9 | 66.2 | 53.3 | 75.0 | 84.6 |
+| PP-YOLOE+_l | 80 | 52.9 | 70.1 | 57.9 | 35.2 | 57.5 | 69.1 | 56.0 | 77.9 | 86.9 |
+| PP-YOLOE+_x | 80 | 54.7 | 72.0 | 59.9 | 37.9 | 59.3 | 70.4 | 57.0 | 78.7 | 87.2 |
+
+
+### End-to-end Speed
+| Model | AP0.5:0.95 | TRT-FP32(fps) | TRT-FP16(fps) |
+|:-----------:|:---------------:|:-------------:|:-------------:|
+| PP-YOLOE+_s | 43.7 | 44.44 | 47.85 |
+| PP-YOLOE+_m | 49.8 | 39.06 | 43.86 |
+| PP-YOLOE+_l | 52.9 | 34.01 | 42.02 |
+| PP-YOLOE+_x | 54.7 | 26.88 | 36.76 |
+
+**Notes:**
+
+- PP-YOLOE is trained on COCO train2017 dataset and evaluated on val2017 & test-dev2017 dataset.
+- The model weights in the table of Comprehensive Metrics are **the same as** that in the original Model Zoo, and evaluated on **val2017**.
+- PP-YOLOE used 8 GPUs for mixed precision training, if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)**.
+- PP-YOLOE inference speed is tesed on single Tesla V100 with batch size as 1, **CUDA 10.2**, **CUDNN 7.6.5**, **TensorRT 6.0.1.8** in TensorRT mode.
+- Refer to [Speed testing](#Speed-testing) to reproduce the speed testing results of PP-YOLOE.
+- If you set `--run_benchmark=True`,you should install these dependencies at first, `pip install pynvml psutil GPUtil`.
+- End-to-end speed test includes pre-processing + inference + post-processing and NMS time, using **Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz**, **single Tesla V100**, **CUDA 11.2**, **CUDNN 8.2.0**, **TensorRT 8.0.1.6**.
+
+### Model Zoo on Objects365
+| Model | Epoch | Machine number | GPU number | images/GPU | backbone | input shape | Box AP0.5 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | download | config |
+|:---------------:|:-----:|:-----------:|:-----------:|:-----------:|:---------:|:----------:|:--------------:|:---------:|:---------:|:-------------:|:-----------------------:| :--------:|:--------:|
+| PP-YOLOE+_s | 60 | 3 | 8 | 8 | cspresnet-s | 640 | 18.1 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams) | [config](./objects365/ppyoloe_plus_crn_s_60e_objects365.yml) |
+| PP-YOLOE+_m | 60 | 4 | 8 | 8 | cspresnet-m | 640 | 25.0 | 23.43 | 49.91 | 123.4 | 208.3 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_m_obj365_pretrained.pdparams) | [config](./objects365/ppyoloe_plus_crn_m_60e_objects365.yml) |
+| PP-YOLOE+_l | 60 | 3 | 8 | 8 | cspresnet-l | 640 | 30.8 | 52.20 | 110.07 | 78.1 | 149.2 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams) | [config](./objects365/ppyoloe_plus_crn_l_60e_objects365.yml) |
+| PP-YOLOE+_x | 60 | 4 | 8 | 8 | cspresnet-x | 640 | 32.7 | 98.42 | 206.59 | 45.0 | 95.2 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_x_obj365_pretrained.pdparams) | [config](./objects365/ppyoloe_plus_crn_x_60e_objects365.yml) |
+
+
+**Notes:**
+- The Details for multiple machine and multi-gpu training, see [DistributedTraining](../../docs/tutorials/DistributedTraining_en.md)
+
+
+### Model Zoo on VOC
+
+| Model | Epoch | GPU number | images/GPU | backbone | input shape | Box AP0.5 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | download | config |
+|:---------------:|:-----:|:-----------:|:-----------:|:---------:|:----------:|:--------------:|:---------:|:---------:|:-------------:|:-----------------------:| :-------: |:--------:|
+| PP-YOLOE+_s | 30 | 8 | 8 | cspresnet-s | 640 | 86.7 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_30e_voc.pdparams) | [config](./voc/ppyoloe_plus_crn_s_30e_voc.yml) |
+| PP-YOLOE+_l | 30 | 8 | 8 | cspresnet-l | 640 | 89.0 | 52.20 | 110.07 | 78.1 | 149.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_30e_voc.pdparams) | [config](./voc/ppyoloe_plus_crn_l_30e_voc.yml) |
+
+
+### Feature Models
+
+The PaddleDetection team provides configs and weights of various feature detection models based on PP-YOLOE, which users can download for use:
+
+|Scenarios | Related Datasets | Links|
+| :--------: | :---------: | :------: |
+|Pedestrian Detection | CrowdHuman | [pphuman](../pphuman) |
+|Vehicle Detection | BDD100K, UA-DETRAC | [ppvehicle](../ppvehicle) |
+|Small Object Detection | VisDrone、DOTA、xView | [smalldet](../smalldet) |
+|Densely Packed Object Detection | SKU110k | [application](./application) |
+|Rotated Object Detection | DOTA | [PP-YOLOE-R](../rotate/ppyoloe_r/) |
+
+
+## Getting Start
+
+### Datasets and Metrics
+
+PaddleDetection team provides **COCO and VOC dataset** , decompress and place it under `PaddleDetection/dataset/`:
+
+```
+wget https://bj.bcebos.com/v1/paddledet/data/coco.tar
+# tar -xvf coco.tar
+
+wget https://bj.bcebos.com/v1/paddledet/data/voc.zip
+# unzip voc.zip
+```
+
+**Note:**
+ - For the format of COCO style dataset, please refer to [format-data](https://cocodataset.org/#format-data) and [format-results](https://cocodataset.org/#format-results).
+ - For the evaluation metric of COCO, please refer to [detection-eval](https://cocodataset.org/#detection-eval), and install [cocoapi](https://github.com/cocodataset/cocoapi) at first.
+ - For the evaluation metric of VOC, please refer to [VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html).
+
+### Custom dataset
+
+1.For the annotation of custom dataset, please refer to [DetAnnoTools](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/data/DetAnnoTools_en.md);
+
+2.For training preparation of custom dataset,please refer to [PrepareDataSet](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/data/PrepareDetDataSet_en.md).
+
+
+### Training
+
+Training PP-YOLOE+ on 8 GPUs with following command
+
+```bash
+python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml --eval --amp
+```
+
+**Notes:**
+- If you need to evaluate while training, please add `--eval`.
+- PP-YOLOE+ supports mixed precision training, please add `--amp`.
+- PaddleDetection supports multi-machine distributed training, you can refer to [DistributedTraining tutorial](../../docs/tutorials/DistributedTraining_en.md).
+
+
+### Evaluation
+
+Evaluating PP-YOLOE+ on COCO val2017 dataset in single GPU with following commands:
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams
+```
+
+For evaluation on COCO test-dev2017 dataset, please download COCO test-dev2017 dataset from [COCO dataset download](https://cocodataset.org/#download) and decompress to COCO dataset directory and configure `EvalDataset` like `configs/ppyolo/ppyolo_test.yml`.
+
+### Inference
+
+Inference images in single GPU with following commands, use `--infer_img` to inference a single image and `--infer_dir` to inference all images in the directory.
+
+```bash
+# inference single image
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
+
+# inference all images in the directory
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams --infer_dir=demo
+```
+
+### Exporting models
+
+For deployment on GPU or speed testing, model should be first exported to inference model using `tools/export_model.py`.
+
+**Exporting PP-YOLOE+ for Paddle Inference without TensorRT**, use following command
+
+```bash
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams
+```
+
+**Exporting PP-YOLOE+ for Paddle Inference with TensorRT** for better performance, use following command with extra `-o trt=True` setting.
+
+```bash
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams trt=True
+```
+
+If you want to export PP-YOLOE model to **ONNX format**, use following command refer to [PaddleDetection Model Export as ONNX Format Tutorial](../../deploy/EXPORT_ONNX_MODEL_en.md).
+
+```bash
+# export inference model
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml --output_dir=output_inference -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams trt=True
+
+# install paddle2onnx
+pip install paddle2onnx
+
+# convert to onnx
+paddle2onnx --model_dir output_inference/ppyoloe_plus_crn_l_80e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 11 --save_file ppyoloe_plus_crn_l_80e_coco.onnx
+
+```
+
+**Notes:** ONNX model only supports batch_size=1 now
+
+### Speed testing
+
+For fair comparison, the speed in [Model Zoo](#Model-Zoo) do not contains the time cost of data reading and post-processing(NMS), which is same as [YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet) in testing method. Thus, you should export model with extra `-o exclude_nms=True` setting.
+
+**Using Paddle Inference without TensorRT** to test speed, run following command
+
+```bash
+# export inference model
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams exclude_nms=True
+
+# speed testing with run_benchmark=True
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --run_mode=paddle --device=gpu --run_benchmark=True
+```
+
+**Using Paddle Inference with TensorRT** to test speed, run following command
+
+```bash
+# export inference model with trt=True
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams exclude_nms=True trt=True
+
+# speed testing with run_benchmark=True,run_mode=trt_fp32/trt_fp16
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --run_mode=trt_fp16 --device=gpu --run_benchmark=True
+
+```
+
+**Using TensorRT Inference with ONNX** to test speed, run following command
+
+```bash
+# export inference model with trt=True
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams exclude_nms=True trt=True
+
+# convert to onnx
+paddle2onnx --model_dir output_inference/ppyoloe_plus_crn_s_80e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ppyoloe_plus_crn_s_80e_coco.onnx
+
+# trt inference using fp16 and batch_size=1
+trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs1.engine --workspace=1024 --avgRuns=1000 --shapes=image:1x3x640x640,scale_factor:1x2 --fp16
+
+# trt inference using fp16 and batch_size=32
+trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs32.engine --workspace=1024 --avgRuns=1000 --shapes=image:32x3x640x640,scale_factor:32x2 --fp16
+
+# Using the above script, T4 and tensorrt 7.2 machine, the speed of PPYOLOE-s model is as follows,
+
+# batch_size=1, 2.80ms, 357fps
+# batch_size=32, 67.69ms, 472fps
+
+```
+
+
+### Deployment
+
+PP-YOLOE can be deployed by following approaches:
+ - Paddle Inference [Python](../../deploy/python) & [C++](../../deploy/cpp)
+ - [Paddle-TensorRT](../../deploy/TENSOR_RT.md)
+ - [PaddleServing](https://github.com/PaddlePaddle/Serving)
+ - [PaddleSlim](../slim)
+
+Next, we will introduce how to use Paddle Inference to deploy PP-YOLOE models in TensorRT FP16 mode.
+
+First, refer to [Paddle Inference Docs](https://www.paddlepaddle.org.cn/inference/master/user_guides/download_lib.html#python), download and install packages corresponding to CUDA, CUDNN and TensorRT version.
+
+Then, Exporting PP-YOLOE for Paddle Inference **with TensorRT**, use following command.
+
+```bash
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams trt=True
+```
+
+Finally, inference in TensorRT FP16 mode.
+
+```bash
+# inference single image
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_mode=trt_fp16
+
+# inference all images in the directory
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_dir=demo/ --device=gpu --run_mode=trt_fp16
+
+```
+
+**Notes:**
+- TensorRT will perform optimization for the current hardware platform according to the definition of the network, generate an inference engine and serialize it into a file. This inference engine is only applicable to the current hardware hardware platform. If your hardware and software platform has not changed, you can set `use_static=True` in [enable_tensorrt_engine](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/infer.py#L660). In this way, the serialized file generated will be saved in the `output_inference` folder, and the saved serialized file will be loaded the next time when TensorRT is executed.
+- PaddleDetection release/2.4 and later versions will support NMS calling TensorRT, which requires PaddlePaddle release/2.3 and later versions.
+
+### Other Datasets
+
+Model | AP | AP50
+---|---|---
+[YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) | 22.6 | 37.5
+[YOLOv5](https://github.com/ultralytics/yolov5) | 26.0 | 42.7
+**PP-YOLOE** | **30.5** | **46.4**
+
+**Notes**
+- Here, we use [VisDrone](https://github.com/VisDrone/VisDrone-Dataset) dataset, and to detect 9 objects including `person, bicycles, car, van, truck, tricycle, awning-tricycle, bus, motor`.
+- Above models trained using official default config, and load pretrained parameters on COCO dataset.
+- *Due to the limited time, more verification results will be supplemented in the future. You are also welcome to contribute to PP-YOLOE*
+
+
+## Appendix
+
+Ablation experiments of PP-YOLOE.
+
+| NO. | Model | Box APval | Params(M) | FLOPs(G) | V100 FP32 FPS |
+| :--: | :---------------------------: | :------------------: | :-------: | :------: | :-----------: |
+| A | PP-YOLOv2 | 49.1 | 54.58 | 115.77 | 68.9 |
+| B | A + Anchor-free | 48.8 | 54.27 | 114.78 | 69.8 |
+| C | B + CSPRepResNet | 49.5 | 47.42 | 101.87 | 85.5 |
+| D | C + TAL | 50.4 | 48.32 | 104.75 | 84.0 |
+| E | D + ET-Head | 50.9 | 52.20 | 110.07 | 78.1 |
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/README_cn.md b/PaddleDetection-release-2.6/configs/ppyoloe/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..6f0288d126def4891363a5e3d51c76b51073135b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/README_cn.md
@@ -0,0 +1,314 @@
+简体中文 | [English](README.md)
+
+# PP-YOLOE
+
+## 最新动态
+- 发布PP-YOLOE+模型: **(2022.08)**
+ - 使用大规模数据集obj365预训练模型
+ - 在backbone中block分支中增加alpha参数
+ - 优化端到端推理速度,提升训练收敛速度
+
+## 历史版本模型
+- 详情请参考:[PP-YOLOE 2022.03版本](./README_legacy.md)
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [使用说明](#使用说明)
+- [附录](#附录)
+
+## 简介
+PP-YOLOE是基于PP-YOLOv2的卓越的单阶段Anchor-free模型,超越了多种流行的YOLO模型。PP-YOLOE有一系列的模型,即s/m/l/x,可以通过width multiplier和depth multiplier配置。PP-YOLOE避免了使用诸如Deformable Convolution或者Matrix NMS之类的特殊算子,以使其能轻松地部署在多种多样的硬件上。更多细节可以参考我们的[report](https://arxiv.org/abs/2203.16250)。
+
+
+

+
+
+PP-YOLOE+_l在COCO test-dev2017达到了53.3的mAP, 同时其速度在Tesla V100上达到了78.1 FPS。PP-YOLOE+_s/m/x同样具有卓越的精度速度性价比, 其精度速度可以在[模型库](#模型库)中找到。
+
+PP-YOLOE由以下方法组成
+- 可扩展的backbone和neck
+- [Task Alignment Learning](https://arxiv.org/abs/2108.07755)
+- Efficient Task-aligned head with [DFL](https://arxiv.org/abs/2006.04388)和[VFL](https://arxiv.org/abs/2008.13367)
+- [SiLU(Swish)激活函数](https://arxiv.org/abs/1710.05941)
+
+## 模型库
+
+### COCO数据集模型库
+
+| 模型 | Epoch | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box APval
0.5:0.95 | Box APtest
0.5:0.95 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | 模型下载 | 配置文件 |
+|:---------------:|:-----:|:---------:|:--------:|:----------:|:----------:|:--------------------------:|:---------------------------:|:---------:|:--------:|:---------------:| :---------------------: |:------------------------------------------------------------------------------------:|:-------------------------------------------:|
+| PP-YOLOE+_s | 80 | 8 | 8 | cspresnet-s | 640 | 43.7 | 43.9 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams) | [config](./ppyoloe_plus_crn_s_80e_coco.yml) |
+| PP-YOLOE+_m | 80 | 8 | 8 | cspresnet-m | 640 | 49.8 | 50.0 | 23.43 | 49.91 | 123.4 | 208.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams) | [config](./ppyoloe_plus_crn_m_80e_coco.yml) |
+| PP-YOLOE+_l | 80 | 8 | 8 | cspresnet-l | 640 | 52.9 | 53.3 | 52.20 | 110.07 | 78.1 | 149.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) | [config](./ppyoloe_plus_crn_l_80e_coco.yml) |
+| PP-YOLOE+_x | 80 | 8 | 8 | cspresnet-x | 640 | 54.7 | 54.9 | 98.42 | 206.59 | 45.0 | 95.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_x_80e_coco.pdparams) | [config](./ppyoloe_plus_crn_x_80e_coco.yml) |
+
+#### Tiny模型
+
+| 模型 | Epoch | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box APval
0.5:0.95 | Box APtest
0.5:0.95 | Params(M) | FLOPs(G) | T4 TensorRT FP16(FPS) | 模型下载 | 配置文件 |
+|:----------:|:-----:|:--------:|:-----------:|:---------:|:--------:|:--------------------------:|:---------------------------:|:---------:|:--------:|:---------------------:| :------: |:--------:|
+| PP-YOLOE+_t-aux(640) | 300 | 8 | 8 | cspresnet-t | 640 | 39.7 | 56.4 | 4.85 | 19.15 | 344.8 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_300e_coco.pdparams) | [config](./ppyoloe_plus_crn_t_auxhead_300e_coco.yml) |
+| PP-YOLOE+_t-aux(640)-relu | 300 | 8 | 8 | cspresnet-t | 640 | 36.4 | 53.0 | 3.60 | 12.17 | 476.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_relu_300e_coco.pdparams) | [config](./ppyoloe_plus_crn_t_auxhead_relu_300e_coco.yml) |
+| PP-YOLOE+_t-aux(320) | 300 | 8 | 8 | cspresnet-t | 320 | 33.3 | 48.5 | 4.85 | 4.80 | 729.9 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_320_300e_coco.pdparams) | [config](./ppyoloe_plus_crn_t_auxhead_320_300e_coco.yml) |
+| PP-YOLOE+_t-aux(320)-relu | 300 | 8 | 8 | cspresnet-t | 320 | 29.5 | 43.7 | 3.60 | 3.04 | 984.8 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_t_auxhead_relu_320_300e_coco.pdparams) | [config](./ppyoloe_plus_crn_t_auxhead_relu_320_300e_coco.yml) |
+
+
+### 综合指标
+| 模型 | Epoch | AP0.5:0.95 | AP0.5 | AP0.75 | APsmall | APmedium | APlarge | ARsmall | ARmedium | ARlarge |
+|:------------------------:|:-----:|:---------------:|:----------:|:-----------:|:------------:|:-------------:|:------------:|:------------:|:-------------:|:------------:|
+| PP-YOLOE+_s | 80 | 43.7 | 60.6 | 47.9 | 26.5 | 47.5 | 59.0 | 46.7 | 71.4 | 81.7 |
+| PP-YOLOE+_m | 80 | 49.8 | 67.1 | 54.5 | 31.8 | 53.9 | 66.2 | 53.3 | 75.0 | 84.6 |
+| PP-YOLOE+_l | 80 | 52.9 | 70.1 | 57.9 | 35.2 | 57.5 | 69.1 | 56.0 | 77.9 | 86.9 |
+| PP-YOLOE+_x | 80 | 54.7 | 72.0 | 59.9 | 37.9 | 59.3 | 70.4 | 57.0 | 78.7 | 87.2 |
+
+
+### 端到端速度
+| 模型 | AP0.5:0.95 | TRT-FP32(fps) | TRT-FP16(fps) |
+|:------------------------:|:---------------:|:-------------:|:-------------:|
+| PP-YOLOE+_s | 43.7 | 44.44 | 47.85 |
+| PP-YOLOE+_m | 49.8 | 39.06 | 43.86 |
+| PP-YOLOE+_l | 52.9 | 34.01 | 42.02 |
+| PP-YOLOE+_x | 54.7 | 26.88 | 36.76 |
+
+**注意:**
+
+- PP-YOLOE模型使用COCO数据集中train2017作为训练集,使用val2017和test-dev2017作为测试集。
+- 综合指标的表格与模型库的表格里的模型权重是**同一个权重**,综合指标是使用**val2017**作为验证精度的。
+- PP-YOLOE模型训练过程中使用8 GPUs进行混合精度训练,如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- PP-YOLOE模型推理速度测试采用单卡V100,batch size=1进行测试,使用**CUDA 10.2**, **CUDNN 7.6.5**,TensorRT推理速度测试使用**TensorRT 6.0.1.8**。
+- 参考[速度测试](#速度测试)以复现PP-YOLOE推理速度测试结果。
+- 如果你设置了`--run_benchmark=True`, 你首先需要安装以下依赖`pip install pynvml psutil GPUtil`。
+- 端到端速度测试包含模型前处理 + 模型推理 + 模型后处理及NMS的时间,测试使用**Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz**, **单卡V100**, **CUDA 11.2**, **CUDNN 8.2.0**, **TensorRT 8.0.1.6**。
+
+### Objects365数据集模型库
+| 模型 | Epoch | 机器个数 | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box AP0.5 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | 模型下载 | 配置文件 |
+|:---------------:|:-----:|:-----------:|:-----------:|:-----------:|:---------:|:----------:|:--------------:|:---------:|:---------:|:-------------:|:-----------------------:| :--------:|:--------:|
+| PP-YOLOE+_s | 60 | 3 | 8 | 8 | cspresnet-s | 640 | 18.1 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams) | [config](./objects365/ppyoloe_plus_crn_s_60e_objects365.yml) |
+| PP-YOLOE+_m | 60 | 4 | 8 | 8 | cspresnet-m | 640 | 25.0 | 23.43 | 49.91 | 123.4 | 208.3 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_m_obj365_pretrained.pdparams) | [config](./objects365/ppyoloe_plus_crn_m_60e_objects365.yml) |
+| PP-YOLOE+_l | 60 | 3 | 8 | 8 | cspresnet-l | 640 | 30.8 | 52.20 | 110.07 | 78.1 | 149.2 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams) | [config](./objects365/ppyoloe_plus_crn_l_60e_objects365.yml) |
+| PP-YOLOE+_x | 60 | 4 | 8 | 8 | cspresnet-x | 640 | 32.7 | 98.42 | 206.59 | 45.0 | 95.2 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_x_obj365_pretrained.pdparams) | [config](./objects365/ppyoloe_plus_crn_x_60e_objects365.yml) |
+
+
+**注意:**
+- 多机训练细节见[文档](../../docs/tutorials/DistributedTraining_cn.md)
+
+
+### VOC数据集模型库
+| 模型 | Epoch | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box AP0.5 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | 模型下载 | 配置文件 |
+|:---------------:|:-----:|:-----------:|:-----------:|:---------:|:----------:|:--------------:|:---------:|:---------:|:-------------:|:-----------------------:| :-------: |:--------:|
+| PP-YOLOE+_s | 30 | 8 | 8 | cspresnet-s | 640 | 86.7 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_30e_voc.pdparams) | [config](./voc/ppyoloe_plus_crn_s_30e_voc.yml) |
+| PP-YOLOE+_l | 30 | 8 | 8 | cspresnet-l | 640 | 89.0 | 52.20 | 110.07 | 78.1 | 149.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_30e_voc.pdparams) | [config](./voc/ppyoloe_plus_crn_l_30e_voc.yml) |
+
+
+### 垂类应用模型
+
+PaddleDetection团队提供了基于PP-YOLOE的各种垂类检测模型的配置文件和权重,用户可以下载进行使用:
+
+| 场景 | 相关数据集 | 链接 |
+| :--------: | :---------: | :------: |
+| 行人检测 | CrowdHuman | [pphuman](../pphuman) |
+| 车辆检测 | BDD100K、UA-DETRAC | [ppvehicle](../ppvehicle) |
+| 小目标检测 | VisDrone、DOTA、xView | [smalldet](../smalldet) |
+| 密集目标检测 | SKU110k | [application](./application) |
+| 旋转框检测 | DOTA | [PP-YOLOE-R](../rotate/ppyoloe_r/) |
+
+
+## 使用说明
+
+### 数据集和评价指标
+
+下载PaddleDetection团队提供的**COCO和VOC数据**,并解压放置于`PaddleDetection/dataset/`下:
+
+```
+wget https://bj.bcebos.com/v1/paddledet/data/coco.tar
+# tar -xvf coco.tar
+
+wget https://bj.bcebos.com/v1/paddledet/data/voc.zip
+# unzip voc.zip
+```
+
+**注意:**
+ - COCO风格格式,请参考 [format-data](https://cocodataset.org/#format-data) 和 [format-results](https://cocodataset.org/#format-results)。
+ - COCO风格评测指标,请参考 [detection-eval](https://cocodataset.org/#detection-eval) ,并首先安装 [cocoapi](https://github.com/cocodataset/cocoapi)。
+ - VOC风格格式和评测指标,请参考 [VOC2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html)。
+
+### 自定义数据集
+
+1.自定义数据集的标注制作,请参考 [DetAnnoTools](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/data/DetAnnoTools.md);
+2.自定义数据集的训练准备,请参考 [PrepareDataSet](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/docs/tutorials/data/PrepareDetDataSet.md).
+
+
+### 训练
+
+请执行以下指令训练PP-YOLOE+
+
+```bash
+python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml --eval --amp
+```
+**注意:**
+- 如果需要边训练边评估,请添加`--eval`.
+- PP-YOLOE+支持混合精度训练,请添加`--amp`.
+- PaddleDetection支持多机训练,可以参考[多机训练教程](../../docs/tutorials/DistributedTraining_cn.md).
+
+### 评估
+
+执行以下命令在单个GPU上评估COCO val2017数据集
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams
+```
+
+在coco test-dev2017上评估,请先从[COCO数据集下载](https://cocodataset.org/#download)下载COCO test-dev2017数据集,然后解压到COCO数据集文件夹并像`configs/ppyolo/ppyolo_test.yml`一样配置`EvalDataset`。
+
+### 推理
+
+使用以下命令在单张GPU上预测图片,使用`--infer_img`推理单张图片以及使用`--infer_dir`推理文件中的所有图片。
+
+
+```bash
+# 推理单张图片
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
+
+# 推理文件中的所有图片
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams --infer_dir=demo
+```
+
+### 模型导出
+
+PP-YOLOE+在GPU上部署或者速度测试需要通过`tools/export_model.py`导出模型。
+
+当你**使用Paddle Inference但不使用TensorRT**时,运行以下的命令导出模型
+
+```bash
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams
+```
+
+当你**使用Paddle Inference且使用TensorRT**时,需要指定`-o trt=True`来导出模型。
+
+```bash
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams trt=True
+```
+
+如果你想将PP-YOLOE模型导出为**ONNX格式**,参考
+[PaddleDetection模型导出为ONNX格式教程](../../deploy/EXPORT_ONNX_MODEL.md),运行以下命令:
+
+```bash
+
+# 导出推理模型
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml --output_dir=output_inference -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams trt=True
+
+# 安装paddle2onnx
+pip install paddle2onnx
+
+# 转换成onnx格式
+paddle2onnx --model_dir output_inference/ppyoloe_plus_crn_l_80e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 11 --save_file ppyoloe_plus_crn_l_80e_coco.onnx
+```
+
+**注意:** ONNX模型目前只支持batch_size=1
+
+### 速度测试
+
+为了公平起见,在[模型库](#模型库)中的速度测试结果均为不包含数据预处理和模型输出后处理(NMS)的数据(与[YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet)测试方法一致),需要在导出模型时指定`-o exclude_nms=True`.
+
+**使用Paddle Inference但不使用TensorRT**进行测速,执行以下命令:
+
+```bash
+# 导出模型
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams exclude_nms=True
+
+# 速度测试,使用run_benchmark=True
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --run_mode=paddle --device=gpu --run_benchmark=True
+```
+
+**使用Paddle Inference且使用TensorRT**进行测速,执行以下命令:
+
+```bash
+# 导出模型,使用trt=True
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams exclude_nms=True trt=True
+
+# 速度测试,使用run_benchmark=True, run_mode=trt_fp32/trt_fp16
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --run_mode=trt_fp16 --device=gpu --run_benchmark=True
+
+```
+
+
+**使用 ONNX 和 TensorRT** 进行测速,执行以下命令:
+
+```bash
+# 导出模型
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams exclude_nms=True trt=True
+
+# 转化成ONNX格式
+paddle2onnx --model_dir output_inference/ppyoloe_plus_crn_s_80e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ppyoloe_plus_crn_s_80e_coco.onnx
+
+# 测试速度,半精度,batch_size=1
+trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs1.engine --workspace=1024 --avgRuns=1000 --shapes=image:1x3x640x640,scale_factor:1x2 --fp16
+
+# 测试速度,半精度,batch_size=32
+trtexec --onnx=./ppyoloe_plus_crn_s_80e_coco.onnx --saveEngine=./ppyoloe_s_bs32.engine --workspace=1024 --avgRuns=1000 --shapes=image:32x3x640x640,scale_factor:32x2 --fp16
+
+# 使用上边的脚本, 在T4 和 TensorRT 7.2的环境下,PPYOLOE-plus-s模型速度如下
+# batch_size=1, 2.80ms, 357fps
+# batch_size=32, 67.69ms, 472fps
+```
+
+
+
+### 部署
+
+PP-YOLOE可以使用以下方式进行部署:
+ - Paddle Inference [Python](../../deploy/python) & [C++](../../deploy/cpp)
+ - [Paddle-TensorRT](../../deploy/TENSOR_RT.md)
+ - [PaddleServing](https://github.com/PaddlePaddle/Serving)
+ - [PaddleSlim模型量化](../slim)
+
+接下来,我们将介绍PP-YOLOE如何使用Paddle Inference在TensorRT FP16模式下部署
+
+首先,参考[Paddle Inference文档](https://www.paddlepaddle.org.cn/inference/master/user_guides/download_lib.html#python),下载并安装与你的CUDA, CUDNN和TensorRT相应的wheel包。
+
+然后,运行以下命令导出模型
+
+```bash
+python tools/export_model.py -c configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams trt=True
+```
+
+最后,使用TensorRT FP16进行推理
+
+```bash
+# 推理单张图片
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_mode=trt_fp16
+
+# 推理文件夹下的所有图片
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_crn_l_80e_coco --image_dir=demo/ --device=gpu --run_mode=trt_fp16
+
+```
+
+**注意:**
+- TensorRT会根据网络的定义,执行针对当前硬件平台的优化,生成推理引擎并序列化为文件。该推理引擎只适用于当前软硬件平台。如果你的软硬件平台没有发生变化,你可以设置[enable_tensorrt_engine](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/deploy/python/infer.py#L660)的参数`use_static=True`,这样生成的序列化文件将会保存在`output_inference`文件夹下,下次执行TensorRT时将加载保存的序列化文件。
+- PaddleDetection release/2.4及其之后的版本将支持NMS调用TensorRT,需要依赖PaddlePaddle release/2.3及其之后的版本
+
+### 泛化性验证
+
+模型 | AP | AP50
+---|---|---
+[YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) | 22.6 | 37.5
+[YOLOv5](https://github.com/ultralytics/yolov5) | 26.0 | 42.7
+**PP-YOLOE** | **30.5** | **46.4**
+
+**注意**
+- 试验使用[VisDrone](https://github.com/VisDrone/VisDrone-Dataset)数据集, 并且检测其中的9类,包括 `person, bicycles, car, van, truck, tricyle, awning-tricyle, bus, motor`.
+- 以上模型训练均采用官方提供的默认参数,并且加载COCO预训练参数
+- *由于人力/时间有限,后续将会持续补充更多验证结果,也欢迎各位开源用户贡献,共同优化PP-YOLOE*
+
+
+## 附录
+
+PP-YOLOE消融实验
+
+| 序号 | 模型 | Box APval | 参数量(M) | FLOPs(G) | V100 FP32 FPS |
+| :--: | :---------------------------: | :-------------------: | :-------: | :------: | :-----------: |
+| A | PP-YOLOv2 | 49.1 | 54.58 | 115.77 | 68.9 |
+| B | A + Anchor-free | 48.8 | 54.27 | 114.78 | 69.8 |
+| C | B + CSPRepResNet | 49.5 | 47.42 | 101.87 | 85.5 |
+| D | C + TAL | 50.4 | 48.32 | 104.75 | 84.0 |
+| E | D + ET-Head | 50.9 | 52.20 | 110.07 | 78.1 |
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/README_legacy.md b/PaddleDetection-release-2.6/configs/ppyoloe/README_legacy.md
new file mode 100644
index 0000000000000000000000000000000000000000..3daab44766fe8a07adf9a93fd30c9cf47aa38fac
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/README_legacy.md
@@ -0,0 +1,39 @@
+# PP-YOLOE Legacy Model Zoo (2022.03)
+
+## Legacy Model Zoo
+| Model | Epoch | GPU number | images/GPU | backbone | input shape | Box APval
0.5:0.95 | Box APtest
0.5:0.95 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | download | config |
+|:------------------------:|:-------:|:-------:|:--------:|:----------:| :-------:| :------------------: | :-------------------: |:---------:|:--------:|:---------------:| :---------------------: | :------: | :------: |
+| PP-YOLOE-s | 400 | 8 | 32 | cspresnet-s | 640 | 43.4 | 43.6 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_400e_coco.pdparams) | [config](./ppyoloe_crn_s_400e_coco.yml) |
+| PP-YOLOE-s | 300 | 8 | 32 | cspresnet-s | 640 | 43.0 | 43.2 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams) | [config](./ppyoloe_crn_s_300e_coco.yml) |
+| PP-YOLOE-m | 300 | 8 | 28 | cspresnet-m | 640 | 49.0 | 49.1 | 23.43 | 49.91 | 123.4 | 208.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams) | [config](./ppyoloe_crn_m_300e_coco.yml) |
+| PP-YOLOE-l | 300 | 8 | 20 | cspresnet-l | 640 | 51.4 | 51.6 | 52.20 | 110.07 | 78.1 | 149.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams) | [config](./ppyoloe_crn_l_300e_coco.yml) |
+| PP-YOLOE-x | 300 | 8 | 16 | cspresnet-x | 640 | 52.3 | 52.4 | 98.42 | 206.59 | 45.0 | 95.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams) | [config](./ppyoloe_crn_x_300e_coco.yml) |
+
+### Comprehensive Metrics
+| Model | Epoch | AP0.5:0.95 | AP0.5 | AP0.75 | APsmall | APmedium | APlarge | ARsmall | ARmedium | ARlarge | download | config |
+|:----------------------:|:-----:|:---------------:|:----------:|:-------------:| :------------:| :-----------: | :----------: |:------------:|:-------------:|:------------:| :-----: | :-----: |
+| PP-YOLOE-s | 400 | 43.4 | 60.0 | 47.5 | 25.7 | 47.8 | 59.2 | 43.9 | 70.8 | 81.9 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_400e_coco.pdparams) | [config](./ppyoloe_crn_s_400e_coco.yml)|
+| PP-YOLOE-s | 300 | 43.0 | 59.6 | 47.2 | 26.0 | 47.4 | 58.7 | 45.1 | 70.6 | 81.4 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams) | [config](./ppyoloe_crn_s_300e_coco.yml)|
+| PP-YOLOE-m | 300 | 49.0 | 65.9 | 53.8 | 30.9 | 53.5 | 65.3 | 50.9 | 74.4 | 84.7 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams) | [config](./ppyoloe_crn_m_300e_coco.yml)|
+| PP-YOLOE-l | 300 | 51.4 | 68.6 | 56.2 | 34.8 | 56.1 | 68.0 | 53.1 | 76.8 | 85.6 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams) | [config](./ppyoloe_crn_l_300e_coco.yml)|
+| PP-YOLOE-x | 300 | 52.3 | 69.5 | 56.8 | 35.1 | 57.0 | 68.6 | 55.5 | 76.9 | 85.7 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams) | [config](./ppyoloe_crn_x_300e_coco.yml)|
+
+
+**Notes:**
+
+- PP-YOLOE is trained on COCO train2017 dataset and evaluated on val2017 & test-dev2017 dataset.
+- The model weights in the table of Comprehensive Metrics are **the same as** that in the original Model Zoo, and evaluated on **val2017**.
+- PP-YOLOE used 8 GPUs for training, if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)**.
+- PP-YOLOE inference speed is tesed on single Tesla V100 with batch size as 1, **CUDA 10.2**, **CUDNN 7.6.5**, **TensorRT 6.0.1.8** in TensorRT mode.
+
+## Appendix
+
+Ablation experiments of PP-YOLOE.
+
+| NO. | Model | Box APval | Params(M) | FLOPs(G) | V100 FP32 FPS |
+| :--: | :---------------------------: | :------------------: | :-------: | :------: | :-----------: |
+| A | PP-YOLOv2 | 49.1 | 54.58 | 115.77 | 68.9 |
+| B | A + Anchor-free | 48.8 | 54.27 | 114.78 | 69.8 |
+| C | B + CSPRepResNet | 49.5 | 47.42 | 101.87 | 85.5 |
+| D | C + TAL | 50.4 | 48.32 | 104.75 | 84.0 |
+| E | D + ET-Head | 50.9 | 52.20 | 110.07 | 78.1 |
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_300e.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_300e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d07bf4e53ef03571a04bda6353f798eabe24dfcd
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_300e.yml
@@ -0,0 +1,18 @@
+epoch: 300
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 360
+ - name: LinearWarmup
+ start_factor: 0.
+ epochs: 5
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_36e_xpu.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_36e_xpu.yml
new file mode 100644
index 0000000000000000000000000000000000000000..951938468bd767369a41a8318306d8301e5a62fb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_36e_xpu.yml
@@ -0,0 +1,18 @@
+epoch: 36
+
+LearningRate:
+ base_lr: 0.00125
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 43
+ - name: LinearWarmup
+ start_factor: 0.001
+ steps: 2000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_400e.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_400e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0a8a5a6c377d2886e3c8e53b3d8fd03d7fba1146
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_400e.yml
@@ -0,0 +1,18 @@
+epoch: 400
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 480
+ - name: LinearWarmup
+ start_factor: 0.
+ epochs: 5
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_60e.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_60e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b261003db3aa56122022234bb0332b4db811ae63
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_60e.yml
@@ -0,0 +1,18 @@
+epoch: 60
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 72
+ - name: LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_80e.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_80e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b6ba4ec31a9703c56d2e470b646354cfdfdb7ddc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/optimizer_80e.yml
@@ -0,0 +1,18 @@
+epoch: 80
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - name: CosineDecay
+ max_epochs: 96
+ - name: LinearWarmup
+ start_factor: 0.
+ epochs: 5
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_crn.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_crn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..118db7ee19d423a39ba7310a28dc806479128866
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_crn.yml
@@ -0,0 +1,47 @@
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+ema_black_list: ['proj_conv.weight']
+custom_black_list: ['reduce_mean']
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_crn.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_crn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c8e6191fdd4515b79596f4bd9ecb48731523a83b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_crn.yml
@@ -0,0 +1,48 @@
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+ema_black_list: ['proj_conv.weight']
+custom_black_list: ['reduce_mean']
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 30
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_crn_tiny_auxhead.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_crn_tiny_auxhead.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8aea82150dfaef11a9c7e7362642fdd8e5e951d9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_crn_tiny_auxhead.yml
@@ -0,0 +1,60 @@
+architecture: PPYOLOEWithAuxHead
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+ema_black_list: ['proj_conv.weight']
+custom_black_list: ['reduce_mean']
+
+PPYOLOEWithAuxHead:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ aux_head: SimpleConvHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [384, 384, 384]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+SimpleConvHead:
+ feat_in: 288
+ feat_out: 288
+ num_convs: 1
+ fpn_strides: [32, 16, 8]
+ norm_type: 'gn'
+ act: 'LeakyReLU'
+ reg_max: 16
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ attn_conv: 'repvgg' #
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ is_close_gt: True #
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_reader.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cd9cdeff8b9d46e41a4e6fb518339168dfd4b154
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_reader_320.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_reader_320.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2b7be58daf8e208c4875cff6be9ea48dbf0073e5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_plus_reader_320.yml
@@ -0,0 +1,40 @@
+worker_num: 4
+eval_height: &eval_height 320
+eval_width: &eval_width 320
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [224, 256, 288, 320, 352, 384, 416, 448, 480, 512, 544], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_reader.yml b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9f99713e5c106321025842db1f61361a82364e77
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/_base_/ppyoloe_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/README.md b/PaddleDetection-release-2.6/configs/ppyoloe/application/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..41bf34f5bece3831539462535d376f8ad367ee3b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/README.md
@@ -0,0 +1,69 @@
+# PP-YOLOE+ 下游任务
+
+我们验证了PP-YOLOE+模型强大的泛化能力,在农业、低光、工业等不同场景下游任务检测效果稳定提升!
+
+农业数据集采用[Embrapa WGISD](https://github.com/thsant/wgisd),该数据集用于葡萄栽培中基于图像的监测和现场机器人技术,提供了来自5种不同葡萄品种的实地实例,
+处理后的COCO格式,包含图片训练集242张,测试集58张,5个类别,[Embrapa WGISD COCO格式下载](https://bj.bcebos.com/v1/paddledet/data/wgisd.zip);
+
+低光数据集使用[ExDark](https://github.com/cs-chan/Exclusively-Dark-Image-Dataset/tree/master/Dataset),该数据集是一个专门在低光照环境下拍摄出针对低光目标检测的数据集,包括从极低光环境到暮光环境等10种不同光照条件下的图片,
+处理后的COCO格式,包含图片训练集5891张,测试集1472张,12个类别,[ExDark COCO格式下载](https://bj.bcebos.com/v1/paddledet/data/Exdark.zip);
+
+工业数据集使用[PKU-Market-PCB](https://robotics.pkusz.edu.cn/resources/dataset/),该数据集用于印刷电路板(PCB)的瑕疵检测,提供了6种常见的PCB缺陷,
+处理后的COCO格式,包含图片训练集555张,测试集138张,6个类别,[PKU-Market-PCB COCO格式下载](https://bj.bcebos.com/v1/paddledet/data/PCB_coco.zip)。
+
+商超数据集[SKU110k](https://github.com/eg4000/SKU110K_CVPR19)是商品超市场景下的密集目标检测数据集,包含11,762张图片和超过170个实例。其中包括8,233张用于训练的图像、588张用于验证的图像和2,941张用于测试的图像。
+
+
+## 实验结果:
+
+| 模型 | 数据集 | mAPval
0.5:0.95 | 下载链接 | 配置文件 |
+|:---------|:---------------:|:-----------------------:|:---------:| :-----: |
+|PP-YOLOE_m| Embrapa WGISD | 52.7 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_80e_wgisd.pdparams) | [配置文件](./ppyoloe_crn_m_80e_wgisd.yml) |
+|PP-YOLOE+_m
(obj365_pretrained)| Embrapa WGISD | 60.8(+8.1) | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_obj365_pretrained_wgisd.pdparams) | [配置文件](./ppyoloe_plus_crn_m_80e_obj365_pretrained_wgisd.yml) |
+|PP-YOLOE+_m
(coco_pretrained)| Embrapa WGISD | 59.7(+7.0) | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco_pretrained_wgisd.pdparams) | [配置文件](./ppyoloe_plus_crn_m_80e_coco_pretrained_wgisd.yml) |
+|PP-YOLOE_m| ExDark | 56.4 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_80e_exdark.pdparams) | [配置文件](./ppyoloe_crn_m_80e_exdark.yml) |
+|PP-YOLOE+_m
(obj365_pretrained)| ExDark | 57.7(+1.3) | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_obj365_pretrained_exdark.pdparams) | [配置文件](./ppyoloe_plus_crn_m_80e_obj365_pretrained_exdark.yml) |
+|PP-YOLOE+_m
(coco_pretrained)| ExDark | 58.1(+1.7) | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco_pretrained_exdark.pdparams) | [配置文件](./ppyoloe_plus_crn_m_80e_coco_pretrained_exdark.yml) |
+|PP-YOLOE_m| PKU-Market-PCB | 50.8 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_80e_pcb.pdparams) | [配置文件](./ppyoloe_crn_m_80e_pcb.yml) |
+|PP-YOLOE+_m
(obj365_pretrained)| PKU-Market-PCB | 52.7(+1.9) | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_obj365_pretrained_pcb.pdparams) | [配置文件](./ppyoloe_plus_crn_m_80e_obj365_pretrained_pcb.yml) |
+|PP-YOLOE+_m
(coco_pretrained)| PKU-Market-PCB | 52.4(+1.6) | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco_pretrained_pcb.pdparams) | [配置文件](./ppyoloe_plus_crn_m_80e_coco_pretrained_pcb.yml) |
+
+**注意:**
+- PP-YOLOE模型训练过程中使用8 GPUs进行训练,如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- 具体使用教程请参考[ppyoloe](../ppyoloe#getting-start)。
+
+
+## SKU110k Model ZOO
+| Model | Epoch | GPU number | images/GPU | backbone | input shape | Box APval
0.5:0.95 (maxDets=300) | Box APtest
0.5:0.95 (maxDets=300) | download | config |
+|:--------------:|:-----:|:-------:|:----------:|:----------:| :-------:|:-------------------------:|:---------------------------:|:---------:|:------:|
+| PP-YOLOE+_s | 80 | 8 | 8 | cspresnet-s | 960 | 57.4 | 58.8 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_sku110k.pdparams) | [config](./ppyoloe_plus_crn_s_80e_sku110k.yml) |
+| PP-YOLOE+_m | 80 | 8 | 8 | cspresnet-m | 960 | 58.2 | 59.7 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_sku110k.pdparams) | [config](./ppyoloe_plus_crn_m_80e_sku110k.yml) |
+| PP-YOLOE+_l | 80 | 8 | 4 | cspresnet-l | 960 | 58.8 | 60.2 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_sku110k.pdparams) | [config](./ppyoloe_plus_crn_l_80e_sku110k.yml) |
+| PP-YOLOE+_x | 80 | 8 | 4 | cspresnet-x | 960 | 59.0 | 60.3 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_x_80e_sku110k.pdparams) | [config](./ppyoloe_plus_crn_x_80e_sku110k.yml) |
+
+
+**注意:**
+- SKU110k系列模型训练过程中使用8 GPUs进行训练,如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- SKU110k数据集使用**maxDets=300**的mAP值作为评估指标。
+- 具体使用教程请参考[ppyoloe](../ppyoloe#getting-start)。
+
+
+## 引用
+```
+@inproceedings{goldman2019dense,
+ author = {Eran Goldman and Roei Herzig and Aviv Eisenschtat and Jacob Goldberger and Tal Hassner},
+ title = {Precise Detection in Densely Packed Scenes},
+ booktitle = {Proc. Conf. Comput. Vision Pattern Recognition (CVPR)},
+ year = {2019}
+}
+
+@article{Exdark,
+title={Getting to Know Low-light Images with The Exclusively Dark Dataset},
+author={Loh, Yuen Peng and Chan, Chee Seng},
+journal={Computer Vision and Image Understanding},
+volume={178},
+pages={30-42},
+year={2019},
+doi={https://doi.org/10.1016/j.cviu.2018.10.010}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/exdark_detection.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/exdark_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..07585bc5ddcea460aedaf5797b6720ceab988814
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/exdark_detection.yml
@@ -0,0 +1,20 @@
+metric: COCO
+num_classes: 12
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: coco_annotations/train.json
+ dataset_dir: dataset/Exdark/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: coco_annotations/val.json
+ dataset_dir: dataset/Exdark/
+
+TestDataset:
+ !ImageFolder
+ anno_path: coco_annotations/val.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/Exdark/ # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/pcb_detection.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/pcb_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..53f5f3744c5aa029ed80b7a5ab911ea831d2f78e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/pcb_detection.yml
@@ -0,0 +1,20 @@
+metric: COCO
+num_classes: 6
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: pcb_cocoanno/train.json
+ dataset_dir: dataset/PCB_coco/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: pcb_cocoanno/val.json
+ dataset_dir: dataset/PCB_coco/
+
+TestDataset:
+ !ImageFolder
+ anno_path: pcb_cocoanno/val.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/PCB_coco/ # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/sku110k.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/sku110k.yml
new file mode 100644
index 0000000000000000000000000000000000000000..664ce2f25c354d1fe5e85642e7a6ae348b59a032
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/sku110k.yml
@@ -0,0 +1,21 @@
+metric: COCO
+num_classes: 1
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/annotations_train.json
+ dataset_dir: dataset/SKU110K_fixed
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/annotations_val.json
+ dataset_dir: dataset/SKU110K_fixed
+ allow_empty: true
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/annotations_test.json
+ dataset_dir: dataset/SKU110K_fixed
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/wgisd_detection.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/wgisd_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a2721bbd193c91884c512294eb73978eddd3bb9a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/_base_/wgisd_detection.yml
@@ -0,0 +1,20 @@
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: data
+ anno_path: coco_annotations/new_train_bbox_instances.json
+ dataset_dir: dataset/wgisd/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: data
+ anno_path: coco_annotations/new_test_bbox_instances.json
+ dataset_dir: dataset/wgisd/
+
+TestDataset:
+ !ImageFolder
+ anno_path: coco_annotations/new_test_bbox_instances.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/wgisd/ # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_crn_m_80e_exdark.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_crn_m_80e_exdark.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6f9914dce90d39062a01579dac3dc0dc6da56430
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_crn_m_80e_exdark.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ './_base_/exdark_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_crn.yml',
+ '../_base_/ppyoloe_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_crn_m_80e_exdark/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_crn_m_80e_pcb.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_crn_m_80e_pcb.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7e7de1cf8e7461aa8f1d2a1acded2c5babb37c2e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_crn_m_80e_pcb.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ './_base_/pcb_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_crn.yml',
+ '../_base_/ppyoloe_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_crn_m_80e_pcb/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_crn_m_80e_wgisd.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_crn_m_80e_wgisd.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c8658b0d6cfec4e3aba3c4ed865e1e02e55a60ca
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_crn_m_80e_wgisd.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ './_base_/wgisd_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_crn.yml',
+ '../_base_/ppyoloe_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_crn_m_80e_wgisd/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_l_80e_sku110k.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_l_80e_sku110k.yml
new file mode 100644
index 0000000000000000000000000000000000000000..858bf5f4a0ff4ad0141df000f13de0b56804b460
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_l_80e_sku110k.yml
@@ -0,0 +1,127 @@
+_BASE_: [
+ './_base_/sku110k.yml',
+ '../../runtime.yml'
+]
+
+log_iter: 10
+snapshot_epoch: 20
+weights: output/ppyoloe_plus_crn_s_80e_coco/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+
+# arch
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+custom_black_list: ['reduce_mean']
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+ use_alpha: True
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 3000
+ keep_top_k: 1000
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+
+# reader
+worker_num: 8
+eval_height: &eval_height 960
+eval_width: &eval_width 960
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [3000, 1800], keep_ratio: True, interp: 2}
+ - RandomDistort: {}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800, 832, 864, 896, 928, 960, 992, 1024, 1056, 1088, 1120, 1152], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+
+# optimizer
+epoch: 80
+
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_coco_pretrained_exdark.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_coco_pretrained_exdark.yml
new file mode 100644
index 0000000000000000000000000000000000000000..66fc8b52d4b7246fccbd12aa1ace9bee65be6229
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_coco_pretrained_exdark.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ './_base_/exdark_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_m_80e_coco_pretrained_exdark/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_m_80e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_coco_pretrained_pcb.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_coco_pretrained_pcb.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7b0e3abd8fc30e4d5097392b8367f4b11cc14f5d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_coco_pretrained_pcb.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ './_base_/pcb_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_m_80e_coco_pretrained_pcb/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_m_80e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_coco_pretrained_wgisd.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_coco_pretrained_wgisd.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3e813cb09773beff735131544581b42480bebccf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_coco_pretrained_wgisd.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ './_base_/wgisd_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_m_80e_coco_pretrained_wgisd/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_m_80e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_obj365_pretrained_exdark.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_obj365_pretrained_exdark.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d97f2a115ac8aac0a4f31b629bc7a2a4d5388810
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_obj365_pretrained_exdark.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ './_base_/exdark_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_m_80e_obj365_pretrained_exdark/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_m_obj365_pretrained.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_obj365_pretrained_pcb.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_obj365_pretrained_pcb.yml
new file mode 100644
index 0000000000000000000000000000000000000000..72d5620c22e28c901a69eb358c7f82e067fa4986
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_obj365_pretrained_pcb.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ './_base_/pcb_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_m_80e_obj365_pretrained_pcb/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_m_obj365_pretrained.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_obj365_pretrained_wgisd.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_obj365_pretrained_wgisd.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6cebc6d47d85a95194873ee17885c0691bf40883
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_obj365_pretrained_wgisd.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ './_base_/wgisd_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_m_80e_obj365_pretrained_wgisd/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_m_obj365_pretrained.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_sku110k.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_sku110k.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cd7a4431cd9eec6a40d04410e4c160557d4e9be1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_m_80e_sku110k.yml
@@ -0,0 +1,127 @@
+_BASE_: [
+ './_base_/sku110k.yml',
+ '../../runtime.yml'
+]
+
+log_iter: 10
+snapshot_epoch: 20
+weights: output/ppyoloe_plus_crn_s_80e_coco/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_m_obj365_pretrained.pdparams
+depth_mult: 0.67
+width_mult: 0.75
+
+
+# arch
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+custom_black_list: ['reduce_mean']
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+ use_alpha: True
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 3000
+ keep_top_k: 1000
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+
+# reader
+worker_num: 8
+eval_height: &eval_height 960
+eval_width: &eval_width 960
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [3000, 1800], keep_ratio: True, interp: 2}
+ - RandomDistort: {}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800, 832, 864, 896, 928, 960, 992, 1024, 1056, 1088, 1120, 1152], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+
+# optimizer
+epoch: 80
+
+LearningRate:
+ base_lr: 0.004
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_s_80e_sku110k.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_s_80e_sku110k.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e196a6845a4be8f06bb623c965924157c5f206e2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_s_80e_sku110k.yml
@@ -0,0 +1,127 @@
+_BASE_: [
+ './_base_/sku110k.yml',
+ '../../runtime.yml'
+]
+
+log_iter: 10
+snapshot_epoch: 20
+weights: output/ppyoloe_plus_crn_s_80e_coco/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+
+# arch
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+custom_black_list: ['reduce_mean']
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+ use_alpha: True
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 3000
+ keep_top_k: 1000
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+
+# reader
+worker_num: 8
+eval_height: &eval_height 960
+eval_width: &eval_width 960
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [3000, 1800], keep_ratio: True, interp: 2}
+ - RandomDistort: {}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800, 832, 864, 896, 928, 960, 992, 1024, 1056, 1088, 1120, 1152], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+
+# optimizer
+epoch: 80
+
+LearningRate:
+ base_lr: 0.004
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_x_80e_sku110k.yml b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_x_80e_sku110k.yml
new file mode 100644
index 0000000000000000000000000000000000000000..da465662cf1797b8ab70cab3171c99f7627e96da
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/application/ppyoloe_plus_crn_x_80e_sku110k.yml
@@ -0,0 +1,127 @@
+_BASE_: [
+ './_base_/sku110k.yml',
+ '../../runtime.yml'
+]
+
+log_iter: 10
+snapshot_epoch: 20
+weights: output/ppyoloe_plus_crn_s_80e_coco/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_x_obj365_pretrained.pdparams
+depth_mult: 1.33
+width_mult: 1.25
+
+
+# arch
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+custom_black_list: ['reduce_mean']
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+ use_alpha: True
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 3000
+ keep_top_k: 1000
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+
+# reader
+worker_num: 8
+eval_height: &eval_height 960
+eval_width: &eval_width 960
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [3000, 1800], keep_ratio: True, interp: 2}
+ - RandomDistort: {}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800, 832, 864, 896, 928, 960, 992, 1024, 1056, 1088, 1120, 1152], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+
+# optimizer
+epoch: 80
+
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/distill/README.md b/PaddleDetection-release-2.6/configs/ppyoloe/distill/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..868d70b88805dca01e63bd56dff7c08c06a2f5cb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/distill/README.md
@@ -0,0 +1,46 @@
+# PPYOLOE+ Distillation(PPYOLOE+ 蒸馏)
+
+PaddleDetection提供了对PPYOLOE+ 进行模型蒸馏的方案,结合了logits蒸馏和feature蒸馏。更多蒸馏方案可以查看[slim/distill](../../slim/distill/)。
+
+## 模型库
+
+| 模型 | 方案 | 输入尺寸 | epochs | Box mAP | 配置文件 | 下载链接 |
+| ----------------- | ----------- | ------ | :----: | :-----------: | :--------------: | :------------: |
+| PP-YOLOE+_x | teacher | 640 | 80e | 54.7 | [config](../ppyoloe_plus_crn_x_80e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_x_80e_coco.pdparams) |
+| PP-YOLOE+_l | student | 640 | 80e | 52.9 | [config](../ppyoloe_plus_crn_l_80e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco.pdparams) |
+| PP-YOLOE+_l | distill | 640 | 80e | **54.0(+1.1)** | [config](./ppyoloe_plus_crn_l_80e_coco_distill.yml),[slim_config](../../slim/distill/ppyoloe_plus_distill_x_distill_l.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco_distill.pdparams) |
+| PP-YOLOE+_l | teacher | 640 | 80e | 52.9 | [config](../ppyoloe_plus_crn_l_80e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco.pdparams) |
+| PP-YOLOE+_m | student | 640 | 80e | 49.8 | [config](../ppyoloe_plus_crn_m_80e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_m_80e_coco.pdparams) |
+| PP-YOLOE+_m | distill | 640 | 80e | **51.0(+1.2)** | [config](./ppyoloe_plus_crn_m_80e_coco_distill.yml),[slim_config](../../slim/distill/ppyoloe_plus_distill_l_distill_m.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_m_80e_coco_distill.pdparams) |
+
+## 快速开始
+
+### 训练
+```shell
+# 单卡
+python tools/train.py -c configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml --slim_config configs/slim/distill/ppyoloe_plus_distill_x_distill_l.yml
+# 多卡
+python -m paddle.distributed.launch --log_dir=ppyoloe_plus_distill_x_distill_l/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml --slim_config configs/slim/distill/ppyoloe_plus_distill_x_distill_l.yml
+```
+
+- `-c`: 指定模型配置文件,也是student配置文件。
+- `--slim_config`: 指定压缩策略配置文件,也是teacher配置文件。
+
+### 评估
+```shell
+python tools/eval.py -c configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml -o weights=output/ppyoloe_plus_crn_l_80e_coco_distill/model_final.pdparams
+```
+
+- `-c`: 指定模型配置文件,也是student配置文件。
+- `--slim_config`: 指定压缩策略配置文件,也是teacher配置文件。
+- `-o weights`: 指定压缩算法训好的模型路径。
+
+### 测试
+```shell
+python tools/infer.py -c configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml -o weights=output/ppyoloe_plus_crn_l_80e_coco_distill/model_final.pdparams --infer_img=demo/000000014439_640x640.jpg
+```
+
+- `-c`: 指定模型配置文件。
+- `--slim_config`: 指定压缩策略配置文件。
+- `-o weights`: 指定压缩算法训好的模型路径。
+- `--infer_img`: 指定测试图像路径。
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml b/PaddleDetection-release-2.6/configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c000a4898012afd0cb832d36a9716130ad68ae48
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml
@@ -0,0 +1,39 @@
+_BASE_: [
+ '../ppyoloe_plus_crn_l_80e_coco.yml',
+]
+for_distill: True
+architecture: PPYOLOE
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+ collate_batch: True
+
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_l_80e_coco_distill/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/distill/ppyoloe_plus_crn_m_80e_coco_distill.yml b/PaddleDetection-release-2.6/configs/ppyoloe/distill/ppyoloe_plus_crn_m_80e_coco_distill.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ef2f38510bcedb7ed5ab0859c893322299b7e0d9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/distill/ppyoloe_plus_crn_m_80e_coco_distill.yml
@@ -0,0 +1,39 @@
+_BASE_: [
+ '../ppyoloe_plus_crn_m_80e_coco.yml',
+]
+for_distill: True
+architecture: PPYOLOE
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+ collate_batch: True
+
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_m_80e_coco_distill/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_m_obj365_pretrained.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/distill/ppyoloe_plus_crn_s_80e_coco_distill.yml b/PaddleDetection-release-2.6/configs/ppyoloe/distill/ppyoloe_plus_crn_s_80e_coco_distill.yml
new file mode 100644
index 0000000000000000000000000000000000000000..95ac5d0caef531ffca8109c348a06dc408410a18
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/distill/ppyoloe_plus_crn_s_80e_coco_distill.yml
@@ -0,0 +1,39 @@
+_BASE_: [
+ '../ppyoloe_plus_crn_s_80e_coco.yml',
+]
+for_distill: True
+architecture: PPYOLOE
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+ collate_batch: True
+
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_s_80e_coco_distill/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/objects365/README_cn.md b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..8018d03c62d77514a13f2c45340fe3c23ce6fdec
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/README_cn.md
@@ -0,0 +1,15 @@
+# PP-YOLOE
+
+## 模型库
+
+### Objects365数据集模型库
+| 模型 | Epoch | 机器个数 | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box AP0.5 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | 模型下载 | 配置文件 |
+|:---------------:|:-----:|:-----------:|:-----------:|:-----------:|:---------:|:----------:|:--------------:|:---------:|:---------:|:-------------:|:-----------------------:| :--------:|:--------:|
+| PP-YOLOE+_s | 60 | 3 | 8 | 8 | cspresnet-s | 640 | 18.1 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams) | [config](./ppyoloe_plus_crn_s_60e_objects365.yml) |
+| PP-YOLOE+_m | 60 | 4 | 8 | 8 | cspresnet-m | 640 | 25.0 | 23.43 | 49.91 | 123.4 | 208.3 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_m_obj365_pretrained.pdparams) | [config](./ppyoloe_plus_crn_m_60e_objects365.yml) |
+| PP-YOLOE+_l | 60 | 3 | 8 | 8 | cspresnet-l | 640 | 30.8 | 52.20 | 110.07 | 78.1 | 149.2 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams) | [config](./ppyoloe_plus_crn_l_60e_objects365.yml) |
+| PP-YOLOE+_x | 60 | 4 | 8 | 8 | cspresnet-x | 640 | 32.7 | 98.42 | 206.59 | 45.0 | 95.2 | [model](https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_x_obj365_pretrained.pdparams) | [config](./ppyoloe_plus_crn_x_60e_objects365.yml) |
+
+
+**注意:**
+- 多机训练细节见[文档](../../../docs/tutorials/DistributedTraining_cn.md)
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_l_60e_objects365.yml b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_l_60e_objects365.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ca283394fe24e23ed1395637ebe120da00fc49b6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_l_60e_objects365.yml
@@ -0,0 +1,21 @@
+_BASE_: [
+ '../../datasets/objects365_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_60e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_l_60e_objects365/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_l_pretrained.pdparams
+
+CSPResNet:
+ use_alpha: False
+
+PPYOLOEHead:
+ static_assigner_epoch: 20
+
+depth_mult: 1.0
+width_mult: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_m_60e_objects365.yml b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_m_60e_objects365.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0877b5275a95e44e4275cc873d920aabb6a266cb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_m_60e_objects365.yml
@@ -0,0 +1,21 @@
+_BASE_: [
+ '../../datasets/objects365_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_60e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_m_60e_objects365/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_m_pretrained.pdparams
+
+CSPResNet:
+ use_alpha: False
+
+PPYOLOEHead:
+ static_assigner_epoch: 20
+
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_s_60e_objects365.yml b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_s_60e_objects365.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0023af93f17faf61ab071823f557c649e3155c67
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_s_60e_objects365.yml
@@ -0,0 +1,21 @@
+_BASE_: [
+ '../../datasets/objects365_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_60e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_s_60e_objects365/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_s_pretrained.pdparams
+
+CSPResNet:
+ use_alpha: False
+
+PPYOLOEHead:
+ static_assigner_epoch: 20
+
+depth_mult: 0.33
+width_mult: 0.50
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_x_60e_objects365.yml b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_x_60e_objects365.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0c5fe97150c40a282783de40f9b8d6f5cd6a1be4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/objects365/ppyoloe_plus_crn_x_60e_objects365.yml
@@ -0,0 +1,21 @@
+_BASE_: [
+ '../../datasets/objects365_detection.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_60e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_x_60e_objects365/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_x_pretrained.pdparams
+
+CSPResNet:
+ use_alpha: False
+
+PPYOLOEHead:
+ static_assigner_epoch: 20
+
+depth_mult: 1.33
+width_mult: 1.25
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ef3422815b4376fdd921516e235ec59af28681f7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/ppyoloe_crn.yml',
+ './_base_/ppyoloe_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_l_300e_coco/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_l_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_l_36e_coco_xpu.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_l_36e_coco_xpu.yml
new file mode 100644
index 0000000000000000000000000000000000000000..21af7774c7260ca7d3db01b64c92c92ef0e2d882
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_l_36e_coco_xpu.yml
@@ -0,0 +1,71 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_36e_xpu.yml',
+ './_base_/ppyoloe_reader.yml',
+]
+
+# note: these are default values (use_gpu = true and use_xpu = false) for CI.
+# set use_gpu = false and use_xpu = true for training.
+use_gpu: true
+use_xpu: false
+
+log_iter: 100
+snapshot_epoch: 1
+weights: output/ppyoloe_crn_l_36e_coco/model_final
+find_unused_parameters: True
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_l_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+TrainReader:
+ batch_size: 8
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+ema_black_list: ['proj_conv.weight']
+custom_black_list: ['reduce_mean']
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 4
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_m_300e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_m_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f6c2a4ab3df171904714a9b7edb17fd588f3d5fc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_m_300e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/ppyoloe_crn.yml',
+ './_base_/ppyoloe_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_m_300e_coco/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_m_pretrained.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0afba55c4e3339e7f91070e524f9a9e4d37e4cd7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/ppyoloe_crn.yml',
+ './_base_/ppyoloe_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_s_300e_coco/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_s_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_s_400e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_s_400e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bc60cf6b6cc414b6fc6f20f86b7dd09aa1699d40
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_s_400e_coco.yml
@@ -0,0 +1,18 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_400e.yml',
+ './_base_/ppyoloe_crn.yml',
+ './_base_/ppyoloe_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_s_400e_coco/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_s_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+PPYOLOEHead:
+ static_assigner_epoch: 133
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_x_300e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_x_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fc388e9416b567ca431c30b39280d09a9ebf04ab
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_crn_x_300e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/ppyoloe_crn.yml',
+ './_base_/ppyoloe_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_x_300e_coco/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_x_pretrained.pdparams
+depth_mult: 1.33
+width_mult: 1.25
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..626cc2810510a908cb361bf63e4c1ae087adcba7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_80e.yml',
+ './_base_/ppyoloe_plus_crn.yml',
+ './_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_l_80e_coco/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3209bef6b91c03c12b09bb8038adc82b7d1de8e0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_80e.yml',
+ './_base_/ppyoloe_plus_crn.yml',
+ './_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_m_80e_coco/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_m_obj365_pretrained.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..862f322c48ec6fd7f7bc669f4be8b436746046e7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_80e.yml',
+ './_base_/ppyoloe_plus_crn.yml',
+ './_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_s_80e_coco/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_300e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5884cc0f7af9a3ed85afaa2cd4b89362b224482b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_300e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/ppyoloe_plus_crn_tiny_auxhead.yml',
+ './_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_plus_crn_t_auxhead_300e_coco/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_t_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.375
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_320_300e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_320_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..010a4f610c8c1c1a2f62e6c1b49541072fbea578
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_320_300e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/ppyoloe_plus_crn_tiny_auxhead.yml',
+ './_base_/ppyoloe_plus_reader_320.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_plus_crn_t_auxhead_320_300e_coco/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_t_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.375
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_relu_300e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_relu_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6822f188685ccab3c7887cb39d14c5e182362f12
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_relu_300e_coco.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/ppyoloe_plus_crn_tiny_auxhead.yml',
+ './_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_plus_crn_t_auxhead_relu_300e_coco/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_t_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.375
+
+
+CSPResNet:
+ act: 'relu'
+
+CustomCSPPAN:
+ act: 'relu'
+
+PPYOLOEHead:
+ act: 'relu'
+ attn_conv: None
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_relu_320_300e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_relu_320_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ad7642881ae6055340ece761ae97775d314f0b13
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_t_auxhead_relu_320_300e_coco.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/ppyoloe_plus_crn_tiny_auxhead.yml',
+ './_base_/ppyoloe_plus_reader_320.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_plus_crn_t_auxhead_relu_320_300e_coco/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_t_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.375
+
+
+CSPResNet:
+ act: 'relu'
+
+CustomCSPPAN:
+ act: 'relu'
+
+PPYOLOEHead:
+ act: 'relu'
+ attn_conv: None
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cd41814f972cff4b1193c0c7813d22764b1f565d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_80e.yml',
+ './_base_/ppyoloe_plus_crn.yml',
+ './_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_x_80e_coco/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_x_obj365_pretrained.pdparams
+depth_mult: 1.33
+width_mult: 1.25
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/voc/README_cn.md b/PaddleDetection-release-2.6/configs/ppyoloe/voc/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..8bfc61d9c8c16cfcad8f9e2a89442345608ce757
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/voc/README_cn.md
@@ -0,0 +1,9 @@
+# PP-YOLOE
+
+## 模型库
+
+### VOC数据集模型库
+| 模型 | Epoch | GPU个数 | 每GPU图片个数 | 骨干网络 | 输入尺寸 | Box AP0.5 | Params(M) | FLOPs(G) | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | 模型下载 | 配置文件 |
+|:---------------:|:-----:|:-----------:|:-----------:|:---------:|:----------:|:--------------:|:---------:|:---------:|:-------------:|:-----------------------:| :-------: |:--------:|
+| PP-YOLOE+_s | 30 | 8 | 8 | cspresnet-s | 640 | 86.7 | 7.93 | 17.36 | 208.3 | 333.3 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_30e_voc.pdparams) | [config](./ppyoloe_plus_crn_s_30e_voc.yml) |
+| PP-YOLOE+_l | 30 | 8 | 8 | cspresnet-l | 640 | 89.0 | 52.20 | 110.07 | 78.1 | 149.2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_30e_voc.pdparams) | [config](./ppyoloe_plus_crn_l_30e_voc.yml) |
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/voc/ppyoloe_plus_crn_l_30e_voc.yml b/PaddleDetection-release-2.6/configs/ppyoloe/voc/ppyoloe_plus_crn_l_30e_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..217e37f274c443f0d905ad162ede72674c6f9092
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/voc/ppyoloe_plus_crn_l_30e_voc.yml
@@ -0,0 +1,43 @@
+_BASE_: [
+ '../../datasets/voc.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_l_30e_voc/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+
+TrainReader:
+ batch_size: 8 # default 8 gpus, total bs = 64
+
+EvalReader:
+ batch_size: 4
+
+
+epoch: 30
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/ppyoloe/voc/ppyoloe_plus_crn_s_30e_voc.yml b/PaddleDetection-release-2.6/configs/ppyoloe/voc/ppyoloe_plus_crn_s_30e_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..080bcdd808d971bfc214e8e10a34ef26fd6700ca
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ppyoloe/voc/ppyoloe_plus_crn_s_30e_voc.yml
@@ -0,0 +1,43 @@
+_BASE_: [
+ '../../datasets/voc.yml',
+ '../../runtime.yml',
+ '../_base_/optimizer_80e.yml',
+ '../_base_/ppyoloe_plus_crn.yml',
+ '../_base_/ppyoloe_plus_reader.yml',
+]
+
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_s_30e_voc/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_s_80e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+
+TrainReader:
+ batch_size: 8 # default 8 gpus, total bs = 64
+
+EvalReader:
+ batch_size: 4
+
+
+epoch: 30
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/queryinst/README.md b/PaddleDetection-release-2.6/configs/queryinst/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..568135328ba43780a3829977b839169126fe0b10
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/queryinst/README.md
@@ -0,0 +1,41 @@
+# QueryInst: Instances as Queries
+
+## Introduction
+
+QueryInst is a multi-stage end-to-end system that treats instances of interest as learnable queries, enabling query
+based object detectors, e.g., Sparse R-CNN, to have strong instance segmentation performance. The attributes of
+instances such as categories, bounding boxes, instance masks, and instance association embeddings are represented by
+queries in a unified manner. In QueryInst, a query is shared by both detection and segmentation via dynamic convolutions
+and driven by parallelly-supervised multi-stage learning.
+
+## Model Zoo
+
+| Backbone | Lr schd | Proposals | MultiScale | RandomCrop | bbox AP | mask AP | Download | Config |
+|:------------:|:-------:|:---------:|:----------:|:----------:|:-------:|:-------:|------------------------------------------------------------------------------------------------------|----------------------------------------------------------|
+| ResNet50-FPN | 1x | 100 | × | × | 42.1 | 37.8 | [model](https://bj.bcebos.com/v1/paddledet/models/queryinst_r50_fpn_1x_pro100_coco.pdparams) | [config](./queryinst_r50_fpn_1x_pro100_coco.yml) |
+| ResNet50-FPN | 3x | 300 | √ | √ | 47.9 | 42.1 | [model](https://bj.bcebos.com/v1/paddledet/models/queryinst_r50_fpn_ms_crop_3x_pro300_coco.pdparams) | [config](./queryinst_r50_fpn_ms_crop_3x_pro300_coco.yml) |
+
+- COCO val-set evaluation results.
+- These configurations are for 4-card training.
+
+Please modify these parameters as appropriate:
+
+```yaml
+worker_num: 4
+TrainReader:
+ use_shared_memory: true
+find_unused_parameters: true
+```
+
+## Citations
+
+```
+@InProceedings{Fang_2021_ICCV,
+ author = {Fang, Yuxin and Yang, Shusheng and Wang, Xinggang and Li, Yu and Fang, Chen and Shan, Ying and Feng, Bin and Liu, Wenyu},
+ title = {Instances As Queries},
+ booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
+ month = {October},
+ year = {2021},
+ pages = {6910-6919}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/queryinst/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/queryinst/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a7c0f5cb16311f046adec9e11f7cd0cc4a93e3d9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/queryinst/_base_/optimizer_1x.yml
@@ -0,0 +1,17 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 0.1
+ optimizer:
+ type: AdamW
+ weight_decay: 0.0001
diff --git a/PaddleDetection-release-2.6/configs/queryinst/_base_/queryinst_r50_fpn.yml b/PaddleDetection-release-2.6/configs/queryinst/_base_/queryinst_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..05ab1c02f8a02308cfd47d441697c4a548c32f1a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/queryinst/_base_/queryinst_r50_fpn.yml
@@ -0,0 +1,74 @@
+num_proposals: &num_proposals 100
+proposal_embedding_dim: &proposal_embedding_dim 256
+bbox_resolution: &bbox_resolution 7
+mask_resolution: &mask_resolution 14
+
+architecture: QueryInst
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+QueryInst:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: EmbeddingRPNHead
+ roi_head: SparseRoIHead
+ post_process: SparsePostProcess
+
+ResNet:
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [ 0, 1, 2, 3 ]
+ num_stages: 4
+ lr_mult_list: [ 0.1, 0.1, 0.1, 0.1 ]
+
+FPN:
+ out_channel: *proposal_embedding_dim
+ extra_stage: 0
+
+EmbeddingRPNHead:
+ num_proposals: *num_proposals
+
+SparseRoIHead:
+ num_stages: 6
+ bbox_roi_extractor:
+ resolution: *bbox_resolution
+ sampling_ratio: 2
+ aligned: True
+ mask_roi_extractor:
+ resolution: *mask_resolution
+ sampling_ratio: 2
+ aligned: True
+ bbox_head: DIIHead
+ mask_head: DynamicMaskHead
+ loss_func: QueryInstLoss
+
+DIIHead:
+ feedforward_channels: 2048
+ dynamic_feature_channels: 64
+ roi_resolution: *bbox_resolution
+ num_attn_heads: 8
+ dropout: 0.0
+ num_ffn_fcs: 2
+ num_cls_fcs: 1
+ num_reg_fcs: 3
+
+DynamicMaskHead:
+ dynamic_feature_channels: 64
+ roi_resolution: *mask_resolution
+ num_convs: 4
+ conv_kernel_size: 3
+ conv_channels: 256
+ upsample_method: 'deconv'
+ upsample_scale_factor: 2
+
+QueryInstLoss:
+ focal_loss_alpha: 0.25
+ focal_loss_gamma: 2.0
+ class_weight: 2.0
+ l1_weight: 5.0
+ giou_weight: 2.0
+ mask_weight: 8.0
+
+SparsePostProcess:
+ num_proposals: *num_proposals
+ binary_thresh: 0.5
diff --git a/PaddleDetection-release-2.6/configs/queryinst/_base_/queryinst_reader.yml b/PaddleDetection-release-2.6/configs/queryinst/_base_/queryinst_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e867cc27454efaf321e40bd09dc674d4f32c3a8d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/queryinst/_base_/queryinst_reader.yml
@@ -0,0 +1,43 @@
+worker_num: 4
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Mask: {del_poly: True}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2SparseTarget: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2SparseTarget: {}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2SparseTarget: {}
+ batch_size: 1
+ shuffle: false
diff --git a/PaddleDetection-release-2.6/configs/queryinst/queryinst_r50_fpn_1x_pro100_coco.yml b/PaddleDetection-release-2.6/configs/queryinst/queryinst_r50_fpn_1x_pro100_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1e61252b71d3373c2fc062207ef2b88d699d8a0b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/queryinst/queryinst_r50_fpn_1x_pro100_coco.yml
@@ -0,0 +1,12 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/queryinst_r50_fpn.yml',
+ '_base_/queryinst_reader.yml',
+]
+
+log_iter: 50
+find_unused_parameters: true
+
+weights: output/queryinst_r50_fpn_1x_pro100_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/queryinst/queryinst_r50_fpn_ms_crop_3x_pro300_coco.yml b/PaddleDetection-release-2.6/configs/queryinst/queryinst_r50_fpn_ms_crop_3x_pro300_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7dfa8997e3f71c0941b3b7626ad256333af2161a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/queryinst/queryinst_r50_fpn_ms_crop_3x_pro300_coco.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ './queryinst_r50_fpn_1x_pro100_coco.yml',
+]
+
+weights: output/queryinst_r50_fpn_ms_crop_3x_pro300_coco/model_final
+
+EmbeddingRPNHead:
+ num_proposals: 300
+
+QueryInstPostProcess:
+ num_proposals: 300
+
+epoch: 36
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [27, 33]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Mask: {del_poly: True}
+ - RandomFlip: {prob: 0.5}
+ - RandomSelect: { transforms1: [ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ],
+ transforms2: [
+ RandomShortSideResize: { short_side_sizes: [ 400, 500, 600 ], max_size: 1333 },
+ RandomSizeCrop: { min_size: 384, max_size: 600, keep_empty: true },
+ RandomShortSideResize: { short_side_sizes: [ 480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800 ], max_size: 1333 } ]
+ }
+ - NormalizeImage: { is_scale: true, mean: [ 0.485,0.456,0.406 ], std: [ 0.229, 0.224,0.225 ] }
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2SparseTarget: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: true
diff --git a/PaddleDetection-release-2.6/configs/rcnn_enhance/README.md b/PaddleDetection-release-2.6/configs/rcnn_enhance/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..378f4d83c627c847ef5f5c48472710401fec6124
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rcnn_enhance/README.md
@@ -0,0 +1,12 @@
+## 服务器端实用目标检测方案
+
+### 简介
+
+* 近年来,学术界和工业界广泛关注图像中目标检测任务。基于[PaddleClas](https://github.com/PaddlePaddle/PaddleClas)中SSLD蒸馏方案训练得到的ResNet50_vd预训练模型(ImageNet1k验证集上Top1 Acc为82.39%),结合PaddleDetection中的丰富算子,飞桨提供了一种面向服务器端实用的目标检测方案PSS-DET(Practical Server Side Detection)。基于COCO2017目标检测数据集,V100单卡预测速度为61FPS时,COCO mAP可达41.2%。
+
+
+### 模型库
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | Mask AP | 下载 | 配置文件 |
+| :---------------------- | :-------------: | :-------: | :-----: | :------------: | :----: | :-----: | :-------------: | :-----: |
+| ResNet50-vd-FPN-Dcnv2 | Faster | 2 | 3x | 61.425 | 41.5 | - | [下载链接](https://paddledet.bj.bcebos.com/models/faster_rcnn_enhance_3x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rcnn_enhance/faster_rcnn_enhance_3x_coco.yml) |
diff --git a/PaddleDetection-release-2.6/configs/rcnn_enhance/README_en.md b/PaddleDetection-release-2.6/configs/rcnn_enhance/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..bf768a294cdb3b51b0d3be57be7952b78ce6c91f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rcnn_enhance/README_en.md
@@ -0,0 +1,12 @@
+## Practical Server Side Detection
+
+### Introduction
+
+* In recent years, the object detection task in image has been widely concerned by academia and industry. ResNet50vd pretraining model based on SSLD distillation program training in [PaddleClas](https://github.com/PaddlePaddle/PaddleClas) (Top1 on ImageNet1k verification set) Acc is 82.39%), combined with the rich operator of PaddleDetection, PaddlePaddle provides a practical server side detection scheme PSS-DET(Practical Server Side Detection). Based on COCO2017 object detection dataset, V100 single gpu prediction speed is 61FPS, COCO mAP can reach 41.2%.
+
+
+### Model library
+
+| Backbone | Network type | Number of images per GPU | Learning rate strategy | Inferring time(fps) | Box AP | Mask AP | Download | Configuration File |
+| :-------------------- | :----------: | :----------------------: | :--------------------: | :-----------------: | :----: | :-----: | :---------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------: |
+| ResNet50-vd-FPN-Dcnv2 | Faster | 2 | 3x | 61.425 | 41.5 | - | [link](https://paddledet.bj.bcebos.com/models/faster_rcnn_enhance_3x_coco.pdparams) | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rcnn_enhance/faster_rcnn_enhance_3x_coco.yml) |
diff --git a/PaddleDetection-release-2.6/configs/rcnn_enhance/_base_/faster_rcnn_enhance.yml b/PaddleDetection-release-2.6/configs/rcnn_enhance/_base_/faster_rcnn_enhance.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d47fd2c98ce28ab3e75f56e981a2be70326a8bbd
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rcnn_enhance/_base_/faster_rcnn_enhance.yml
@@ -0,0 +1,81 @@
+architecture: FasterRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ variant: d
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ dcn_v2_stages: [1,2,3]
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ in_channels: [256, 512, 1024, 2048]
+ out_channel: 64
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 500
+ post_nms_top_n: 300
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxLibraAssigner
+ bbox_loss: DIouLoss
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxLibraAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+DIouLoss:
+ loss_weight: 10.0
+ use_complete_iou_loss: true
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/rcnn_enhance/_base_/faster_rcnn_enhance_reader.yml b/PaddleDetection-release-2.6/configs/rcnn_enhance/_base_/faster_rcnn_enhance_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f1a7c998d4e332661491024ca17a1a0d996b589d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rcnn_enhance/_base_/faster_rcnn_enhance_reader.yml
@@ -0,0 +1,42 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - AutoAugment: {autoaug_type: v1}
+ - RandomResize: {target_size: [[384,1000], [416,1000], [448,1000], [480,1000], [512,1000], [544,1000], [576,1000], [608,1000], [640,1000], [672,1000]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [640, 640], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [640, 640], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/rcnn_enhance/_base_/optimizer_3x.yml b/PaddleDetection-release-2.6/configs/rcnn_enhance/_base_/optimizer_3x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8bd85fae359c552952bdfc7cec4cbb5ff1198e85
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rcnn_enhance/_base_/optimizer_3x.yml
@@ -0,0 +1,19 @@
+epoch: 36
+
+LearningRate:
+ base_lr: 0.02
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [24, 33]
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/rcnn_enhance/faster_rcnn_enhance_3x_coco.yml b/PaddleDetection-release-2.6/configs/rcnn_enhance/faster_rcnn_enhance_3x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a49f245f22dbcaf80cb9a8ca382c35f549858b18
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rcnn_enhance/faster_rcnn_enhance_3x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/faster_rcnn_enhance.yml',
+ '_base_/faster_rcnn_enhance_reader.yml',
+]
+weights: output/faster_rcnn_enhance_r50_3x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/res2net/README.md b/PaddleDetection-release-2.6/configs/res2net/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a51654d809ec9b0d11822946a4d9ef620b2e053b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/res2net/README.md
@@ -0,0 +1,37 @@
+# Res2Net
+
+## Introduction
+
+- Res2Net: A New Multi-scale Backbone Architecture: [https://arxiv.org/abs/1904.01169](https://arxiv.org/abs/1904.01169)
+
+```
+@article{DBLP:journals/corr/abs-1904-01169,
+ author = {Shanghua Gao and
+ Ming{-}Ming Cheng and
+ Kai Zhao and
+ Xinyu Zhang and
+ Ming{-}Hsuan Yang and
+ Philip H. S. Torr},
+ title = {Res2Net: {A} New Multi-scale Backbone Architecture},
+ journal = {CoRR},
+ volume = {abs/1904.01169},
+ year = {2019},
+ url = {http://arxiv.org/abs/1904.01169},
+ archivePrefix = {arXiv},
+ eprint = {1904.01169},
+ timestamp = {Thu, 25 Apr 2019 10:24:54 +0200},
+ biburl = {https://dblp.org/rec/bib/journals/corr/abs-1904-01169},
+ bibsource = {dblp computer science bibliography, https://dblp.org}
+}
+```
+
+
+## Model Zoo
+
+| Backbone | Type | Image/gpu | Lr schd | Inf time (fps) | Box AP | Mask AP | Download | Configs |
+| :---------------------- | :------------- | :-------: | :-----: | :------------: | :----: | :-----: | :----------------------------------------------------------: | :-----: |
+| Res2Net50-FPN | Faster | 2 | 1x | - | 40.6 | - | [model](https://paddledet.bj.bcebos.com/models/faster_rcnn_res2net50_vb_26w_4s_fpn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/res2net/faster_rcnn_res2net50_vb_26w_4s_fpn_1x_coco.yml) |
+| Res2Net50-FPN | Mask | 2 | 2x | - | 42.4 | 38.1 | [model](https://paddledet.bj.bcebos.com/models/mask_rcnn_res2net50_vb_26w_4s_fpn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/res2net/mask_rcnn_res2net50_vb_26w_4s_fpn_2x_coco.yml) |
+| Res2Net50-vd-FPN | Mask | 2 | 2x | - | 42.6 | 38.1 | [model](https://paddledet.bj.bcebos.com/models/mask_rcnn_res2net50_vd_26w_4s_fpn_2x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/res2net/mask_rcnn_res2net50_vd_26w_4s_fpn_2x_coco.yml) |
+
+Note: all the above models are trained with 8 gpus.
diff --git a/PaddleDetection-release-2.6/configs/res2net/faster_rcnn_res2net50_vb_26w_4s_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/res2net/faster_rcnn_res2net50_vb_26w_4s_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1fbdc9d73ff4cedf8498a7d5fdb76d7b6b454bbb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/res2net/faster_rcnn_res2net50_vb_26w_4s_fpn_1x_coco.yml
@@ -0,0 +1,33 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../faster_rcnn/_base_/optimizer_1x.yml',
+ '../faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
+ '../faster_rcnn/_base_/faster_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Res2Net50_26w_4s_pretrained.pdparams
+weights: output/faster_rcnn_res2net50_vb_26w_4s_fpn_1x_coco/model_final
+
+FasterRCNN:
+ backbone: Res2Net
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+Res2Net:
+ # index 0 stands for res2
+ depth: 50
+ width: 26
+ scales: 4
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ variant: b
+
+
+TrainReader:
+ batch_size: 2
diff --git a/PaddleDetection-release-2.6/configs/res2net/mask_rcnn_res2net50_vb_26w_4s_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/res2net/mask_rcnn_res2net50_vb_26w_4s_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..02970d1f0c659b5dac0dc3c53fa5f2a750272520
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/res2net/mask_rcnn_res2net50_vb_26w_4s_fpn_2x_coco.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '../mask_rcnn/_base_/optimizer_1x.yml',
+ '../mask_rcnn/_base_/mask_rcnn_r50_fpn.yml',
+ '../mask_rcnn/_base_/mask_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Res2Net50_26w_4s_pretrained.pdparams
+weights: output/mask_rcnn_res2net50_vb_26w_4s_fpn_2x_coco/model_final
+
+MaskRCNN:
+ backbone: Res2Net
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ mask_head: MaskHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+ mask_post_process: MaskPostProcess
+
+
+Res2Net:
+ # index 0 stands for res2
+ depth: 50
+ width: 26
+ scales: 4
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ variant: b
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
+
+
+TrainReader:
+ batch_size: 2
diff --git a/PaddleDetection-release-2.6/configs/res2net/mask_rcnn_res2net50_vd_26w_4s_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/res2net/mask_rcnn_res2net50_vd_26w_4s_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..549e1f79128c6f33f28b468c4c946ac99e495a6f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/res2net/mask_rcnn_res2net50_vd_26w_4s_fpn_2x_coco.yml
@@ -0,0 +1,47 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '../mask_rcnn/_base_/optimizer_1x.yml',
+ '../mask_rcnn/_base_/mask_rcnn_r50_fpn.yml',
+ '../mask_rcnn/_base_/mask_fpn_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/Res2Net50_vd_26w_4s_pretrained.pdparams
+weights: output/mask_rcnn_res2net50_vd_26w_4s_fpn_2x_coco/model_final
+
+MaskRCNN:
+ backbone: Res2Net
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ mask_head: MaskHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+ mask_post_process: MaskPostProcess
+
+
+Res2Net:
+ # index 0 stands for res2
+ depth: 50
+ width: 26
+ scales: 4
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ variant: d
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
+
+
+TrainReader:
+ batch_size: 2
diff --git a/PaddleDetection-release-2.6/configs/retinanet/README.md b/PaddleDetection-release-2.6/configs/retinanet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..1259d47dddf5eb52e1499c7c63ae913d2f806c7f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/retinanet/README.md
@@ -0,0 +1,28 @@
+# RetinaNet (Focal Loss for Dense Object Detection)
+
+## Model Zoo
+
+| Backbone | Model | imgs/GPU | lr schedule | FPS | Box AP | download | config |
+| ------------ | --------- | -------- | ----------- | --- | ------ | ---------- | ----------- |
+| ResNet50-FPN | RetinaNet | 2 | 1x | --- | 37.5 | [model](https://bj.bcebos.com/v1/paddledet/models/retinanet_r50_fpn_1x_coco.pdparams) | [config](./retinanet_r50_fpn_1x_coco.yml) |
+| ResNet50-FPN | RetinaNet | 2 | 2x | --- | 39.1 | [model](https://bj.bcebos.com/v1/paddledet/models/retinanet_r50_fpn_2x_coco.pdparams) | [config](./retinanet_r50_fpn_2x_coco.yml) |
+| ResNet101-FPN| RetinaNet | 2 | 2x | --- | 40.6 | [model](https://paddledet.bj.bcebos.com/models/retinanet_r101_fpn_2x_coco.pdparams) | [config](./retinanet_r101_fpn_2x_coco.yml) |
+| ResNet50-FPN | RetinaNet + [FGD](../slim/distill/README.md) | 2 | 2x | --- | 40.8 | [model](https://bj.bcebos.com/v1/paddledet/models/retinanet_r101_distill_r50_2x_coco.pdparams) | [config](./retinanet_r50_fpn_2x_coco.yml)/[slim_config](../slim/distill/retinanet_resnet101_coco_distill.yml) |
+
+
+**Notes:**
+
+- The ResNet50-FPN are trained on COCO train2017 with 8 GPUs. Both ResNet101-FPN and ResNet50-FPN with [FGD](../slim/distill/README.md) are trained on COCO train2017 with 4 GPUs.
+- All above models are evaluated on val2017. Box AP=`mAP(IoU=0.5:0.95)`.
+
+
+## Citation
+
+```latex
+@inproceedings{lin2017focal,
+ title={Focal loss for dense object detection},
+ author={Lin, Tsung-Yi and Goyal, Priya and Girshick, Ross and He, Kaiming and Doll{\'a}r, Piotr},
+ booktitle={Proceedings of the IEEE international conference on computer vision},
+ year={2017}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/retinanet/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/retinanet/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..39c54ac805031619debf9b31119afa86b3ead857
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/retinanet/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/retinanet/_base_/optimizer_2x.yml b/PaddleDetection-release-2.6/configs/retinanet/_base_/optimizer_2x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..61841433417b9fcc6f29a6c71a72ba23406b55ad
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/retinanet/_base_/optimizer_2x.yml
@@ -0,0 +1,19 @@
+epoch: 24
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/retinanet/_base_/retinanet_r50_fpn.yml b/PaddleDetection-release-2.6/configs/retinanet/_base_/retinanet_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fb2d767aed5bd383f312ce79e4e39e3710c3cb9c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/retinanet/_base_/retinanet_r50_fpn.yml
@@ -0,0 +1,57 @@
+architecture: RetinaNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+RetinaNet:
+ backbone: ResNet
+ neck: FPN
+ head: RetinaHead
+
+ResNet:
+ depth: 50
+ variant: b
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ spatial_scales: [0.125, 0.0625, 0.03125]
+ extra_stage: 2
+ has_extra_convs: true
+ use_c5: false
+
+RetinaHead:
+ conv_feat:
+ name: RetinaFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: null
+ use_dcn: false
+ anchor_generator:
+ name: RetinaAnchorGenerator
+ octave_base_scale: 4
+ scales_per_octave: 3
+ aspect_ratios: [0.5, 1.0, 2.0]
+ strides: [8.0, 16.0, 32.0, 64.0, 128.0]
+ bbox_assigner:
+ name: MaxIoUAssigner
+ positive_overlap: 0.5
+ negative_overlap: 0.4
+ allow_low_quality: true
+ loss_class:
+ name: FocalLoss
+ gamma: 2.0
+ alpha: 0.25
+ loss_weight: 1.0
+ loss_bbox:
+ name: SmoothL1Loss
+ beta: 0.0
+ loss_weight: 1.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/retinanet/_base_/retinanet_reader.yml b/PaddleDetection-release-2.6/configs/retinanet/_base_/retinanet_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1f686b4d7f06f143106491e9b8fe3957a40927c2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/retinanet/_base_/retinanet_reader.yml
@@ -0,0 +1,36 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
+ - RandomFlip: {}
+ - NormalizeImage: {is_scale: True, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: True
+ drop_last: True
+ collate_batch: False
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {is_scale: True, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {is_scale: True, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/retinanet/retinanet_r101_distill_r50_2x_coco.yml b/PaddleDetection-release-2.6/configs/retinanet/retinanet_r101_distill_r50_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bb72cda8e99ac6a597ea5fc9b113378f7954bac3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/retinanet/retinanet_r101_distill_r50_2x_coco.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/retinanet_r50_fpn.yml',
+ '_base_/optimizer_2x.yml',
+ '_base_/retinanet_reader.yml'
+]
+
+weights: https://paddledet.bj.bcebos.com/models/retinanet_r101_distill_r50_2x_coco.pdparams
diff --git a/PaddleDetection-release-2.6/configs/retinanet/retinanet_r101_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/retinanet/retinanet_r101_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0518dd30fd5597b58e5756a022af187add56e221
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/retinanet/retinanet_r101_fpn_2x_coco.yml
@@ -0,0 +1,18 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/retinanet_r50_fpn.yml',
+ '_base_/optimizer_2x.yml',
+ '_base_/retinanet_reader.yml'
+]
+
+weights: output/retinanet_r101_fpn_2x_coco/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_pretrained.pdparams
+
+ResNet:
+ depth: 101
+ variant: b
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1, 2, 3]
+ num_stages: 4
diff --git a/PaddleDetection-release-2.6/configs/retinanet/retinanet_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/retinanet/retinanet_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cb6d342baeb428547d42f417acda02e8c90e39da
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/retinanet/retinanet_r50_fpn_1x_coco.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/retinanet_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/retinanet_reader.yml'
+]
+
+weights: output/retinanet_r50_fpn_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/retinanet/retinanet_r50_fpn_2x_coco.yml b/PaddleDetection-release-2.6/configs/retinanet/retinanet_r50_fpn_2x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b25a5cfe1acf7fb872dd0bf289b06ceec59925ed
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/retinanet/retinanet_r50_fpn_2x_coco.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/retinanet_r50_fpn.yml',
+ '_base_/optimizer_2x.yml',
+ '_base_/retinanet_reader.yml'
+]
+
+weights: output/retinanet_r50_fpn_2x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/rotate/README.md b/PaddleDetection-release-2.6/configs/rotate/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..bbc9fca205895f610ef8097901ab0b1e91533367
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/README.md
@@ -0,0 +1,132 @@
+简体中文 | [English](README_en.md)
+
+# 旋转框检测
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [数据准备](#数据准备)
+- [安装依赖](#安装依赖)
+
+## 简介
+旋转框常用于检测带有角度信息的矩形框,即矩形框的宽和高不再与图像坐标轴平行。相较于水平矩形框,旋转矩形框一般包括更少的背景信息。旋转框检测常用于遥感等场景中。
+
+## 模型库
+
+| 模型 | mAP | 学习率策略 | 角度表示 | 数据增广 | GPU数目 | 每GPU图片数目 | 模型下载 | 配置文件 |
+|:---:|:----:|:---------:|:-----:|:--------:|:-----:|:------------:|:-------:|:------:|
+| [S2ANet](./s2anet/README.md) | 73.84 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml) |
+| [FCOSR](./fcosr/README.md) | 76.62 | 3x | oc | RR | 4 | 4 | [model](https://paddledet.bj.bcebos.com/models/fcosr_x50_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/fcosr/fcosr_x50_3x_dota.yml) |
+| [PP-YOLOE-R-s](./ppyoloe_r/README.md) | 73.82 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota.yml) |
+| [PP-YOLOE-R-s](./ppyoloe_r/README.md) | 79.42 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota_ms.yml) |
+| [PP-YOLOE-R-m](./ppyoloe_r/README.md) | 77.64 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota.yml) |
+| [PP-YOLOE-R-m](./ppyoloe_r/README.md) | 79.71 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota_ms.yml) |
+| [PP-YOLOE-R-l](./ppyoloe_r/README.md) | 78.14 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml) |
+| [PP-YOLOE-R-l](./ppyoloe_r/README.md) | 80.02 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml) |
+| [PP-YOLOE-R-x](./ppyoloe_r/README.md) | 78.28 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota.yml) |
+| [PP-YOLOE-R-x](./ppyoloe_r/README.md) | 80.73 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota_ms.yml) |
+
+**注意:**
+
+- 如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- 模型库中的模型默认使用单尺度训练单尺度测试。如果数据增广一栏标明MS,意味着使用多尺度训练和多尺度测试。如果数据增广一栏标明RR,意味着使用RandomRotate数据增广进行训练。
+
+## 数据准备
+### DOTA数据准备
+DOTA数据集是一个大规模的遥感图像数据集,包含旋转框和水平框的标注。可以从[DOTA数据集官网](https://captain-whu.github.io/DOTA/)下载数据集并解压,解压后的数据集目录结构如下所示:
+```
+${DOTA_ROOT}
+├── test
+│ └── images
+├── train
+│ ├── images
+│ └── labelTxt
+└── val
+ ├── images
+ └── labelTxt
+```
+
+对于有标注的数据,每一张图片会对应一个同名的txt文件,文件中每一行为一个旋转框的标注,其格式如下:
+```
+x1 y1 x2 y2 x3 y3 x4 y4 class_name difficult
+```
+
+#### 单尺度切图
+DOTA数据集分辨率较高,因此一般在训练和测试之前对图像进行离线切图,使用单尺度进行切图可以使用以下命令:
+``` bash
+# 对于有标注的数据进行切图
+python configs/rotate/tools/prepare_data.py \
+ --input_dirs ${DOTA_ROOT}/train/ ${DOTA_ROOT}/val/ \
+ --output_dir ${OUTPUT_DIR}/trainval1024/ \
+ --coco_json_file DOTA_trainval1024.json \
+ --subsize 1024 \
+ --gap 200 \
+ --rates 1.0
+
+# 对于无标注的数据进行切图需要设置--image_only
+python configs/rotate/tools/prepare_data.py \
+ --input_dirs ${DOTA_ROOT}/test/ \
+ --output_dir ${OUTPUT_DIR}/test1024/ \
+ --coco_json_file DOTA_test1024.json \
+ --subsize 1024 \
+ --gap 200 \
+ --rates 1.0 \
+ --image_only
+
+```
+
+#### 多尺度切图
+使用多尺度进行切图可以使用以下命令:
+``` bash
+# 对于有标注的数据进行切图
+python configs/rotate/tools/prepare_data.py \
+ --input_dirs ${DOTA_ROOT}/train/ ${DOTA_ROOT}/val/ \
+ --output_dir ${OUTPUT_DIR}/trainval/ \
+ --coco_json_file DOTA_trainval1024.json \
+ --subsize 1024 \
+ --gap 500 \
+ --rates 0.5 1.0 1.5
+
+# 对于无标注的数据进行切图需要设置--image_only
+python configs/rotate/tools/prepare_data.py \
+ --input_dirs ${DOTA_ROOT}/test/ \
+ --output_dir ${OUTPUT_DIR}/test1024/ \
+ --coco_json_file DOTA_test1024.json \
+ --subsize 1024 \
+ --gap 500 \
+ --rates 0.5 1.0 1.5 \
+ --image_only
+```
+
+### 自定义数据集
+旋转框使用标准COCO数据格式,你可以将你的数据集转换成COCO格式以训练模型。COCO标准数据格式的标注信息中包含以下信息:
+``` python
+'annotations': [
+ {
+ 'id': 2083, 'category_id': 9, 'image_id': 9008,
+ 'bbox': [x, y, w, h], # 水平框标注
+ 'segmentation': [[x1, y1, x2, y2, x3, y3, x4, y4]], # 旋转框标注
+ ...
+ }
+ ...
+]
+```
+**需要注意的是`bbox`的标注是水平框标注,`segmentation`为旋转框四个点的标注(顺时针或逆时针均可)。在旋转框训练时`bbox`是可以缺省,一般推荐根据旋转框标注`segmentation`生成。** 在PaddleDetection 2.4及之前的版本,`bbox`为旋转框标注[x, y, w, h, angle],`segmentation`缺省,**目前该格式已不再支持,请下载最新数据集或者转换成标准COCO格式**。
+
+## 安装依赖
+旋转框检测模型需要依赖外部算子进行训练,评估等。Linux环境下,你可以执行以下命令进行编译安装
+```
+cd ppdet/ext_op
+python setup.py install
+```
+Windows环境请按照如下步骤安装:
+
+(1)准备Visual Studio (版本需要>=Visual Studio 2015 update3),这里以VS2017为例;
+
+(2)点击开始-->Visual Studio 2017-->适用于 VS 2017 的x64本机工具命令提示;
+
+(3)设置环境变量:`set DISTUTILS_USE_SDK=1`
+
+(4)进入`PaddleDetection/ppdet/ext_op`目录,通过`python setup.py install`命令进行安装。
+
+安装完成后,可以执行`ppdet/ext_op/unittest`下的单测验证外部op是否正确安装
diff --git a/PaddleDetection-release-2.6/configs/rotate/README_en.md b/PaddleDetection-release-2.6/configs/rotate/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..6cdc111cdb984afcefedac58f2687270c7151e7d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/README_en.md
@@ -0,0 +1,129 @@
+English | [简体中文](README.md)
+
+# Rotated Object Detection
+
+## Table of Contents
+- [Introduction](#Introduction)
+- [Model Zoo](#Model-Zoo)
+- [Data Preparation](#Data-Preparation)
+- [Installation](#Installation)
+
+## Introduction
+Rotated object detection is used to detect rectangular bounding boxes with angle information, that is, the long and short sides of the rectangular bounding box are no longer parallel to the image coordinate axes. Oriented bounding boxes generally contain less background information than horizontal bounding boxes. Rotated object detection is often used in remote sensing scenarios.
+
+## Model Zoo
+| Model | mAP | Lr Scheduler | Angle | Aug | GPU Number | images/GPU | download | config |
+|:---:|:----:|:---------:|:-----:|:--------:|:-----:|:------------:|:-------:|:------:|
+| [S2ANet](./s2anet/README_en.md) | 73.84 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml) |
+| [FCOSR](./fcosr/README_en.md) | 76.62 | 3x | oc | RR | 4 | 4 | [model](https://paddledet.bj.bcebos.com/models/fcosr_x50_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/fcosr/fcosr_x50_3x_dota.yml) |
+| [PP-YOLOE-R-s](./ppyoloe_r/README_en.md) | 73.82 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota.yml) |
+| [PP-YOLOE-R-s](./ppyoloe_r/README_en.md) | 79.42 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota_ms.yml) |
+| [PP-YOLOE-R-m](./ppyoloe_r/README_en.md) | 77.64 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota.yml) |
+| [PP-YOLOE-R-m](./ppyoloe_r/README_en.md) | 79.71 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota_ms.yml) |
+| [PP-YOLOE-R-l](./ppyoloe_r/README_en.md) | 78.14 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml) |
+| [PP-YOLOE-R-l](./ppyoloe_r/README_en.md) | 80.02 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml) |
+| [PP-YOLOE-R-x](./ppyoloe_r/README_en.md) | 78.28 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota.yml) |
+| [PP-YOLOE-R-x](./ppyoloe_r/README_en.md) | 80.73 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota_ms.yml) |
+
+**Notes:**
+
+- if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)**.
+- Models in model zoo is trained and tested with single scale by default. If `MS` is indicated in the data augmentation column, it means that multi-scale training and multi-scale testing are used. If `RR` is indicated in the data augmentation column, it means that RandomRotate data augmentation is used for training.
+
+## Data Preparation
+### DOTA Dataset preparation
+The DOTA dataset is a large-scale remote sensing image dataset containing annotations of oriented and horizontal bounding boxes. The dataset can be download from [Official Website of DOTA Dataset](https://captain-whu.github.io/DOTA/). When the dataset is decompressed, its directory structure is shown as follows.
+```
+${DOTA_ROOT}
+├── test
+│ └── images
+├── train
+│ ├── images
+│ └── labelTxt
+└── val
+ ├── images
+ └── labelTxt
+```
+
+For labeled data, each image corresponds to a txt file with the same name, and each row in the txt file represent a rotated bouding box. The format is as follows:
+
+```
+x1 y1 x2 y2 x3 y3 x4 y4 class_name difficult
+```
+
+#### Slicing data with single scale
+The image resolution of DOTA dataset is relatively high, so we usually slice the images before training and testing. To slice the images with a single scale, you can use the command below
+``` bash
+# slicing labeled data
+python configs/rotate/tools/prepare_data.py \
+ --input_dirs ${DOTA_ROOT}/train/ ${DOTA_ROOT}/val/ \
+ --output_dir ${OUTPUT_DIR}/trainval1024/ \
+ --coco_json_file DOTA_trainval1024.json \
+ --subsize 1024 \
+ --gap 200 \
+ --rates 1.0
+# slicing unlabeled data by setting --image_only
+python configs/rotate/tools/prepare_data.py \
+ --input_dirs ${DOTA_ROOT}/test/ \
+ --output_dir ${OUTPUT_DIR}/test1024/ \
+ --coco_json_file DOTA_test1024.json \
+ --subsize 1024 \
+ --gap 200 \
+ --rates 1.0 \
+ --image_only
+
+```
+
+#### Slicing data with multi scale
+To slice the images with multiple scales, you can use the command below
+``` bash
+# slicing labeled data
+python configs/rotate/tools/prepare_data.py \
+ --input_dirs ${DOTA_ROOT}/train/ ${DOTA_ROOT}/val/ \
+ --output_dir ${OUTPUT_DIR}/trainval/ \
+ --coco_json_file DOTA_trainval1024.json \
+ --subsize 1024 \
+ --gap 500 \
+ --rates 0.5 1.0 1.5
+# slicing unlabeled data by setting --image_only
+python configs/rotate/tools/prepare_data.py \
+ --input_dirs ${DOTA_ROOT}/test/ \
+ --output_dir ${OUTPUT_DIR}/test1024/ \
+ --coco_json_file DOTA_test1024.json \
+ --subsize 1024 \
+ --gap 500 \
+ --rates 0.5 1.0 1.5 \
+ --image_only
+```
+
+### Custom Dataset
+Rotated object detction uses the standard COCO data format, and you can convert your dataset to COCO format to train the model. The annotations of standard COCO format contains the following information
+``` python
+'annotations': [
+ {
+ 'id': 2083, 'category_id': 9, 'image_id': 9008,
+ 'bbox': [x, y, w, h], # horizontal bouding box
+ 'segmentation': [[x1, y1, x2, y2, x3, y3, x4, y4]], # rotated bounding box
+ ...
+ }
+ ...
+]
+```
+**It should be noted that `bbox` is the horizontal bouding box, and `segmentation` is four points of rotated bounding box (clockwise or counterclockwise). The `bbox` can be empty when training rotated object detector, and it is recommended to generate `bbox` according to `segmentation`**. In PaddleDetection 2.4 and earlier versions, `bbox` represents the rotated bounding box [x, y, w, h, angle] and `segmentation` is empty. **But this format is no longer supported after PaddleDetection 2.5, please download the latest dataset or convert to standard COCO format**.
+## Installation
+Models of rotated object detection depend on external operators for training, evaluation, etc. In Linux environment, you can execute the following command to compile and install.
+```
+cd ppdet/ext_op
+python setup.py install
+```
+In Windows environment, perform the following steps to install it:
+
+(1)Visual Studio (version required >= Visual Studio 2015 Update3);
+
+(2)Go to Start --> Visual Studio 2017 --> X64 native Tools command prompt for VS 2017;
+
+(3)Setting Environment Variables:set DISTUTILS_USE_SDK=1
+
+(4)Enter `ppdet/ext_op` directory,use `python setup.py install` to install。
+
+After the installation, you can execute the unittest of `ppdet/ext_op/unittest` to verify whether the external oprators is installed correctly.
diff --git a/PaddleDetection-release-2.6/configs/rotate/fcosr/README.md b/PaddleDetection-release-2.6/configs/rotate/fcosr/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..1d93449d96916fe752df11fe605d35729cddb1f3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/fcosr/README.md
@@ -0,0 +1,91 @@
+简体中文 | [English](README_en.md)
+
+# FCOSR
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [使用说明](#使用说明)
+- [预测部署](#预测部署)
+- [引用](#引用)
+
+## 简介
+
+[FCOSR](https://arxiv.org/abs/2111.10780)是基于[FCOS](https://arxiv.org/abs/1904.01355)的单阶段Anchor-Free的旋转框检测算法。FCOSR主要聚焦于旋转框的标签匹配策略,提出了椭圆中心采样和模糊样本标签匹配的方法。在loss方面,FCOSR使用了[ProbIoU](https://arxiv.org/abs/2106.06072)避免边界不连续性问题。
+
+## 模型库
+
+| 模型 | Backbone | mAP | 学习率策略 | 角度表示 | 数据增广 | GPU数目 | 每GPU图片数目 | 模型下载 | 配置文件 |
+|:---:|:--------:|:----:|:---------:|:-----:|:--------:|:-----:|:------------:|:-------:|:------:|
+| FCOSR-M | ResNeXt-50 | 76.62 | 3x | oc | RR | 4 | 4 | [model](https://paddledet.bj.bcebos.com/models/fcosr_x50_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/fcosr/fcosr_x50_3x_dota.yml) |
+
+**注意:**
+
+- 如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- 模型库中的模型默认使用单尺度训练单尺度测试。如果数据增广一栏标明MS,意味着使用多尺度训练和多尺度测试。如果数据增广一栏标明RR,意味着使用RandomRotate数据增广进行训练。
+
+## 使用说明
+
+参考[数据准备](../README.md#数据准备)准备数据。
+
+### 训练
+
+GPU单卡训练
+``` bash
+CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/rotate/fcosr/fcosr_x50_3x_dota.yml
+```
+
+GPU多卡训练
+``` bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/fcosr/fcosr_x50_3x_dota.yml
+```
+
+### 预测
+
+执行以下命令预测单张图片,图片预测结果会默认保存在`output`文件夹下面
+``` bash
+python tools/infer.py -c configs/rotate/fcosr/fcosr_x50_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/fcosr_x50_3x_dota.pdparams --infer_img=demo/P0861__1.0__1154___824.png --draw_threshold=0.5
+```
+
+### DOTA数据集评估
+
+参考[DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), 评估DOTA数据集需要生成一个包含所有检测结果的zip文件,每一类的检测结果储存在一个txt文件中,txt文件中每行格式为:`image_name score x1 y1 x2 y2 x3 y3 x4 y4`。将生成的zip文件提交到[DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html)的Task1进行评估。你可以执行以下命令得到test数据集的预测结果:
+``` bash
+python tools/infer.py -c configs/rotate/fcosr/fcosr_x50_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/fcosr_x50_3x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output_fcosr --visualize=False --save_results=True
+```
+将预测结果处理成官网评估所需要的格式:
+``` bash
+python configs/rotate/tools/generate_result.py --pred_txt_dir=output_fcosr/ --output_dir=submit/ --data_type=dota10
+
+zip -r submit.zip submit
+```
+
+## 预测部署
+
+部署教程请参考[预测部署](../../../deploy/README.md)
+
+## 引用
+
+```
+@article{li2021fcosr,
+ title={Fcosr: A simple anchor-free rotated detector for aerial object detection},
+ author={Li, Zhonghua and Hou, Biao and Wu, Zitong and Jiao, Licheng and Ren, Bo and Yang, Chen},
+ journal={arXiv preprint arXiv:2111.10780},
+ year={2021}
+}
+
+@inproceedings{tian2019fcos,
+ title={Fcos: Fully convolutional one-stage object detection},
+ author={Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
+ booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
+ pages={9627--9636},
+ year={2019}
+}
+
+@article{llerena2021gaussian,
+ title={Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection},
+ author={Llerena, Jeffri M and Zeni, Luis Felipe and Kristen, Lucas N and Jung, Claudio},
+ journal={arXiv preprint arXiv:2106.06072},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/rotate/fcosr/README_en.md b/PaddleDetection-release-2.6/configs/rotate/fcosr/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..8d7621b339b3adb5d309c195754fd013f191b7d5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/fcosr/README_en.md
@@ -0,0 +1,92 @@
+English | [简体中文](README.md)
+
+# FCOSR
+
+## Content
+- [Introduction](#Introduction)
+- [Model Zoo](#Model-Zoo)
+- [Getting Start](#Getting-Start)
+- [Deployment](#Deployment)
+- [Citations](#Citations)
+
+## Introduction
+
+[FCOSR](https://arxiv.org/abs/2111.10780) is one stage anchor-free model based on [FCOS](https://arxiv.org/abs/1904.01355). FCOSR focuses on the label assignment strategy for oriented bounding boxes and proposes ellipse center sampling method and fuzzy sample assignment strategy. In terms of loss, FCOSR uses [ProbIoU](https://arxiv.org/abs/2106.06072) to avoid boundary discontinuity problem.
+
+## Model Zoo
+
+| Model | Backbone | mAP | Lr Scheduler | Angle | Aug | GPU Number | images/GPU | download | config |
+|:---:|:--------:|:----:|:---------:|:-----:|:--------:|:-----:|:------------:|:-------:|:------:|
+| FCOSR-M | ResNeXt-50 | 76.62 | 3x | oc | RR | 4 | 4 | [model](https://paddledet.bj.bcebos.com/models/fcosr_x50_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/fcosr/fcosr_x50_3x_dota.yml) |
+
+**Notes:**
+
+- if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)**.
+- Models in model zoo is trained and tested with single scale by default. If `MS` is indicated in the data augmentation column, it means that multi-scale training and multi-scale testing are used. If `RR` is indicated in the data augmentation column, it means that RandomRotate data augmentation is used for training.
+
+## Getting Start
+
+Refer to [Data-Preparation](../README_en.md#Data-Preparation) to prepare data.
+
+### Training
+
+Single GPU Training
+``` bash
+CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/rotate/fcosr/fcosr_x50_3x_dota.yml
+```
+
+Multiple GPUs Training
+``` bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/fcosr/fcosr_x50_3x_dota.yml
+```
+
+### Inference
+
+Run the follow command to infer single image, the result of inference will be saved in `output` directory by default.
+
+``` bash
+python tools/infer.py -c configs/rotate/fcosr/fcosr_x50_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/fcosr_x50_3x_dota.pdparams --infer_img=demo/P0861__1.0__1154___824.png --draw_threshold=0.5
+```
+
+### Evaluation on DOTA Dataset
+Refering to [DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), You need to submit a zip file containing results for all test images for evaluation. The detection results of each category are stored in a txt file, each line of which is in the following format
+`image_id score x1 y1 x2 y2 x3 y3 x4 y4`. To evaluate, you should submit the generated zip file to the Task1 of [DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html). You can run the following command to get the inference results of test dataset:
+``` bash
+python tools/infer.py -c configs/rotate/fcosr/fcosr_x50_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/fcosr_x50_3x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output_fcosr --visualize=False --save_results=True
+```
+Process the prediction results into the format required for the official website evaluation:
+``` bash
+python configs/rotate/tools/generate_result.py --pred_txt_dir=output_fcosr/ --output_dir=submit/ --data_type=dota10
+
+zip -r submit.zip submit
+```
+
+## Deployment
+
+Please refer to the deployment tutorial[Deployment](../../../deploy/README_en.md)
+
+## Citations
+
+```
+@article{li2021fcosr,
+ title={Fcosr: A simple anchor-free rotated detector for aerial object detection},
+ author={Li, Zhonghua and Hou, Biao and Wu, Zitong and Jiao, Licheng and Ren, Bo and Yang, Chen},
+ journal={arXiv preprint arXiv:2111.10780},
+ year={2021}
+}
+
+@inproceedings{tian2019fcos,
+ title={Fcos: Fully convolutional one-stage object detection},
+ author={Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
+ booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
+ pages={9627--9636},
+ year={2019}
+}
+
+@article{llerena2021gaussian,
+ title={Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection},
+ author={Llerena, Jeffri M and Zeni, Luis Felipe and Kristen, Lucas N and Jung, Claudio},
+ journal={arXiv preprint arXiv:2106.06072},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/rotate/fcosr/_base_/fcosr_reader.yml b/PaddleDetection-release-2.6/configs/rotate/fcosr/_base_/fcosr_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..854436dc1e109289d9d460390a299dfad0d988e0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/fcosr/_base_/fcosr_reader.yml
@@ -0,0 +1,46 @@
+worker_num: 4
+image_height: &image_height 1024
+image_width: &image_width 1024
+image_size: &image_size [*image_height, *image_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Array: {}
+ - RandomRFlip: {}
+ - RandomRRotate: {angle_mode: 'value', angle: [0, 90, 180, -90]}
+ - RandomRRotate: {angle_mode: 'value', angle: [30, 60], rotate_prob: 0.5}
+ - RResize: {target_size: *image_size, keep_ratio: True, interp: 2}
+ - Poly2RBox: {filter_threshold: 2, filter_mode: 'edge', rbox_type: 'oc'}
+ batch_transforms:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadRGT: {}
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Array: {}
+ - RResize: {target_size: *image_size, keep_ratio: True, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ collate_batch: false
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *image_size, keep_ratio: True, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
diff --git a/PaddleDetection-release-2.6/configs/rotate/fcosr/_base_/fcosr_x50.yml b/PaddleDetection-release-2.6/configs/rotate/fcosr/_base_/fcosr_x50.yml
new file mode 100644
index 0000000000000000000000000000000000000000..77a4d8a2ff0594aa9f948111092fd6c625d13234
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/fcosr/_base_/fcosr_x50.yml
@@ -0,0 +1,44 @@
+architecture: YOLOv3
+snapshot_epoch: 1
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNeXt50_32x4d_pretrained.pdparams
+
+YOLOv3:
+ backbone: ResNet
+ neck: FPN
+ yolo_head: FCOSRHead
+ post_process: ~
+
+ResNet:
+ depth: 50
+ groups: 32
+ base_width: 4
+ variant: b
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ extra_stage: 2
+ has_extra_convs: true
+ use_c5: false
+ relu_before_extra_convs: true
+
+FCOSRHead:
+ feat_channels: 256
+ fpn_strides: [8, 16, 32, 64, 128]
+ stacked_convs: 4
+ loss_weight: {class: 1.0, probiou: 1.0}
+ assigner:
+ name: FCOSRAssigner
+ factor: 12
+ threshold: 0.23
+ boundary: [[-1, 64], [64, 128], [128, 256], [256, 512], [512, 100000000.0]]
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 2000
+ keep_top_k: -1
+ score_threshold: 0.1
+ nms_threshold: 0.1
+ normalized: False
diff --git a/PaddleDetection-release-2.6/configs/rotate/fcosr/_base_/optimizer_3x.yml b/PaddleDetection-release-2.6/configs/rotate/fcosr/_base_/optimizer_3x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..859db126bed27471f6d8dcd02761299395ce9468
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/fcosr/_base_/optimizer_3x.yml
@@ -0,0 +1,20 @@
+epoch: 36
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [24, 33]
+ - !LinearWarmup
+ start_factor: 0.3333333
+ steps: 500
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/rotate/fcosr/fcosr_x50_3x_dota.yml b/PaddleDetection-release-2.6/configs/rotate/fcosr/fcosr_x50_3x_dota.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d9554d30896ca5e2a3a5eb03725f1f6bb97a7dfc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/fcosr/fcosr_x50_3x_dota.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../../datasets/dota.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/fcosr_reader.yml',
+ '_base_/fcosr_x50.yml'
+]
+
+weights: output/fcosr_x50_3x_dota/model_final
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/README.md b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..5efb248e5ba37168c74ce156a7a76ace53cec9a0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/README.md
@@ -0,0 +1,178 @@
+简体中文 | [English](README_en.md)
+
+# PP-YOLOE-R
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [使用说明](#使用说明)
+- [预测部署](#预测部署)
+- [附录](#附录)
+- [引用](#引用)
+
+## 简介
+PP-YOLOE-R是一个高效的单阶段Anchor-free旋转框检测模型。基于PP-YOLOE, PP-YOLOE-R以极少的参数量和计算量为代价,引入了一系列有用的设计来提升检测精度。在DOTA 1.0数据集上,PP-YOLOE-R-l和PP-YOLOE-R-x在单尺度训练和测试的情况下分别达到了78.14和78.27 mAP,这超越了几乎所有的旋转框检测模型。通过多尺度训练和测试,PP-YOLOE-R-l和PP-YOLOE-R-x的检测精度进一步提升至80.02和80.73 mAP。在这种情况下,PP-YOLOE-R-x超越了所有的anchor-free方法并且和最先进的anchor-based的两阶段模型精度几乎相当。此外,PP-YOLOE-R-s和PP-YOLOE-R-m通过多尺度训练和测试可以达到79.42和79.71 mAP。考虑到这两个模型的参数量和计算量,其性能也非常卓越。在保持高精度的同时,PP-YOLOE-R避免使用特殊的算子,例如Deformable Convolution或Rotated RoI Align,以使其能轻松地部署在多种多样的硬件上。在1024x1024的输入分辨率下,PP-YOLOE-R-s/m/l/x在RTX 2080 Ti上使用TensorRT FP16分别能达到69.8/55.1/48.3/37.1 FPS,在Tesla V100上分别能达到114.5/86.8/69.7/50.7 FPS。更多细节可以参考我们的[**技术报告**](https://arxiv.org/abs/2211.02386)。
+
+
+

+
+
+PP-YOLOE-R相较于PP-YOLOE做了以下几点改动:
+- Rotated Task Alignment Learning
+- 解耦的角度预测头
+- 使用DFL进行角度预测
+- 可学习的门控单元
+- [ProbIoU损失函数](https://arxiv.org/abs/2106.06072)
+
+## 模型库
+
+| 模型 | Backbone | mAP | V100 TRT FP16 (FPS) | RTX 2080 Ti TRT FP16 (FPS) | Params (M) | FLOPs (G) | 学习率策略 | 角度表示 | 数据增广 | GPU数目 | 每GPU图片数目 | 模型下载 | 配置文件 |
+|:---:|:--------:|:----:|:--------------------:|:------------------------:|:----------:|:---------:|:--------:|:----------:|:-------:|:------:|:-----------:|:--------:|:------:|
+| PP-YOLOE-R-s | CRN-s | 73.82 | 114.5 | 69.8 | 8.09 | 43.46 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota.yml) |
+| PP-YOLOE-R-s | CRN-s | 79.42 | 114.5 | 69.8 | 8.09 | 43.46 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota_ms.yml) |
+| PP-YOLOE-R-m | CRN-m | 77.64 | 86.8 | 55.1 | 23.96 |127.00 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota.yml) |
+| PP-YOLOE-R-m | CRN-m | 79.71 | 86.8 | 55.1 | 23.96 |127.00 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota_ms.yml) |
+| PP-YOLOE-R-l | CRN-l | 78.14 | 69.7 | 48.3 | 53.29 |281.65 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml) |
+| PP-YOLOE-R-l | CRN-l | 80.02 | 69.7 | 48.3 | 53.29 |281.65 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml) |
+| PP-YOLOE-R-x | CRN-x | 78.28 | 50.7 | 37.1 | 100.27|529.82 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota.yml) |
+| PP-YOLOE-R-x | CRN-x | 80.73 | 50.7 | 37.1 | 100.27|529.82 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota_ms.yml) |
+
+**注意:**
+
+- 如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- 模型库中的模型默认使用单尺度训练单尺度测试。如果数据增广一栏标明MS,意味着使用多尺度训练和多尺度测试。如果数据增广一栏标明RR,意味着使用RandomRotate数据增广进行训练。
+- CRN表示在PP-YOLOE中提出的CSPRepResNet
+- PP-YOLOE-R的参数量和计算量是在重参数化之后计算得到,输入图像的分辨率为1024x1024
+- 速度测试使用TensorRT 8.2.3在DOTA测试集中测试2000张图片计算平均值得到。参考速度测试以复现[速度测试](#速度测试)
+
+## 使用说明
+
+参考[数据准备](../README.md#数据准备)准备数据。
+
+### 训练
+
+GPU单卡训练
+``` bash
+CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
+```
+
+GPU多卡训练
+``` bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
+```
+
+### 预测
+
+执行以下命令预测单张图片,图片预测结果会默认保存在`output`文件夹下面
+``` bash
+python tools/infer.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams --infer_img=demo/P0861__1.0__1154___824.png --draw_threshold=0.5
+```
+
+### DOTA数据集评估
+
+参考[DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), 评估DOTA数据集需要生成一个包含所有检测结果的zip文件,每一类的检测结果储存在一个txt文件中,txt文件中每行格式为:`image_name score x1 y1 x2 y2 x3 y3 x4 y4`。将生成的zip文件提交到[DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html)的Task1进行评估。你可以执行以下命令得到test数据集的预测结果:
+``` bash
+python tools/infer.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output_ppyoloe_r --visualize=False --save_results=True
+```
+将预测结果处理成官网评估所需要的格式:
+``` bash
+python configs/rotate/tools/generate_result.py --pred_txt_dir=output_ppyoloe_r/ --output_dir=submit/ --data_type=dota10
+
+zip -r submit.zip submit
+```
+
+### 速度测试
+可以使用Paddle模式或者Paddle-TRT模式进行测速。当使用Paddle-TRT模式测速时,需要确保**TensorRT版本大于8.2, PaddlePaddle版本为develop版本**。使用Paddle-TRT进行测速,可以执行以下命令:
+
+``` bash
+# 导出模型
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams trt=True
+
+# 速度测试
+CUDA_VISIBLE_DEVICES=0 python configs/rotate/tools/inference_benchmark.py --model_dir output_inference/ppyoloe_r_crn_l_3x_dota/ --image_dir /path/to/dota/test/dir --run_mode trt_fp16
+```
+当只使用Paddle进行测速,可以执行以下命令:
+``` bash
+# 导出模型
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams
+
+# 速度测试
+CUDA_VISIBLE_DEVICES=0 python configs/rotate/tools/inference_benchmark.py --model_dir output_inference/ppyoloe_r_crn_l_3x_dota/ --image_dir /path/to/dota/test/dir --run_mode paddle
+```
+
+## 预测部署
+
+**使用Paddle**进行部署,执行以下命令:
+``` bash
+# 导出模型
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams
+
+# 预测图片
+python deploy/python/infer.py --image_file demo/P0072__1.0__0___0.png --model_dir=output_inference/ppyoloe_r_crn_l_3x_dota --run_mode=paddle --device=gpu
+```
+
+**使用Paddle-TRT进行部署**,执行以下命令:
+```
+# 导出模型
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams trt=True
+
+# 预测图片
+python deploy/python/infer.py --image_file demo/P0072__1.0__0___0.png --model_dir=output_inference/ppyoloe_r_crn_l_3x_dota --run_mode=trt_fp16 --device=gpu
+```
+
+**注意:**
+- 使用Paddle-TRT使用确保**PaddlePaddle版本为develop版本且TensorRT版本大于8.2**.
+
+**使用ONNX Runtime进行部署**,执行以下命令:
+```
+# 导出模型
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams export_onnx=True
+
+# 安装paddle2onnx
+pip install paddle2onnx
+
+# 转换成onnx模型
+paddle2onnx --model_dir output_inference/ppyoloe_r_crn_l_3x_dota --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 11 --save_file ppyoloe_r_crn_l_3x_dota.onnx
+
+# 预测图片
+python configs/rotate/tools/onnx_infer.py --infer_cfg output_inference/ppyoloe_r_crn_l_3x_dota/infer_cfg.yml --onnx_file ppyoloe_r_crn_l_3x_dota.onnx --image_file demo/P0072__1.0__0___0.png
+
+```
+
+## 附录
+
+PP-YOLOE-R消融实验
+
+| 模型 | mAP | 参数量(M) | FLOPs(G) |
+| :-: | :-: | :------: | :------: |
+| Baseline | 75.61 | 50.65 | 269.09 |
+| +Rotated Task Alignment Learning | 77.24 | 50.65 | 269.09 |
+| +Decoupled Angle Prediction Head | 77.78 | 52.20 | 272.72 |
+| +Angle Prediction with DFL | 78.01 | 53.29 | 281.65 |
+| +Learnable Gating Unit for RepVGG | 78.14 | 53.29 | 281.65 |
+
+
+## 引用
+
+```
+@article{wang2022pp,
+ title={PP-YOLOE-R: An Efficient Anchor-Free Rotated Object Detector},
+ author={Wang, Xinxin and Wang, Guanzhong and Dang, Qingqing and Liu, Yi and Hu, Xiaoguang and Yu, Dianhai},
+ journal={arXiv preprint arXiv:2211.02386},
+ year={2022}
+}
+
+@article{xu2022pp,
+ title={PP-YOLOE: An evolved version of YOLO},
+ author={Xu, Shangliang and Wang, Xinxin and Lv, Wenyu and Chang, Qinyao and Cui, Cheng and Deng, Kaipeng and Wang, Guanzhong and Dang, Qingqing and Wei, Shengyu and Du, Yuning and others},
+ journal={arXiv preprint arXiv:2203.16250},
+ year={2022}
+}
+
+@article{llerena2021gaussian,
+ title={Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection},
+ author={Llerena, Jeffri M and Zeni, Luis Felipe and Kristen, Lucas N and Jung, Claudio},
+ journal={arXiv preprint arXiv:2106.06072},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/README_en.md b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..6a37ed5fbd0904f90eba7f4e52b8f7ddd7c3ac3a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/README_en.md
@@ -0,0 +1,180 @@
+English | [简体中文](README.md)
+
+# PP-YOLOE-R
+
+## Content
+- [Introduction](#Introduction)
+- [Model Zoo](#Model-Zoo)
+- [Getting Start](#Getting-Start)
+- [Deployment](#Deployment)
+- [Appendix](#Appendix)
+- [Citations](#Citations)
+
+## Introduction
+PP-YOLOE-R is an efficient anchor-free rotated object detector. Based on PP-YOLOE, PP-YOLOE-R introduces a bag of useful tricks to improve detection precision at the expense of marginal parameters and computations.PP-YOLOE-R-l and PP-YOLOE-R-x achieve 78.14 and 78.27 mAP respectively on DOTA 1.0 dataset with single-scale training and testing, which outperform almost all other rotated object detectors. With multi-scale training and testing, the detection precision of PP-YOLOE-R-l and PP-YOLOE-R-x is further improved to 80.02 and 80.73 mAP. In this case, PP-YOLOE-R-x surpasses all anchor-free methods and demonstrates competitive performance to state-of-the-art anchor-based two-stage model. Moreover, PP-YOLOE-R-s and PP-YOLOE-R-m can achieve 79.42 and 79.71 mAP with multi-scale training and testing, which is an excellent result considering the parameters and GLOPS of these two models. While maintaining high precision, PP-YOLOE-R avoids using special operators, such as Deformable Convolution or Rotated RoI Align, to be deployed friendly on various hardware. At the input resolution of 1024$\times$1024, PP-YOLOE-R-s/m/l/x can reach 69.8/55.1/48.3/37.1 FPS on RTX 2080 Ti and 114.5/86.8/69.7/50.7 FPS on Tesla V100 GPU with TensorRT and FP16-precision. For more details, please refer to our [**technical report**](https://arxiv.org/abs/2211.02386).
+
+
+

+
+
+Compared with PP-YOLOE, PP-YOLOE-R has made the following changes:
+- Rotated Task Alignment Learning
+- Decoupled Angle Prediction Head
+- Angle Prediction with DFL
+- Learnable Gating Unit for RepVGG
+- [ProbIoU Loss](https://arxiv.org/abs/2106.06072)
+
+## Model Zoo
+| Model | Backbone | mAP | V100 TRT FP16 (FPS) | RTX 2080 Ti TRT FP16 (FPS) | Params (M) | FLOPs (G) | Lr Scheduler | Angle | Aug | GPU Number | images/GPU | download | config |
+|:-----:|:--------:|:----:|:-------------------:|:--------------------------:|:-----------:|:---------:|:--------:|:-----:|:---:|:----------:|:----------:|:--------:|:------:|
+| PP-YOLOE-R-s | CRN-s | 73.82 | 114.5 | 69.8 | 8.09 | 43.46 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota.yml) |
+| PP-YOLOE-R-s | CRN-s | 79.42 | 114.5 | 69.8 | 8.09 | 43.46 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_s_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota_ms.yml) |
+| PP-YOLOE-R-m | CRN-m | 77.64 | 86.8 | 55.1 | 23.96 |127.00 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota.yml) |
+| PP-YOLOE-R-m | CRN-m | 79.71 | 86.8 | 55.1 | 23.96 |127.00 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_m_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota_ms.yml) |
+| PP-YOLOE-R-l | CRN-l | 78.14 | 69.7 | 48.3 | 53.29 |281.65 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml) |
+| PP-YOLOE-R-l | CRN-l | 80.02 | 69.7 | 48.3 | 53.29 |281.65 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml) |
+| PP-YOLOE-R-x | CRN-x | 78.28 | 50.7 | 37.1 | 100.27|529.82 | 3x | oc | RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota.yml) |
+| PP-YOLOE-R-x | CRN-x | 80.73 | 50.7 | 37.1 | 100.27|529.82 | 3x | oc | MS+RR | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_x_3x_dota_ms.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota_ms.yml) |
+
+**Notes:**
+
+- if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)**.
+- Models in model zoo is trained and tested with single scale by default. If `MS` is indicated in the data augmentation column, it means that multi-scale training and multi-scale testing are used. If `RR` is indicated in the data augmentation column, it means that RandomRotate data augmentation is used for training.
+- CRN denotes CSPRepResNet proposed in PP-YOLOE
+- The parameters and GLOPs of PP-YOLOE-R are calculated after re-parameterization, and the resolution of the input image is 1024x1024
+- Speed is calculated and averaged by testing 2000 images on the DOTA test dataset. Refer to [Speed testing](#Speed-testing) to reproduce the results.
+
+## Getting Start
+
+Refer to [Data-Preparation](../README_en.md#Data-Preparation) to prepare data.
+
+### Training
+
+Single GPU Training
+``` bash
+CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
+```
+
+Multiple GPUs Training
+``` bash
+CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
+```
+
+### Inference
+
+Run the follow command to infer single image, the result of inference will be saved in `output` directory by default.
+
+``` bash
+python tools/infer.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams --infer_img=demo/P0861__1.0__1154___824.png --draw_threshold=0.5
+```
+
+### Evaluation on DOTA Dataset
+Refering to [DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), You need to submit a zip file containing results for all test images for evaluation. The detection results of each category are stored in a txt file, each line of which is in the following format
+`image_id score x1 y1 x2 y2 x3 y3 x4 y4`. To evaluate, you should submit the generated zip file to the Task1 of [DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html). You can run the following command to get the inference results of test dataset:
+``` bash
+python tools/infer.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output_ppyoloe_r --visualize=False --save_results=True
+```
+Process the prediction results into the format required for the official website evaluation:
+``` bash
+python configs/rotate/tools/generate_result.py --pred_txt_dir=output_ppyoloe_r/ --output_dir=submit/ --data_type=dota10
+
+zip -r submit.zip submit
+```
+
+### Speed testing
+
+You can use Paddle mode or Paddle-TRT mode for speed testing. When using Paddle-TRT for speed testing, make sure that **the version of TensorRT is larger than 8.2 and the version of PaddlePaddle is the develop version**. Using Paddle-TRT to test speed, run following command
+
+``` bash
+# export inference model
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams trt=True
+
+# speed testing
+CUDA_VISIBLE_DEVICES=0 python configs/rotate/tools/inference_benchmark.py --model_dir output_inference/ppyoloe_r_crn_l_3x_dota/ --image_dir /path/to/dota/test/dir --run_mode trt_fp16
+```
+Using Paddle to test speed, run following command
+``` bash
+# export inference model
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams
+
+# speed testing
+CUDA_VISIBLE_DEVICES=0 python configs/rotate/tools/inference_benchmark.py --model_dir output_inference/ppyoloe_r_crn_l_3x_dota/ --image_dir /path/to/dota/test/dir --run_mode paddle
+
+```
+
+## Deployment
+
+**Using Paddle** to for deployment, run following command
+
+``` bash
+# export inference model
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams
+
+# inference single image
+python deploy/python/infer.py --image_file demo/P0072__1.0__0___0.png --model_dir=output_inference/ppyoloe_r_crn_l_3x_dota --run_mode=paddle --device=gpu
+```
+
+**Using Paddle-TRT** for deployment, run following command
+
+``` bash
+# export inference model
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams trt=True
+
+# inference single image
+python deploy/python/infer.py --image_file demo/P0072__1.0__0___0.png --model_dir=output_inference/ppyoloe_r_crn_l_3x_dota --run_mode=trt_fp16 --device=gpu
+```
+**Notes:**
+- When using Paddle-TRT for speed testing, make sure that **the version of TensorRT is larger than 8.2 and the version of PaddlePaddle is the develop version**
+
+**Using ONNX Runtime** for deployment, run following command
+
+``` bash
+# export inference model
+python tools/export_model.py -c configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_r_crn_l_3x_dota.pdparams export_onnx=True
+
+# install paddle2onnx
+pip install paddle2onnx
+
+# convert to onnx model
+paddle2onnx --model_dir output_inference/ppyoloe_r_crn_l_3x_dota --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 11 --save_file ppyoloe_r_crn_l_3x_dota.onnx
+
+# inference single image
+python configs/rotate/tools/onnx_infer.py --infer_cfg output_inference/ppyoloe_r_crn_l_3x_dota/infer_cfg.yml --onnx_file ppyoloe_r_crn_l_3x_dota.onnx --image_file demo/P0072__1.0__0___0.png
+```
+
+## Appendix
+
+Ablation experiments of PP-YOLOE-R
+
+| Model | mAP | Params(M) | FLOPs(G) |
+| :-: | :-: | :------: | :------: |
+| Baseline | 75.61 | 50.65 | 269.09 |
+| +Rotated Task Alignment Learning | 77.24 | 50.65 | 269.09 |
+| +Decoupled Angle Prediction Head | 77.78 | 52.20 | 272.72 |
+| +Angle Prediction with DFL | 78.01 | 53.29 | 281.65 |
+| +Learnable Gating Unit for RepVGG | 78.14 | 53.29 | 281.65 |
+
+## Citations
+
+```
+@article{wang2022pp,
+ title={PP-YOLOE-R: An Efficient Anchor-Free Rotated Object Detector},
+ author={Wang, Xinxin and Wang, Guanzhong and Dang, Qingqing and Liu, Yi and Hu, Xiaoguang and Yu, Dianhai},
+ journal={arXiv preprint arXiv:2211.02386},
+ year={2022}
+}
+
+@article{xu2022pp,
+ title={PP-YOLOE: An evolved version of YOLO},
+ author={Xu, Shangliang and Wang, Xinxin and Lv, Wenyu and Chang, Qinyao and Cui, Cheng and Deng, Kaipeng and Wang, Guanzhong and Dang, Qingqing and Wei, Shengyu and Du, Yuning and others},
+ journal={arXiv preprint arXiv:2203.16250},
+ year={2022}
+}
+
+@article{llerena2021gaussian,
+ title={Gaussian Bounding Boxes and Probabilistic Intersection-over-Union for Object Detection},
+ author={Llerena, Jeffri M and Zeni, Luis Felipe and Kristen, Lucas N and Jung, Claudio},
+ journal={arXiv preprint arXiv:2106.06072},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/_base_/optimizer_3x.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/_base_/optimizer_3x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1cdad4beb093deeef0b6918b88b81fc5964e95ce
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/_base_/optimizer_3x.yml
@@ -0,0 +1,19 @@
+epoch: 36
+
+LearningRate:
+ base_lr: 0.008
+ schedulers:
+ - !CosineDecay
+ max_epochs: 44
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/_base_/ppyoloe_r_crn.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/_base_/ppyoloe_r_crn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ab5bdb50aa731e3af664b68aa52b3c7293d715e8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/_base_/ppyoloe_r_crn.yml
@@ -0,0 +1,49 @@
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOERHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+ use_alpha: True
+
+PPYOLOERHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_offset: 0.5
+ use_varifocal_loss: true
+ static_assigner_epoch: -1
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.05}
+ static_assigner:
+ name: FCOSRAssigner
+ factor: 12
+ threshold: 0.23
+ boundary: [[512, 10000], [256, 512], [-1, 256]]
+ assigner:
+ name: RotatedTaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 2000
+ keep_top_k: -1
+ score_threshold: 0.1
+ nms_threshold: 0.1
+ normalized: False
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/_base_/ppyoloe_r_reader.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/_base_/ppyoloe_r_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c429c6ea07c4efaebe97aa62a7029e2951f68dea
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/_base_/ppyoloe_r_reader.yml
@@ -0,0 +1,46 @@
+worker_num: 4
+image_height: &image_height 1024
+image_width: &image_width 1024
+image_size: &image_size [*image_height, *image_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Array: {}
+ - RandomRFlip: {}
+ - RandomRRotate: {angle_mode: 'value', angle: [0, 90, 180, -90]}
+ - RandomRRotate: {angle_mode: 'value', angle: [30, 60], rotate_prob: 0.5}
+ - RResize: {target_size: *image_size, keep_ratio: True, interp: 2}
+ - Poly2RBox: {filter_threshold: 2, filter_mode: 'edge', rbox_type: 'oc'}
+ batch_transforms:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadRGT: {}
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Array: {}
+ - RResize: {target_size: *image_size, keep_ratio: True, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ collate_batch: false
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *image_size, keep_ratio: True, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b019d736c19b35423cb536eea0cf0e55036c2af7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../../datasets/dota.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/ppyoloe_r_reader.yml',
+ '_base_/ppyoloe_r_crn.yml'
+]
+
+log_iter: 50
+snapshot_epoch: 1
+weights: output/ppyoloe_r_crn_l_3x_dota/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_l_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a1411a3153dfae89d722d4895039b15370094c45
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_l_3x_dota_ms.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../../datasets/dota_ms.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/ppyoloe_r_reader.yml',
+ '_base_/ppyoloe_r_crn.yml'
+]
+
+log_iter: 50
+snapshot_epoch: 1
+weights: output/ppyoloe_r_crn_l_3x_dota/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_l_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota.yml
new file mode 100644
index 0000000000000000000000000000000000000000..755cf3f4e5bb93072779cf83344124c6d28cb925
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../../datasets/dota.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/ppyoloe_r_reader.yml',
+ '_base_/ppyoloe_r_crn.yml'
+]
+
+log_iter: 50
+snapshot_epoch: 1
+weights: output/ppyoloe_r_crn_m_3x_dota/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_m_pretrained.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota_ms.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota_ms.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d885b459ff61f5ab7b3dcdcf55b80f1d6a3d6a4f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_m_3x_dota_ms.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../../datasets/dota_ms.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/ppyoloe_r_reader.yml',
+ '_base_/ppyoloe_r_crn.yml'
+]
+
+log_iter: 50
+snapshot_epoch: 1
+weights: output/ppyoloe_r_crn_m_3x_dota/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_m_pretrained.pdparams
+depth_mult: 0.67
+width_mult: 0.75
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a227f18ac2ddb93e7af79d2452ea7e043cfe3eb0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../../datasets/dota.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/ppyoloe_r_reader.yml',
+ '_base_/ppyoloe_r_crn.yml'
+]
+
+log_iter: 50
+snapshot_epoch: 1
+weights: output/ppyoloe_r_crn_s_3x_dota/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_s_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota_ms.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota_ms.yml
new file mode 100644
index 0000000000000000000000000000000000000000..921a9d571b730d3f57865e51baca6d37080d42a1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_s_3x_dota_ms.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../../datasets/dota_ms.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/ppyoloe_r_reader.yml',
+ '_base_/ppyoloe_r_crn.yml'
+]
+
+log_iter: 50
+snapshot_epoch: 1
+weights: output/ppyoloe_r_crn_s_3x_dota/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_s_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d81b5ef9861fcef9e044c792894f671886037182
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../../datasets/dota.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/ppyoloe_r_reader.yml',
+ '_base_/ppyoloe_r_crn.yml'
+]
+
+log_iter: 50
+snapshot_epoch: 1
+weights: output/ppyoloe_r_crn_x_3x_dota/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_x_pretrained.pdparams
+depth_mult: 1.33
+width_mult: 1.25
diff --git a/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota_ms.yml b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota_ms.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d99cdb0787109cdd88054d15967ddf4bfbb2b52f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/ppyoloe_r/ppyoloe_r_crn_x_3x_dota_ms.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ '../../datasets/dota_ms.yml',
+ '../../runtime.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/ppyoloe_r_reader.yml',
+ '_base_/ppyoloe_r_crn.yml'
+]
+
+log_iter: 50
+snapshot_epoch: 1
+weights: output/ppyoloe_r_crn_x_3x_dota/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_x_pretrained.pdparams
+depth_mult: 1.33
+width_mult: 1.25
diff --git a/PaddleDetection-release-2.6/configs/rotate/s2anet/README.md b/PaddleDetection-release-2.6/configs/rotate/s2anet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..270f7cb6884e815f14e06c8186a47ed200941bb3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/s2anet/README.md
@@ -0,0 +1,104 @@
+简体中文 | [English](README_en.md)
+
+# S2ANet
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+- [使用说明](#使用说明)
+- [预测部署](#预测部署)
+- [引用](#引用)
+
+## 简介
+
+[S2ANet](https://arxiv.org/pdf/2008.09397.pdf)是用于检测旋转框的模型.
+
+## 模型库
+
+| 模型 | Conv类型 | mAP | 学习率策略 | 角度表示 | 数据增广 | GPU数目 | 每GPU图片数目 | 模型下载 | 配置文件 |
+|:---:|:------:|:----:|:---------:|:-----:|:--------:|:-----:|:------------:|:-------:|:------:|
+| S2ANet | Conv | 71.45 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_conv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/s2anet/s2anet_conv_2x_dota.yml) |
+| S2ANet | AlignConv | 73.84 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml) |
+
+**注意:**
+
+- 如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- 模型库中的模型默认使用单尺度训练单尺度测试。如果数据增广一栏标明MS,意味着使用多尺度训练和多尺度测试。如果数据增广一栏标明RR,意味着使用RandomRotate数据增广进行训练。
+- 这里使用`multiclass_nms`,与原作者使用nms略有不同。
+
+
+## 使用说明
+
+参考[数据准备](../README.md#数据准备)准备数据。
+
+### 1. 训练
+
+GPU单卡训练
+```bash
+export CUDA_VISIBLE_DEVICES=0
+python tools/train.py -c configs/rotate/s2anet/s2anet_1x_spine.yml
+```
+
+GPU多卡训练
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/s2anet/s2anet_1x_spine.yml
+```
+
+可以通过`--eval`开启边训练边测试。
+
+### 2. 评估
+```bash
+python tools/eval.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=output/s2anet_1x_spine/model_final.pdparams
+
+# 使用提供训练好的模型评估
+python tools/eval.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_1x_spine.pdparams
+```
+
+### 3. 预测
+执行如下命令,会将图像预测结果保存到`output`文件夹下。
+```bash
+python tools/infer.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=output/s2anet_1x_spine/model_final.pdparams --infer_img=demo/39006.jpg --draw_threshold=0.3
+```
+使用提供训练好的模型预测:
+```bash
+python tools/infer.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_1x_spine.pdparams --infer_img=demo/39006.jpg --draw_threshold=0.3
+```
+
+### 4. DOTA数据评估
+执行如下命令,会在`output`文件夹下将每个图像预测结果保存到同文件夹名的txt文本中。
+```
+python tools/infer.py -c configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output --visualize=False --save_results=True
+```
+参考[DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), 评估DOTA数据集需要生成一个包含所有检测结果的zip文件,每一类的检测结果储存在一个txt文件中,txt文件中每行格式为:`image_name score x1 y1 x2 y2 x3 y3 x4 y4`。将生成的zip文件提交到[DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html)的Task1进行评估。你可以执行以下命令生成评估文件
+```
+python configs/rotate/tools/generate_result.py --pred_txt_dir=output/ --output_dir=submit/ --data_type=dota10
+
+zip -r submit.zip submit
+```
+
+## 预测部署
+
+Paddle中`multiclass_nms`算子的输入支持四边形输入,因此部署时可以不需要依赖旋转框IOU计算算子。
+
+部署教程请参考[预测部署](../../../deploy/README.md)
+
+
+## 引用
+```
+@article{han2021align,
+ author={J. {Han} and J. {Ding} and J. {Li} and G. -S. {Xia}},
+ journal={IEEE Transactions on Geoscience and Remote Sensing},
+ title={Align Deep Features for Oriented Object Detection},
+ year={2021},
+ pages={1-11},
+ doi={10.1109/TGRS.2021.3062048}}
+
+@inproceedings{xia2018dota,
+ title={DOTA: A large-scale dataset for object detection in aerial images},
+ author={Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
+ booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
+ pages={3974--3983},
+ year={2018}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/rotate/s2anet/README_en.md b/PaddleDetection-release-2.6/configs/rotate/s2anet/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..aecda9a3f2c46522f186152d146497d9fb41833e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/s2anet/README_en.md
@@ -0,0 +1,102 @@
+English | [简体中文](README.md)
+
+# S2ANet
+
+## Content
+- [Introduction](#Introduction)
+- [Model Zoo](#Model-Zoo)
+- [Getting Start](#Getting-Start)
+- [Deployment](#Deployment)
+- [Citations](#Citations)
+
+## Introduction
+
+[S2ANet](https://arxiv.org/pdf/2008.09397.pdf) is used to detect rotated objects.
+
+## Model Zoo
+| Model | Conv Type | mAP | Lr Scheduler | Angle | Aug | GPU Number | images/GPU | download | config |
+|:---:|:------:|:----:|:---------:|:-----:|:--------:|:-----:|:------------:|:-------:|:------:|
+| S2ANet | Conv | 71.45 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_conv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/s2anet/s2anet_conv_2x_dota.yml) |
+| S2ANet | AlignConv | 73.84 | 2x | le135 | - | 4 | 2 | [model](https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml) |
+
+**Notes:**
+- if **GPU number** or **mini-batch size** is changed, **learning rate** should be adjusted according to the formula **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)**.
+- Models in model zoo is trained and tested with single scale by default. If `MS` is indicated in the data augmentation column, it means that multi-scale training and multi-scale testing are used. If `RR` is indicated in the data augmentation column, it means that RandomRotate data augmentation is used for training.
+- `multiclass_nms` is used here, which is slightly different from the original author's use of NMS.
+
+## Getting Start
+
+Refer to [Data-Preparation](../README_en.md#Data-Preparation) to prepare data.
+
+### 1. Train
+
+Single GPU Training
+```bash
+export CUDA_VISIBLE_DEVICES=0
+python tools/train.py -c configs/rotate/s2anet/s2anet_1x_spine.yml
+```
+
+Multiple GPUs Training
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/rotate/s2anet/s2anet_1x_spine.yml
+```
+
+You can use `--eval`to enable train-by-test.
+
+### 2. Evaluation
+```bash
+python tools/eval.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=output/s2anet_1x_spine/model_final.pdparams
+
+# Use a trained model to evaluate
+python tools/eval.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_1x_spine.pdparams
+```
+
+### 3. Prediction
+Executing the following command will save the image prediction results to the `output` folder.
+```bash
+python tools/infer.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=output/s2anet_1x_spine/model_final.pdparams --infer_img=demo/39006.jpg --draw_threshold=0.3
+```
+Prediction using models that provide training:
+```bash
+python tools/infer.py -c configs/rotate/s2anet/s2anet_1x_spine.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_1x_spine.pdparams --infer_img=demo/39006.jpg --draw_threshold=0.3
+```
+
+### 4. DOTA Data evaluation
+Execute the following command, will save each image prediction result in `output` folder txt text with the same folder name.
+```
+python tools/infer.py -c configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml -o weights=https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams --infer_dir=/path/to/test/images --output_dir=output --visualize=False --save_results=True
+```
+Refering to [DOTA Task](https://captain-whu.github.io/DOTA/tasks.html), You need to submit a zip file containing results for all test images for evaluation. The detection results of each category are stored in a txt file, each line of which is in the following format
+`image_id score x1 y1 x2 y2 x3 y3 x4 y4`. To evaluate, you should submit the generated zip file to the Task1 of [DOTA Evaluation](https://captain-whu.github.io/DOTA/evaluation.html). You can execute the following command to generate the file
+```
+python configs/rotate/tools/generate_result.py --pred_txt_dir=output/ --output_dir=submit/ --data_type=dota10
+
+zip -r submit.zip submit
+```
+
+## Deployment
+
+The inputs of the `multiclass_nms` operator in Paddle support quadrilateral inputs, so deployment can be done without relying on the rotating frame IOU operator.
+
+Please refer to the deployment tutorial[Predict deployment](../../../deploy/README_en.md)
+
+
+## Citations
+```
+@article{han2021align,
+ author={J. {Han} and J. {Ding} and J. {Li} and G. -S. {Xia}},
+ journal={IEEE Transactions on Geoscience and Remote Sensing},
+ title={Align Deep Features for Oriented Object Detection},
+ year={2021},
+ pages={1-11},
+ doi={10.1109/TGRS.2021.3062048}}
+
+@inproceedings{xia2018dota,
+ title={DOTA: A large-scale dataset for object detection in aerial images},
+ author={Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
+ booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
+ pages={3974--3983},
+ year={2018}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet.yml b/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fc8b2e25836b616e102676c39735b4debffcf435
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet.yml
@@ -0,0 +1,52 @@
+architecture: S2ANet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/s2anet_r50_fpn_1x_dota/model_final.pdparams
+
+
+# Model Achitecture
+S2ANet:
+ backbone: ResNet
+ neck: FPN
+ head: S2ANetHead
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ return_idx: [1,2,3]
+ num_stages: 4
+
+FPN:
+ in_channels: [256, 512, 1024]
+ out_channel: 256
+ spatial_scales: [0.25, 0.125, 0.0625]
+ has_extra_convs: True
+ extra_stage: 2
+ relu_before_extra_convs: False
+
+S2ANetHead:
+ anchor_strides: [8, 16, 32, 64, 128]
+ anchor_scales: [4]
+ anchor_ratios: [1.0]
+ anchor_assign: RBoxAssigner
+ stacked_convs: 2
+ feat_in: 256
+ feat_out: 256
+ align_conv_type: 'AlignConv' # AlignConv Conv
+ align_conv_size: 3
+ use_sigmoid_cls: True
+ reg_loss_weight: [1.0, 1.0, 1.0, 1.0, 1.1]
+ cls_loss_weight: [1.1, 1.05]
+ nms_pre: 2000
+ nms:
+ name: MultiClassNMS
+ keep_top_k: -1
+ score_threshold: 0.05
+ nms_threshold: 0.1
+ normalized: False
+
+RBoxAssigner:
+ pos_iou_thr: 0.5
+ neg_iou_thr: 0.4
+ min_iou_thr: 0.0
+ ignore_iof_thr: -2
diff --git a/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet_optimizer_1x.yml b/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet_optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..65f794dc34c55f5d597b94eb1b305b28a28707f7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet_optimizer_1x.yml
@@ -0,0 +1,20 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [7, 10]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+ clip_grad_by_norm: 35
diff --git a/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet_optimizer_2x.yml b/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet_optimizer_2x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..54e73ce64634ce9a479d07bbde1c3de385d2a7d5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet_optimizer_2x.yml
@@ -0,0 +1,20 @@
+epoch: 24
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [14, 20]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+ clip_grad_by_norm: 35
diff --git a/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet_reader.yml b/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7d0fc15e002f8fe0772a7feea241418f9a2ada42
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/s2anet/_base_/s2anet_reader.yml
@@ -0,0 +1,44 @@
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Array: {}
+ - RandomRFlip: {}
+ - RResize: {target_size: [1024, 1024], keep_ratio: True, interp: 2}
+ - Poly2RBox: {rbox_type: 'le135'}
+ batch_transforms:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadRGT: {}
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Array: {}
+ - RResize: {target_size: [1024, 1024], keep_ratio: True, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: false
+ drop_last: false
+ collate_batch: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1024, 1024], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/rotate/s2anet/s2anet_1x_spine.yml b/PaddleDetection-release-2.6/configs/rotate/s2anet/s2anet_1x_spine.yml
new file mode 100644
index 0000000000000000000000000000000000000000..550586f45ce293b2edd082d6fe700b97c53c35f3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/s2anet/s2anet_1x_spine.yml
@@ -0,0 +1,25 @@
+_BASE_: [
+ '../../datasets/spine_coco.yml',
+ '../../runtime.yml',
+ '_base_/s2anet_optimizer_1x.yml',
+ '_base_/s2anet.yml',
+ '_base_/s2anet_reader.yml',
+]
+
+weights: output/s2anet_1x_spine/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/s2anet_alignconv_2x_dota.pdparams
+
+# for 4 card
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [7, 10]
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ epochs: 5
+
+S2ANetHead:
+ reg_loss_weight: [1.0, 1.0, 1.0, 1.0, 1.05]
+ cls_loss_weight: [1.05, 1.0]
diff --git a/PaddleDetection-release-2.6/configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml b/PaddleDetection-release-2.6/configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1b3e9eb4636dc56e2cb97142e2a9b3f4c16bb84d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/s2anet/s2anet_alignconv_2x_dota.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../../datasets/dota.yml',
+ '../../runtime.yml',
+ '_base_/s2anet_optimizer_2x.yml',
+ '_base_/s2anet.yml',
+ '_base_/s2anet_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+
+weights: output/s2anet_alignconv_2x_dota/model_final
diff --git a/PaddleDetection-release-2.6/configs/rotate/s2anet/s2anet_conv_2x_dota.yml b/PaddleDetection-release-2.6/configs/rotate/s2anet/s2anet_conv_2x_dota.yml
new file mode 100644
index 0000000000000000000000000000000000000000..34d136d865b5c4692f69356a6a22835248efe970
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/s2anet/s2anet_conv_2x_dota.yml
@@ -0,0 +1,19 @@
+_BASE_: [
+ '../../datasets/dota.yml',
+ '../../runtime.yml',
+ '_base_/s2anet_optimizer_2x.yml',
+ '_base_/s2anet.yml',
+ '_base_/s2anet_reader.yml',
+]
+weights: output/s2anet_conv_1x_dota/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+ResNet:
+ depth: 50
+ variant: b
+ norm_type: bn
+ return_idx: [1,2,3]
+ num_stages: 4
+
+S2ANetHead:
+ align_conv_type: 'Conv'
diff --git a/PaddleDetection-release-2.6/configs/rotate/tools/convert.py b/PaddleDetection-release-2.6/configs/rotate/tools/convert.py
new file mode 100644
index 0000000000000000000000000000000000000000..cf5bdd01f9ed024f64df10658ff3e5b91efd82ad
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/tools/convert.py
@@ -0,0 +1,163 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# Reference: https://github.com/CAPTAIN-WHU/DOTA_devkit
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import json
+import cv2
+from tqdm import tqdm
+from multiprocessing import Pool
+
+
+def load_dota_info(image_dir, anno_dir, file_name, ext=None):
+ base_name, extension = os.path.splitext(file_name)
+ if ext and (extension != ext and extension not in ext):
+ return None
+ info = {'image_file': os.path.join(image_dir, file_name), 'annotation': []}
+ anno_file = os.path.join(anno_dir, base_name + '.txt')
+ if not os.path.exists(anno_file):
+ return info
+ with open(anno_file, 'r') as f:
+ for line in f:
+ items = line.strip().split()
+ if (len(items) < 9):
+ continue
+
+ anno = {
+ 'poly': list(map(float, items[:8])),
+ 'name': items[8],
+ 'difficult': '0' if len(items) == 9 else items[9],
+ }
+ info['annotation'].append(anno)
+
+ return info
+
+
+def load_dota_infos(root_dir, num_process=8, ext=None):
+ image_dir = os.path.join(root_dir, 'images')
+ anno_dir = os.path.join(root_dir, 'labelTxt')
+ data_infos = []
+ if num_process > 1:
+ pool = Pool(num_process)
+ results = []
+ for file_name in os.listdir(image_dir):
+ results.append(
+ pool.apply_async(load_dota_info, (image_dir, anno_dir,
+ file_name, ext)))
+
+ pool.close()
+ pool.join()
+
+ for result in results:
+ info = result.get()
+ if info:
+ data_infos.append(info)
+
+ else:
+ for file_name in os.listdir(image_dir):
+ info = load_dota_info(image_dir, anno_dir, file_name, ext)
+ if info:
+ data_infos.append(info)
+
+ return data_infos
+
+
+def process_single_sample(info, image_id, class_names):
+ image_file = info['image_file']
+ single_image = dict()
+ single_image['file_name'] = os.path.split(image_file)[-1]
+ single_image['id'] = image_id
+ image = cv2.imread(image_file)
+ height, width, _ = image.shape
+ single_image['width'] = width
+ single_image['height'] = height
+
+ # process annotation field
+ single_objs = []
+ objects = info['annotation']
+ for obj in objects:
+ poly, name, difficult = obj['poly'], obj['name'], obj['difficult']
+ if difficult == '2':
+ continue
+
+ single_obj = dict()
+ single_obj['category_id'] = class_names.index(name) + 1
+ single_obj['segmentation'] = [poly]
+ single_obj['iscrowd'] = 0
+ xmin, ymin, xmax, ymax = min(poly[0::2]), min(poly[1::2]), max(poly[
+ 0::2]), max(poly[1::2])
+ width, height = xmax - xmin, ymax - ymin
+ single_obj['bbox'] = [xmin, ymin, width, height]
+ single_obj['area'] = height * width
+ single_obj['image_id'] = image_id
+ single_objs.append(single_obj)
+
+ return (single_image, single_objs)
+
+
+def data_to_coco(infos, output_path, class_names, num_process):
+ data_dict = dict()
+ data_dict['categories'] = []
+
+ for i, name in enumerate(class_names):
+ data_dict['categories'].append({
+ 'id': i + 1,
+ 'name': name,
+ 'supercategory': name
+ })
+
+ pbar = tqdm(total=len(infos), desc='data to coco')
+ images, annotations = [], []
+ if num_process > 1:
+ pool = Pool(num_process)
+ results = []
+ for i, info in enumerate(infos):
+ image_id = i + 1
+ results.append(
+ pool.apply_async(
+ process_single_sample, (info, image_id, class_names),
+ callback=lambda x: pbar.update()))
+
+ pool.close()
+ pool.join()
+
+ for result in results:
+ single_image, single_anno = result.get()
+ images.append(single_image)
+ annotations += single_anno
+
+ else:
+ for i, info in enumerate(infos):
+ image_id = i + 1
+ single_image, single_anno = process_single_sample(info, image_id,
+ class_names)
+ images.append(single_image)
+ annotations += single_anno
+ pbar.update()
+
+ pbar.close()
+
+ for i, anno in enumerate(annotations):
+ anno['id'] = i + 1
+
+ data_dict['images'] = images
+ data_dict['annotations'] = annotations
+
+ with open(output_path, 'w') as f:
+ json.dump(data_dict, f)
diff --git a/PaddleDetection-release-2.6/configs/rotate/tools/generate_result.py b/PaddleDetection-release-2.6/configs/rotate/tools/generate_result.py
new file mode 100644
index 0000000000000000000000000000000000000000..f8343ee5b368c796ef31b92977653843515bcf2a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/tools/generate_result.py
@@ -0,0 +1,266 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import re
+import glob
+
+import numpy as np
+from multiprocessing import Pool
+from functools import partial
+from shapely.geometry import Polygon
+import argparse
+
+wordname_15 = [
+ 'plane', 'baseball-diamond', 'bridge', 'ground-track-field',
+ 'small-vehicle', 'large-vehicle', 'ship', 'tennis-court',
+ 'basketball-court', 'storage-tank', 'soccer-ball-field', 'roundabout',
+ 'harbor', 'swimming-pool', 'helicopter'
+]
+
+wordname_16 = wordname_15 + ['container-crane']
+
+wordname_18 = wordname_16 + ['airport', 'helipad']
+
+DATA_CLASSES = {
+ 'dota10': wordname_15,
+ 'dota15': wordname_16,
+ 'dota20': wordname_18
+}
+
+
+def rbox_iou(g, p):
+ """
+ iou of rbox
+ """
+ g = np.array(g)
+ p = np.array(p)
+ g = Polygon(g[:8].reshape((4, 2)))
+ p = Polygon(p[:8].reshape((4, 2)))
+ g = g.buffer(0)
+ p = p.buffer(0)
+ if not g.is_valid or not p.is_valid:
+ return 0
+ inter = Polygon(g).intersection(Polygon(p)).area
+ union = g.area + p.area - inter
+ if union == 0:
+ return 0
+ else:
+ return inter / union
+
+
+def py_cpu_nms_poly_fast(dets, thresh):
+ """
+ Args:
+ dets: pred results
+ thresh: nms threshold
+
+ Returns: index of keep
+ """
+ obbs = dets[:, 0:-1]
+ x1 = np.min(obbs[:, 0::2], axis=1)
+ y1 = np.min(obbs[:, 1::2], axis=1)
+ x2 = np.max(obbs[:, 0::2], axis=1)
+ y2 = np.max(obbs[:, 1::2], axis=1)
+ scores = dets[:, 8]
+ areas = (x2 - x1 + 1) * (y2 - y1 + 1)
+
+ polys = []
+ for i in range(len(dets)):
+ tm_polygon = [
+ dets[i][0], dets[i][1], dets[i][2], dets[i][3], dets[i][4],
+ dets[i][5], dets[i][6], dets[i][7]
+ ]
+ polys.append(tm_polygon)
+ polys = np.array(polys)
+ order = scores.argsort()[::-1]
+
+ keep = []
+ while order.size > 0:
+ ovr = []
+ i = order[0]
+ keep.append(i)
+
+ xx1 = np.maximum(x1[i], x1[order[1:]])
+ yy1 = np.maximum(y1[i], y1[order[1:]])
+ xx2 = np.minimum(x2[i], x2[order[1:]])
+ yy2 = np.minimum(y2[i], y2[order[1:]])
+ w = np.maximum(0.0, xx2 - xx1)
+ h = np.maximum(0.0, yy2 - yy1)
+ hbb_inter = w * h
+ hbb_ovr = hbb_inter / (areas[i] + areas[order[1:]] - hbb_inter)
+ h_inds = np.where(hbb_ovr > 0)[0]
+ tmp_order = order[h_inds + 1]
+ for j in range(tmp_order.size):
+ iou = rbox_iou(polys[i], polys[tmp_order[j]])
+ hbb_ovr[h_inds[j]] = iou
+
+ try:
+ if math.isnan(ovr[0]):
+ pdb.set_trace()
+ except:
+ pass
+ inds = np.where(hbb_ovr <= thresh)[0]
+
+ order = order[inds + 1]
+ return keep
+
+
+def poly2origpoly(poly, x, y, rate):
+ origpoly = []
+ for i in range(int(len(poly) / 2)):
+ tmp_x = float(poly[i * 2] + x) / float(rate)
+ tmp_y = float(poly[i * 2 + 1] + y) / float(rate)
+ origpoly.append(tmp_x)
+ origpoly.append(tmp_y)
+ return origpoly
+
+
+def nmsbynamedict(nameboxdict, nms, thresh):
+ """
+ Args:
+ nameboxdict: nameboxdict
+ nms: nms
+ thresh: nms threshold
+
+ Returns: nms result as dict
+ """
+ nameboxnmsdict = {x: [] for x in nameboxdict}
+ for imgname in nameboxdict:
+ keep = nms(np.array(nameboxdict[imgname]), thresh)
+ outdets = []
+ for index in keep:
+ outdets.append(nameboxdict[imgname][index])
+ nameboxnmsdict[imgname] = outdets
+ return nameboxnmsdict
+
+
+def merge_single(output_dir, nms, nms_thresh, pred_class_lst):
+ """
+ Args:
+ output_dir: output_dir
+ nms: nms
+ pred_class_lst: pred_class_lst
+ class_name: class_name
+
+ Returns:
+
+ """
+ class_name, pred_bbox_list = pred_class_lst
+ nameboxdict = {}
+ for line in pred_bbox_list:
+ splitline = line.split(' ')
+ subname = splitline[0]
+ splitname = subname.split('__')
+ oriname = splitname[0]
+ pattern1 = re.compile(r'__\d+___\d+')
+ x_y = re.findall(pattern1, subname)
+ x_y_2 = re.findall(r'\d+', x_y[0])
+ x, y = int(x_y_2[0]), int(x_y_2[1])
+
+ pattern2 = re.compile(r'__([\d+\.]+)__\d+___')
+
+ rate = re.findall(pattern2, subname)[0]
+
+ confidence = splitline[1]
+ poly = list(map(float, splitline[2:]))
+ origpoly = poly2origpoly(poly, x, y, rate)
+ det = origpoly
+ det.append(confidence)
+ det = list(map(float, det))
+ if (oriname not in nameboxdict):
+ nameboxdict[oriname] = []
+ nameboxdict[oriname].append(det)
+ nameboxnmsdict = nmsbynamedict(nameboxdict, nms, nms_thresh)
+
+ # write result
+ dstname = os.path.join(output_dir, class_name + '.txt')
+ with open(dstname, 'w') as f_out:
+ for imgname in nameboxnmsdict:
+ for det in nameboxnmsdict[imgname]:
+ confidence = det[-1]
+ bbox = det[0:-1]
+ outline = imgname + ' ' + str(confidence) + ' ' + ' '.join(
+ map(str, bbox))
+ f_out.write(outline + '\n')
+
+
+def generate_result(pred_txt_dir,
+ output_dir='output',
+ class_names=wordname_15,
+ nms_thresh=0.1):
+ """
+ pred_txt_dir: dir of pred txt
+ output_dir: dir of output
+ class_names: class names of data
+ """
+ pred_txt_list = glob.glob("{}/*.txt".format(pred_txt_dir))
+
+ # step1: summary pred bbox
+ pred_classes = {}
+ for class_name in class_names:
+ pred_classes[class_name] = []
+
+ for current_txt in pred_txt_list:
+ img_id = os.path.split(current_txt)[1]
+ img_id = img_id.split('.txt')[0]
+ with open(current_txt) as f:
+ res = f.readlines()
+ for item in res:
+ item = item.split(' ')
+ pred_class = item[0]
+ item[0] = img_id
+ pred_bbox = ' '.join(item)
+ pred_classes[pred_class].append(pred_bbox)
+
+ pred_classes_lst = []
+ for class_name in pred_classes.keys():
+ print('class_name: {}, count: {}'.format(class_name,
+ len(pred_classes[class_name])))
+ pred_classes_lst.append((class_name, pred_classes[class_name]))
+
+ # step2: merge
+ pool = Pool(len(class_names))
+ nms = py_cpu_nms_poly_fast
+ mergesingle_fn = partial(merge_single, output_dir, nms, nms_thresh)
+ pool.map(mergesingle_fn, pred_classes_lst)
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description='generate test results')
+ parser.add_argument('--pred_txt_dir', type=str, help='path of pred txt dir')
+ parser.add_argument(
+ '--output_dir', type=str, default='output', help='path of output dir')
+ parser.add_argument(
+ '--data_type', type=str, default='dota10', help='data type')
+ parser.add_argument(
+ '--nms_thresh',
+ type=float,
+ default=0.1,
+ help='nms threshold while merging results')
+
+ return parser.parse_args()
+
+
+if __name__ == '__main__':
+ args = parse_args()
+
+ output_dir = args.output_dir
+ if not os.path.exists(output_dir):
+ os.makedirs(output_dir)
+
+ class_names = DATA_CLASSES[args.data_type]
+
+ generate_result(args.pred_txt_dir, output_dir, class_names)
+ print('done!')
diff --git a/PaddleDetection-release-2.6/configs/rotate/tools/inference_benchmark.py b/PaddleDetection-release-2.6/configs/rotate/tools/inference_benchmark.py
new file mode 100644
index 0000000000000000000000000000000000000000..7421e7810c0b93ed4d31f1f22bf175be91a7819b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/tools/inference_benchmark.py
@@ -0,0 +1,378 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import sys
+import six
+import glob
+import time
+import yaml
+import argparse
+import cv2
+import numpy as np
+
+import paddle
+import paddle.version as paddle_version
+from paddle.inference import Config, create_predictor, PrecisionType, get_trt_runtime_version
+
+TUNED_TRT_DYNAMIC_MODELS = {'DETR'}
+
+
+def check_version(version='2.2'):
+ err = "PaddlePaddle version {} or higher is required, " \
+ "or a suitable develop version is satisfied as well. \n" \
+ "Please make sure the version is good with your code.".format(version)
+
+ version_installed = [
+ paddle_version.major, paddle_version.minor, paddle_version.patch,
+ paddle_version.rc
+ ]
+
+ if version_installed == ['0', '0', '0', '0']:
+ return
+
+ if version == 'develop':
+ raise Exception("PaddlePaddle develop version is required!")
+
+ version_split = version.split('.')
+
+ length = min(len(version_installed), len(version_split))
+ for i in six.moves.range(length):
+ if version_installed[i] > version_split[i]:
+ return
+ if version_installed[i] < version_split[i]:
+ raise Exception(err)
+
+
+def check_trt_version(version='8.2'):
+ err = "TensorRT version {} or higher is required," \
+ "Please make sure the version is good with your code.".format(version)
+ version_split = list(map(int, version.split('.')))
+ version_installed = get_trt_runtime_version()
+ length = min(len(version_installed), len(version_split))
+ for i in six.moves.range(length):
+ if version_installed[i] > version_split[i]:
+ return
+ if version_installed[i] < version_split[i]:
+ raise Exception(err)
+
+
+# preprocess ops
+def decode_image(im_file, im_info):
+ if isinstance(im_file, str):
+ with open(im_file, 'rb') as f:
+ im_read = f.read()
+ data = np.frombuffer(im_read, dtype='uint8')
+ im = cv2.imdecode(data, 1) # BGR mode, but need RGB mode
+ im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
+ else:
+ im = im_file
+ im_info['im_shape'] = np.array(im.shape[:2], dtype=np.float32)
+ im_info['scale_factor'] = np.array([1., 1.], dtype=np.float32)
+ return im, im_info
+
+
+class Resize(object):
+ def __init__(self, target_size, keep_ratio=True, interp=cv2.INTER_LINEAR):
+ if isinstance(target_size, int):
+ target_size = [target_size, target_size]
+ self.target_size = target_size
+ self.keep_ratio = keep_ratio
+ self.interp = interp
+
+ def __call__(self, im, im_info):
+ assert len(self.target_size) == 2
+ assert self.target_size[0] > 0 and self.target_size[1] > 0
+ im_channel = im.shape[2]
+ im_scale_y, im_scale_x = self.generate_scale(im)
+ im = cv2.resize(
+ im,
+ None,
+ None,
+ fx=im_scale_x,
+ fy=im_scale_y,
+ interpolation=self.interp)
+ im_info['im_shape'] = np.array(im.shape[:2]).astype('float32')
+ im_info['scale_factor'] = np.array(
+ [im_scale_y, im_scale_x]).astype('float32')
+ return im, im_info
+
+ def generate_scale(self, im):
+ origin_shape = im.shape[:2]
+ im_c = im.shape[2]
+ if self.keep_ratio:
+ im_size_min = np.min(origin_shape)
+ im_size_max = np.max(origin_shape)
+ target_size_min = np.min(self.target_size)
+ target_size_max = np.max(self.target_size)
+ im_scale = float(target_size_min) / float(im_size_min)
+ if np.round(im_scale * im_size_max) > target_size_max:
+ im_scale = float(target_size_max) / float(im_size_max)
+ im_scale_x = im_scale
+ im_scale_y = im_scale
+ else:
+ resize_h, resize_w = self.target_size
+ im_scale_y = resize_h / float(origin_shape[0])
+ im_scale_x = resize_w / float(origin_shape[1])
+ return im_scale_y, im_scale_x
+
+
+class Permute(object):
+ def __init__(self, ):
+ super(Permute, self).__init__()
+
+ def __call__(self, im, im_info):
+ im = im.transpose((2, 0, 1))
+ return im, im_info
+
+
+class NormalizeImage(object):
+ def __init__(self, mean, std, is_scale=True, norm_type='mean_std'):
+ self.mean = mean
+ self.std = std
+ self.is_scale = is_scale
+ self.norm_type = norm_type
+
+ def __call__(self, im, im_info):
+ im = im.astype(np.float32, copy=False)
+ if self.is_scale:
+ scale = 1.0 / 255.0
+ im *= scale
+
+ if self.norm_type == 'mean_std':
+ mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
+ std = np.array(self.std)[np.newaxis, np.newaxis, :]
+ im -= mean
+ im /= std
+ return im, im_info
+
+
+class PadStride(object):
+ def __init__(self, stride=0):
+ self.coarsest_stride = stride
+
+ def __call__(self, im, im_info):
+ coarsest_stride = self.coarsest_stride
+ if coarsest_stride <= 0:
+ return im, im_info
+ im_c, im_h, im_w = im.shape
+ pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
+ pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
+ padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
+ padding_im[:, :im_h, :im_w] = im
+ return padding_im, im_info
+
+
+def preprocess(im, preprocess_ops):
+ # process image by preprocess_ops
+ im_info = {
+ 'scale_factor': np.array(
+ [1., 1.], dtype=np.float32),
+ 'im_shape': None,
+ }
+ im, im_info = decode_image(im, im_info)
+ for operator in preprocess_ops:
+ im, im_info = operator(im, im_info)
+ return im, im_info
+
+
+def parse_args():
+ parser = argparse.ArgumentParser()
+ parser.add_argument(
+ '--model_dir', type=str, help='directory of inference model')
+ parser.add_argument(
+ '--run_mode', type=str, default='paddle', help='running mode')
+ parser.add_argument('--batch_size', type=int, default=1, help='batch size')
+ parser.add_argument(
+ '--image_dir',
+ type=str,
+ default='/paddle/data/DOTA_1024_ss/test1024/images',
+ help='directory of test images')
+ parser.add_argument(
+ '--warmup_iter', type=int, default=5, help='num of warmup iters')
+ parser.add_argument(
+ '--total_iter', type=int, default=2000, help='num of total iters')
+ parser.add_argument(
+ '--log_iter', type=int, default=50, help='num of log interval')
+ parser.add_argument(
+ '--tuned_trt_shape_file',
+ type=str,
+ default='shape_range_info.pbtxt',
+ help='dynamic shape range info')
+ args = parser.parse_args()
+ return args
+
+
+def init_predictor(FLAGS):
+ model_dir, run_mode, batch_size = FLAGS.model_dir, FLAGS.run_mode, FLAGS.batch_size
+ yaml_file = os.path.join(model_dir, 'infer_cfg.yml')
+ with open(yaml_file) as f:
+ yml_conf = yaml.safe_load(f)
+
+ config = Config(
+ os.path.join(model_dir, 'model.pdmodel'),
+ os.path.join(model_dir, 'model.pdiparams'))
+
+ # initial GPU memory(M), device ID
+ config.enable_use_gpu(200, 0)
+ # optimize graph and fuse op
+ config.switch_ir_optim(True)
+
+ precision_map = {
+ 'trt_int8': Config.Precision.Int8,
+ 'trt_fp32': Config.Precision.Float32,
+ 'trt_fp16': Config.Precision.Half
+ }
+
+ arch = yml_conf['arch']
+ tuned_trt_shape_file = os.path.join(model_dir, FLAGS.tuned_trt_shape_file)
+
+ if run_mode in precision_map.keys():
+ if arch in TUNED_TRT_DYNAMIC_MODELS and not os.path.exists(
+ tuned_trt_shape_file):
+ print(
+ 'dynamic shape range info is saved in {}. After that, rerun the code'.
+ format(tuned_trt_shape_file))
+ config.collect_shape_range_info(tuned_trt_shape_file)
+ config.enable_tensorrt_engine(
+ workspace_size=(1 << 25) * batch_size,
+ max_batch_size=batch_size,
+ min_subgraph_size=yml_conf['min_subgraph_size'],
+ precision_mode=precision_map[run_mode],
+ use_static=True,
+ use_calib_mode=False)
+
+ if yml_conf['use_dynamic_shape']:
+ if arch in TUNED_TRT_DYNAMIC_MODELS and os.path.exists(
+ tuned_trt_shape_file):
+ config.enable_tuned_tensorrt_dynamic_shape(tuned_trt_shape_file,
+ True)
+ else:
+ min_input_shape = {
+ 'image': [batch_size, 3, 640, 640],
+ 'scale_factor': [batch_size, 2]
+ }
+ max_input_shape = {
+ 'image': [batch_size, 3, 1280, 1280],
+ 'scale_factor': [batch_size, 2]
+ }
+ opt_input_shape = {
+ 'image': [batch_size, 3, 1024, 1024],
+ 'scale_factor': [batch_size, 2]
+ }
+ config.set_trt_dynamic_shape_info(
+ min_input_shape, max_input_shape, opt_input_shape)
+
+ # disable print log when predict
+ config.disable_glog_info()
+ # enable shared memory
+ config.enable_memory_optim()
+ # disable feed, fetch OP, needed by zero_copy_run
+ config.switch_use_feed_fetch_ops(False)
+ predictor = create_predictor(config)
+ return predictor, yml_conf
+
+
+def create_preprocess_ops(yml_conf):
+ preprocess_ops = []
+ for op_info in yml_conf['Preprocess']:
+ new_op_info = op_info.copy()
+ op_type = new_op_info.pop('type')
+ preprocess_ops.append(eval(op_type)(**new_op_info))
+ return preprocess_ops
+
+
+def get_test_images(image_dir):
+ images = set()
+ infer_dir = os.path.abspath(image_dir)
+ exts = ['jpg', 'jpeg', 'png', 'bmp']
+ exts += [ext.upper() for ext in exts]
+ for ext in exts:
+ images.update(glob.glob('{}/*.{}'.format(infer_dir, ext)))
+ images = list(images)
+ return images
+
+
+def create_inputs(image_files, preprocess_ops):
+ inputs = dict()
+ im_list, im_info_list = [], []
+ for im_path in image_files:
+ im, im_info = preprocess(im_path, preprocess_ops)
+ im_list.append(im)
+ im_info_list.append(im_info)
+
+ inputs['im_shape'] = np.stack(
+ [e['im_shape'] for e in im_info_list], axis=0).astype('float32')
+ inputs['scale_factor'] = np.stack(
+ [e['scale_factor'] for e in im_info_list], axis=0).astype('float32')
+ inputs['image'] = np.stack(im_list, axis=0).astype('float32')
+ return inputs
+
+
+def measure_speed(FLAGS):
+ predictor, yml_conf = init_predictor(FLAGS)
+ input_names = predictor.get_input_names()
+ preprocess_ops = create_preprocess_ops(yml_conf)
+
+ image_files = get_test_images(FLAGS.image_dir)
+
+ batch_size = FLAGS.batch_size
+ warmup_iter, log_iter, total_iter = FLAGS.warmup_iter, FLAGS.log_iter, FLAGS.total_iter
+
+ total_time = 0
+ fps = 0
+ for i in range(0, total_iter, batch_size):
+ # make data ready
+ inputs = create_inputs(image_files[i:i + batch_size], preprocess_ops)
+ for name in input_names:
+ input_tensor = predictor.get_input_handle(name)
+ input_tensor.copy_from_cpu(inputs[name])
+
+ paddle.device.cuda.synchronize()
+ # start running
+ start_time = time.perf_counter()
+ predictor.run()
+ paddle.device.cuda.synchronize()
+
+ if i >= warmup_iter:
+ total_time += time.perf_counter() - start_time
+ if (i + 1) % log_iter == 0:
+ fps = (i + 1 - warmup_iter) / total_time
+ print(
+ f'Done image [{i + 1:<3}/ {total_iter}], '
+ f'fps: {fps:.1f} img / s, '
+ f'times per image: {1000 / fps:.1f} ms / img',
+ flush=True)
+
+ if (i + 1) == total_iter:
+ fps = (i + 1 - warmup_iter) / total_time
+ print(
+ f'Overall fps: {fps:.1f} img / s, '
+ f'times per image: {1000 / fps:.1f} ms / img',
+ flush=True)
+ break
+
+
+if __name__ == '__main__':
+ FLAGS = parse_args()
+ if 'trt' in FLAGS.run_mode:
+ check_version('develop')
+ check_trt_version('8.2')
+ else:
+ check_version('2.4')
+ measure_speed(FLAGS)
diff --git a/PaddleDetection-release-2.6/configs/rotate/tools/onnx_infer.py b/PaddleDetection-release-2.6/configs/rotate/tools/onnx_infer.py
new file mode 100644
index 0000000000000000000000000000000000000000..fa9b06d2ddd525ed32f99891bdd67f3b6650f0be
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/tools/onnx_infer.py
@@ -0,0 +1,302 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import sys
+import six
+import glob
+import copy
+import yaml
+import argparse
+import cv2
+import numpy as np
+from shapely.geometry import Polygon
+from onnxruntime import InferenceSession
+
+
+# preprocess ops
+def decode_image(img_path):
+ with open(img_path, 'rb') as f:
+ im_read = f.read()
+ data = np.frombuffer(im_read, dtype='uint8')
+ im = cv2.imdecode(data, 1) # BGR mode, but need RGB mode
+ im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
+ img_info = {
+ "im_shape": np.array(
+ im.shape[:2], dtype=np.float32),
+ "scale_factor": np.array(
+ [1., 1.], dtype=np.float32)
+ }
+ return im, img_info
+
+
+class Resize(object):
+ def __init__(self, target_size, keep_ratio=True, interp=cv2.INTER_LINEAR):
+ if isinstance(target_size, int):
+ target_size = [target_size, target_size]
+ self.target_size = target_size
+ self.keep_ratio = keep_ratio
+ self.interp = interp
+
+ def __call__(self, im, im_info):
+ assert len(self.target_size) == 2
+ assert self.target_size[0] > 0 and self.target_size[1] > 0
+ im_channel = im.shape[2]
+ im_scale_y, im_scale_x = self.generate_scale(im)
+ im = cv2.resize(
+ im,
+ None,
+ None,
+ fx=im_scale_x,
+ fy=im_scale_y,
+ interpolation=self.interp)
+ im_info['im_shape'] = np.array(im.shape[:2]).astype('float32')
+ im_info['scale_factor'] = np.array(
+ [im_scale_y, im_scale_x]).astype('float32')
+ return im, im_info
+
+ def generate_scale(self, im):
+ origin_shape = im.shape[:2]
+ im_c = im.shape[2]
+ if self.keep_ratio:
+ im_size_min = np.min(origin_shape)
+ im_size_max = np.max(origin_shape)
+ target_size_min = np.min(self.target_size)
+ target_size_max = np.max(self.target_size)
+ im_scale = float(target_size_min) / float(im_size_min)
+ if np.round(im_scale * im_size_max) > target_size_max:
+ im_scale = float(target_size_max) / float(im_size_max)
+ im_scale_x = im_scale
+ im_scale_y = im_scale
+ else:
+ resize_h, resize_w = self.target_size
+ im_scale_y = resize_h / float(origin_shape[0])
+ im_scale_x = resize_w / float(origin_shape[1])
+ return im_scale_y, im_scale_x
+
+
+class Permute(object):
+ def __init__(self, ):
+ super(Permute, self).__init__()
+
+ def __call__(self, im, im_info):
+ im = im.transpose((2, 0, 1))
+ return im, im_info
+
+
+class NormalizeImage(object):
+ def __init__(self, mean, std, is_scale=True, norm_type='mean_std'):
+ self.mean = mean
+ self.std = std
+ self.is_scale = is_scale
+ self.norm_type = norm_type
+
+ def __call__(self, im, im_info):
+ im = im.astype(np.float32, copy=False)
+ if self.is_scale:
+ scale = 1.0 / 255.0
+ im *= scale
+
+ if self.norm_type == 'mean_std':
+ mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
+ std = np.array(self.std)[np.newaxis, np.newaxis, :]
+ im -= mean
+ im /= std
+ return im, im_info
+
+
+class PadStride(object):
+ def __init__(self, stride=0):
+ self.coarsest_stride = stride
+
+ def __call__(self, im, im_info):
+ coarsest_stride = self.coarsest_stride
+ if coarsest_stride <= 0:
+ return im, im_info
+ im_c, im_h, im_w = im.shape
+ pad_h = int(np.ceil(float(im_h) / coarsest_stride) * coarsest_stride)
+ pad_w = int(np.ceil(float(im_w) / coarsest_stride) * coarsest_stride)
+ padding_im = np.zeros((im_c, pad_h, pad_w), dtype=np.float32)
+ padding_im[:, :im_h, :im_w] = im
+ return padding_im, im_info
+
+
+class Compose:
+ def __init__(self, transforms):
+ self.transforms = []
+ for op_info in transforms:
+ new_op_info = op_info.copy()
+ op_type = new_op_info.pop('type')
+ self.transforms.append(eval(op_type)(**new_op_info))
+
+ def __call__(self, img_path):
+ img, im_info = decode_image(img_path)
+ for t in self.transforms:
+ img, im_info = t(img, im_info)
+ inputs = copy.deepcopy(im_info)
+ inputs['image'] = img
+ return inputs
+
+
+# postprocess
+def rbox_iou(g, p):
+ g = np.array(g)
+ p = np.array(p)
+ g = Polygon(g[:8].reshape((4, 2)))
+ p = Polygon(p[:8].reshape((4, 2)))
+ g = g.buffer(0)
+ p = p.buffer(0)
+ if not g.is_valid or not p.is_valid:
+ return 0
+ inter = Polygon(g).intersection(Polygon(p)).area
+ union = g.area + p.area - inter
+ if union == 0:
+ return 0
+ else:
+ return inter / union
+
+
+def multiclass_nms_rotated(pred_bboxes,
+ pred_scores,
+ iou_threshlod=0.1,
+ score_threshold=0.1):
+ """
+ Args:
+ pred_bboxes (numpy.ndarray): [B, N, 8]
+ pred_scores (numpy.ndarray): [B, C, N]
+
+ Return:
+ bboxes (numpy.ndarray): [N, 10]
+ bbox_num (numpy.ndarray): [B]
+ """
+ bbox_num = []
+ bboxes = []
+ for bbox_per_img, score_per_img in zip(pred_bboxes, pred_scores):
+ num_per_img = 0
+ for cls_id, score_per_cls in enumerate(score_per_img):
+ keep_mask = score_per_cls > score_threshold
+ bbox = bbox_per_img[keep_mask]
+ score = score_per_cls[keep_mask]
+
+ idx = score.argsort()[::-1]
+ bbox = bbox[idx]
+ score = score[idx]
+ keep_idx = []
+ for i, b in enumerate(bbox):
+ supressed = False
+ for gi in keep_idx:
+ g = bbox[gi]
+ if rbox_iou(b, g) > iou_threshlod:
+ supressed = True
+ break
+
+ if supressed:
+ continue
+
+ keep_idx.append(i)
+
+ keep_box = bbox[keep_idx]
+ keep_score = score[keep_idx]
+ keep_cls_ids = np.ones(len(keep_idx)) * cls_id
+ bboxes.append(
+ np.concatenate(
+ [keep_cls_ids[:, None], keep_score[:, None], keep_box],
+ axis=-1))
+ num_per_img += len(keep_idx)
+
+ bbox_num.append(num_per_img)
+
+ return np.concatenate(bboxes, axis=0), np.array(bbox_num)
+
+
+def get_test_images(infer_dir, infer_img):
+ """
+ Get image path list in TEST mode
+ """
+ assert infer_img is not None or infer_dir is not None, \
+ "--image_file or --image_dir should be set"
+ assert infer_img is None or os.path.isfile(infer_img), \
+ "{} is not a file".format(infer_img)
+ assert infer_dir is None or os.path.isdir(infer_dir), \
+ "{} is not a directory".format(infer_dir)
+
+ # infer_img has a higher priority
+ if infer_img and os.path.isfile(infer_img):
+ return [infer_img]
+
+ images = set()
+ infer_dir = os.path.abspath(infer_dir)
+ assert os.path.isdir(infer_dir), \
+ "infer_dir {} is not a directory".format(infer_dir)
+ exts = ['jpg', 'jpeg', 'png', 'bmp']
+ exts += [ext.upper() for ext in exts]
+ for ext in exts:
+ images.update(glob.glob('{}/*.{}'.format(infer_dir, ext)))
+ images = list(images)
+
+ assert len(images) > 0, "no image found in {}".format(infer_dir)
+ print("Found {} inference images in total.".format(len(images)))
+
+ return images
+
+
+def predict_image(infer_config, predictor, img_list):
+ # load preprocess transforms
+ transforms = Compose(infer_config['Preprocess'])
+ # predict image
+ for img_path in img_list:
+ inputs = transforms(img_path)
+ inputs_name = [var.name for var in predictor.get_inputs()]
+ inputs = {k: inputs[k][None, ] for k in inputs_name}
+
+ outputs = predictor.run(output_names=None, input_feed=inputs)
+
+ bboxes, bbox_num = multiclass_nms_rotated(
+ np.array(outputs[0]), np.array(outputs[1]))
+ print("ONNXRuntime predict: ")
+ for bbox in bboxes:
+ if bbox[0] > -1 and bbox[1] > infer_config['draw_threshold']:
+ print(f"{int(bbox[0])} {bbox[1]} "
+ f"{bbox[2]} {bbox[3]} {bbox[4]} {bbox[5]}"
+ f"{bbox[6]} {bbox[7]} {bbox[8]} {bbox[9]}")
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description=__doc__)
+ parser.add_argument("--infer_cfg", type=str, help="infer_cfg.yml")
+ parser.add_argument(
+ '--onnx_file',
+ type=str,
+ default="model.onnx",
+ help="onnx model file path")
+ parser.add_argument("--image_dir", type=str)
+ parser.add_argument("--image_file", type=str)
+ return parser.parse_args()
+
+
+if __name__ == '__main__':
+ FLAGS = parse_args()
+ # load image list
+ img_list = get_test_images(FLAGS.image_dir, FLAGS.image_file)
+ # load predictor
+ predictor = InferenceSession(FLAGS.onnx_file)
+ # load infer config
+ with open(FLAGS.infer_cfg) as f:
+ infer_config = yaml.safe_load(f)
+
+ predict_image(infer_config, predictor, img_list)
diff --git a/PaddleDetection-release-2.6/configs/rotate/tools/prepare_data.py b/PaddleDetection-release-2.6/configs/rotate/tools/prepare_data.py
new file mode 100644
index 0000000000000000000000000000000000000000..21488e2c7a5a604dad4a508f2c67ec6bf8cea37a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/tools/prepare_data.py
@@ -0,0 +1,128 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import argparse
+from convert import load_dota_infos, data_to_coco
+from slicebase import SliceBase
+
+wordname_15 = [
+ 'plane', 'baseball-diamond', 'bridge', 'ground-track-field',
+ 'small-vehicle', 'large-vehicle', 'ship', 'tennis-court',
+ 'basketball-court', 'storage-tank', 'soccer-ball-field', 'roundabout',
+ 'harbor', 'swimming-pool', 'helicopter'
+]
+
+wordname_16 = wordname_15 + ['container-crane']
+
+wordname_18 = wordname_16 + ['airport', 'helipad']
+
+DATA_CLASSES = {
+ 'dota10': wordname_15,
+ 'dota15': wordname_16,
+ 'dota20': wordname_18
+}
+
+
+def parse_args():
+ parser = argparse.ArgumentParser('prepare data for training')
+
+ parser.add_argument(
+ '--input_dirs',
+ nargs='+',
+ type=str,
+ default=None,
+ help='input dirs which contain image and labelTxt dir')
+
+ parser.add_argument(
+ '--output_dir',
+ type=str,
+ default=None,
+ help='output dirs which contain image and labelTxt dir and coco style json file'
+ )
+
+ parser.add_argument(
+ '--coco_json_file',
+ type=str,
+ default='',
+ help='coco json annotation files')
+
+ parser.add_argument('--subsize', type=int, default=1024, help='patch size')
+
+ parser.add_argument('--gap', type=int, default=200, help='step size')
+
+ parser.add_argument(
+ '--data_type', type=str, default='dota10', help='data type')
+
+ parser.add_argument(
+ '--rates',
+ nargs='+',
+ type=float,
+ default=[1.],
+ help='scales for multi-slice training')
+
+ parser.add_argument(
+ '--nproc', type=int, default=8, help='the processor number')
+
+ parser.add_argument(
+ '--iof_thr',
+ type=float,
+ default=0.5,
+ help='the minimal iof between a object and a window')
+
+ parser.add_argument(
+ '--image_only',
+ action='store_true',
+ default=False,
+ help='only processing image')
+
+ args = parser.parse_args()
+ return args
+
+
+def load_dataset(input_dir, nproc, data_type):
+ if 'dota' in data_type.lower():
+ infos = load_dota_infos(input_dir, nproc)
+ else:
+ raise ValueError('only dota dataset is supported now')
+
+ return infos
+
+
+def main():
+ args = parse_args()
+ infos = []
+ for input_dir in args.input_dirs:
+ infos += load_dataset(input_dir, args.nproc, args.data_type)
+
+ slicer = SliceBase(
+ args.gap,
+ args.subsize,
+ args.iof_thr,
+ num_process=args.nproc,
+ image_only=args.image_only)
+ slicer.slice_data(infos, args.rates, args.output_dir)
+ if args.coco_json_file:
+ infos = load_dota_infos(args.output_dir, args.nproc)
+ coco_json_file = os.path.join(args.output_dir, args.coco_json_file)
+ class_names = DATA_CLASSES[args.data_type]
+ data_to_coco(infos, coco_json_file, class_names, args.nproc)
+
+
+if __name__ == '__main__':
+ main()
diff --git a/PaddleDetection-release-2.6/configs/rotate/tools/slicebase.py b/PaddleDetection-release-2.6/configs/rotate/tools/slicebase.py
new file mode 100644
index 0000000000000000000000000000000000000000..5514b7e27c7de4047eab750fd6e1e811728a5139
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/rotate/tools/slicebase.py
@@ -0,0 +1,267 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# Reference: https://github.com/CAPTAIN-WHU/DOTA_devkit
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import os
+import math
+import copy
+from numbers import Number
+from multiprocessing import Pool
+
+import cv2
+import numpy as np
+from tqdm import tqdm
+import shapely.geometry as shgeo
+
+
+def choose_best_pointorder_fit_another(poly1, poly2):
+ """
+ To make the two polygons best fit with each point
+ """
+ x1, y1, x2, y2, x3, y3, x4, y4 = poly1
+ combinate = [
+ np.array([x1, y1, x2, y2, x3, y3, x4, y4]),
+ np.array([x2, y2, x3, y3, x4, y4, x1, y1]),
+ np.array([x3, y3, x4, y4, x1, y1, x2, y2]),
+ np.array([x4, y4, x1, y1, x2, y2, x3, y3])
+ ]
+ dst_coordinate = np.array(poly2)
+ distances = np.array(
+ [np.sum((coord - dst_coordinate)**2) for coord in combinate])
+ sorted = distances.argsort()
+ return combinate[sorted[0]]
+
+
+def cal_line_length(point1, point2):
+ return math.sqrt(
+ math.pow(point1[0] - point2[0], 2) + math.pow(point1[1] - point2[1], 2))
+
+
+class SliceBase(object):
+ def __init__(self,
+ gap=512,
+ subsize=1024,
+ thresh=0.7,
+ choosebestpoint=True,
+ ext='.png',
+ padding=True,
+ num_process=8,
+ image_only=False):
+ self.gap = gap
+ self.subsize = subsize
+ self.slide = subsize - gap
+ self.thresh = thresh
+ self.choosebestpoint = choosebestpoint
+ self.ext = ext
+ self.padding = padding
+ self.num_process = num_process
+ self.image_only = image_only
+
+ def get_windows(self, height, width):
+ windows = []
+ left, up = 0, 0
+ while (left < width):
+ if (left + self.subsize >= width):
+ left = max(width - self.subsize, 0)
+ up = 0
+ while (up < height):
+ if (up + self.subsize >= height):
+ up = max(height - self.subsize, 0)
+ right = min(left + self.subsize, width - 1)
+ down = min(up + self.subsize, height - 1)
+ windows.append((left, up, right, down))
+ if (up + self.subsize >= height):
+ break
+ else:
+ up = up + self.slide
+ if (left + self.subsize >= width):
+ break
+ else:
+ left = left + self.slide
+
+ return windows
+
+ def slice_image_single(self, image, windows, output_dir, output_name):
+ image_dir = os.path.join(output_dir, 'images')
+ for (left, up, right, down) in windows:
+ image_name = output_name + str(left) + '___' + str(up) + self.ext
+ subimg = copy.deepcopy(image[up:up + self.subsize, left:left +
+ self.subsize])
+ h, w, c = subimg.shape
+ if (self.padding):
+ outimg = np.zeros((self.subsize, self.subsize, 3))
+ outimg[0:h, 0:w, :] = subimg
+ cv2.imwrite(os.path.join(image_dir, image_name), outimg)
+ else:
+ cv2.imwrite(os.path.join(image_dir, image_name), subimg)
+
+ def iof(self, poly1, poly2):
+ inter_poly = poly1.intersection(poly2)
+ inter_area = inter_poly.area
+ poly1_area = poly1.area
+ half_iou = inter_area / poly1_area
+ return inter_poly, half_iou
+
+ def translate(self, poly, left, up):
+ n = len(poly)
+ out_poly = np.zeros(n)
+ for i in range(n // 2):
+ out_poly[i * 2] = int(poly[i * 2] - left)
+ out_poly[i * 2 + 1] = int(poly[i * 2 + 1] - up)
+ return out_poly
+
+ def get_poly4_from_poly5(self, poly):
+ distances = [
+ cal_line_length((poly[i * 2], poly[i * 2 + 1]),
+ (poly[(i + 1) * 2], poly[(i + 1) * 2 + 1]))
+ for i in range(int(len(poly) / 2 - 1))
+ ]
+ distances.append(
+ cal_line_length((poly[0], poly[1]), (poly[8], poly[9])))
+ pos = np.array(distances).argsort()[0]
+ count = 0
+ out_poly = []
+ while count < 5:
+ if (count == pos):
+ out_poly.append(
+ (poly[count * 2] + poly[(count * 2 + 2) % 10]) / 2)
+ out_poly.append(
+ (poly[(count * 2 + 1) % 10] + poly[(count * 2 + 3) % 10]) /
+ 2)
+ count = count + 1
+ elif (count == (pos + 1) % 5):
+ count = count + 1
+ continue
+
+ else:
+ out_poly.append(poly[count * 2])
+ out_poly.append(poly[count * 2 + 1])
+ count = count + 1
+ return out_poly
+
+ def slice_anno_single(self, annos, windows, output_dir, output_name):
+ anno_dir = os.path.join(output_dir, 'labelTxt')
+ for (left, up, right, down) in windows:
+ image_poly = shgeo.Polygon(
+ [(left, up), (right, up), (right, down), (left, down)])
+ anno_file = output_name + str(left) + '___' + str(up) + '.txt'
+ with open(os.path.join(anno_dir, anno_file), 'w') as f:
+ for anno in annos:
+ gt_poly = shgeo.Polygon(
+ [(anno['poly'][0], anno['poly'][1]),
+ (anno['poly'][2], anno['poly'][3]),
+ (anno['poly'][4], anno['poly'][5]),
+ (anno['poly'][6], anno['poly'][7])])
+ if gt_poly.area <= 0:
+ continue
+ inter_poly, iof = self.iof(gt_poly, image_poly)
+ if iof == 1:
+ final_poly = self.translate(anno['poly'], left, up)
+ elif iof > 0:
+ inter_poly = shgeo.polygon.orient(inter_poly, sign=1)
+ out_poly = list(inter_poly.exterior.coords)[0:-1]
+ if len(out_poly) < 4 or len(out_poly) > 5:
+ continue
+
+ final_poly = []
+ for p in out_poly:
+ final_poly.append(p[0])
+ final_poly.append(p[1])
+
+ if len(out_poly) == 5:
+ final_poly = self.get_poly4_from_poly5(final_poly)
+
+ if self.choosebestpoint:
+ final_poly = choose_best_pointorder_fit_another(
+ final_poly, anno['poly'])
+
+ final_poly = self.translate(final_poly, left, up)
+ final_poly = np.clip(final_poly, 1, self.subsize)
+ else:
+ continue
+ outline = ' '.join(list(map(str, final_poly)))
+ if iof >= self.thresh:
+ outline = outline + ' ' + anno['name'] + ' ' + str(anno[
+ 'difficult'])
+ else:
+ outline = outline + ' ' + anno['name'] + ' ' + '2'
+
+ f.write(outline + '\n')
+
+ def slice_data_single(self, info, rate, output_dir):
+ file_name = info['image_file']
+ base_name = os.path.splitext(os.path.split(file_name)[-1])[0]
+ base_name = base_name + '__' + str(rate) + '__'
+ img = cv2.imread(file_name)
+ if img.shape == ():
+ return
+
+ if (rate != 1):
+ resize_img = cv2.resize(
+ img, None, fx=rate, fy=rate, interpolation=cv2.INTER_CUBIC)
+ else:
+ resize_img = img
+
+ height, width, _ = resize_img.shape
+ windows = self.get_windows(height, width)
+ self.slice_image_single(resize_img, windows, output_dir, base_name)
+ if not self.image_only:
+ annos = info['annotation']
+ for anno in annos:
+ anno['poly'] = list(map(lambda x: rate * x, anno['poly']))
+ self.slice_anno_single(annos, windows, output_dir, base_name)
+
+ def check_or_mkdirs(self, path):
+ if not os.path.exists(path):
+ os.makedirs(path, exist_ok=True)
+
+ def slice_data(self, infos, rates, output_dir):
+ """
+ Args:
+ infos (list[dict]): data_infos
+ rates (float, list): scale rates
+ output_dir (str): output directory
+ """
+ if isinstance(rates, Number):
+ rates = [rates, ]
+
+ self.check_or_mkdirs(output_dir)
+ self.check_or_mkdirs(os.path.join(output_dir, 'images'))
+ if not self.image_only:
+ self.check_or_mkdirs(os.path.join(output_dir, 'labelTxt'))
+
+ pbar = tqdm(total=len(rates) * len(infos), desc='slicing data')
+
+ if self.num_process <= 1:
+ for rate in rates:
+ for info in infos:
+ self.slice_data_single(info, rate, output_dir)
+ pbar.update()
+ else:
+ pool = Pool(self.num_process)
+ for rate in rates:
+ for info in infos:
+ pool.apply_async(
+ self.slice_data_single, (info, rate, output_dir),
+ callback=lambda x: pbar.update())
+
+ pool.close()
+ pool.join()
+
+ pbar.close()
diff --git a/PaddleDetection-release-2.6/configs/runtime.yml b/PaddleDetection-release-2.6/configs/runtime.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a58b171ce774e045f4db2e0894a6781a25e0ec03
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/runtime.yml
@@ -0,0 +1,16 @@
+use_gpu: true
+use_xpu: false
+use_mlu: false
+use_npu: false
+log_iter: 20
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+print_params: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+ fuse_conv_bn: False
diff --git a/PaddleDetection-release-2.6/configs/semi_det/README.md b/PaddleDetection-release-2.6/configs/semi_det/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..996a1decfec0328420654d2d39d930ea2c7fdc0f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/README.md
@@ -0,0 +1,417 @@
+简体中文 | [English](README_en.md)
+
+# Semi-Supervised Detection (Semi DET) 半监督检测
+
+## 内容
+- [简介](#简介)
+- [模型库](#模型库)
+ - [Baseline](#Baseline)
+ - [DenseTeacher](#DenseTeacher)
+- [半监督数据集准备](#半监督数据集准备)
+- [半监督检测配置](#半监督检测配置)
+ - [训练集配置](#训练集配置)
+ - [预训练配置](#预训练配置)
+ - [全局配置](#全局配置)
+ - [模型配置](#模型配置)
+ - [数据增强配置](#数据增强配置)
+ - [其他配置](#其他配置)
+- [使用说明](#使用说明)
+ - [训练](#训练)
+ - [评估](#评估)
+ - [预测](#预测)
+ - [部署](#部署)
+- [引用](#引用)
+
+## 简介
+半监督目标检测(Semi DET)是**同时使用有标注数据和无标注数据**进行训练的目标检测,既可以极大地节省标注成本,也可以充分利用无标注数据进一步提高检测精度。PaddleDetection团队复现了[DenseTeacher](denseteacher)半监督检测算法,用户可以下载使用。
+
+## 模型库
+
+### [Baseline](baseline)
+
+**纯监督数据**模型的训练和模型库,请参照[Baseline](baseline);
+
+
+### [DenseTeacher](denseteacher)
+
+| 模型 | 监督数据比例 | Sup Baseline | Sup Epochs (Iters) | Sup mAPval
0.5:0.95 | Semi mAPval
0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
+| :------------: | :---------: | :---------------------: | :---------------------: |:---------------------------: |:----------------------------: | :------------------: |:--------: |:----------: |
+| DenseTeacher-FCOS | 5% | [sup_config](./baseline/fcos_r50_fpn_2x_coco_sup005.yml) | 24 (8712) | 21.3 | **30.6** | 240 (87120) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi005.pdparams) | [config](denseteacher/denseteacher_fcos_r50_fpn_coco_semi005.yml) |
+| DenseTeacher-FCOS | 10% | [sup_config](./baseline/fcos_r50_fpn_2x_coco_sup010.yml) | 24 (17424) | 26.3 | **35.1** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010.pdparams) | [config](denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml) |
+| DenseTeacher-FCOS(LSJ)| 10% | [sup_config](./baseline/fcos_r50_fpn_2x_coco_sup010.yml) | 24 (17424) | 26.3 | **37.1(LSJ)** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010_lsj.pdparams) | [config](denseteacher/denseteacher_fcos_r50_fpn_coco_semi010_lsj.yml) |
+| DenseTeacher-FCOS |100%(full)| [sup_config](./../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.ymll) | 24 (175896) | 42.6 | **44.2** | 24 (175896)| [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_full.pdparams) | [config](denseteacher/denseteacher_fcos_r50_fpn_coco_full.yml) |
+
+
+## 半监督数据集准备
+
+半监督目标检测**同时需要有标注数据和无标注数据**,且无标注数据量一般**远多于有标注数据量**。
+对于COCO数据集一般有两种常规设置:
+
+(1)抽取部分比例的原始训练集`train2017`作为标注数据和无标注数据;
+
+从`train2017`中按固定百分比(1%、2%、5%、10%等)抽取,由于抽取方法会对半监督训练的结果影响较大,所以采用五折交叉验证来评估。运行数据集划分制作的脚本如下:
+```bash
+python tools/gen_semi_coco.py
+```
+会按照 1%、2%、5%、10% 的监督数据比例来划分`train2017`全集,为了交叉验证每一种划分会随机重复5次,生成的半监督标注文件如下:
+- 标注数据集标注:`instances_train2017.{fold}@{percent}.json`
+- 无标注数据集标注:`instances_train2017.{fold}@{percent}-unlabeled.json`
+其中,`fold` 表示交叉验证,`percent` 表示有标注数据的百分比。
+
+注意如果根据`txt_file`生成,需要下载`COCO_supervision.txt`:
+```shell
+wget https://bj.bcebos.com/v1/paddledet/data/coco/COCO_supervision.txt
+```
+
+(2)使用全量原始训练集`train2017`作为有标注数据 和 全量原始无标签图片集`unlabeled2017`作为无标注数据;
+
+
+### 下载链接
+
+PaddleDetection团队提供了COCO数据集全部的标注文件,请下载并解压存放至对应目录:
+
+```shell
+# 下载COCO全量数据集图片和标注
+# 包括 train2017, val2017, annotations
+wget https://bj.bcebos.com/v1/paddledet/data/coco.tar
+
+# 下载PaddleDetection团队整理的COCO部分比例数据的标注文件
+wget https://bj.bcebos.com/v1/paddledet/data/coco/semi_annotations.zip
+
+# unlabeled2017是可选,如果不需要训‘full’则无需下载
+# 下载COCO全量 unlabeled 无标注数据集
+wget https://bj.bcebos.com/v1/paddledet/data/coco/unlabeled2017.zip
+wget https://bj.bcebos.com/v1/paddledet/data/coco/image_info_unlabeled2017.zip
+# 下载转换完的 unlabeled2017 无标注json文件
+wget https://bj.bcebos.com/v1/paddledet/data/coco/instances_unlabeled2017.zip
+```
+
+如果需要用到COCO全量unlabeled无标注数据集,需要将原版的`image_info_unlabeled2017.json`进行格式转换,运行以下代码:
+
+
+ COCO unlabeled 标注转换代码:
+
+```python
+import json
+anns_train = json.load(open('annotations/instances_train2017.json', 'r'))
+anns_unlabeled = json.load(open('annotations/image_info_unlabeled2017.json', 'r'))
+unlabeled_json = {
+ 'images': anns_unlabeled['images'],
+ 'annotations': [],
+ 'categories': anns_train['categories'],
+}
+path = 'annotations/instances_unlabeled2017.json'
+with open(path, 'w') as f:
+ json.dump(unlabeled_json, f)
+```
+
+
+
+
+
+ 解压后的数据集目录如下:
+
+```
+PaddleDetection
+├── dataset
+│ ├── coco
+│ │ ├── annotations
+│ │ │ ├── instances_train2017.json
+│ │ │ ├── instances_unlabeled2017.json
+│ │ │ ├── instances_val2017.json
+│ │ ├── semi_annotations
+│ │ │ ├── instances_train2017.1@1.json
+│ │ │ ├── instances_train2017.1@1-unlabeled.json
+│ │ │ ├── instances_train2017.1@2.json
+│ │ │ ├── instances_train2017.1@2-unlabeled.json
+│ │ │ ├── instances_train2017.1@5.json
+│ │ │ ├── instances_train2017.1@5-unlabeled.json
+│ │ │ ├── instances_train2017.1@10.json
+│ │ │ ├── instances_train2017.1@10-unlabeled.json
+│ │ ├── train2017
+│ │ ├── unlabeled2017
+│ │ ├── val2017
+```
+
+
+
+## 半监督检测配置
+
+配置半监督检测,需要基于选用的**基础检测器**的配置文件,如:
+
+```python
+_BASE_: [
+ '../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
+ '../_base_/coco_detection_percent_10.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+epochs: &epochs 240
+weights: output/denseteacher_fcos_r50_fpn_coco_semi010/model_final
+```
+并依次做出如下几点改动:
+
+### 训练集配置
+
+首先可以直接引用已经配置好的半监督训练集,如:
+
+```python
+_BASE_: [
+ '../_base_/coco_detection_percent_10.yml',
+]
+```
+
+具体来看,构建半监督数据集,需要同时配置监督数据集`TrainDataset`和无监督数据集`UnsupTrainDataset`的路径,**注意必须选用`SemiCOCODataSet`类而不是`COCODataSet`类**,如以下所示:
+
+**COCO-train2017部分比例数据集**:
+
+```python
+# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+TrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@10.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+UnsupTrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@10-unlabeled.json
+ dataset_dir: dataset/coco
+ data_fields: ['image']
+ supervised: False
+```
+
+或者 **COCO-train2017 full全量数据集**:
+
+```python
+# full labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+TrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+# full unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+UnsupTrainDataset:
+ !SemiCOCODataSet
+ image_dir: unlabeled2017
+ anno_path: annotations/instances_unlabeled2017.json
+ dataset_dir: dataset/coco
+ data_fields: ['image']
+ supervised: False
+```
+
+验证集`EvalDataset`和测试集`TestDataset`的配置**不需要更改**,且还是采用`COCODataSet`类。
+
+
+### 预训练配置
+
+```python
+### pretrain and warmup config, choose one and comment another
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+semi_start_iters: 5000
+ema_start_iters: 3000
+use_warmup: &use_warmup True
+```
+
+**注意:**
+ - `Dense Teacher`原文使用`R50-va-caffe`预训练,PaddleDetection中默认使用`R50-vb`预训练,如果使用`R50-vd`结合[SSLD](../../../docs/feature_models/SSLD_PRETRAINED_MODEL.md)的预训练模型,可进一步显著提升检测精度,同时backbone部分配置也需要做出相应更改,如:
+ ```python
+ pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+ ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1, 2, 3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+```
+
+### 全局配置
+
+需要在配置文件中添加如下全局配置,并且注意 DenseTeacher 模型需要使用`use_simple_ema: True`而不是`use_ema: True`:
+
+```python
+### global config
+use_simple_ema: True
+ema_decay: 0.9996
+ssod_method: DenseTeacher
+DenseTeacher:
+ train_cfg:
+ sup_weight: 1.0
+ unsup_weight: 1.0
+ loss_weight: {distill_loss_cls: 4.0, distill_loss_box: 1.0, distill_loss_quality: 1.0}
+ concat_sup_data: True
+ suppress: linear
+ ratio: 0.01
+ gamma: 2.0
+ test_cfg:
+ inference_on: teacher
+```
+
+### 模型配置
+
+如果没有特殊改动,则直接继承自基础检测器里的模型配置。
+以 `DenseTeacher` 为例,选择 `fcos_r50_fpn_iou_multiscale_2x_coco.yml` 作为**基础检测器**进行半监督训练,**teacher网络的结构和student网络的结构均为基础检测器的结构,且结构相同**。
+
+```python
+_BASE_: [
+ '../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
+]
+```
+
+### 数据增强配置
+
+构建半监督训练集的Reader,需要在原先`TrainReader`的基础上,新增加`weak_aug`,`strong_aug`,`sup_batch_transforms`和`unsup_batch_transforms`,并且需要注意:
+- 如果有`NormalizeImage`,需要单独从`sample_transforms`中抽出来放在`weak_aug`和`strong_aug`中;
+- `sample_transforms`为**公用的基础数据增强**;
+- 完整的弱数据增强为`sample_transforms + weak_aug`,完整的强数据增强为`sample_transforms + strong_aug`;
+
+如以下所示:
+
+原纯监督模型的`TrainReader`:
+```python
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
+ - RandomFlip: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ norm_reg_targets: True
+ batch_size: 2
+ shuffle: True
+ drop_last: True
+```
+
+更改后的半监督TrainReader:
+
+```python
+### reader config
+SemiTrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
+ - RandomFlip: {}
+ weak_aug:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ strong_aug:
+ - StrongAugImage: {transforms: [
+ RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
+ RandomErasingCrop: {},
+ RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
+ RandomGrayscale: {prob: 0.2},
+ ]}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ sup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ norm_reg_targets: True
+ unsup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ sup_batch_size: 2
+ unsup_batch_size: 2
+ shuffle: True
+ drop_last: True
+```
+
+### 其他配置
+
+训练epoch数需要和全量数据训练时换算总iter数保持一致,如全量训练24 epoch(换算约为180k个iter),则10%监督数据的半监督训练,总epoch数需要为240 epoch左右(换算约为180k个iter)。示例如下:
+
+```python
+### other config
+epoch: 240
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: 240
+ use_warmup: True
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+ clip_grad_by_value: 1.0
+```
+
+
+## 使用说明
+
+仅训练时必须使用半监督检测的配置文件去训练,评估、预测、部署也可以按基础检测器的配置文件去执行。
+
+### 训练
+
+```bash
+# 单卡训练 (不推荐,需按线性比例相应地调整学习率)
+CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml --eval
+
+# 多卡训练
+python -m paddle.distributed.launch --log_dir=denseteacher_fcos_semi010/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml --eval
+```
+
+### 评估
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=output/denseteacher_fcos_r50_fpn_coco_semi010/model_final.pdparams
+```
+
+### 预测
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=output/denseteacher_fcos_r50_fpn_coco_semi010/model_final.pdparams --infer_img=demo/000000014439.jpg
+```
+
+### 部署
+
+部署可以使用半监督检测配置文件,也可以使用基础检测器的配置文件去部署和使用。
+
+```bash
+# 导出模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010.pdparams
+
+# 导出权重预测
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/denseteacher_fcos_r50_fpn_coco_semi010 --image_file=demo/000000014439_640x640.jpg --device=GPU
+
+# 部署测速
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/denseteacher_fcos_r50_fpn_coco_semi010 --image_file=demo/000000014439_640x640.jpg --device=GPU --run_benchmark=True # --run_mode=trt_fp16
+
+# 导出ONNX
+paddle2onnx --model_dir output_inference/denseteacher_fcos_r50_fpn_coco_semi010/ --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file denseteacher_fcos_r50_fpn_coco_semi010.onnx
+```
+
+
+## 引用
+
+```
+ @article{denseteacher2022,
+ title={Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection},
+ author={Hongyu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun},
+ journal={arXiv preprint arXiv:2207.02541},
+ year={2022}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_full.yml b/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_full.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2805f88c879b8a4b0616fcf587878799fbb42b43
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_full.yml
@@ -0,0 +1,31 @@
+metric: COCO
+num_classes: 80
+
+# full labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+TrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+# full unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+UnsupTrainDataset:
+ !SemiCOCODataSet
+ image_dir: unlabeled2017
+ anno_path: annotations/instances_unlabeled2017.json
+ dataset_dir: dataset/coco
+ data_fields: ['image']
+ supervised: False
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+ allow_empty: true
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_percent_1.yml b/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_percent_1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..569b8e9dc922b9ba290b96bc734e35840d74f551
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_percent_1.yml
@@ -0,0 +1,31 @@
+metric: COCO
+num_classes: 80
+
+# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+TrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@1.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+UnsupTrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@1-unlabeled.json
+ dataset_dir: dataset/coco
+ data_fields: ['image']
+ supervised: False
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+ allow_empty: true
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_percent_10.yml b/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_percent_10.yml
new file mode 100644
index 0000000000000000000000000000000000000000..58746017866851b72b0a10ca0069de30e3e88440
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_percent_10.yml
@@ -0,0 +1,31 @@
+metric: COCO
+num_classes: 80
+
+# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+TrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@10.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+UnsupTrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@10-unlabeled.json
+ dataset_dir: dataset/coco
+ data_fields: ['image']
+ supervised: False
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+ allow_empty: true
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_percent_5.yml b/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_percent_5.yml
new file mode 100644
index 0000000000000000000000000000000000000000..01d5fde1b22ef51d5a41ebfb83f10cd38683c7cd
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/_base_/coco_detection_percent_5.yml
@@ -0,0 +1,31 @@
+metric: COCO
+num_classes: 80
+
+# partial labeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+TrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@5.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+# partial unlabeled COCO, use `SemiCOCODataSet` rather than `COCODataSet`
+UnsupTrainDataset:
+ !SemiCOCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@5-unlabeled.json
+ dataset_dir: dataset/coco
+ data_fields: ['image']
+ supervised: False
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco
+ allow_empty: true
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/instances_val2017.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/coco # if set, anno_path will be 'dataset_dir/anno_path'
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/README.md b/PaddleDetection-release-2.6/configs/semi_det/baseline/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..457ad7f7cdba66b83b55c1974b6867d9982dff86
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/README.md
@@ -0,0 +1,81 @@
+# Supervised Baseline 纯监督模型基线
+
+## COCO数据集模型库
+
+### [FCOS](../../fcos)
+
+| 基础模型 | 监督数据比例 | Epochs (Iters) | mAPval
0.5:0.95 | 模型下载 | 配置文件 |
+| :---------------: | :-------------: | :---------------: |:---------------------: |:--------: | :---------: |
+| FCOS ResNet50-FPN | 5% | 24 (8712) | 21.3 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_2x_coco_sup005.pdparams) | [config](fcos_r50_fpn_2x_coco_sup005.yml) |
+| FCOS ResNet50-FPN | 10% | 24 (17424) | 26.3 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_2x_coco_sup010.pdparams) | [config](fcos_r50_fpn_2x_coco_sup010.yml) |
+| FCOS ResNet50-FPN | full | 24 (175896) | 42.6 | [download](https://paddledet.bj.bcebos.com/models/fcos_r50_fpn_iou_multiscale_2x_coco.pdparams) | [config](../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml) |
+
+**注意:**
+ - 以上模型训练默认使用8 GPUs,总batch_size默认为16,默认初始学习率为0.01。如果改动了总batch_size,请按线性比例相应地调整学习率。
+
+
+### [PP-YOLOE+](../../ppyoloe)
+
+| 基础模型 | 监督数据比例 | Epochs (Iters) | mAPval
0.5:0.95 | 模型下载 | 配置文件 |
+| :---------------: | :-------------: | :---------------: | :---------------------: |:--------: | :---------: |
+| PP-YOLOE+_s | 5% | 80 (7200) | 32.8 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco_sup005.pdparams) | [config](ppyoloe_plus_crn_s_80e_coco_sup005.yml) |
+| PP-YOLOE+_s | 10% | 80 (14480) | 35.3 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco_sup010.pdparams) | [config](ppyoloe_plus_crn_s_80e_coco_sup010.yml) |
+| PP-YOLOE+_s | full | 80 (146560) | 43.7 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams) | [config](../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml) |
+| PP-YOLOE+_l | 5% | 80 (7200) | 42.9 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco_sup005.pdparams) | [config](ppyoloe_plus_crn_l_80e_coco_sup005.yml) |
+| PP-YOLOE+_l | 10% | 80 (14480) | 45.7 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco_sup010.pdparams) | [config](ppyoloe_plus_crn_l_80e_coco_sup010.yml) |
+| PP-YOLOE+_l | full | 80 (146560) | 49.8 | [download](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) | [config](../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) |
+
+**注意:**
+ - 以上模型训练默认使用8 GPUs,总batch_size默认为64,默认初始学习率为0.001。如果改动了总batch_size,请按线性比例相应地调整学习率。
+
+
+### [Faster R-CNN](../../faster_rcnn)
+
+| 基础模型 | 监督数据比例 | Epochs (Iters) | mAPval
0.5:0.95 | 模型下载 | 配置文件 |
+| :---------------: | :-------------: | :---------------: | :---------------------: |:--------: | :---------: |
+| Faster R-CNN ResNet50-FPN | 5% | 24 (8712) | 20.7 | [download](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_2x_coco_sup005.pdparams) | [config](faster_rcnn_r50_fpn_2x_coco_sup005.yml) |
+| Faster R-CNN ResNet50-FPN | 10% | 24 (17424) | 25.6 | [download](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_2x_coco_sup010.pdparams) | [config](faster_rcnn_r50_fpn_2x_coco_sup010.yml) |
+| Faster R-CNN ResNet50-FPN | full | 24 (175896) | 40.0 | [download](https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_2x_coco.pdparams) | [config](../../configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml) |
+
+**注意:**
+ - 以上模型训练默认使用8 GPUs,总batch_size默认为16,默认初始学习率为0.02。如果改动了总batch_size,请按线性比例相应地调整学习率。
+
+
+### [RetinaNet](../../retinanet)
+
+| 基础模型 | 监督数据比例 | Epochs (Iters) | mAPval
0.5:0.95 | 模型下载 | 配置文件 |
+| :---------------: | :-------------: | :---------------: | :---------------------: |:--------: | :---------: |
+| RetinaNet ResNet50-FPN | 5% | 24 (8712) | 13.9 | [download](https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco_sup005.pdparams) | [config](retinanet_r50_fpn_2x_coco_sup005.yml) |
+| RetinaNet ResNet50-FPN | 10% | 24 (17424) | 23.6 | [download](https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco_sup010.pdparams) | [config](retinanet_r50_fpn_2x_coco_sup010.yml) |
+| RetinaNet ResNet50-FPN | full | 24 (175896) | 39.1 | [download](https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco.pdparams) | [config](../../configs/retinanet/retinanet_r50_fpn_2x_coco.yml) |
+
+**注意:**
+ - 以上模型训练默认使用8 GPUs,总batch_size默认为16,默认初始学习率为0.01。如果改动了总batch_size,请按线性比例相应地调整学习率。
+
+
+### 注意事项
+ - COCO部分监督数据集请参照 [数据集准备](../README.md) 去下载和准备,各个比例的训练集均为**从train2017中抽取部分百分比的子集**,默认使用`fold`号为1的划分子集,`sup010`表示抽取10%的监督数据训练,`sup005`表示抽取5%,`full`表示全部train2017,验证集均为val2017全量;
+ - 抽取部分百分比的监督数据的抽法不同,或使用的`fold`号不同,精度都会因此而有约0.5 mAP之多的差异;
+ - PP-YOLOE+ 使用Objects365预训练,其余模型均使用ImageNet预训练;
+ - 线型比例相应调整学习率,参照公式: **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)**。
+
+
+## 使用教程
+
+将以下命令写在一个脚本文件里如```run.sh```,一键运行命令为:```sh run.sh```,也可命令行一句句去运行:
+
+```bash
+model_type=semi_det/baseline
+job_name=ppyoloe_plus_crn_s_80e_coco_sup010 # 可修改,如 fcos_r50_fpn_2x_coco_sup010
+
+config=configs/${model_type}/${job_name}.yml
+log_dir=log_dir/${job_name}
+weights=output/${job_name}/model_final.pdparams
+
+# 1.training
+# CUDA_VISIBLE_DEVICES=0 python tools/train.py -c ${config}
+python -m paddle.distributed.launch --log_dir=${log_dir} --gpus 0,1,2,3,4,5,6,7 tools/train.py -c ${config} --eval --amp
+
+# 2.eval
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c ${config} -o weights=${weights}
+```
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/faster_rcnn_r50_fpn_2x_coco_sup005.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/faster_rcnn_r50_fpn_2x_coco_sup005.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d0e4cf7022b643ee45303efef553f0a6ba71c8a5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/faster_rcnn_r50_fpn_2x_coco_sup005.yml
@@ -0,0 +1,42 @@
+_BASE_: [
+ '../../faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 2
+weights: output/faster_rcnn_r50_fpn_2x_coco_sup005/model_final
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@5.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ epochs: 1
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/faster_rcnn_r50_fpn_2x_coco_sup010.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/faster_rcnn_r50_fpn_2x_coco_sup010.yml
new file mode 100644
index 0000000000000000000000000000000000000000..80136304b9beaef77c995298824da06d2204d7eb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/faster_rcnn_r50_fpn_2x_coco_sup010.yml
@@ -0,0 +1,42 @@
+_BASE_: [
+ '../../faster_rcnn/faster_rcnn_r50_fpn_2x_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 2
+weights: output/faster_rcnn_r50_fpn_2x_coco_sup010/model_final
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@10.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.02
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ epochs: 1
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/fcos_r50_fpn_2x_coco_sup005.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/fcos_r50_fpn_2x_coco_sup005.yml
new file mode 100644
index 0000000000000000000000000000000000000000..de9982a8c3a1c17dc69ed17cc7f9f4099cf58285
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/fcos_r50_fpn_2x_coco_sup005.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 2
+weights: output/fcos_r50_fpn_2x_coco_sup005/model_final
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@5.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 1
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/fcos_r50_fpn_2x_coco_sup010.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/fcos_r50_fpn_2x_coco_sup010.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3636ae8bbc9eafcd8edcc6d0f6ad3262b34ebb8f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/fcos_r50_fpn_2x_coco_sup010.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 2
+weights: output/fcos_r50_fpn_2x_coco_sup010/model_final
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@10.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 1
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_l_80e_coco_sup005.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_l_80e_coco_sup005.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4dd4a898e4afaed66c0f3bf27b1991316d965999
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_l_80e_coco_sup005.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_l_80e_coco_sup005/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@5.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_l_80e_coco_sup010.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_l_80e_coco_sup010.yml
new file mode 100644
index 0000000000000000000000000000000000000000..647252175cc3aaa9dbb8edb6c5b48b7d4568cd5b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_l_80e_coco_sup010.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_l_80e_coco_sup010/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@10.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_s_80e_coco_sup005.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_s_80e_coco_sup005.yml
new file mode 100644
index 0000000000000000000000000000000000000000..88de96dcc44b7e5c2e7bee42c14ab6358ff308d1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_s_80e_coco_sup005.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_s_80e_coco_sup005/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@5.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_s_80e_coco_sup010.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_s_80e_coco_sup010.yml
new file mode 100644
index 0000000000000000000000000000000000000000..aeb9435a0fee9ab502185f81a6b3710443471c89
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/ppyoloe_plus_crn_s_80e_coco_sup010.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_crn_s_80e_coco_sup010/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@10.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 5
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/retinanet_r50_fpn_2x_coco_sup005.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/retinanet_r50_fpn_2x_coco_sup005.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d901ea26e9c811395b35f447a74664de9eb72c91
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/retinanet_r50_fpn_2x_coco_sup005.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../../retinanet/retinanet_r50_fpn_2x_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 2
+weights: output/retinanet_r50_fpn_2x_coco_sup005/model_final
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@5.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 1
diff --git a/PaddleDetection-release-2.6/configs/semi_det/baseline/retinanet_r50_fpn_2x_coco_sup010.yml b/PaddleDetection-release-2.6/configs/semi_det/baseline/retinanet_r50_fpn_2x_coco_sup010.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5480f3c57549f758f94fea5f5cccfc53142aa663
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/baseline/retinanet_r50_fpn_2x_coco_sup010.yml
@@ -0,0 +1,26 @@
+_BASE_: [
+ '../../retinanet/retinanet_r50_fpn_2x_coco.yml',
+]
+log_iter: 50
+snapshot_epoch: 2
+weights: output/retinanet_r50_fpn_2x_coco_sup010/model_final
+
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: semi_annotations/instances_train2017.1@10.json
+ dataset_dir: dataset/coco
+ data_fields: ['image', 'gt_bbox', 'gt_class']
+
+
+epoch: 24
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 1
diff --git a/PaddleDetection-release-2.6/configs/semi_det/denseteacher/README.md b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..7c629cc7c7c45cc8e23dc7bce6e3074f28abe585
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/README.md
@@ -0,0 +1,101 @@
+简体中文 | [English](README_en.md)
+
+# Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection
+
+## FCOS模型库
+
+| 模型 | 监督数据比例 | Sup Baseline | Sup Epochs (Iters) | Sup mAPval
0.5:0.95 | Semi mAPval
0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
+| :------------: | :---------: | :---------------------: | :---------------------: |:---------------------------: |:----------------------------: | :------------------: |:--------: |:----------: |
+| DenseTeacher-FCOS | 5% | [sup_config](../baseline/fcos_r50_fpn_2x_coco_sup005.yml) | 24 (8712) | 21.3 | **30.6** | 240 (87120) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi005.pdparams) | [config](./denseteacher_fcos_r50_fpn_coco_semi005.yml) |
+| DenseTeacher-FCOS | 10% | [sup_config](../baseline/fcos_r50_fpn_2x_coco_sup010.yml) | 24 (17424) | 26.3 | **35.1** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010.pdparams) | [config](./denseteacher_fcos_r50_fpn_coco_semi010.yml) |
+| DenseTeacher-FCOS(LSJ)| 10% | [sup_config](../baseline/fcos_r50_fpn_2x_coco_sup010.yml) | 24 (17424) | 26.3 | **37.1(LSJ)** | 240 (174240) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010_lsj.pdparams) | [config](./denseteacher_fcos_r50_fpn_coco_semi010_lsj.yml) |
+| DenseTeacher-FCOS |100%(full)| [sup_config](../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.ymll) | 24 (175896) | 42.6 | **44.2** | 24 (175896)| [download](https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_full.pdparams) | [config](./denseteacher_fcos_r50_fpn_coco_full.yml) |
+
+
+**注意:**
+ - 以上模型训练默认使用8 GPUs,监督数据总batch_size默认为16,无监督数据总batch_size默认也为16,默认初始学习率为0.01。如果改动了总batch_size,请按线性比例相应地调整学习率;
+ - **监督数据比例**是指使用的有标签COCO数据集占 COCO train2017 全量训练集的百分比,使用的无标签COCO数据集一般也是相同比例,但具体图片和有标签数据的图片不重合;
+ - `Semi Epochs (Iters)`表示**半监督训练**的模型的 Epochs (Iters),如果使用**自定义数据集**,需自行根据Iters换算到对应的Epochs调整,最好保证总Iters 和COCO数据集的设置较为接近;
+ - `Sup mAP`是**只使用有监督数据训练**的模型的精度,请参照**基础检测器的配置文件** 和 [baseline](../baseline);
+ - `Semi mAP`是**半监督训练**的模型的精度,模型下载和配置文件的链接均为**半监督模型**;
+ - `LSJ`表示 **large-scale jittering**,表示使用更大范围的多尺度训练,可进一步提升精度,但训练速度也会变慢;
+ - 半监督检测的配置讲解,请参照[文档](../README.md/#半监督检测配置);
+ - `Dense Teacher`原文使用`R50-va-caffe`预训练,PaddleDetection中默认使用`R50-vb`预训练,如果使用`R50-vd`结合[SSLD](../../../docs/feature_models/SSLD_PRETRAINED_MODEL.md)的预训练模型,可进一步显著提升检测精度,同时backbone部分配置也需要做出相应更改,如:
+ ```python
+ pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+ ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1, 2, 3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+ ```
+
+
+## PPYOLOE+ 模型库
+
+| 模型 | 监督数据比例 | Sup Baseline | Sup Epochs (Iters) | Sup mAPval
0.5:0.95 | Semi mAPval
0.5:0.95 | Semi Epochs (Iters) | 模型下载 | 配置文件 |
+| :------------: | :---------: | :---------------------: | :---------------------: |:---------------------------: |:----------------------------: | :------------------: |:--------: |:----------: |
+| DenseTeacher-PPYOLOE+_s | 5% | [sup_config](../baseline/ppyoloe_plus_crn_s_80e_coco_sup005.yml) | 80 (14480) | 32.8 | **34.0** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_s_coco_semi005.pdparams) | [config](./denseteacher_ppyoloe_plus_crn_s_coco_semi005.yml) |
+| DenseTeacher-PPYOLOE+_s | 10% | [sup_config](../baseline/ppyoloe_plus_crn_s_80e_coco_sup010.yml) | 80 (14480) | 35.3 | **37.5** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_s_coco_semi010.pdparams) | [config](./denseteacher_ppyoloe_plus_crn_s_coco_semi010.yml) |
+| DenseTeacher-PPYOLOE+_l | 5% | [sup_config](../baseline/ppyoloe_plus_crn_s_80e_coco_sup005.yml) | 80 (14480) | 42.9 | **45.4** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_l_coco_semi005.pdparams) | [config](./denseteacher_ppyoloe_plus_crn_l_coco_semi005.yml) |
+| DenseTeacher-PPYOLOE+_l | 10% | [sup_config](../baseline/ppyoloe_plus_crn_l_80e_coco_sup010.yml) | 80 (14480) | 45.7 | **47.4** | 200 (36200) | [download](https://paddledet.bj.bcebos.com/models/denseteacher_ppyoloe_plus_crn_l_coco_semi010.pdparams) | [config](./denseteacher_ppyoloe_plus_crn_l_coco_semi010.yml) |
+
+
+## 使用说明
+
+仅训练时必须使用半监督检测的配置文件去训练,评估、预测、部署也可以按基础检测器的配置文件去执行。
+
+### 训练
+
+```bash
+# 单卡训练 (不推荐,需按线性比例相应地调整学习率)
+CUDA_VISIBLE_DEVICES=0 python tools/train.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml --eval
+
+# 多卡训练
+python -m paddle.distributed.launch --log_dir=denseteacher_fcos_semi010/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml --eval
+```
+
+### 评估
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=output/denseteacher_fcos_r50_fpn_coco_semi010/model_final.pdparams
+```
+
+### 预测
+
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=output/denseteacher_fcos_r50_fpn_coco_semi010/model_final.pdparams --infer_img=demo/000000014439.jpg
+```
+
+### 部署
+
+部署可以使用半监督检测配置文件,也可以使用基础检测器的配置文件去部署和使用。
+
+```bash
+# 导出模型
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml -o weights=https://paddledet.bj.bcebos.com/models/denseteacher_fcos_r50_fpn_coco_semi010.pdparams
+
+# 导出权重预测
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/denseteacher_fcos_r50_fpn_coco_semi010 --image_file=demo/000000014439_640x640.jpg --device=GPU
+
+# 部署测速
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/denseteacher_fcos_r50_fpn_coco_semi010 --image_file=demo/000000014439_640x640.jpg --device=GPU --run_benchmark=True # --run_mode=trt_fp16
+
+# 导出ONNX
+paddle2onnx --model_dir output_inference/denseteacher_fcos_r50_fpn_coco_semi010/ --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file denseteacher_fcos_r50_fpn_coco_semi010.onnx
+```
+
+
+## 引用
+
+```
+ @article{denseteacher2022,
+ title={Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection},
+ author={Hongyu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun},
+ journal={arXiv preprint arXiv:2207.02541},
+ year={2022}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_full.yml b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_full.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1b15b222387dfcd94e2f933c34d6810ddace4f45
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_full.yml
@@ -0,0 +1,166 @@
+_BASE_: [
+ 'denseteacher_fcos_r50_fpn_coco_semi010.yml',
+ '../_base_/coco_detection_full.yml',
+]
+log_iter: 100
+snapshot_epoch: 2
+epochs: &epochs 24
+weights: output/denseteacher_fcos_r50_fpn_coco_full/model_final
+
+
+### pretrain and warmup config, choose one and comment another
+# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/fcos_r50_fpn_iou_multiscale_2x_coco.pdparams # mAP=42.6
+# semi_start_iters: 0
+# ema_start_iters: 0
+# use_warmup: &use_warmup False
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+semi_start_iters: 5000
+ema_start_iters: 3000
+use_warmup: &use_warmup True
+
+
+### global config
+use_simple_ema: True
+ema_decay: 0.9996
+ssod_method: DenseTeacher
+DenseTeacher:
+ train_cfg:
+ sup_weight: 1.0
+ unsup_weight: 1.0
+ loss_weight: {distill_loss_cls: 2.0, distill_loss_box: 1.0, distill_loss_quality: 1.0}
+ concat_sup_data: True
+ suppress: linear
+ ratio: 0.01
+ gamma: 2.0
+ test_cfg:
+ inference_on: teacher
+
+
+### reader config
+worker_num: 2
+SemiTrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
+ - RandomFlip: {}
+ weak_aug:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ strong_aug:
+ - StrongAugImage: {transforms: [
+ RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
+ RandomErasingCrop: {},
+ RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
+ RandomGrayscale: {prob: 0.2},
+ ]}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ sup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ num_shift: 0.5
+ norm_reg_targets: True
+ unsup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ sup_batch_size: 2
+ unsup_batch_size: 2
+ shuffle: True
+ drop_last: True
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ fuse_normalize: True
+
+
+### model config
+architecture: FCOS
+FCOS:
+ backbone: ResNet
+ neck: FPN
+ fcos_head: FCOSHead
+
+ResNet:
+ depth: 50
+ variant: 'b'
+ norm_type: bn
+ freeze_at: 0 # res2
+ return_idx: [1, 2, 3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ spatial_scales: [0.125, 0.0625, 0.03125]
+ extra_stage: 2
+ has_extra_convs: True
+ use_c5: False
+
+FCOSHead:
+ fcos_feat:
+ name: FCOSFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: "gn"
+ use_dcn: False
+ fpn_stride: [8, 16, 32, 64, 128]
+ prior_prob: 0.01
+ norm_reg_targets: True
+ centerness_on_reg: True
+ num_shift: 0.5
+ fcos_loss:
+ name: FCOSLoss
+ loss_alpha: 0.25
+ loss_gamma: 2.0
+ iou_loss_type: "giou"
+ reg_weights: 1.0
+ quality: "iou"
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
+
+
+### other config
+epoch: *epochs
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [*epochs]
+ use_warmup: *use_warmup
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+ clip_grad_by_value: 1.0
diff --git a/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi005.yml b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi005.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3efa1a04b82351673cd72a68415a7115e9759b38
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi005.yml
@@ -0,0 +1,164 @@
+_BASE_: [
+ '../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
+ '../_base_/coco_detection_percent_5.yml',
+]
+log_iter: 20
+snapshot_epoch: 5
+epochs: &epochs 240 # 480 will be better
+weights: output/denseteacher_fcos_r50_fpn_coco_semi005/model_final
+
+
+### pretrain and warmup config, choose one and comment another
+# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/fcos_r50_fpn_2x_coco_sup005.pdparams # mAP=21.3
+# semi_start_iters: 0
+# ema_start_iters: 0
+# use_warmup: &use_warmup False
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+semi_start_iters: 5000
+ema_start_iters: 3000
+use_warmup: &use_warmup True
+
+
+### global config
+use_simple_ema: True
+ema_decay: 0.9996
+ssod_method: DenseTeacher
+DenseTeacher:
+ train_cfg:
+ sup_weight: 1.0
+ unsup_weight: 1.0
+ loss_weight: {distill_loss_cls: 4.0, distill_loss_box: 1.0, distill_loss_quality: 1.0}
+ concat_sup_data: True
+ suppress: linear
+ ratio: 0.01
+ gamma: 2.0
+ test_cfg:
+ inference_on: teacher
+
+
+### reader config
+worker_num: 2
+SemiTrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
+ - RandomFlip: {}
+ weak_aug:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ strong_aug:
+ - StrongAugImage: {transforms: [
+ RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
+ RandomErasingCrop: {},
+ RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
+ RandomGrayscale: {prob: 0.2},
+ ]}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ sup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ norm_reg_targets: True
+ unsup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ sup_batch_size: 2
+ unsup_batch_size: 2
+ shuffle: True
+ drop_last: True
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ fuse_normalize: True
+
+
+### model config
+architecture: FCOS
+FCOS:
+ backbone: ResNet
+ neck: FPN
+ fcos_head: FCOSHead
+
+ResNet:
+ depth: 50
+ variant: 'b'
+ norm_type: bn
+ freeze_at: 0 # res2
+ return_idx: [1, 2, 3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ spatial_scales: [0.125, 0.0625, 0.03125]
+ extra_stage: 2
+ has_extra_convs: True
+ use_c5: False
+
+FCOSHead:
+ fcos_feat:
+ name: FCOSFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: "gn"
+ use_dcn: False
+ fpn_stride: [8, 16, 32, 64, 128]
+ prior_prob: 0.01
+ norm_reg_targets: True
+ centerness_on_reg: True
+ fcos_loss:
+ name: FCOSLoss
+ loss_alpha: 0.25
+ loss_gamma: 2.0
+ iou_loss_type: "giou"
+ reg_weights: 1.0
+ quality: "iou"
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
+
+
+### other config
+epoch: *epochs
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [*epochs]
+ use_warmup: *use_warmup
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+ clip_grad_by_value: 1.0
diff --git a/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml
new file mode 100644
index 0000000000000000000000000000000000000000..76d884ca20fb3cd819c5dc5aed954df4cfad0848
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010.yml
@@ -0,0 +1,169 @@
+_BASE_: [
+ '../../fcos/fcos_r50_fpn_iou_multiscale_2x_coco.yml',
+ '../_base_/coco_detection_percent_10.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+epochs: &epochs 240
+weights: output/denseteacher_fcos_r50_fpn_coco_semi010/model_final
+
+
+### pretrain and warmup config, choose one and comment another
+# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/fcos_r50_fpn_2x_coco_sup010.pdparams # mAP=26.3
+# semi_start_iters: 0
+# ema_start_iters: 0
+# use_warmup: &use_warmup False
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+semi_start_iters: 5000
+ema_start_iters: 3000
+use_warmup: &use_warmup True
+
+
+### global config
+use_simple_ema: True
+ema_decay: 0.9996
+ssod_method: DenseTeacher
+DenseTeacher:
+ train_cfg:
+ sup_weight: 1.0
+ unsup_weight: 1.0
+ loss_weight: {distill_loss_cls: 4.0, distill_loss_box: 1.0, distill_loss_quality: 1.0}
+ concat_sup_data: True
+ suppress: linear
+ ratio: 0.01
+ gamma: 2.0
+ test_cfg:
+ inference_on: teacher
+
+
+### reader config
+worker_num: 2
+SemiTrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 1}
+ - RandomFlip: {}
+ weak_aug:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ strong_aug:
+ - StrongAugImage: {transforms: [
+ RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
+ RandomErasingCrop: {},
+ RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
+ RandomGrayscale: {prob: 0.2},
+ ]}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ sup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ num_shift: 0. # default 0.5
+ multiply_strides_reg_targets: False
+ norm_reg_targets: True
+ unsup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ sup_batch_size: 2
+ unsup_batch_size: 2
+ shuffle: True
+ drop_last: True
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ fuse_normalize: True
+
+
+### model config
+architecture: FCOS
+FCOS:
+ backbone: ResNet
+ neck: FPN
+ fcos_head: FCOSHead
+
+ResNet:
+ depth: 50
+ variant: 'b'
+ norm_type: bn
+ freeze_at: 0 # res2
+ return_idx: [1, 2, 3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ spatial_scales: [0.125, 0.0625, 0.03125]
+ extra_stage: 2
+ has_extra_convs: True
+ use_c5: False
+
+FCOSHead:
+ fcos_feat:
+ name: FCOSFeat
+ feat_in: 256
+ feat_out: 256
+ num_convs: 4
+ norm_type: "gn"
+ use_dcn: False
+ fpn_stride: [8, 16, 32, 64, 128]
+ prior_prob: 0.01
+ norm_reg_targets: True
+ centerness_on_reg: True
+ num_shift: 0. # default 0.5
+ multiply_strides_reg_targets: False
+ sqrt_score: False
+ fcos_loss:
+ name: FCOSLoss
+ loss_alpha: 0.25
+ loss_gamma: 2.0
+ iou_loss_type: "giou"
+ reg_weights: 1.0
+ quality: "iou"
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
+
+
+### other config
+epoch: *epochs
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [*epochs]
+ use_warmup: *use_warmup
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+ clip_grad_by_value: 1.0
diff --git a/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010_lsj.yml b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010_lsj.yml
new file mode 100644
index 0000000000000000000000000000000000000000..32107c93f86dec016880ec4e6ba53adff21e47a1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_fcos_r50_fpn_coco_semi010_lsj.yml
@@ -0,0 +1,44 @@
+_BASE_: [
+ 'denseteacher_fcos_r50_fpn_coco_semi010.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+epochs: &epochs 240
+weights: output/denseteacher_fcos_r50_fpn_coco_semi010_lsj/model_final
+
+
+### reader config
+worker_num: 2
+SemiTrainReader:
+ sample_transforms:
+ - Decode: {}
+ # large-scale jittering
+ - RandomResize: {target_size: [[400, 1333], [1200, 1333]], keep_ratio: True, interp: 1, random_range: True}
+ - RandomFlip: {}
+ weak_aug:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ strong_aug:
+ - StrongAugImage: {transforms: [
+ RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
+ RandomErasingCrop: {},
+ RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
+ RandomGrayscale: {prob: 0.2},
+ ]}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ sup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2FCOSTarget:
+ object_sizes_boundary: [64, 128, 256, 512]
+ center_sampling_radius: 1.5
+ downsample_ratios: [8, 16, 32, 64, 128]
+ num_shift: 0. # default 0.5
+ multiply_strides_reg_targets: False
+ norm_reg_targets: True
+ unsup_batch_transforms:
+ - Permute: {}
+ - PadBatch: {pad_to_stride: 32}
+ sup_batch_size: 2
+ unsup_batch_size: 2
+ shuffle: True
+ drop_last: True
diff --git a/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_l_coco_semi005.yml b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_l_coco_semi005.yml
new file mode 100644
index 0000000000000000000000000000000000000000..920613fd9e092f3c53783f71571e93b2413a388f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_l_coco_semi005.yml
@@ -0,0 +1,151 @@
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml',
+ '../_base_/coco_detection_percent_5.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+weights: output/denseteacher_ppyoloe_plus_crn_l_coco_semi005/model_final
+
+epochs: &epochs 200
+cosine_epochs: &cosine_epochs 240
+
+
+### pretrain and warmup config, choose one and comment another
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco_sup005.pdparams # mAP=42.9
+semi_start_iters: 0
+ema_start_iters: 0
+use_warmup: &use_warmup False
+
+# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
+# semi_start_iters: 5000
+# ema_start_iters: 3000
+# use_warmup: &use_warmup True
+
+
+### global config
+use_simple_ema: True
+ema_decay: 0.9996
+ssod_method: DenseTeacher
+DenseTeacher:
+ train_cfg:
+ sup_weight: 1.0
+ unsup_weight: 1.0
+ loss_weight: {distill_loss_cls: 1.0, distill_loss_iou: 2.5, distill_loss_dfl: 0., distill_loss_contrast: 0.1}
+ contrast_loss:
+ temperature: 0.2
+ alpha: 0.9
+ smooth_iter: 100
+ concat_sup_data: True
+ suppress: linear
+ ratio: 0.01
+ test_cfg:
+ inference_on: teacher
+
+
+### reader config
+batch_size: &batch_size 8
+worker_num: 2
+SemiTrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomFlip: {}
+ - RandomCrop: {} # unsup will be fake gt_boxes
+ weak_aug:
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
+ strong_aug:
+ - StrongAugImage: {transforms: [
+ RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
+ RandomErasingCrop: {},
+ RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
+ RandomGrayscale: {prob: 0.2},
+ ]}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
+ sup_batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - Permute: {}
+ - PadGT: {}
+ unsup_batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - Permute: {}
+ sup_batch_size: *batch_size
+ unsup_batch_size: *batch_size
+ shuffle: True
+ drop_last: True
+ collate_batch: True
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+
+### model config
+architecture: PPYOLOE
+norm_type: sync_bn
+ema_black_list: ['proj_conv.weight']
+custom_black_list: ['reduce_mean']
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+eval_size: ~ # means None, but not str 'None'
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 #
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+
+### other config
+epoch: *epochs
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: *cosine_epochs
+ use_warmup: *use_warmup
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 3
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005 # dt-fcos 0.0001
+ type: L2
+ clip_grad_by_norm: 1.0 # dt-fcos clip_grad_by_value
diff --git a/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_l_coco_semi010.yml b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_l_coco_semi010.yml
new file mode 100644
index 0000000000000000000000000000000000000000..253a8c18ca773f9216aad9f32025261c3976ba38
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_l_coco_semi010.yml
@@ -0,0 +1,151 @@
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml',
+ '../_base_/coco_detection_percent_10.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+weights: output/denseteacher_ppyoloe_plus_crn_l_coco_semi010/model_final
+
+epochs: &epochs 200
+cosine_epochs: &cosine_epochs 240
+
+
+### pretrain and warmup config, choose one and comment another
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco_sup010.pdparams # mAP=45.7
+semi_start_iters: 0
+ema_start_iters: 0
+use_warmup: &use_warmup False
+
+# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
+# semi_start_iters: 5000
+# ema_start_iters: 3000
+# use_warmup: &use_warmup True
+
+
+### global config
+use_simple_ema: True
+ema_decay: 0.9996
+ssod_method: DenseTeacher
+DenseTeacher:
+ train_cfg:
+ sup_weight: 1.0
+ unsup_weight: 1.0
+ loss_weight: {distill_loss_cls: 1.0, distill_loss_iou: 2.5, distill_loss_dfl: 0., distill_loss_contrast: 0.1}
+ contrast_loss:
+ temperature: 0.2
+ alpha: 0.9
+ smooth_iter: 100
+ concat_sup_data: True
+ suppress: linear
+ ratio: 0.01
+ test_cfg:
+ inference_on: teacher
+
+
+### reader config
+batch_size: &batch_size 8
+worker_num: 2
+SemiTrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomFlip: {}
+ - RandomCrop: {} # unsup will be fake gt_boxes
+ weak_aug:
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
+ strong_aug:
+ - StrongAugImage: {transforms: [
+ RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
+ RandomErasingCrop: {},
+ RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
+ RandomGrayscale: {prob: 0.2},
+ ]}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
+ sup_batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - Permute: {}
+ - PadGT: {}
+ unsup_batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - Permute: {}
+ sup_batch_size: *batch_size
+ unsup_batch_size: *batch_size
+ shuffle: True
+ drop_last: True
+ collate_batch: True
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+
+### model config
+architecture: PPYOLOE
+norm_type: sync_bn
+ema_black_list: ['proj_conv.weight']
+custom_black_list: ['reduce_mean']
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+eval_size: ~ # means None, but not str 'None'
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 #
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+
+### other config
+epoch: *epochs
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: *cosine_epochs
+ use_warmup: *use_warmup
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 3
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005 # dt-fcos 0.0001
+ type: L2
+ clip_grad_by_norm: 1.0 # dt-fcos clip_grad_by_value
diff --git a/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_s_coco_semi005.yml b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_s_coco_semi005.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d3482e5e9d18e4b7459a4457dd78043fc56fb7db
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_s_coco_semi005.yml
@@ -0,0 +1,151 @@
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml',
+ '../_base_/coco_detection_percent_5.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+weights: output/denseteacher_ppyoloe_plus_crn_s_coco_semi005/model_final
+
+epochs: &epochs 200
+cosine_epochs: &cosine_epochs 240
+
+
+### pretrain and warmup config, choose one and comment another
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_s_80e_coco_sup005.pdparams # mAP=32.8
+semi_start_iters: 0
+ema_start_iters: 0
+use_warmup: &use_warmup False
+
+# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
+# semi_start_iters: 5000
+# ema_start_iters: 3000
+# use_warmup: &use_warmup True
+
+
+### global config
+use_simple_ema: True
+ema_decay: 0.9996
+ssod_method: DenseTeacher
+DenseTeacher:
+ train_cfg:
+ sup_weight: 1.0
+ unsup_weight: 1.0
+ loss_weight: {distill_loss_cls: 1.0, distill_loss_iou: 2.5, distill_loss_dfl: 0., distill_loss_contrast: 0.1}
+ contrast_loss:
+ temperature: 0.2
+ alpha: 0.9
+ smooth_iter: 100
+ concat_sup_data: True
+ suppress: linear
+ ratio: 0.01
+ test_cfg:
+ inference_on: teacher
+
+
+### reader config
+batch_size: &batch_size 8
+worker_num: 2
+SemiTrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomFlip: {}
+ - RandomCrop: {} # unsup will be fake gt_boxes
+ weak_aug:
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
+ strong_aug:
+ - StrongAugImage: {transforms: [
+ RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
+ RandomErasingCrop: {},
+ RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
+ RandomGrayscale: {prob: 0.2},
+ ]}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
+ sup_batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - Permute: {}
+ - PadGT: {}
+ unsup_batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - Permute: {}
+ sup_batch_size: *batch_size
+ unsup_batch_size: *batch_size
+ shuffle: True
+ drop_last: True
+ collate_batch: True
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+
+### model config
+architecture: PPYOLOE
+norm_type: sync_bn
+ema_black_list: ['proj_conv.weight']
+custom_black_list: ['reduce_mean']
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+eval_size: ~ # means None, but not str 'None'
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 #
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+
+### other config
+epoch: *epochs
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: *cosine_epochs
+ use_warmup: *use_warmup
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 3
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005 # dt-fcos 0.0001
+ type: L2
+ clip_grad_by_norm: 1.0 # dt-fcos clip_grad_by_value
diff --git a/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_s_coco_semi010.yml b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_s_coco_semi010.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e8b0aad3aff745ac9b3a62d8e18d470f4fe6698a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/semi_det/denseteacher/denseteacher_ppyoloe_plus_crn_s_coco_semi010.yml
@@ -0,0 +1,151 @@
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml',
+ '../_base_/coco_detection_percent_10.yml',
+]
+log_iter: 50
+snapshot_epoch: 5
+weights: output/denseteacher_ppyoloe_plus_crn_s_coco_semi010/model_final
+
+epochs: &epochs 200
+cosine_epochs: &cosine_epochs 240
+
+
+### pretrain and warmup config, choose one and comment another
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_s_80e_coco_sup010.pdparams # mAP=35.3
+semi_start_iters: 0
+ema_start_iters: 0
+use_warmup: &use_warmup False
+
+# pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_s_obj365_pretrained.pdparams
+# semi_start_iters: 5000
+# ema_start_iters: 3000
+# use_warmup: &use_warmup True
+
+
+### global config
+use_simple_ema: True
+ema_decay: 0.9996
+ssod_method: DenseTeacher
+DenseTeacher:
+ train_cfg:
+ sup_weight: 1.0
+ unsup_weight: 1.0
+ loss_weight: {distill_loss_cls: 1.0, distill_loss_iou: 2.5, distill_loss_dfl: 0., distill_loss_contrast: 0.1}
+ contrast_loss:
+ temperature: 0.2
+ alpha: 0.9
+ smooth_iter: 100
+ concat_sup_data: True
+ suppress: linear
+ ratio: 0.01
+ test_cfg:
+ inference_on: teacher
+
+
+### reader config
+batch_size: &batch_size 8
+worker_num: 2
+SemiTrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomFlip: {}
+ - RandomCrop: {} # unsup will be fake gt_boxes
+ weak_aug:
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
+ strong_aug:
+ - StrongAugImage: {transforms: [
+ RandomColorJitter: {prob: 0.8, brightness: 0.4, contrast: 0.4, saturation: 0.4, hue: 0.1},
+ RandomErasingCrop: {},
+ RandomGaussianBlur: {prob: 0.5, sigma: [0.1, 2.0]},
+ RandomGrayscale: {prob: 0.2},
+ ]}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], is_scale: true, norm_type: none}
+ sup_batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - Permute: {}
+ - PadGT: {}
+ unsup_batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - Permute: {}
+ sup_batch_size: *batch_size
+ unsup_batch_size: *batch_size
+ shuffle: True
+ drop_last: True
+ collate_batch: True
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+
+### model config
+architecture: PPYOLOE
+norm_type: sync_bn
+ema_black_list: ['proj_conv.weight']
+custom_black_list: ['reduce_mean']
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+eval_size: ~ # means None, but not str 'None'
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: -1 #
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
+
+
+### other config
+epoch: *epochs
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: *cosine_epochs
+ use_warmup: *use_warmup
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 3
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005 # dt-fcos 0.0001
+ type: L2
+ clip_grad_by_norm: 1.0 # dt-fcos clip_grad_by_value
diff --git a/PaddleDetection-release-2.6/configs/slim/README.md b/PaddleDetection-release-2.6/configs/slim/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..6d67b37cb27e93ce71745eb442b1b3e1ceeb370e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/README.md
@@ -0,0 +1,182 @@
+# 模型压缩
+
+在PaddleDetection中, 提供了基于[PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim)进行模型压缩的完整教程和benchmark。目前支持的方法:
+
+- [剪裁](prune)
+- [量化](quant)
+- [离线量化](post_quant)
+- [蒸馏](distill)
+- [联合策略](extensions)
+
+推荐您使用剪裁和蒸馏联合训练,或者使用剪裁、量化训练和离线量化,进行检测模型压缩。 下面以YOLOv3为例,进行剪裁、蒸馏和量化实验。
+
+## 实验环境
+
+- Python 3.7+
+- PaddlePaddle >= 2.1.0
+- PaddleSlim >= 2.1.0
+- CUDA 10.1+
+- cuDNN >=7.6.5
+
+**PaddleDetection、 PaddlePaddle与PaddleSlim 版本关系:**
+| PaddleDetection版本 | PaddlePaddle版本 | PaddleSlim版本 | 备注 |
+| :------------------: | :---------------: | :-------: |:---------------: |
+| release/2.3 | >= 2.1 | 2.1 | 离线量化依赖Paddle 2.2及PaddleSlim 2.2 |
+| release/2.1 | 2.2 | >= 2.1.0 | 2.1 | 量化模型导出依赖最新Paddle develop分支,可在[PaddlePaddle每日版本](https://www.paddlepaddle.org.cn/documentation/docs/zh/install/Tables.html#whl-dev)中下载安装 |
+| release/2.0 | >= 2.0.1 | 2.0 | 量化依赖Paddle 2.1及PaddleSlim 2.1 |
+
+
+#### 安装PaddleSlim
+- 方法一:直接安装:
+```
+pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
+```
+- 方法二:编译安装:
+```
+git clone https://github.com/PaddlePaddle/PaddleSlim.git
+cd PaddleSlim
+python setup.py install
+```
+
+## 快速开始
+
+### 训练
+
+```shell
+python tools/train.py -c configs/{MODEL.yml} --slim_config configs/slim/{SLIM_CONFIG.yml}
+```
+
+- `-c`: 指定模型配置文件。
+- `--slim_config`: 指定压缩策略配置文件。
+
+
+### 评估
+
+```shell
+python tools/eval.py -c configs/{MODEL.yml} --slim_config configs/slim/{SLIM_CONFIG.yml} -o weights=output/{SLIM_CONFIG}/model_final
+```
+
+- `-c`: 指定模型配置文件。
+- `--slim_config`: 指定压缩策略配置文件。
+- `-o weights`: 指定压缩算法训好的模型路径。
+
+### 测试
+
+```shell
+python tools/infer.py -c configs/{MODEL.yml} --slim_config configs/slim/{SLIM_CONFIG.yml} \
+ -o weights=output/{SLIM_CONFIG}/model_final
+ --infer_img={IMAGE_PATH}
+```
+
+- `-c`: 指定模型配置文件。
+- `--slim_config`: 指定压缩策略配置文件。
+- `-o weights`: 指定压缩算法训好的模型路径。
+- `--infer_img`: 指定测试图像路径。
+
+
+## 全链条部署
+
+### 动转静导出模型
+
+```shell
+python tools/export_model.py -c configs/{MODEL.yml} --slim_config configs/slim/{SLIM_CONFIG.yml} -o weights=output/{SLIM_CONFIG}/model_final
+```
+
+- `-c`: 指定模型配置文件。
+- `--slim_config`: 指定压缩策略配置文件。
+- `-o weights`: 指定压缩算法训好的模型路径。
+
+### 部署预测
+
+- Paddle-Inference预测:
+ - [Python部署](../../deploy/python/README.md)
+ - [C++部署](../../deploy/cpp/README.md)
+ - [TensorRT预测部署教程](../../deploy/TENSOR_RT.md)
+- 服务器端部署:使用[PaddleServing](../../deploy/serving/README.md)部署。
+- 手机移动端部署:使用[Paddle-Lite](../../deploy/lite/README.md) 在手机移动端部署。
+
+## Benchmark
+
+### 剪裁
+
+#### Pascal VOC上benchmark
+
+| 模型 | 压缩策略 | GFLOPs | 模型体积(MB) | 输入尺寸 | 预测时延(SD855) | Box AP | 下载 | 模型配置文件 | 压缩算法配置文件 |
+| :---------: | :-------: | :------------: |:-------------: | :------: | :-------------: | :------: | :-----------------------------------------------------: |:-------------: | :------: |
+| YOLOv3-MobileNetV1 | baseline | 24.13 | 93 | 608 | 332.0ms | 75.1 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_voc.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_voc.yml) | - |
+| YOLOv3-MobileNetV1 | 剪裁-l1_norm(sensity) | 15.78(-34.49%) | 66(-29%) | 608 | - | 78.4(+3.3) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_voc_prune_l1_norm.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_voc.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/prune/yolov3_prune_l1_norm.yml) |
+
+#### COCO上benchmark
+| 模型 | 压缩策略 | GFLOPs | 模型体积(MB) | 输入尺寸 | 预测时延(SD855) | Box AP | 下载 | 模型配置文件 | 压缩算法配置文件 |
+| :---------: | :-------: | :------------: |:-------------: | :------: | :-------------: | :------: | :-----------------------------------------------------: |:-------------: | :------: |
+| PP-YOLO-MobileNetV3_large | baseline | -- | 18.5 | 608 | 25.1ms | 23.2 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) | - |
+| PP-YOLO-MobileNetV3_large | 剪裁-FPGM | -37% | 12.6 | 608 | - | 22.3 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolo_mbv3_large_prune_fpgm.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/prune/ppyolo_mbv3_large_prune_fpgm.yml) |
+| YOLOv3-DarkNet53 | baseline | -- | 238.2 | 608 | - | 39.0 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) | - |
+| YOLOv3-DarkNet53 | 剪裁-FPGM | -24% | - | 608 | - | 37.6 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_darknet_prune_fpgm.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/prune/yolov3_darknet_prune_fpgm.yml) |
+| PP-YOLO_R50vd | baseline | -- | 183.3 | 608 | - | 44.8 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | - |
+| PP-YOLO_R50vd | 剪裁-FPGM | -35% | - | 608 | - | 42.1 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolo_r50vd_prune_fpgm.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/prune/ppyolo_r50vd_prune_fpgm.yml) |
+
+说明:
+- 目前剪裁除RCNN系列模型外,其余模型均已支持。
+- SD855预测时延为使用PaddleLite部署,使用arm8架构并使用4线程(4 Threads)推理时延。
+
+### 量化
+
+#### COCO上benchmark
+
+| 模型 | 压缩策略 | 输入尺寸 | 模型体积(MB) | 预测时延(V100) | 预测时延(SD855) | Box AP | 下载 | Inference模型下载 | 模型配置文件 | 压缩算法配置文件 |
+| ------------------ | ------------ | -------- | :---------: | :---------: |:---------: | :---------: | :----------------------------------------------: | :----------------------------------------------: |:------------------------------------------: | :------------------------------------: |
+| PP-YOLOE-l | baseline | 640 | - | 11.2ms(trt_fp32) | 7.7ms(trt_fp16) | -- | 50.9 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams) | - | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml) | - |
+| PP-YOLOE-l | 普通在线量化 | 640 | - | 6.7ms(trt_int8) | -- | 48.8 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyoloe_l_coco_qat.pdparams) | - | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ppyoloe_l_qat.yml) |
+| PP-YOLOv2_R50vd | baseline | 640 | 208.6 | 19.1ms | -- | 49.1 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolov2_r50vd_dcn_365e_coco.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | - |
+| PP-YOLOv2_R50vd | PACT在线量化 | 640 | -- | 17.3ms | -- | 48.1 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolov2_r50vd_dcn_qat.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolov2_r50vd_dcn_qat.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ppyolov2_r50vd_dcn_qat.yml) |
+| PP-YOLO_R50vd | baseline | 608 | 183.3 | 17.4ms | -- | 44.8 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolo_r50vd_dcn_1x_coco.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | - |
+| PP-YOLO_R50vd | PACT在线量化 | 608 | 67.3 | 13.8ms | -- | 44.3 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolo_r50vd_qat_pact.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolo_r50vd_qat_pact.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ppyolo_r50vd_qat_pact.yml) |
+| PP-YOLO-MobileNetV3_large | baseline | 320 | 18.5 | 2.7ms | 27.9ms | 23.2 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolo_mbv3_large_coco.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) | - |
+| PP-YOLO-MobileNetV3_large | 普通在线量化 | 320 | 5.6 | -- | 25.1ms | 24.3 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolo_mbv3_large_qat.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ppyolo_mbv3_large_qat.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ppyolo_mbv3_large_qat.yml) |
+| YOLOv3-MobileNetV1 | baseline | 608 | 94.2 | 8.9ms | 332ms | 29.4 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_270e_coco.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | - |
+| YOLOv3-MobileNetV1 | 普通在线量化 | 608 | 25.4 | 6.6ms | 248ms | 30.5 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_qat.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_qat.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/yolov3_mobilenet_v1_qat.yml) |
+| YOLOv3-MobileNetV3 | baseline | 608 | 90.3 | 9.4ms | 367.2ms | 31.4 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_270e_coco.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v3_large_270e_coco.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_coco.yml) | - |
+| YOLOv3-MobileNetV3 | PACT在线量化 | 608 | 24.4 | 8.0ms | 280.0ms | 31.1 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v3_coco_qat.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v3_coco_qat.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_coco.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/yolov3_mobilenet_v3_qat.yml) |
+| YOLOv3-DarkNet53 | baseline | 608 | 238.2 | 16.0ms | -- | 39.0 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_darknet53_270e_coco.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) | - |
+| YOLOv3-DarkNet53 | 普通在线量化 | 608 | 78.8 | 12.4ms | -- | 38.8 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_darknet_coco_qat.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_darknet_coco_qat.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/yolov3_darknet_qat.yml) |
+| SSD-MobileNet_v1 | baseline | 300 | 22.5 | 4.4ms | 26.6ms | 73.8 | [下载链接](https://paddledet.bj.bcebos.com/models/ssd_mobilenet_v1_300_120e_voc.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ssd_mobilenet_v1_300_120e_voc.tar) |[配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ssd/ssd_mobilenet_v1_300_120e_voc.yml) | - |
+| SSD-MobileNet_v1 | 普通在线量化 | 300 | 7.1 | -- | 21.5ms | 72.9 | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ssd_mobilenet_v1_300_voc_qat.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/ssd_mobilenet_v1_300_voc_qat.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ssd/ssd_mobilenet_v1_300_120e_voc.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ssd_mobilenet_v1_qat.yml) |
+| Mask-ResNet50-FPN | baseline | (800, 1333) | 174.1 | 359.5ms | -- | 39.2/35.6 | [下载链接](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_fpn_1x_coco.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/mask_rcnn_r50_fpn_1x_coco.tar) |[配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.yml) | - |
+| Mask-ResNet50-FPN | 普通在线量化 | (800, 1333) | -- | -- | -- | 39.7(+0.5)/35.9(+0.3) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/mask_rcnn_r50_fpn_1x_qat.pdparams) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/mask_rcnn_r50_fpn_1x_qat.tar) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/mask_rcnn_r50_fpn_1x_qat.yml) |
+
+说明:
+- 上述V100预测时延非量化模型均是使用TensorRT-FP32测试,量化模型均使用TensorRT-INT8测试,并且都包含NMS耗时。
+- SD855预测时延为使用PaddleLite部署,使用arm8架构并使用4线程(4 Threads)推理时延。
+- 上述PP-YOLOE模型均在V100,开启TensorRT环境中测速,不包含NMS。(导出模型时指定:-o trt=True exclude_nms=True)
+
+### 离线量化
+需要准备val集,用来对离线量化模型进行校准,运行方式:
+```shell
+python tools/post_quant.py -c configs/{MODEL.yml} --slim_config configs/slim/post_quant/{SLIM_CONFIG.yml}
+```
+例如:
+```shell
+python3.7 tools/post_quant.py -c configs/ppyolo/ppyolo_mbv3_large_coco.yml --slim_config=configs/slim/post_quant/ppyolo_mbv3_large_ptq.yml
+```
+
+### 蒸馏
+
+#### COCO上benchmark
+
+| 模型 | 压缩策略 | 输入尺寸 | Box AP | 下载 | 模型配置文件 | 压缩算法配置文件 |
+| ------------------ | ------------ | -------- | :---------: | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| YOLOv3-MobileNetV1 | baseline | 608 | 29.4 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | - |
+| YOLOv3-MobileNetV1 | 蒸馏 | 608 | 31.0(+1.6) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_distill.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/distill/yolov3_mobilenet_v1_coco_distill.yml) |
+
+- 具体蒸馏方法请参考[蒸馏策略文档](distill/README.md)
+
+### 蒸馏剪裁联合策略
+
+#### COCO上benchmark
+
+| 模型 | 压缩策略 | 输入尺寸 | GFLOPs | 模型体积(MB) | 预测时延(SD855) | Box AP | 下载 | 模型配置文件 | 压缩算法配置文件 |
+| ------------------ | ------------ | -------- | :---------: |:---------: |:---------: | :---------: |:----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: |
+| YOLOv3-MobileNetV1 | baseline | 608 | 24.65 | 94.2 | 332.0ms | 29.4 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | - |
+| YOLOv3-MobileNetV1 | 蒸馏+剪裁 | 608 | 7.54(-69.4%) | 30.9(-67.2%) | 166.1ms | 28.4(-1.0) | [下载链接](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_distill_prune.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/extensions/yolov3_mobilenet_v1_coco_distill_prune.yml) |
+| YOLOv3-MobileNetV1 | 剪裁+量化 | 608 | - | - | - | - | - | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_voc.yml) | [slim配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/extensions/yolov3_mobilenetv1_prune_qat.yml) |
diff --git a/PaddleDetection-release-2.6/configs/slim/README_en.md b/PaddleDetection-release-2.6/configs/slim/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..1ccdc86e0b58ef6031929006fa17ffc6220e561e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/README_en.md
@@ -0,0 +1,168 @@
+# Model Compression
+
+In PaddleDetection, a complete tutorial and benchmarks for model compression based on [PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim) are provided. Currently supported methods:
+
+- [pruning](prune)
+- [quantitative](quant)
+- [distillation](distill)
+- [The joint strategy](extensions)
+
+It is recommended that you use a combination of pruning and distillation training, or use pruning and quantization for test model compression. The following takes YOLOv3 as an example to carry out cutting, distillation and quantization experiments.
+
+## Experimental Environment
+
+- Python 3.7+
+- PaddlePaddle >= 2.1.0
+- PaddleSlim >= 2.1.0
+- CUDA 10.1+
+- cuDNN >=7.6.5
+
+**Version Dependency between PaddleDetection, Paddle and PaddleSlim Version**
+| PaddleDetection Version | PaddlePaddle Version | PaddleSlim Version | Note |
+| :---------------------: | :------------------: | :----------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| release/2.1 | >= 2.1.0 | 2.1 | Quantitative model exports rely on the latest Paddle Develop branch, available in[PaddlePaddle Daily version](https://www.paddlepaddle.org.cn/documentation/docs/zh/install/Tables.html#whl-dev) |
+| release/2.0 | >= 2.0.1 | 2.0 | Quantization depends on Paddle 2.1 and PaddleSlim 2.1 |
+
+
+#### Install PaddleSlim
+- Method 1: Install it directly:
+```
+pip install paddleslim -i https://pypi.tuna.tsinghua.edu.cn/simple
+```
+- Method 2: Compile and install:
+```
+git clone https://github.com/PaddlePaddle/PaddleSlim.git
+cd PaddleSlim
+python setup.py install
+```
+
+## Quick Start
+
+### Train
+
+```shell
+python tools/train.py -c configs/{MODEL.yml} --slim_config configs/slim/{SLIM_CONFIG.yml}
+```
+
+- `-c`: Specify the model configuration file.
+- `--slim_config`: Specify the compression policy profile.
+
+
+### Evaluation
+
+```shell
+python tools/eval.py -c configs/{MODEL.yml} --slim_config configs/slim/{SLIM_CONFIG.yml} -o weights=output/{SLIM_CONFIG}/model_final
+```
+
+- `-c`: Specify the model configuration file.
+- `--slim_config`: Specify the compression policy profile.
+- `-o weights`: Specifies the path of the model trained by the compression algorithm.
+
+### Test
+
+```shell
+python tools/infer.py -c configs/{MODEL.yml} --slim_config configs/slim/{SLIM_CONFIG.yml} \
+ -o weights=output/{SLIM_CONFIG}/model_final
+ --infer_img={IMAGE_PATH}
+```
+
+- `-c`: Specify the model configuration file.
+- `--slim_config`: Specify the compression policy profile.
+- `-o weights`: Specifies the path of the model trained by the compression algorithm.
+- `--infer_img`: Specifies the test image path.
+
+
+## Full Chain Deployment
+
+### the model is derived from moving to static
+
+```shell
+python tools/export_model.py -c configs/{MODEL.yml} --slim_config configs/slim/{SLIM_CONFIG.yml} -o weights=output/{SLIM_CONFIG}/model_final
+```
+
+- `-c`: Specify the model configuration file.
+- `--slim_config`: Specify the compression policy profile.
+- `-o weights`: Specifies the path of the model trained by the compression algorithm.
+
+### prediction and deployment
+
+- Paddle-Inference Prediction:
+ - [Python Deployment](../../deploy/python/README.md)
+ - [C++ Deployment](../../deploy/cpp/README.md)
+ - [TensorRT Predictive Deployment Tutorial](../../deploy/TENSOR_RT.md)
+- Server deployment: Used[PaddleServing](../../deploy/serving/README.md)
+- Mobile deployment: Use[Paddle-Lite](../../deploy/lite/README.md) Deploy it on the mobile terminal.
+
+## Benchmark
+
+### Pruning
+
+#### Pascal VOC Benchmark
+
+| Model | Compression Strategy | GFLOPs | Model Volume(MB) | Input Size | Predict Delay(SD855) | Box AP | Download | Model Configuration File | Compression Algorithm Configuration File |
+| :----------------: | :-------------------: | :------------: | :--------------: | :--------: | :------------------: | :--------: | :------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------: |
+| YOLOv3-MobileNetV1 | baseline | 24.13 | 93 | 608 | 332.0ms | 75.1 | [link](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_voc.pdparams) | [configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_voc.yml) | - |
+| YOLOv3-MobileNetV1 | 剪裁-l1_norm(sensity) | 15.78(-34.49%) | 66(-29%) | 608 | - | 78.4(+3.3) | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_voc_prune_l1_norm.pdparams) | [configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_voc.yml) | [slim configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/prune/yolov3_prune_l1_norm.yml) |
+
+#### COCO Benchmark
+| Mode | Compression Strategy | GFLOPs | Model Volume(MB) | Input Size | Predict Delay(SD855) | Box AP | Download | Model Configuration File | Compression Algorithm Configuration File |
+| :-----------------------: | :------------------: | :----: | :--------------: | :--------: | :------------------: | :----: | :---------------------------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------: |
+| PP-YOLO-MobileNetV3_large | baseline | -- | 18.5 | 608 | 25.1ms | 23.2 | [link](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) | - |
+| PP-YOLO-MobileNetV3_large | 剪裁-FPGM | -37% | 12.6 | 608 | - | 22.3 | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolo_mbv3_large_prune_fpgm.pdparams) | [configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) | [slim configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/prune/ppyolo_mbv3_large_prune_fpgm.yml) |
+| YOLOv3-DarkNet53 | baseline | -- | 238.2 | 608 | - | 39.0 | [link](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) | - |
+| YOLOv3-DarkNet53 | 剪裁-FPGM | -24% | - | 608 | - | 37.6 | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_darknet_prune_fpgm.pdparams) | [configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) | [slim configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/prune/yolov3_darknet_prune_fpgm.yml) |
+| PP-YOLO_R50vd | baseline | -- | 183.3 | 608 | - | 44.8 | [link](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | - |
+| PP-YOLO_R50vd | 剪裁-FPGM | -35% | - | 608 | - | 42.1 | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolo_r50vd_prune_fpgm.pdparams) | [configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | [slim configuration file](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/prune/ppyolo_r50vd_prune_fpgm.yml) |
+
+Description:
+- Currently, all models except RCNN series models are supported.
+- The SD855 predicts the delay for deployment using Paddle Lite, using the ARM8 architecture and using 4 Threads (4 Threads) to reason the delay.
+
+### Quantitative
+
+#### COCO Benchmark
+
+| Model | Compression Strategy | Input Size | Model Volume(MB) | Prediction Delay(V100) | Prediction Delay(SD855) | Box AP | Download | Download of Inference Model | Model Configuration File | Compression Algorithm Configuration File |
+| ------------------------- | -------------------------- | ----------- | :--------------: | :--------------------: | :---------------------: | :-------------------: | :-----------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------: |
+| PP-YOLOE-l | baseline | 640 | - | 11.2ms(trt_fp32) | 7.7ms(trt_fp16) | -- | 50.9 | [link](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams) | - | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml) | - |
+| PP-YOLOE-l | Common Online quantitative | 640 | - | 6.7ms(trt_int8) | -- | 48.8 | [link](https://paddledet.bj.bcebos.com/models/slim/ppyoloe_l_coco_qat.pdparams) | - | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml) | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ppyoloe_l_qat.yml) |
+| PP-YOLOv2_R50vd | baseline | 640 | 208.6 | 19.1ms | -- | 49.1 | [link](https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolov2_r50vd_dcn_365e_coco.tar) | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | - |
+| PP-YOLOv2_R50vd | PACT Online quantitative | 640 | -- | 17.3ms | -- | 48.1 | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolov2_r50vd_dcn_qat.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolov2_r50vd_dcn_qat.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ppyolov2_r50vd_dcn_qat.yml) |
+| PP-YOLO_R50vd | baseline | 608 | 183.3 | 17.4ms | -- | 44.8 | [link](https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolo_r50vd_dcn_1x_coco.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | - |
+| PP-YOLO_R50vd | PACT Online quantitative | 608 | 67.3 | 13.8ms | -- | 44.3 | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolo_r50vd_qat_pact.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolo_r50vd_qat_pact.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ppyolo_r50vd_qat_pact.yml) |
+| PP-YOLO-MobileNetV3_large | baseline | 320 | 18.5 | 2.7ms | 27.9ms | 23.2 | [link](https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolo_mbv3_large_coco.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) | - |
+| PP-YOLO-MobileNetV3_large | Common Online quantitative | 320 | 5.6 | -- | 25.1ms | 24.3 | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolo_mbv3_large_qat.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/ppyolo_mbv3_large_qat.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyolo/ppyolo_mbv3_large_coco.yml) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ppyolo_mbv3_large_qat.yml) |
+| YOLOv3-MobileNetV1 | baseline | 608 | 94.2 | 8.9ms | 332ms | 29.4 | [link](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_270e_coco.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | - |
+| YOLOv3-MobileNetV1 | Common Online quantitative | 608 | 25.4 | 6.6ms | 248ms | 30.5 | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_qat.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_qat.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | [slim Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/yolov3_mobilenet_v1_qat.yml) |
+| YOLOv3-MobileNetV3 | baseline | 608 | 90.3 | 9.4ms | 367.2ms | 31.4 | [link](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_270e_coco.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v3_large_270e_coco.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_coco.yml) | - |
+| YOLOv3-MobileNetV3 | PACT Online quantitative | 608 | 24.4 | 8.0ms | 280.0ms | 31.1 | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v3_coco_qat.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v3_coco_qat.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_coco.yml) | [slim Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/yolov3_mobilenet_v3_qat.yml) |
+| YOLOv3-DarkNet53 | baseline | 608 | 238.2 | 16.0ms | -- | 39.0 | [link](https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_darknet53_270e_coco.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) | - |
+| YOLOv3-DarkNet53 | Common Online quantitative | 608 | 78.8 | 12.4ms | -- | 38.8 | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_darknet_coco_qat.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_darknet_coco_qat.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml) | [slim Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/yolov3_darknet_qat.yml) |
+| SSD-MobileNet_v1 | baseline | 300 | 22.5 | 4.4ms | 26.6ms | 73.8 | [link](https://paddledet.bj.bcebos.com/models/ssd_mobilenet_v1_300_120e_voc.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/ssd_mobilenet_v1_300_120e_voc.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ssd/ssd_mobilenet_v1_300_120e_voc.yml) | - |
+| SSD-MobileNet_v1 | Common Online quantitative | 300 | 7.1 | -- | 21.5ms | 72.9 | [link](https://paddledet.bj.bcebos.com/models/slim/ssd_mobilenet_v1_300_voc_qat.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/ssd_mobilenet_v1_300_voc_qat.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ssd/ssd_mobilenet_v1_300_120e_voc.yml) | [slim Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/ssd_mobilenet_v1_qat.yml) |
+| Mask-ResNet50-FPN | baseline | (800, 1333) | 174.1 | 359.5ms | -- | 39.2/35.6 | [link](https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_fpn_1x_coco.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/mask_rcnn_r50_fpn_1x_coco.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.yml) | - |
+| Mask-ResNet50-FPN | Common Online quantitative | (800, 1333) | -- | -- | -- | 39.7(+0.5)/35.9(+0.3) | [link](https://paddledet.bj.bcebos.com/models/slim/mask_rcnn_r50_fpn_1x_qat.pdparams) | [link](https://paddledet.bj.bcebos.com/models/slim/mask_rcnn_r50_fpn_1x_qat.tar) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.yml) | [slim Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/quant/mask_rcnn_r50_fpn_1x_qat.yml) |
+
+Description:
+- The above V100 prediction delay non-quantified model is tested by TensorRT FP32, and the quantified model is tested by TensorRT INT8, and both of them include NMS time.
+- The SD855 predicts the delay for deployment using PaddleLite, using the ARM8 architecture and using 4 Threads (4 Threads) to reason the delay.
+
+### Distillation
+
+#### COCO Benchmark
+
+| Model | Compression Strategy | Input Size | Box AP | Download | Model Configuration File | Compression Strategy Configuration File |
+| ------------------ | -------------------- | ---------- | :--------: | :-------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------: |
+| YOLOv3-MobileNetV1 | baseline | 608 | 29.4 | [link](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | - |
+| YOLOv3-MobileNetV1 | Distillation | 608 | 31.0(+1.6) | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_distill.pdparams) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | [slimConfiguration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/distill/yolov3_mobilenet_v1_coco_distill.yml) |
+
+- Please refer to the specific distillation method[Distillation Policy Document](distill/README.md)
+
+### Distillation Pruning Combined Strategy
+
+#### COCO Benchmark
+
+| Model | Compression Strategy | Input Size | GFLOPs | Model Volume(MB) | Prediction Delay(SD855) | Box AP | Download | Model Configuration File | Compression Algorithm Configuration File |
+| ------------------ | ------------------------ | ---------- | :----------: | :--------------: | :---------------------: | :--------: | :-------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| YOLOv3-MobileNetV1 | baseline | 608 | 24.65 | 94.2 | 332.0ms | 29.4 | [link](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | - |
+| YOLOv3-MobileNetV1 | Distillation + Tailoring | 608 | 7.54(-69.4%) | 30.9(-67.2%) | 166.1ms | 28.4(-1.0) | [link](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_distill_prune.pdparams) | [Configuration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml) | [slimConfiguration File ](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/slim/extensions/yolov3_mobilenet_v1_coco_distill_prune.yml) |
diff --git a/PaddleDetection-release-2.6/configs/slim/distill/README.md b/PaddleDetection-release-2.6/configs/slim/distill/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..97c93fcc42d3f7d233e5e8794144bfeb8c1cd5b0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/distill/README.md
@@ -0,0 +1,212 @@
+# Distillation(蒸馏)
+
+## 内容
+- [YOLOv3模型蒸馏](#YOLOv3模型蒸馏)
+- [FGD模型蒸馏](#FGD模型蒸馏)
+- [CWD模型蒸馏](#CWD模型蒸馏)
+- [LD模型蒸馏](#LD模型蒸馏)
+- [PPYOLOE模型蒸馏](#PPYOLOE模型蒸馏)
+- [引用](#引用)
+
+## YOLOv3模型蒸馏
+
+以YOLOv3-MobileNetV1为例,使用YOLOv3-ResNet34作为蒸馏训练的teacher网络, 对YOLOv3-MobileNetV1结构的student网络进行蒸馏。
+COCO数据集作为目标检测任务的训练目标难度更大,意味着teacher网络会预测出更多的背景bbox,如果直接用teacher的预测输出作为student学习的`soft label`会有严重的类别不均衡问题。解决这个问题需要引入新的方法,详细背景请参考论文:[Object detection at 200 Frames Per Second](https://arxiv.org/abs/1805.06361)。
+为了确定蒸馏的对象,我们首先需要找到student和teacher网络得到的`x,y,w,h,cls,objectness`等Tensor,用teacher得到的结果指导student训练。具体实现可参考[代码](../../../ppdet/slim/distill_loss.py)
+
+| 模型 | 方案 | 输入尺寸 | epochs | Box mAP | 配置文件 | 下载链接 |
+| :---------------: | :---------: | :----: | :----: |:-----------: | :--------------: | :------------: |
+| YOLOv3-ResNet34 | teacher | 608 | 270e | 36.2 | [config](../../yolov3/yolov3_r34_270e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/yolov3_r34_270e_coco.pdparams) |
+| YOLOv3-MobileNetV1 | student | 608 | 270e | 29.4 | [config](../../yolov3/yolov3_mobilenet_v1_270e_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) |
+| YOLOv3-MobileNetV1 | distill | 608 | 270e | 31.0(+1.6) | [config](../../yolov3/yolov3_mobilenet_v1_270e_coco.yml),[slim_config](./yolov3_mobilenet_v1_coco_distill.yml) | [download](https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_distill.pdparams) |
+
+
+ 快速开始
+
+```shell
+# 单卡训练(不推荐)
+python tools/train.py -c configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml --slim_config configs/slim/distill/yolov3_mobilenet_v1_coco_distill.yml
+# 多卡训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml --slim_config configs/slim/distill/yolov3_mobilenet_v1_coco_distill.yml
+# 评估
+python tools/eval.py -c configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_distill.pdparams
+# 预测
+python tools/infer.py -c configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/slim/yolov3_mobilenet_v1_coco_distill.pdparams --infer_img=demo/000000014439_640x640.jpg
+```
+
+- `-c`: 指定模型配置文件,也是student配置文件。
+- `--slim_config`: 指定压缩策略配置文件,也是teacher配置文件。
+
+
+
+
+## FGD模型蒸馏
+
+FGD全称为[Focal and Global Knowledge Distillation for Detectors](https://arxiv.org/abs/2111.11837v1),是目标检测任务的一种蒸馏方法,FGD蒸馏分为两个部分`Focal`和`Global`。`Focal`蒸馏分离图像的前景和背景,让学生模型分别关注教师模型的前景和背景部分特征的关键像素;`Global`蒸馏部分重建不同像素之间的关系并将其从教师转移到学生,以补偿`Focal`蒸馏中丢失的全局信息。试验结果表明,FGD蒸馏算法在基于anchor和anchor free的方法上能有效提升模型精度。
+在PaddleDetection中,我们实现了FGD算法,并基于RetinaNet算法进行验证,实验结果如下:
+
+| 模型 | 方案 | 输入尺寸 | epochs | Box mAP | 配置文件 | 下载链接 |
+| ----------------- | ----------- | ------ | :----: | :-----------: | :--------------: | :------------: |
+| RetinaNet-ResNet101| teacher | 1333x800 | 2x | 40.6 | [config](../../retinanet/retinanet_r101_fpn_2x_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/retinanet_r101_fpn_2x_coco.pdparams) |
+| RetinaNet-ResNet50 | student | 1333x800 | 2x | 39.1 | [config](../../retinanet/retinanet_r50_fpn_2x_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco.pdparams) |
+| RetinaNet-ResNet50 | FGD | 1333x800 | 2x | 40.8(+1.7) | [config](../../retinanet/retinanet_r50_fpn_2x_coco.yml),[slim_config](./retinanet_resnet101_coco_distill.yml) | [download](https://paddledet.bj.bcebos.com/models/retinanet_r101_distill_r50_2x_coco.pdparams) |
+
+
+ 快速开始
+
+```shell
+# 单卡训练(不推荐)
+python tools/train.py -c configs/retinanet/retinanet_r50_fpn_2x_coco.yml --slim_config configs/slim/distill/retinanet_resnet101_coco_distill.yml
+# 多卡训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/retinanet/retinanet_r50_fpn_2x_coco.yml --slim_config configs/slim/distill/retinanet_resnet101_coco_distill.yml
+# 评估
+python tools/eval.py -c configs/retinanet/retinanet_r50_fpn_2x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/retinanet_r101_distill_r50_2x_coco.pdparams
+# 预测
+python tools/infer.py -c configs/retinanet/retinanet_r50_fpn_2x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/retinanet_r101_distill_r50_2x_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
+```
+
+- `-c`: 指定模型配置文件,也是student配置文件。
+- `--slim_config`: 指定压缩策略配置文件,也是teacher配置文件。
+
+
+
+
+## CWD模型蒸馏
+
+CWD全称为[Channel-wise Knowledge Distillation for Dense Prediction*](https://arxiv.org/pdf/2011.13256.pdf),通过最小化教师网络与学生网络的通道概率图之间的 Kullback-Leibler (KL) 散度,使得在蒸馏过程更加关注每个通道的最显著的区域,进而提升文本检测与图像分割任务的精度。在PaddleDetection中,我们实现了CWD算法,并基于GFL和RetinaNet模型进行验证,实验结果如下:
+
+| 模型 | 方案 | 输入尺寸 | epochs | Box mAP | 配置文件 | 下载链接 |
+| ----------------- | ----------- | ------ | :----: | :-----------: | :--------------: | :------------: |
+| RetinaNet-ResNet101| teacher | 1333x800 | 2x | 40.6 | [config](../../retinanet/retinanet_r101_fpn_2x_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/retinanet_r101_fpn_2x_coco.pdparams) |
+| RetinaNet-ResNet50 | student | 1333x800 | 2x | 39.1 | [config](../../retinanet/retinanet_r50_fpn_2x_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco.pdparams) |
+| RetinaNet-ResNet50 | CWD | 1333x800 | 2x | 40.5(+1.4) | [config](../../retinanet/retinanet_r50_fpn_2x_coco.yml),[slim_config](./retinanet_resnet101_coco_distill_cwd.yml) | [download](https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco_cwd.pdparams) |
+| GFL_ResNet101-vd| teacher | 1333x800 | 2x | 46.8 | [config](../../gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/gfl_r101vd_fpn_mstrain_2x_coco.pdparams) |
+| GFL_ResNet50 | student | 1333x800 | 1x | 41.0 | [config](../../gfl/gfl_r50_fpn_1x_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/gfl_r50_fpn_1x_coco.pdparams) |
+| GFL_ResNet50 | CWD | 1333x800 | 2x | 44.0(+3.0) | [config](../../gfl/gfl_r50_fpn_1x_coco.yml),[slim_config](./gfl_r101vd_fpn_coco_distill_cwd.yml) | [download](https://bj.bcebos.com/v1/paddledet/models/gfl_r50_fpn_2x_coco_cwd.pdparams) |
+
+
+ 快速开始
+
+```shell
+# 单卡训练(不推荐)
+python tools/train.py -c configs/retinanet/retinanet_r50_fpn_2x_coco.yml --slim_config configs/slim/distill/retinanet_resnet101_coco_distill_cwd.yml
+# 多卡训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/retinanet/retinanet_r50_fpn_2x_coco.yml --slim_config configs/slim/distill/retinanet_resnet101_coco_distill_cwd.yml
+# 评估
+python tools/eval.py -c configs/retinanet/retinanet_r50_fpn_2x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco_cwd.pdparams
+# 预测
+python tools/infer.py -c configs/retinanet/retinanet_r50_fpn_2x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/retinanet_r50_fpn_2x_coco_cwd.pdparams --infer_img=demo/000000014439_640x640.jpg
+
+# 单卡训练(不推荐)
+python tools/train.py -c configs/gfl/gfl_r50_fpn_1x_coco.yml --slim_config configs/slim/distill/gfl_r101vd_fpn_coco_distill_cwd.yml
+# 多卡训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/gfl/gfl_r50_fpn_1x_coco.yml --slim_config configs/slim/distill/gfl_r101vd_fpn_coco_distill_cwd.yml
+# 评估
+python tools/eval.py -c configs/gfl/gfl_r50_fpn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/gfl_r50_fpn_2x_coco_cwd.pdparams
+# 预测
+python tools/infer.py -c configs/gfl/gfl_r50_fpn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/gfl_r50_fpn_2x_coco_cwd.pdparams --infer_img=demo/000000014439_640x640.jpg
+```
+
+- `-c`: 指定模型配置文件,也是student配置文件。
+- `--slim_config`: 指定压缩策略配置文件,也是teacher配置文件。
+
+
+
+
+## LD模型蒸馏
+
+LD全称为[Localization Distillation for Dense Object Detection](https://arxiv.org/abs/2102.12252),将回归框表示为概率分布,把分类任务的KD用在定位任务上,并且使用因地制宜、分而治之的策略,在不同的区域分别学习分类知识与定位知识。在PaddleDetection中,我们实现了LD算法,并基于GFL模型进行验证,实验结果如下:
+
+| 模型 | 方案 | 输入尺寸 | epochs | Box mAP | 配置文件 | 下载链接 |
+| ----------------- | ----------- | ------ | :----: | :-----------: | :--------------: | :------------: |
+| GFL_ResNet101-vd| teacher | 1333x800 | 2x | 46.8 | [config](../../gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/gfl_r101vd_fpn_mstrain_2x_coco.pdparams) |
+| GFL_ResNet18-vd | student | 1333x800 | 1x | 36.6 | [config](../../gfl/gfl_r18vd_1x_coco.yml) | [download](https://paddledet.bj.bcebos.com/models/gfl_r18vd_1x_coco.pdparams) |
+| GFL_ResNet18-vd | LD | 1333x800 | 1x | 38.2(+1.6) | [config](../../gfl/gfl_slim_ld_r18vd_1x_coco.yml),[slim_config](./gfl_ld_distill.yml) | [download](https://bj.bcebos.com/v1/paddledet/models/gfl_slim_ld_r18vd_1x_coco.pdparams) |
+
+
+ 快速开始
+
+```shell
+# 单卡训练(不推荐)
+python tools/train.py -c configs/gfl/gfl_slim_ld_r18vd_1x_coco.yml --slim_config configs/slim/distill/gfl_ld_distill.yml
+# 多卡训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/gfl/gfl_slim_ld_r18vd_1x_coco.yml --slim_config configs/slim/distill/gfl_ld_distill.yml
+# 评估
+python tools/eval.py -c configs/gfl/gfl_slim_ld_r18vd_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/gfl_slim_ld_r18vd_1x_coco.pdparams
+# 预测
+python tools/infer.py -c configs/gfl/gfl_slim_ld_r18vd_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/gfl_slim_ld_r18vd_1x_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
+```
+
+- `-c`: 指定模型配置文件,也是student配置文件。
+- `--slim_config`: 指定压缩策略配置文件,也是teacher配置文件。
+
+
+
+
+## PPYOLOE模型蒸馏
+
+PaddleDetection提供了对PPYOLOE+ 进行模型蒸馏的方案,结合了logits蒸馏和feature蒸馏。
+
+| 模型 | 方案 | 输入尺寸 | epochs | Box mAP | 配置文件 | 下载链接 |
+| ----------------- | ----------- | ------ | :----: | :-----------: | :--------------: | :------------: |
+| PP-YOLOE+_x | teacher | 640 | 80e | 54.7 | [config](../../ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_x_80e_coco.pdparams) |
+| PP-YOLOE+_l | student | 640 | 80e | 52.9 | [config](../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco.pdparams) |
+| PP-YOLOE+_l | distill | 640 | 80e | **54.0(+1.1)** | [config](../../ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml),[slim_config](./ppyoloe_plus_distill_x_distill_l.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco_distill.pdparams) |
+| PP-YOLOE+_l | teacher | 640 | 80e | 52.9 | [config](../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco.pdparams) |
+| PP-YOLOE+_m | student | 640 | 80e | 49.8 | [config](../../ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_m_80e_coco.pdparams) |
+| PP-YOLOE+_m | distill | 640 | 80e | **51.0(+1.2)** | [config](../../ppyoloe/distill/ppyoloe_plus_crn_m_80e_coco_distill.yml),[slim_config](./ppyoloe_plus_distill_l_distill_m.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_m_80e_coco_distill.pdparams) |
+
+
+ 快速开始
+
+```shell
+# 单卡训练(不推荐)
+python tools/train.py -c configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml --slim_config configs/slim/distill/ppyoloe_plus_distill_x_distill_l.yml
+# 多卡训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml --slim_config configs/slim/distill/ppyoloe_plus_distill_x_distill_l.yml
+# 评估
+python tools/eval.py -c configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco_distill.pdparams
+# 预测
+python tools/infer.py -c configs/ppyoloe/distill/ppyoloe_plus_crn_l_80e_coco_distill.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco_distill.pdparams --infer_img=demo/000000014439_640x640.jpg
+```
+
+- `-c`: 指定模型配置文件,也是student配置文件。
+- `--slim_config`: 指定压缩策略配置文件,也是teacher配置文件。
+
+
+
+
+## 引用
+```
+@article{mehta2018object,
+ title={Object detection at 200 Frames Per Second},
+ author={Rakesh Mehta and Cemalettin Ozturk},
+ year={2018},
+ eprint={1805.06361},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+
+@inproceedings{yang2022focal,
+ title={Focal and global knowledge distillation for detectors},
+ author={Yang, Zhendong and Li, Zhe and Jiang, Xiaohu and Gong, Yuan and Yuan, Zehuan and Zhao, Danpei and Yuan, Chun},
+ booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
+ pages={4643--4652},
+ year={2022}
+}
+
+@Inproceedings{zheng2022LD,
+ title={Localization Distillation for Dense Object Detection},
+ author= {Zheng, Zhaohui and Ye, Rongguang and Wang, Ping and Ren, Dongwei and Zuo, Wangmeng and Hou, Qibin and Cheng, Mingming},
+ booktitle={CVPR},
+ year={2022}
+}
+
+@inproceedings{shu2021channel,
+ title={Channel-wise knowledge distillation for dense prediction},
+ author={Shu, Changyong and Liu, Yifan and Gao, Jianfei and Yan, Zheng and Shen, Chunhua},
+ booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
+ pages={5311--5320},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/slim/distill/gfl_ld_distill.yml b/PaddleDetection-release-2.6/configs/slim/distill/gfl_ld_distill.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2601e99f319e089d34caf912495c87a8fe0fd98c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/distill/gfl_ld_distill.yml
@@ -0,0 +1,25 @@
+_BASE_: [
+ '../../gfl/gfl_r18vd_1x_coco.yml',
+]
+
+# teacher pretrain model
+pretrain_weights: https://paddledet.bj.bcebos.com/models/gfl_r101vd_fpn_mstrain_2x_coco.pdparams
+
+slim: Distill
+slim_method: LD
+
+ResNet:
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1,2,3]
+ num_stages: 4
+
+TrainReader:
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2GFLTarget:
+ downsample_ratios: [8, 16, 32, 64, 128]
+ grid_cell_scale: 8
+ compute_vlr_region: True
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/configs/slim/distill/gfl_r101vd_fpn_coco_distill_cwd.yml b/PaddleDetection-release-2.6/configs/slim/distill/gfl_r101vd_fpn_coco_distill_cwd.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3af5ac17f2c1a4d4f4445944435879ed486301d0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/distill/gfl_r101vd_fpn_coco_distill_cwd.yml
@@ -0,0 +1,16 @@
+_BASE_: [
+ '../../gfl/gfl_r101vd_fpn_mstrain_2x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/gfl_r101vd_fpn_mstrain_2x_coco.pdparams
+
+slim: Distill
+slim_method: CWD
+distill_loss: CWDFeatureLoss
+distill_loss_name: ['cls_f_4', 'cls_f_3', 'cls_f_2', 'cls_f_1', 'cls_f_0']
+
+CWDFeatureLoss:
+ student_channels: 80
+ teacher_channels: 80
+ tau: 1.0
+ weight: 5.0
diff --git a/PaddleDetection-release-2.6/configs/slim/distill/ppyoloe_plus_distill_l_distill_m.yml b/PaddleDetection-release-2.6/configs/slim/distill/ppyoloe_plus_distill_l_distill_m.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0a5bfcd29cc352bb67f48a33a28f4531728543e5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/distill/ppyoloe_plus_distill_l_distill_m.yml
@@ -0,0 +1,53 @@
+# teacher and slim config
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml',
+]
+depth_mult: 1.0
+width_mult: 1.0
+for_distill: True
+architecture: PPYOLOE
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams
+find_unused_parameters: True
+
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+ collate_batch: True
+
+
+slim: Distill
+slim_method: PPYOLOEDistill
+distill_loss: DistillPPYOLOELoss
+
+DistillPPYOLOELoss: # L -> M
+ loss_weight: {'logits': 4.0, 'feat': 1.0}
+ logits_distill: True
+ logits_loss_weight: {'class': 1.0, 'iou': 2.5, 'dfl': 0.5}
+ logits_ld_distill: True
+ logits_ld_params: {'weight': 20000, 'T': 10}
+ feat_distill: True
+ feat_distiller: 'fgd' # ['cwd', 'fgd', 'pkd', 'mgd', 'mimic']
+ feat_distill_place: 'neck_feats'
+ teacher_width_mult: 1.0 # L
+ student_width_mult: 0.75 # M
+ feat_out_channels: [768, 384, 192] # The actual channel will multiply width_mult
diff --git a/PaddleDetection-release-2.6/configs/slim/distill/ppyoloe_plus_distill_m_distill_s.yml b/PaddleDetection-release-2.6/configs/slim/distill/ppyoloe_plus_distill_m_distill_s.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8ee944e9b91394b7bcfbf89a9610c302d803c9bb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/distill/ppyoloe_plus_distill_m_distill_s.yml
@@ -0,0 +1,53 @@
+# teacher and slim config
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_m_80e_coco.yml',
+]
+depth_mult: 0.67
+width_mult: 0.75
+for_distill: True
+architecture: PPYOLOE
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_m_80e_coco.pdparams
+find_unused_parameters: True
+
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+ collate_batch: True
+
+
+slim: Distill
+slim_method: PPYOLOEDistill
+distill_loss: DistillPPYOLOELoss
+
+DistillPPYOLOELoss: # M -> S
+ loss_weight: {'logits': 4.0, 'feat': 1.0}
+ logits_distill: True
+ logits_loss_weight: {'class': 1.0, 'iou': 2.5, 'dfl': 0.5}
+ logits_ld_distill: True
+ logits_ld_params: {'weight': 20000, 'T': 10}
+ feat_distill: True
+ feat_distiller: 'fgd' # ['cwd', 'fgd', 'pkd', 'mgd', 'mimic']
+ feat_distill_place: 'neck_feats'
+ teacher_width_mult: 0.75 # M
+ student_width_mult: 0.5 # S
+ feat_out_channels: [768, 384, 192] # The actual channel will multiply width_mult
diff --git a/PaddleDetection-release-2.6/configs/slim/distill/ppyoloe_plus_distill_x_distill_l.yml b/PaddleDetection-release-2.6/configs/slim/distill/ppyoloe_plus_distill_x_distill_l.yml
new file mode 100644
index 0000000000000000000000000000000000000000..55d3c4c9f08ae373e764062c3820f3823710e1d7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/distill/ppyoloe_plus_distill_x_distill_l.yml
@@ -0,0 +1,53 @@
+# teacher and slim config
+_BASE_: [
+ '../../ppyoloe/ppyoloe_plus_crn_x_80e_coco.yml',
+]
+depth_mult: 1.33
+width_mult: 1.25
+for_distill: True
+architecture: PPYOLOE
+PPYOLOE:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_x_80e_coco.pdparams
+find_unused_parameters: True
+
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: True
+ drop_last: True
+ use_shared_memory: True
+ collate_batch: True
+
+
+slim: Distill
+slim_method: PPYOLOEDistill
+distill_loss: DistillPPYOLOELoss
+
+DistillPPYOLOELoss: # X -> L
+ loss_weight: {'logits': 4.0, 'feat': 1.0}
+ logits_distill: True
+ logits_loss_weight: {'class': 1.0, 'iou': 2.5, 'dfl': 0.5}
+ logits_ld_distill: True
+ logits_ld_params: {'weight': 20000, 'T': 10}
+ feat_distill: True
+ feat_distiller: 'fgd' # ['cwd', 'fgd', 'pkd', 'mgd', 'mimic']
+ feat_distill_place: 'neck_feats'
+ teacher_width_mult: 1.25 # X
+ student_width_mult: 1.0 # L
+ feat_out_channels: [768, 384, 192] # The actual channel will multiply width_mult
diff --git a/PaddleDetection-release-2.6/configs/slim/distill/retinanet_resnet101_coco_distill.yml b/PaddleDetection-release-2.6/configs/slim/distill/retinanet_resnet101_coco_distill.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d4793c02063d8159e5277e705a58cc0b423d94ea
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/distill/retinanet_resnet101_coco_distill.yml
@@ -0,0 +1,19 @@
+_BASE_: [
+ '../../retinanet/retinanet_r101_fpn_2x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/retinanet_r101_fpn_2x_coco.pdparams
+
+slim: Distill
+slim_method: FGD
+distill_loss: FGDFeatureLoss
+distill_loss_name: ['neck_f_4', 'neck_f_3', 'neck_f_2', 'neck_f_1', 'neck_f_0']
+
+FGDFeatureLoss:
+ student_channels: 256
+ teacher_channels: 256
+ temp: 0.5
+ alpha_fgd: 0.001
+ beta_fgd: 0.0005
+ gamma_fgd: 0.0005
+ lambda_fgd: 0.000005
diff --git a/PaddleDetection-release-2.6/configs/slim/distill/retinanet_resnet101_coco_distill_cwd.yml b/PaddleDetection-release-2.6/configs/slim/distill/retinanet_resnet101_coco_distill_cwd.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7087b85d040cc32ca366701663b29416c7547d01
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/distill/retinanet_resnet101_coco_distill_cwd.yml
@@ -0,0 +1,17 @@
+_BASE_: [
+ '../../retinanet/retinanet_r101_fpn_2x_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/retinanet_r101_fpn_2x_coco.pdparams
+
+
+slim: Distill
+slim_method: CWD
+distill_loss: CWDFeatureLoss
+distill_loss_name: ['cls_f_4', 'cls_f_3', 'cls_f_2', 'cls_f_1', 'cls_f_0']
+
+CWDFeatureLoss:
+ student_channels: 80
+ teacher_channels: 80
+ tau: 1.0
+ weight: 5.0
diff --git a/PaddleDetection-release-2.6/configs/slim/distill/yolov3_mobilenet_v1_coco_distill.yml b/PaddleDetection-release-2.6/configs/slim/distill/yolov3_mobilenet_v1_coco_distill.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9998dec5620adac38fd8a487f7ad1ec6aeb055dd
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/distill/yolov3_mobilenet_v1_coco_distill.yml
@@ -0,0 +1,12 @@
+_BASE_: [
+ '../../yolov3/yolov3_r34_270e_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_r34_270e_coco.pdparams
+
+
+slim: Distill
+distill_loss: DistillYOLOv3Loss
+
+DistillYOLOv3Loss:
+ weight: 1000
diff --git a/PaddleDetection-release-2.6/configs/slim/extensions/yolov3_mobilenet_v1_coco_distill_prune.yml b/PaddleDetection-release-2.6/configs/slim/extensions/yolov3_mobilenet_v1_coco_distill_prune.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f86fac5e9ed0f291c5b3f9b6266ac5755807422c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/extensions/yolov3_mobilenet_v1_coco_distill_prune.yml
@@ -0,0 +1,24 @@
+_BASE_: [
+ '../../yolov3/yolov3_r34_270e_coco.yml',
+]
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_r34_270e_coco.pdparams
+
+slim: DistillPrune
+
+distill_loss: DistillYOLOv3Loss
+
+DistillYOLOv3Loss:
+ weight: 1000
+
+pruner: Pruner
+
+Pruner:
+ criterion: l1_norm
+ pruned_params: ['conv2d_27.w_0', 'conv2d_28.w_0', 'conv2d_29.w_0',
+ 'conv2d_30.w_0', 'conv2d_31.w_0', 'conv2d_32.w_0',
+ 'conv2d_34.w_0', 'conv2d_35.w_0', 'conv2d_36.w_0',
+ 'conv2d_37.w_0', 'conv2d_38.w_0', 'conv2d_39.w_0',
+ 'conv2d_41.w_0', 'conv2d_42.w_0', 'conv2d_43.w_0',
+ 'conv2d_44.w_0', 'conv2d_45.w_0', 'conv2d_46.w_0']
+ pruned_ratios: [0.5,0.5,0.5,0.5,0.5,0.5,0.7,0.7,0.7,0.7,0.7,0.7,0.8,0.8,0.8,0.8,0.8,0.8]
diff --git a/PaddleDetection-release-2.6/configs/slim/extensions/yolov3_mobilenetv1_prune_qat.yml b/PaddleDetection-release-2.6/configs/slim/extensions/yolov3_mobilenetv1_prune_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ff17ea0b4126d934b851df60cda2db2e17fbbae2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/extensions/yolov3_mobilenetv1_prune_qat.yml
@@ -0,0 +1,19 @@
+# Weights of yolov3_mobilenet_v1_voc
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_voc.pdparams
+slim: PrunerQAT
+
+PrunerQAT:
+ criterion: fpgm
+ pruned_params: ['conv2d_27.w_0', 'conv2d_28.w_0', 'conv2d_29.w_0',
+ 'conv2d_30.w_0', 'conv2d_31.w_0', 'conv2d_32.w_0',
+ 'conv2d_34.w_0', 'conv2d_35.w_0', 'conv2d_36.w_0',
+ 'conv2d_37.w_0', 'conv2d_38.w_0', 'conv2d_39.w_0',
+ 'conv2d_41.w_0', 'conv2d_42.w_0', 'conv2d_43.w_0',
+ 'conv2d_44.w_0', 'conv2d_45.w_0', 'conv2d_46.w_0']
+ pruned_ratios: [0.1,0.2,0.2,0.2,0.2,0.1,0.2,0.3,0.3,0.3,0.2,0.1,0.3,0.4,0.4,0.4,0.4,0.3]
+ print_prune_params: False
+ quant_config: {
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_qat_model: True
diff --git a/PaddleDetection-release-2.6/configs/slim/ofa/ofa_picodet_demo.yml b/PaddleDetection-release-2.6/configs/slim/ofa/ofa_picodet_demo.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a5ade9e3168bb0e6ecb68fabc47d98d789d4ae7d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/ofa/ofa_picodet_demo.yml
@@ -0,0 +1,85 @@
+weights: https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_0_pretrained.pdparams
+slim: OFA
+OFA:
+ ofa_config:
+ task: expand_ratio
+ expand_ratio: [0.5, 1]
+
+ skip_neck: True
+ skip_head: True
+
+ RunConfig:
+ # Skip the output layer of each block by layer name
+ skip_layers: ['backbone._conv1._conv','backbone.2_1._conv_linear_1._conv',
+ 'backbone.2_1._conv_linear_2._conv', 'backbone.2_1._conv_dw_mv1._conv',
+ 'backbone.2_1._conv_pw_mv1._conv', 'backbone.2_2._conv_linear._conv',
+ 'backbone.2_3._conv_linear._conv', 'backbone.3_1._conv_linear_1._conv',
+ 'backbone.3_1._conv_linear_2._conv', 'backbone.3_1._conv_dw_mv1._conv',
+ 'backbone.3_1._conv_pw_mv1._conv', 'backbone.3_2._conv_linear._conv',
+ 'backbone.3_3._conv_linear._conv', 'backbone.3_4._conv_linear._conv',
+ 'backbone.3_5._conv_linear._conv', 'backbone.3_6._conv_linear._conv',
+ 'backbone.3_7._conv_linear._conv', 'backbone.4_1._conv_linear_1._conv',
+ 'backbone.4_1._conv_linear_2._conv', 'backbone.4_1._conv_dw_mv1._conv',
+ 'backbone.4_1._conv_pw_mv1._conv', 'backbone.4_2._conv_linear._conv',
+ 'backbone.4_3._conv_linear._conv']
+
+ # For block-wise search, make layers in each block in the same search space
+ same_search_space: [
+ ['backbone.2_1._conv_dw_1._conv', 'backbone.2_1._conv_pw_2._conv',
+ 'backbone.2_1._conv_dw_2._conv', 'backbone.2_1._se.conv1', 'backbone.2_1._se.conv2'],
+ ['backbone.2_2._conv_pw._conv', 'backbone.2_2._conv_dw._conv',
+ 'backbone.2_2._se.conv1', 'backbone.2_2._se.conv2'],
+ ['backbone.2_3._conv_pw._conv', 'backbone.2_3._conv_dw._conv',
+ 'backbone.2_3._se.conv1', 'backbone.2_3._se.conv2'],
+ ['backbone.3_1._conv_dw_1._conv', 'backbone.3_1._conv_pw_2._conv',
+ 'backbone.3_1._conv_dw_2._conv', 'backbone.3_1._se.conv1', 'backbone.3_1._se.conv2'],
+ ['backbone.3_2._conv_pw._conv', 'backbone.3_2._conv_dw._conv',
+ 'backbone.3_2._se.conv1', 'backbone.3_2._se.conv2'],
+ ['backbone.3_3._conv_pw._conv', 'backbone.3_3._conv_dw._conv',
+ 'backbone.3_3._se.conv1', 'backbone.3_3._se.conv2'],
+ ['backbone.3_4._conv_pw._conv', 'backbone.3_4._conv_dw._conv',
+ 'backbone.3_4._se.conv1', 'backbone.3_4._se.conv2'],
+ ['backbone.3_5._conv_pw._conv', 'backbone.3_5._conv_dw._conv',
+ 'backbone.3_5._se.conv1', 'backbone.3_5._se.conv2'],
+ ['backbone.3_6._conv_pw._conv', 'backbone.3_6._conv_dw._conv',
+ 'backbone.3_6._se.conv1', 'backbone.3_6._se.conv2'],
+ ['backbone.3_7._conv_pw._conv', 'backbone.3_7._conv_dw._conv',
+ 'backbone.3_7._se.conv1', 'backbone.3_7._se.conv2'],
+ ['backbone.4_1._conv_dw_1._conv', 'backbone.4_1._conv_pw_2._conv',
+ 'backbone.4_1._conv_dw_2._conv', 'backbone.4_1._se.conv1', 'backbone.4_1._se.conv2'],
+ ['backbone.4_2._conv_pw._conv', 'backbone.4_2._conv_dw._conv',
+ 'backbone.4_2._se.conv1', 'backbone.4_2._se.conv2'],
+ ['backbone.4_3._conv_pw._conv', 'backbone.4_3._conv_dw._conv',
+ 'backbone.4_3._se.conv1', 'backbone.4_3._se.conv2']]
+
+ # demo expand ratio
+ # Generally, for expand ratio, float in (0, 1] is available.
+ # But please be careful if the model is complicated.
+ # For picodet, there are many split and concat, the choice of channel number is important.
+ ofa_layers:
+ 'backbone.2_1._conv_dw_1._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.2_2._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.2_3._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.3_1._conv_dw_1._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.3_2._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.3_3._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.3_4._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.3_5._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.3_6._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.3_7._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.4_1._conv_dw_1._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.4_2._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
+ 'backbone.4_3._conv_pw._conv':
+ 'expand_ratio': [0.5, 1]
diff --git a/PaddleDetection-release-2.6/configs/slim/post_quant/mask_rcnn_r50_fpn_1x_coco_ptq.yml b/PaddleDetection-release-2.6/configs/slim/post_quant/mask_rcnn_r50_fpn_1x_coco_ptq.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d715aedffe2dd5e15bdb222a74aa35bc273d2240
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/post_quant/mask_rcnn_r50_fpn_1x_coco_ptq.yml
@@ -0,0 +1,10 @@
+weights: https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_fpn_1x_coco.pdparams
+slim: PTQ
+
+PTQ:
+ ptq_config: {
+ 'activation_quantizer': 'HistQuantizer',
+ 'upsample_bins': 127,
+ 'hist_percent': 0.999}
+ quant_batch_num: 10
+ fuse: True
diff --git a/PaddleDetection-release-2.6/configs/slim/post_quant/mcfairmot_ptq.yml b/PaddleDetection-release-2.6/configs/slim/post_quant/mcfairmot_ptq.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7ab8e38b9715aa10e5d38a84fa15a033c9ee919f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/post_quant/mcfairmot_ptq.yml
@@ -0,0 +1,10 @@
+weights: https://paddledet.bj.bcebos.com/models/mot/mcfairmot_dla34_30e_1088x608_visdrone_vehicle_bytetracker.pdparams
+slim: PTQ
+
+PTQ:
+ ptq_config: {
+ 'activation_quantizer': 'HistQuantizer',
+ 'upsample_bins': 127,
+ 'hist_percent': 0.999}
+ quant_batch_num: 10
+ fuse: True
diff --git a/PaddleDetection-release-2.6/configs/slim/post_quant/picodet_s_ptq.yml b/PaddleDetection-release-2.6/configs/slim/post_quant/picodet_s_ptq.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e1cf3ca6ab23accabf91b0d7294c0ab48accf693
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/post_quant/picodet_s_ptq.yml
@@ -0,0 +1,10 @@
+weights: https://paddledet.bj.bcebos.com/models/picodet_s_320_coco.pdparams
+slim: PTQ
+
+PTQ:
+ ptq_config: {
+ 'activation_quantizer': 'HistQuantizer',
+ 'upsample_bins': 127,
+ 'hist_percent': 0.999}
+ quant_batch_num: 10
+ fuse: True
diff --git a/PaddleDetection-release-2.6/configs/slim/post_quant/ppyolo_mbv3_large_ptq.yml b/PaddleDetection-release-2.6/configs/slim/post_quant/ppyolo_mbv3_large_ptq.yml
new file mode 100644
index 0000000000000000000000000000000000000000..712651fa58f6eca6907d4530caac2c0a2dde3551
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/post_quant/ppyolo_mbv3_large_ptq.yml
@@ -0,0 +1,10 @@
+weights: https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams
+slim: PTQ
+
+PTQ:
+ ptq_config: {
+ 'activation_quantizer': 'HistQuantizer',
+ 'upsample_bins': 127,
+ 'hist_percent': 0.999}
+ quant_batch_num: 10
+ fuse: True
diff --git a/PaddleDetection-release-2.6/configs/slim/post_quant/ppyolo_r50vd_dcn_ptq.yml b/PaddleDetection-release-2.6/configs/slim/post_quant/ppyolo_r50vd_dcn_ptq.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e829d271598b3cf4243bbd724a7955c6544253e9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/post_quant/ppyolo_r50vd_dcn_ptq.yml
@@ -0,0 +1,10 @@
+weights: https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
+slim: PTQ
+
+PTQ:
+ ptq_config: {
+ 'activation_quantizer': 'HistQuantizer',
+ 'upsample_bins': 127,
+ 'hist_percent': 0.999}
+ quant_batch_num: 10
+ fuse: True
diff --git a/PaddleDetection-release-2.6/configs/slim/post_quant/ppyoloe_crn_s_300e_coco_ptq.yml b/PaddleDetection-release-2.6/configs/slim/post_quant/ppyoloe_crn_s_300e_coco_ptq.yml
new file mode 100644
index 0000000000000000000000000000000000000000..dfa793d528a63255fce62c6d1c94a594fee58853
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/post_quant/ppyoloe_crn_s_300e_coco_ptq.yml
@@ -0,0 +1,10 @@
+weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+slim: PTQ
+
+PTQ:
+ ptq_config: {
+ 'activation_quantizer': 'HistQuantizer',
+ 'upsample_bins': 127,
+ 'hist_percent': 0.999}
+ quant_batch_num: 10
+ fuse: True
diff --git a/PaddleDetection-release-2.6/configs/slim/post_quant/tinypose_128x96_ptq.yml b/PaddleDetection-release-2.6/configs/slim/post_quant/tinypose_128x96_ptq.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a3bd64761fac679d83bbbdb4011ea3ab327ad3f9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/post_quant/tinypose_128x96_ptq.yml
@@ -0,0 +1,10 @@
+weights: https://paddledet.bj.bcebos.com/models/keypoint/tinypose_128x96.pdparams
+slim: PTQ
+
+PTQ:
+ ptq_config: {
+ 'activation_quantizer': 'HistQuantizer',
+ 'upsample_bins': 127,
+ 'hist_percent': 0.999}
+ quant_batch_num: 10
+ fuse: True
diff --git a/PaddleDetection-release-2.6/configs/slim/post_quant/yolov3_darknet53_ptq.yml b/PaddleDetection-release-2.6/configs/slim/post_quant/yolov3_darknet53_ptq.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7715b082a171b42dc8efe624be54eac74a003e68
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/post_quant/yolov3_darknet53_ptq.yml
@@ -0,0 +1,10 @@
+weights: https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams
+slim: PTQ
+
+PTQ:
+ ptq_config: {
+ 'activation_quantizer': 'HistQuantizer',
+ 'upsample_bins': 127,
+ 'hist_percent': 0.999}
+ quant_batch_num: 10
+ fuse: True
diff --git a/PaddleDetection-release-2.6/configs/slim/prune/faster_rcnn_r50_fpn_prune_fpgm.yml b/PaddleDetection-release-2.6/configs/slim/prune/faster_rcnn_r50_fpn_prune_fpgm.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e86c17f04c5f7510ba95c1b09e51832dc49224bb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/prune/faster_rcnn_r50_fpn_prune_fpgm.yml
@@ -0,0 +1,16 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_1x_coco.pdparams
+slim: Pruner
+
+Pruner:
+ criterion: fpgm
+ pruned_params: ['conv2d_27.w_0', 'conv2d_28.w_0', 'conv2d_29.w_0',
+ 'conv2d_30.w_0', 'conv2d_31.w_0', 'conv2d_32.w_0',
+ 'conv2d_33.w_0', 'conv2d_34.w_0', 'conv2d_35.w_0',
+ 'conv2d_36.w_0', 'conv2d_37.w_0', 'conv2d_38.w_0',
+ 'conv2d_39.w_0', 'conv2d_40.w_0', 'conv2d_41.w_0',
+ 'conv2d_42.w_0', 'conv2d_43.w_0', 'conv2d_44.w_0',
+ 'conv2d_45.w_0', 'conv2d_46.w_0', 'conv2d_47.w_0',
+ 'conv2d_48.w_0', 'conv2d_49.w_0', 'conv2d_50.w_0',
+ 'conv2d_51.w_0', 'conv2d_52.w_0']
+ pruned_ratios: [0.1,0.2,0.2,0.2,0.2,0.1,0.2,0.3,0.3,0.3,0.2,0.1,0.3,0.4,0.4,0.4,0.4,0.3,0.4,0.4,0.4,0.4,0.4,0.4,0.4,0.4]
+ print_params: False
diff --git a/PaddleDetection-release-2.6/configs/slim/prune/picodet_m_unstructured_prune_75.yml b/PaddleDetection-release-2.6/configs/slim/prune/picodet_m_unstructured_prune_75.yml
new file mode 100644
index 0000000000000000000000000000000000000000..94345b4e8839d347d0a9ae3eae0337af41f8add3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/prune/picodet_m_unstructured_prune_75.yml
@@ -0,0 +1,11 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_m_320_coco.pdparams
+slim: UnstructuredPruner
+
+UnstructuredPruner:
+ stable_epochs: 0
+ pruning_epochs: 150
+ tunning_epochs: 150
+ pruning_steps: 300
+ ratio: 0.75
+ initial_ratio: 0.15
+ prune_params_type: conv1x1_only
diff --git a/PaddleDetection-release-2.6/configs/slim/prune/picodet_m_unstructured_prune_85.yml b/PaddleDetection-release-2.6/configs/slim/prune/picodet_m_unstructured_prune_85.yml
new file mode 100644
index 0000000000000000000000000000000000000000..db0af7e1087a63cf9891e9ab142f2c331e1443e0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/prune/picodet_m_unstructured_prune_85.yml
@@ -0,0 +1,11 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_m_320_coco.pdparams
+slim: UnstructuredPruner
+
+UnstructuredPruner:
+ stable_epochs: 0
+ pruning_epochs: 150
+ tunning_epochs: 150
+ pruning_steps: 300
+ ratio: 0.85
+ initial_ratio: 0.20
+ prune_params_type: conv1x1_only
diff --git a/PaddleDetection-release-2.6/configs/slim/prune/ppyolo_mbv3_large_prune_fpgm.yml b/PaddleDetection-release-2.6/configs/slim/prune/ppyolo_mbv3_large_prune_fpgm.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b9cecb3163f978b33123499f241ceb88fd05a688
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/prune/ppyolo_mbv3_large_prune_fpgm.yml
@@ -0,0 +1,9 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams
+slim: Pruner
+
+Pruner:
+ criterion: fpgm
+ pruned_params: ['conv2d_62.w_0', 'conv2d_63.w_0', 'conv2d_64.w_0',
+ 'conv2d_65.w_0', 'conv2d_66.w_0', 'conv2d_67.w_0']
+ pruned_ratios: [0.75, 0.75, 0.75, 0.75, 0.75, 0.75]
+ print_params: True
diff --git a/PaddleDetection-release-2.6/configs/slim/prune/ppyolo_r50vd_prune_fpgm.yml b/PaddleDetection-release-2.6/configs/slim/prune/ppyolo_r50vd_prune_fpgm.yml
new file mode 100644
index 0000000000000000000000000000000000000000..00dc57ae9759bb32f774a1852b629cdcac2c1b4a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/prune/ppyolo_r50vd_prune_fpgm.yml
@@ -0,0 +1,13 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
+slim: Pruner
+
+Pruner:
+ criterion: fpgm
+ pruned_params: ['conv2d_56.w_0', 'conv2d_57.w_0', 'conv2d_58.w_0',
+ 'conv2d_59.w_0', 'conv2d_60.w_0', 'conv2d_61.w_0',
+ 'conv2d_63.w_0', 'conv2d_64.w_0', 'conv2d_65.w_0',
+ 'conv2d_66.w_0', 'conv2d_67.w_0', 'conv2d_68.w_0',
+ 'conv2d_70.w_0', 'conv2d_71.w_0', 'conv2d_72.w_0',
+ 'conv2d_73.w_0', 'conv2d_74.w_0', 'conv2d_75.w_0']
+ pruned_ratios: [0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.875,0.875,0.875,0.875,0.875,0.875]
+ print_params: False
diff --git a/PaddleDetection-release-2.6/configs/slim/prune/yolov3_darknet_prune_fpgm.yml b/PaddleDetection-release-2.6/configs/slim/prune/yolov3_darknet_prune_fpgm.yml
new file mode 100644
index 0000000000000000000000000000000000000000..850fefb956431cbb15fc20f58fd868171722ff3c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/prune/yolov3_darknet_prune_fpgm.yml
@@ -0,0 +1,13 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams
+slim: Pruner
+
+Pruner:
+ criterion: fpgm
+ pruned_params: ['conv2d_52.w_0', 'conv2d_53.w_0', 'conv2d_54.w_0',
+ 'conv2d_55.w_0', 'conv2d_56.w_0', 'conv2d_57.w_0',
+ 'conv2d_59.w_0', 'conv2d_60.w_0', 'conv2d_61.w_0',
+ 'conv2d_62.w_0', 'conv2d_63.w_0', 'conv2d_64.w_0',
+ 'conv2d_66.w_0', 'conv2d_67.w_0', 'conv2d_68.w_0',
+ 'conv2d_69.w_0', 'conv2d_70.w_0', 'conv2d_71.w_0']
+ pruned_ratios: [0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.75,0.875,0.875,0.875,0.875,0.875,0.875]
+ print_params: True
diff --git a/PaddleDetection-release-2.6/configs/slim/prune/yolov3_prune_fpgm.yml b/PaddleDetection-release-2.6/configs/slim/prune/yolov3_prune_fpgm.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f3745386823a45a970d077d3201baffa3665490b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/prune/yolov3_prune_fpgm.yml
@@ -0,0 +1,14 @@
+# Weights of yolov3_mobilenet_v1_voc
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_voc.pdparams
+slim: Pruner
+
+Pruner:
+ criterion: fpgm
+ pruned_params: ['conv2d_27.w_0', 'conv2d_28.w_0', 'conv2d_29.w_0',
+ 'conv2d_30.w_0', 'conv2d_31.w_0', 'conv2d_32.w_0',
+ 'conv2d_34.w_0', 'conv2d_35.w_0', 'conv2d_36.w_0',
+ 'conv2d_37.w_0', 'conv2d_38.w_0', 'conv2d_39.w_0',
+ 'conv2d_41.w_0', 'conv2d_42.w_0', 'conv2d_43.w_0',
+ 'conv2d_44.w_0', 'conv2d_45.w_0', 'conv2d_46.w_0']
+ pruned_ratios: [0.1,0.2,0.2,0.2,0.2,0.1,0.2,0.3,0.3,0.3,0.2,0.1,0.3,0.4,0.4,0.4,0.4,0.3]
+ print_params: False
diff --git a/PaddleDetection-release-2.6/configs/slim/prune/yolov3_prune_l1_norm.yml b/PaddleDetection-release-2.6/configs/slim/prune/yolov3_prune_l1_norm.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5b4f4667f2285cd73907df12aa1bd0f446a0f5c0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/prune/yolov3_prune_l1_norm.yml
@@ -0,0 +1,14 @@
+# Weights of yolov3_mobilenet_v1_voc
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_voc.pdparams
+slim: Pruner
+
+Pruner:
+ criterion: l1_norm
+ pruned_params: ['conv2d_27.w_0', 'conv2d_28.w_0', 'conv2d_29.w_0',
+ 'conv2d_30.w_0', 'conv2d_31.w_0', 'conv2d_32.w_0',
+ 'conv2d_34.w_0', 'conv2d_35.w_0', 'conv2d_36.w_0',
+ 'conv2d_37.w_0', 'conv2d_38.w_0', 'conv2d_39.w_0',
+ 'conv2d_41.w_0', 'conv2d_42.w_0', 'conv2d_43.w_0',
+ 'conv2d_44.w_0', 'conv2d_45.w_0', 'conv2d_46.w_0']
+ pruned_ratios: [0.1,0.2,0.2,0.2,0.2,0.1,0.2,0.3,0.3,0.3,0.2,0.1,0.3,0.4,0.4,0.4,0.4,0.3]
+ print_params: False
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/mask_rcnn_r50_fpn_1x_qat.yml b/PaddleDetection-release-2.6/configs/slim/quant/mask_rcnn_r50_fpn_1x_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7363b4e55245024d5534a805be66301ca8b720fb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/mask_rcnn_r50_fpn_1x_qat.yml
@@ -0,0 +1,22 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/mask_rcnn_r50_fpn_1x_coco.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: True
+
+
+epoch: 5
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [3, 4]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 100
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/picodet_s_416_lcnet_quant.yml b/PaddleDetection-release-2.6/configs/slim/quant/picodet_s_416_lcnet_quant.yml
new file mode 100644
index 0000000000000000000000000000000000000000..000807ab6b138ca8f28440f97b44809e75a9ac3d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/picodet_s_416_lcnet_quant.yml
@@ -0,0 +1,22 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_s_416_coco_lcnet.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'activation_preprocess_type': 'PACT',
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: False
+
+TrainReader:
+ batch_size: 48
+
+LearningRate:
+ base_lr: 0.024
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 300
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/picodet_s_quant.yml b/PaddleDetection-release-2.6/configs/slim/quant/picodet_s_quant.yml
new file mode 100644
index 0000000000000000000000000000000000000000..099532ffc5c3791644ceda25db8c1f4581762d61
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/picodet_s_quant.yml
@@ -0,0 +1,26 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_s_320_coco.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'activation_preprocess_type': 'PACT',
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: False
+
+epoch: 50
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 30
+ - 40
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 100
+
+TrainReader:
+ batch_size: 96
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/ppyolo_mbv3_large_qat.yml b/PaddleDetection-release-2.6/configs/slim/quant/ppyolo_mbv3_large_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2b2ddf90e7b36d60fbbd75f1b7beb6e7ffac9685
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/ppyolo_mbv3_large_qat.yml
@@ -0,0 +1,16 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolo_mbv3_large_coco.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.99,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: True
+
+PPYOLOFPN:
+ in_channels: [160, 368]
+ coord_conv: true
+ conv_block_num: 0
+ spp: true
+ drop_block: false
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/ppyolo_r50vd_qat_pact.yml b/PaddleDetection-release-2.6/configs/slim/quant/ppyolo_r50vd_qat_pact.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fb6d98841040a13221ac8cba3acecc6236a6cb03
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/ppyolo_r50vd_qat_pact.yml
@@ -0,0 +1,39 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'activation_preprocess_type': 'PACT',
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: True
+
+epoch: 50
+
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 30
+ - 45
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+PPYOLOFPN:
+ coord_conv: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+ drop_block: false
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/ppyoloe_l_qat.yml b/PaddleDetection-release-2.6/configs/slim/quant/ppyoloe_l_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4c0e94003a6ed0b7dde95ecd1f2361b87c61b4c8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/ppyoloe_l_qat.yml
@@ -0,0 +1,26 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: True
+
+epoch: 30
+snapshot_epoch: 5
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 10
+ - 20
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 100
+
+TrainReader:
+ batch_size: 8
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/ppyolov2_r50vd_dcn_qat.yml b/PaddleDetection-release-2.6/configs/slim/quant/ppyolov2_r50vd_dcn_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d218e1edcbdb3597f43650467b98839c7d5e28c2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/ppyolov2_r50vd_dcn_qat.yml
@@ -0,0 +1,33 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'activation_preprocess_type': 'PACT',
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: True
+
+epoch: 50
+snapshot_epoch: 8
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 30
+ - 45
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 2000
+
+TrainReader:
+ batch_size: 8
+
+PPYOLOPAN:
+ drop_block: false
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/ssd_mobilenet_v1_qat.yml b/PaddleDetection-release-2.6/configs/slim/quant/ssd_mobilenet_v1_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..05e068368fced56bdd3298323cf901dbbe29f925
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/ssd_mobilenet_v1_qat.yml
@@ -0,0 +1,9 @@
+pretrain_weights: https://paddlemodels.bj.bcebos.com/object_detection/dygraph/ssd_mobilenet_v1_300_120e_voc.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: True
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/tinypose_qat.yml b/PaddleDetection-release-2.6/configs/slim/quant/tinypose_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3b85dfe55d226d2514bf11c530abb8df1abf8664
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/tinypose_qat.yml
@@ -0,0 +1,26 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/keypoint/tinypose_128x96.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'activation_preprocess_type': 'PACT',
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: False
+
+epoch: 50
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 30
+ - 40
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 100
+
+TrainReader:
+ batch_size: 256
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/yolov3_darknet_qat.yml b/PaddleDetection-release-2.6/configs/slim/quant/yolov3_darknet_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..281b53418c215751470082794ef4c8d8b0d529e7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/yolov3_darknet_qat.yml
@@ -0,0 +1,31 @@
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: True
+
+epoch: 50
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 30
+ - 45
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/yolov3_mobilenet_v1_qat.yml b/PaddleDetection-release-2.6/configs/slim/quant/yolov3_mobilenet_v1_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d1452082983ced70d1709343cd42017d8a19d361
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/yolov3_mobilenet_v1_qat.yml
@@ -0,0 +1,10 @@
+# Weights of yolov3_mobilenet_v1_coco
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: True
diff --git a/PaddleDetection-release-2.6/configs/slim/quant/yolov3_mobilenet_v3_qat.yml b/PaddleDetection-release-2.6/configs/slim/quant/yolov3_mobilenet_v3_qat.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8e83f27aa92788a5a1ef1e0caa17ee9cc143bd4c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/slim/quant/yolov3_mobilenet_v3_qat.yml
@@ -0,0 +1,24 @@
+# Weights of yolov3_mobilenet_v3_coco
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_270e_coco.pdparams
+slim: QAT
+
+QAT:
+ quant_config: {
+ 'activation_preprocess_type': 'PACT',
+ 'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max',
+ 'weight_bits': 8, 'activation_bits': 8, 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9,
+ 'quantizable_layer_type': ['Conv2D', 'Linear']}
+ print_model: True
+
+epoch: 50
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 35
+ - 45
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/smalldet/DataDownload.md b/PaddleDetection-release-2.6/configs/smalldet/DataDownload.md
new file mode 100644
index 0000000000000000000000000000000000000000..73189056ea15b39e20fec31dbb0968b0ce4730e7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/DataDownload.md
@@ -0,0 +1,99 @@
+# 小目标数据集下载汇总
+
+## 目录
+- [数据集准备](#数据集准备)
+ - [VisDrone-DET](#VisDrone-DET)
+ - [DOTA水平框](#DOTA水平框)
+ - [Xview](#Xview)
+ - [用户自定义数据集](#用户自定义数据集)
+
+## 数据集准备
+
+### VisDrone-DET
+
+VisDrone-DET是一个无人机航拍场景的小目标数据集,整理后的COCO格式VisDrone-DET数据集[下载链接](https://bj.bcebos.com/v1/paddledet/data/smalldet/visdrone.zip),切图后的COCO格式数据集[下载链接](https://bj.bcebos.com/v1/paddledet/data/smalldet/visdrone_sliced.zip),检测其中的**10类**,包括 `pedestrian(1), people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10)`,原始数据集[下载链接](https://github.com/VisDrone/VisDrone-Dataset)。
+具体使用和下载请参考[visdrone](../visdrone)。
+
+### DOTA水平框
+
+DOTA是一个大型的遥感影像公开数据集,这里使用**DOTA-v1.0**水平框数据集,切图后整理的COCO格式的DOTA水平框数据集[下载链接](https://bj.bcebos.com/v1/paddledet/data/smalldet/dota_sliced.zip),检测其中的**15类**,
+包括 `plane(0), baseball-diamond(1), bridge(2), ground-track-field(3), small-vehicle(4), large-vehicle(5), ship(6), tennis-court(7),basketball-court(8), storage-tank(9), soccer-ball-field(10), roundabout(11), harbor(12), swimming-pool(13), helicopter(14)`,
+图片及原始数据集[下载链接](https://captain-whu.github.io/DOAI2019/dataset.html)。
+
+### Xview
+
+Xview是一个大型的航拍遥感检测数据集,目标极小极多,切图后整理的COCO格式数据集[下载链接](https://bj.bcebos.com/v1/paddledet/data/smalldet/xview_sliced.zip),检测其中的**60类**,
+具体类别为:
+
+
+
+`Fixed-wing Aircraft(0),
+Small Aircraft(1),
+Cargo Plane(2),
+Helicopter(3),
+Passenger Vehicle(4),
+Small Car(5),
+Bus(6),
+Pickup Truck(7),
+Utility Truck(8),
+Truck(9),
+Cargo Truck(10),
+Truck w/Box(11),
+Truck Tractor(12),
+Trailer(13),
+Truck w/Flatbed(14),
+Truck w/Liquid(15),
+Crane Truck(16),
+Railway Vehicle(17),
+Passenger Car(18),
+Cargo Car(19),
+Flat Car(20),
+Tank car(21),
+Locomotive(22),
+Maritime Vessel(23),
+Motorboat(24),
+Sailboat(25),
+Tugboat(26),
+Barge(27),
+Fishing Vessel(28),
+Ferry(29),
+Yacht(30),
+Container Ship(31),
+Oil Tanker(32),
+Engineering Vehicle(33),
+Tower crane(34),
+Container Crane(35),
+Reach Stacker(36),
+Straddle Carrier(37),
+Mobile Crane(38),
+Dump Truck(39),
+Haul Truck(40),
+Scraper/Tractor(41),
+Front loader/Bulldozer(42),
+Excavator(43),
+Cement Mixer(44),
+Ground Grader(45),
+Hut/Tent(46),
+Shed(47),
+Building(48),
+Aircraft Hangar(49),
+Damaged Building(50),
+Facility(51),
+Construction Site(52),
+Vehicle Lot(53),
+Helipad(54),
+Storage Tank(55),
+Shipping container lot(56),
+Shipping Container(57),
+Pylon(58),
+Tower(59)
+`
+
+
+
+,原始数据集[下载链接](https://challenge.xviewdataset.org/)。
+
+
+### 用户自定义数据集
+
+用户自定义数据集准备请参考[DET数据集标注工具](../../docs/tutorials/data/DetAnnoTools.md)和[DET数据集准备教程](../../docs/tutorials/data/PrepareDetDataSet.md)去准备。
diff --git a/PaddleDetection-release-2.6/configs/smalldet/README.md b/PaddleDetection-release-2.6/configs/smalldet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..db9c6366c210014a4e5c678bb2aca9d4365b47b4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/README.md
@@ -0,0 +1,424 @@
+# PP-YOLOE-SOD 小目标检测模型(PP-YOLOE Small Object Detection)
+
+


+
+## 内容
+- [简介](#简介)
+- [切图使用说明](#切图使用说明)
+ - [小目标数据集下载](#小目标数据集下载)
+ - [统计数据集分布](#统计数据集分布)
+ - [SAHI切图](#SAHI切图)
+- [模型库](#模型库)
+ - [VisDrone模型](#VisDrone模型)
+ - [COCO模型](#COCO模型)
+ - [切图模型](#切图模型)
+ - [拼图模型](#拼图模型)
+ - [注意事项](#注意事项)
+- [模型库使用说明](#模型库使用说明)
+ - [训练](#训练)
+ - [评估](#评估)
+ - [预测](#预测)
+ - [部署](#部署)
+- [引用](#引用)
+
+
+## 简介
+PaddleDetection团队提供了针对VisDrone-DET、DOTA水平框、Xview等小目标场景数据集的基于PP-YOLOE改进的检测模型 PP-YOLOE-SOD,以及提供了一套使用[SAHI](https://github.com/obss/sahi)(Slicing Aided Hyper Inference)工具的切图和拼图的方案。
+
+ - PP-YOLOE-SOD 是PaddleDetection团队自研的小目标检测特色模型,使用**数据集分布相关的基于向量的DFL算法** 和 **针对小目标优化的中心先验优化策略**,并且**在模型的Neck(FPN)结构中加入Transformer模块**,以及结合增加P2层、使用large size等策略,最终在多个小目标数据集上达到极高的精度。
+
+ - 切图拼图方案**适用于任何检测模型**,建议**使用 PP-YOLOE-SOD 结合切图拼图方案**一起使用以达到最佳的效果。
+
+ - 官方 AI Studio 教程案例请参考 [基于PP-YOLOE-SOD的无人机航拍图像检测案例全流程实操](https://aistudio.baidu.com/aistudio/projectdetail/5036782),欢迎一起动手实践学习。
+
+ - 第三方 AI Studio 教程案例可参考 [PPYOLOE:遥感场景下的小目标检测与部署(切图版)](https://aistudio.baidu.com/aistudio/projectdetail/4493701) 和 [涨分神器!基于PPYOLOE的切图和拼图解决方案](https://aistudio.baidu.com/aistudio/projectdetail/4438275),欢迎一起动手实践学习。
+
+**注意:**
+ - **不通过切图拼图而直接使用原图或子图**去训练评估预测,推荐使用 PP-YOLOE-SOD 模型,更多细节和消融实验可参照[COCO模型](#COCO模型)和[VisDrone模型](./visdrone)。
+ - 是否需要切图然后使用子图去**训练**,建议首先参照[切图使用说明](#切图使用说明)中的[统计数据集分布](#统计数据集分布)分析一下数据集再确定,一般数据集中**所有的目标均极小**的情况下推荐切图去训练。
+ - 是否需要切图然后使用子图去**预测**,建议在切图训练的情况下,配合着**同样操作的切图策略和参数**去预测(inference)效果更佳。但其实即便不切图训练,也可进行切图预测(inference),只需**在常规的预测命令最后加上`--slice_infer`以及相关子图参数**即可。
+ - 是否需要切图然后使用子图去**评估**,建议首先确保制作生成了合适的子图验证集,以及确保对应的标注框制作无误,并需要参照[模型库使用说明-评估](#评估)去**改动配置文件中的验证集(EvalDataset)的相关配置**,然后**在常规的评估命令最后加上`--slice_infer`以及相关子图参数**即可。
+ - `--slice_infer`的操作在PaddleDetection中默认**子图预测框会自动组合并拼回原图**,默认返回的是原图上的预测框,此方法也**适用于任何训好的检测模型**,无论是否切图训练。
+
+
+## 切图使用说明
+
+### 小目标数据集下载
+PaddleDetection团队整理提供的VisDrone-DET、DOTA水平框、Xview等小目标场景数据集的下载链接可以参照 [DataDownload.md](./DataDownload.md)。
+
+### 统计数据集分布
+
+对于待训的数据集(默认已处理为COCO格式,参照 [COCO格式数据集准备](../../docs/tutorials/data/PrepareDetDataSet.md#用户数据转成COCO数据),首先统计**标注框的平均宽高占图片真实宽高的比例**分布:
+
+以DOTA水平框数据集的train数据集为例:
+
+```bash
+python tools/box_distribution.py --json_path dataset/DOTA/annotations/train.json --out_img box_distribution.jpg --eval_size 640 --small_stride 8
+```
+ - `--json_path` :待统计数据集 COCO 格式 annotation 的json标注文件路径
+ - `--out_img` :输出的统计分布图的路径
+ - `--eval_size` :推理尺度(默认640)
+ - `--small_stride` :模型最小步长(默认8)
+
+统计结果打印如下:
+```bash
+Suggested reg_range[1] is 13 # DFL算法中推荐值,在 PP-YOLOE-SOD 模型的配置文件的head中设置为此值,效果最佳
+Mean of all img_w is 2304.3981547196595 # 原图宽的平均值
+Mean of all img_h is 2180.9354151880766 # 原图高的平均值
+Median of ratio_w is 0.03799439775910364 # 标注框的宽与原图宽的比例的中位数
+Median of ratio_h is 0.04074914637387802 # 标注框的高与原图高的比例的中位数
+all_img with box: 1409 # 数据集图片总数(排除无框或空标注的图片)
+all_ann: 98905 # 数据集标注框总数
+Distribution saved as box_distribution.jpg
+```
+
+**注意:**
+- 一般情况下,在原始数据集全部有标注框的图片中,**原图宽高的平均值大于1500像素,且有1/2以上的图片标注框的平均宽高与原图宽高比例小于0.04时(通过打印中位数得到该值)**,建议进行切图训练。
+- `Suggested reg_range[1]` 为数据集在优化后DFL算法中推荐的`reg_range`上限,即`reg_max + 1`,在 PP-YOLOE-SOD 模型的配置文件的head中设置这个值。
+
+
+### SAHI切图
+
+针对需要切图的数据集,使用[SAHI](https://github.com/obss/sahi)库进行切图:
+
+#### 安装SAHI库:
+
+参考[SAHI installation](https://github.com/obss/sahi/blob/main/README.md#installation)进行安装,`pip install sahi`,参考[installation](https://github.com/obss/sahi/blob/main/README.md#installation)。
+
+#### 基于SAHI切图
+
+以DOTA水平框数据集的train数据集为例,切分后的**子图文件夹**与**子图json标注文件**共同保存在`dota_sliced`文件夹下,分别命名为`train_images_500_025`、`train_500_025.json`:
+
+```bash
+python tools/slice_image.py --image_dir dataset/DOTA/train/ --json_path dataset/DOTA/annotations/train.json --output_dir dataset/dota_sliced --slice_size 500 --overlap_ratio 0.25
+```
+ - `--image_dir`:原始数据集图片文件夹的路径
+ - `--json_path`:原始数据集COCO格式的json标注文件的路径
+ - `--output_dir`:切分后的子图及其json标注文件保存的路径
+ - `--slice_size`:切分以后子图的边长尺度大小(默认切图后为正方形)
+ - `--overlap_ratio`:切分时的子图之间的重叠率
+
+**注意:**
+- 如果切图然后使用子图去**训练**,则只能**离线切图**,即切完图后保存成子图,存放在内存空间中。
+- 如果切图然后使用子图去**评估或预测**,则既可以**离线切图**,也可以**在线切图**,PaddleDetection中支持切图并自动拼图组合结果到原图上。
+
+
+## 模型库
+
+### [VisDrone模型](visdrone/)
+
+| 模型 | COCOAPI mAPval
0.5:0.95 | COCOAPI mAPval
0.5 | COCOAPI mAPtest_dev
0.5:0.95 | COCOAPI mAPtest_dev
0.5 | MatlabAPI mAPtest_dev
0.5:0.95 | MatlabAPI mAPtest_dev
0.5 | 下载 | 配置文件 |
+|:---------|:------:|:------:| :----: | :------:| :------: | :------:| :----: | :------:|
+|PP-YOLOE-s| 23.5 | 39.9 | 19.4 | 33.6 | 23.68 | 40.66 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_crn_s_80e_visdrone.yml) |
+|PP-YOLOE-P2-Alpha-s| 24.4 | 41.6 | 20.1 | 34.7 | 24.55 | 42.19 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_p2_alpha_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_crn_s_p2_alpha_80e_visdrone.yml) |
+|**PP-YOLOE+_SOD-s**| **25.1** | **42.8** | **20.7** | **36.2** | **25.16** | **43.86** | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_s_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_plus_sod_crn_s_80e_visdrone.yml) |
+|PP-YOLOE-l| 29.2 | 47.3 | 23.5 | 39.1 | 28.00 | 46.20 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_crn_l_80e_visdrone.yml) |
+|PP-YOLOE-P2-Alpha-l| 30.1 | 48.9 | 24.3 | 40.8 | 28.47 | 48.16 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_p2_alpha_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_crn_l_p2_alpha_80e_visdrone.yml) |
+|**PP-YOLOE+_SOD-l**| **31.9** | **52.1** | **25.6** | **43.5** | **30.25** | **51.18** | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml) |
+|PP-YOLOE-Alpha-largesize-l| 41.9 | 65.0 | 32.3 | 53.0 | 37.13 | 61.15 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_alpha_largesize_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_crn_l_alpha_largesize_80e_visdrone.yml) |
+|PP-YOLOE-P2-Alpha-largesize-l| 41.3 | 64.5 | 32.4 | 53.1 | 37.49 | 51.54 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_p2_alpha_largesize_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_crn_l_p2_alpha_largesize_80e_visdrone.yml) |
+|PP-YOLOE+_largesize-l | 43.3 | 66.7 | 33.5 | 54.7 | 38.24 | 62.76 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_largesize_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_plus_crn_l_largesize_80e_visdrone.yml) |
+|**PP-YOLOE+_SOD-largesize-l** | 42.7 | 65.9 | **33.6** | **55.1** | **38.4** | **63.07** | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.pdparams) | [配置文件](visdrone/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.yml) |
+
+**注意:**
+ - 上表中的模型均为**使用原图训练**,也**使用原图评估预测**,AP精度均为**原图验证集**上评估的结果。
+ - VisDrone-DET数据集**可使用原图训练,也可使用切图后训练**,通过数据集统计分布分析,推荐使用**原图训练**,推荐直接使用带**SOD**的模型配置文件去训练评估和预测部署,在显卡算力有限时也可使用切图后训练。
+ - 上表中的模型指标均是使用VisDrone-DET的train子集作为训练集,使用VisDrone-DET的val子集和test_dev子集作为验证集。
+ - **SOD**表示使用**基于向量的DFL算法**和针对小目标的**中心先验优化策略**,并**在模型的Neck结构中加入transformer**。
+ - **P2**表示增加P2层(1/4下采样层)的特征,共输出4个PPYOLOEHead。
+ - **Alpha**表示对CSPResNet骨干网络增加可一个学习权重参数Alpha参与训练。
+ - **largesize**表示使用**以1600尺度为基础的多尺度训练**和**1920尺度预测**,相应的训练batch_size也减小,以速度来换取高精度。
+ - MatlabAPI测试是使用官网评测工具[VisDrone2018-DET-toolkit](https://github.com/VisDrone/VisDrone2018-DET-toolkit)。
+
+
+ 快速开始
+
+```shell
+# 训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml --amp --eval
+# 评估
+python tools/eval.py -c configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_visdrone.pdparams
+# 预测
+python tools/infer.py -c configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_visdrone.pdparams --infer_img=demo/visdrone_0000315_01601_d_0000509.jpg --draw_threshold=0.25
+```
+
+
+
+
+### COCO模型
+
+| 模型 | mAPval
0.5:0.95 | AP0.5 | AP0.75 | APsmall | APmedium | APlarge | ARsmall | ARmedium | ARlarge | 下载链接 | 配置文件 |
+|:--------:|:-----------------------:|:----------:|:-----------:|:------------:|:-------------:|:-----------:|:------------:|:-------------:|:------------:|:-------:|:-------:|
+|PP-YOLOE+_l| 52.9 | 70.1 | 57.9 | 35.2 | 57.5 | 69.1 | 56.0 | 77.9 | 86.9 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams) | [配置文件](../ppyoloe/ppyoloe_plus_crn_l_80e_coco.yml) |
+|**PP-YOLOE+_SOD-l**| 53.0 | **70.4** | 57.7 | **37.1** | 57.5 | 69.0 | **56.5** | 77.5 | 86.7 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_coco.pdparams) | [配置文件](./ppyoloe_plus_sod_crn_l_80e_coco.yml) |
+
+**注意:**
+ - 上表中的模型均为**使用原图训练**,也**原图评估预测**,网络输入尺度为640x640,训练集为COCO的train2017,验证集为val2017,均为8卡总batch_size为64训练80 epoch。
+ - **SOD**表示使用**基于向量的DFL算法**和针对小目标的**中心先验优化策略**,并**在模型的Neck结构中加入transformer**,可在 APsmall 上提升1.9。
+
+
+ 快速开始
+
+```shell
+# 训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml --amp --eval
+# 评估
+python tools/eval.py -c configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_coco.pdparams
+# 预测
+python tools/infer.py -c configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_coco.pdparams --infer_img=demo/000000014439_640x640.jpg --draw_threshold=0.25
+```
+
+
+
+
+### 切图模型
+
+| 模型 | 数据集 | SLICE_SIZE | OVERLAP_RATIO | 类别数 | mAPval
0.5:0.95 | APval
0.5 | 下载链接 | 配置文件 |
+|:---------|:---------------:|:---------------:|:---------------:|:------:|:-----------------------:|:-------------------:|:---------:| :-----: |
+|PP-YOLOE-P2-l| DOTA | 500 | 0.25 | 15 | 53.9 | 78.6 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_p2_crn_l_80e_sliced_DOTA_500_025.pdparams) | [配置文件](./ppyoloe_p2_crn_l_80e_sliced_DOTA_500_025.yml) |
+|PP-YOLOE-P2-l| Xview | 400 | 0.25 | 60 | 14.9 | 27.0 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_p2_crn_l_80e_sliced_xview_400_025.pdparams) | [配置文件](./ppyoloe_p2_crn_l_80e_sliced_xview_400_025.yml) |
+|PP-YOLOE-l| VisDrone-DET| 640 | 0.25 | 10 | 38.5 | 60.2 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams) | [配置文件](./ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml) |
+
+**注意:**
+ - 上表中的模型均为使用**切图后的子图训练**,且使用**切图后的子图评估预测**,AP精度均为**子图验证集**上评估的结果。
+ - **SLICE_SIZE**表示使用SAHI工具切图后子图的边长大小,**OVERLAP_RATIO**表示切图的子图之间的重叠率。
+ - VisDrone-DET的模型与[拼图模型](#拼图模型)表格中的VisDrone-DET是**同一个模型权重**,但此处AP精度是在**切图后的子图验证集**上评估的结果。
+
+
+ 快速开始
+
+```shell
+# 训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml --amp --eval
+# 子图直接评估
+python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
+# 子图直接预测
+python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --infer_img=demo/visdrone_0000315_01601_d_0000509.jpg --draw_threshold=0.25
+```
+
+
+
+
+### 拼图模型
+
+| 模型 | 数据集 | SLICE_SIZE | OVERLAP_RATIO | 类别数 | mAPval
0.5:0.95 | APval
0.5 | 下载链接 | 配置文件 |
+|:---------|:---------------:|:---------------:|:---------------:|:------:|:-----------------------:|:-------------------:|:---------:| :-----: |
+|PP-YOLOE-l (原图直接评估)| VisDrone-DET| 640 | 0.25 | 10 | 29.7 | 48.5 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams) | [配置文件](./ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml) |
+|PP-YOLOE-l (切图拼图评估)| VisDrone-DET| 640 | 0.25 | 10 | 37.3 | 59.5 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams) | [配置文件](./ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml) |
+
+**注意:**
+ - 上表中的模型均为使用**切图后的子图**训练,评估预测时分为两种,**直接使用原图**评估预测,和**使用子图自动拼成原图**评估预测,AP精度均为**原图验证集**上评估的结果。。
+ - **SLICE_SIZE**表示使用SAHI工具切图后子图的边长大小,**OVERLAP_RATIO**表示切图的子图之间的重叠率。
+ - VisDrone-DET的模型与[切图模型](#切图模型)表格中的VisDrone-DET是**同一个模型权重**,但此处AP精度是在**原图验证集**上评估的结果,需要提前修改`ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml`里的`EvalDataset`的默认的子图验证集路径为以下**原图验证集路径**:
+ ```
+ EvalDataset:
+ !COCODataSet
+ image_dir: VisDrone2019-DET-val
+ anno_path: val.json
+ dataset_dir: dataset/visdrone
+ ```
+
+
+ 快速开始
+
+```shell
+# 训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml --amp --eval
+# 原图直接评估,注意需要提前修改此yml中的 `EvalDataset` 的默认的子图验证集路径 为 原图验证集路径:
+python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
+# 切图拼图评估,加上 --slice_infer,注意是使用的带 _slice_infer 后缀的yml配置文件
+python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --slice_infer
+# 切图拼图预测,加上 --slice_infer
+python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --infer_img=demo/visdrone_0000315_01601_d_0000509.jpg --draw_threshold=0.25 --slice_infer
+```
+
+
+
+
+### 注意事项
+
+- 切图和拼图,需要使用[SAHI](https://github.com/obss/sahi)切图工具,需要首先安装:`pip install sahi`,参考[installation](https://github.com/obss/sahi/blob/main/README.md#installation)。
+- DOTA水平框和Xview数据集均是**切图后训练**,AP指标为**切图后的子图val上的指标**。
+- VisDrone-DET数据集请参照[visdrone](./visdrone),**可使用原图训练,也可使用切图后训练**,这上面表格中的指标均是使用VisDrone-DET的val子集做验证而未使用test_dev子集。
+- PP-YOLOE模型训练过程中使用8 GPUs进行混合精度训练,如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+- 常用训练验证部署等步骤请参考[ppyoloe](../ppyoloe#getting-start)。
+- 自动切图和拼图的推理预测需添加设置`--slice_infer`,具体见下文[模型库使用说明](#模型库使用说明)中的[预测](#预测)和[部署](#部署)。
+- 自动切图和拼图过程,参照[2.3 子图拼图评估](#评估)。
+
+
+## 模型库使用说明
+
+### 训练
+
+#### 1.1 原图训练
+首先将待训数据集制作成COCO数据集格式,然后按照PaddleDetection的模型的常规训练流程训练即可。
+
+执行以下指令使用混合精度训练COCO数据集:
+
+```bash
+python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml --amp --eval
+```
+
+**注意:**
+- 使用默认配置训练需要设置`--amp`以避免显存溢出,`--eval`表示边训边验证,会自动保存最佳精度的模型权重。
+
+#### 1.2 原图训练
+首先将待训数据集制作成COCO数据集格式,然后使用SAHI切图工具进行**离线切图**,对保存的子图按**常规检测模型的训练流程**走即可。
+也可直接下载PaddleDetection团队提供的切图后的VisDrone-DET、DOTA水平框、Xview数据集。
+
+执行以下指令使用混合精度训练VisDrone切图数据集:
+
+```bash
+python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml --amp --eval
+```
+
+
+### 评估
+
+#### 2.1 子图评估
+**默认评估方式是子图评估**,子图数据集的验证集设置为:
+```
+EvalDataset:
+ !COCODataSet
+ image_dir: val_images_640_025
+ anno_path: val_640_025.json
+ dataset_dir: dataset/visdrone_sliced
+```
+按常规检测模型的评估流程,评估提前切好并存下来的子图上的精度:
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
+```
+
+#### 2.2 原图评估
+修改验证集的标注文件路径为**原图标注文件**:
+```
+EvalDataset:
+ !COCODataSet
+ image_dir: VisDrone2019-DET-val
+ anno_path: val.json
+ dataset_dir: dataset/visdrone
+```
+直接评估原图上的精度:
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
+```
+
+#### 2.3 子图拼图评估
+修改验证集的标注文件路径为**原图标注文件**:
+```
+# very slow, preferly eval with a determined weights(xx.pdparams)
+# if you want to eval during training, change SlicedCOCODataSet to COCODataSet and delete sliced_size and overlap_ratio
+EvalDataset:
+ !SlicedCOCODataSet
+ image_dir: VisDrone2019-DET-val
+ anno_path: val.json
+ dataset_dir: dataset/visdrone
+ sliced_size: [640, 640]
+ overlap_ratio: [0.25, 0.25]
+```
+会在评估过程中自动对原图进行切图最后再重组和融合结果来评估原图上的精度:
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --slice_infer --combine_method=nms --match_threshold=0.6 --match_metric=ios
+```
+
+**注意:**
+- 设置`--slice_infer`表示切图预测并拼装重组结果,如果不使用则不写,注意需要确保EvalDataset的数据集类是选用的SlicedCOCODataSet而不是COCODataSet;
+- 设置`--slice_size`表示切图的子图尺寸大小,设置`--overlap_ratio`表示子图间重叠率,可以自行修改选择合适的子图尺度sliced_size和子图间重叠率overlap_ratio,如:
+```
+EvalDataset:
+ !SlicedCOCODataSet
+ image_dir: VisDrone2019-DET-val
+ anno_path: val.json
+ dataset_dir: dataset/visdrone
+ sliced_size: [480, 480]
+ overlap_ratio: [0.2, 0.2]
+```
+- 设置`--combine_method`表示子图结果重组去重的方式,默认是`nms`;
+- 设置`--match_threshold`表示子图结果重组去重的阈值,默认是0.6;
+- 设置`--match_metric`表示子图结果重组去重的度量标准,默认是`ios`表示交小比(两个框交集面积除以更小框的面积),也可以选择交并比`iou`(两个框交集面积除以并集面积),精度效果因数据集而而异,但选择`ios`预测速度会更快一点;
+
+
+### 预测
+
+#### 3.1 子图或原图直接预测
+与评估流程基本相同,可以在提前切好并存下来的子图上预测,也可以对原图预测,如:
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --infer_img=demo/visdrone_0000315_01601_d_0000509.jpg --draw_threshold=0.25
+```
+
+#### 3.2 原图自动切图并拼图预测
+也可以对原图进行自动切图并拼图重组来预测原图,如:
+```bash
+# 单张图
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --infer_img=demo/visdrone_0000315_01601_d_0000509.jpg --draw_threshold=0.25 --slice_infer --slice_size 640 640 --overlap_ratio 0.25 0.25 --combine_method=nms --match_threshold=0.6 --match_metric=ios --save_results=True
+# 或图片文件夹
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --infer_dir=demo/ --draw_threshold=0.25 --slice_infer --slice_size 640 640 --overlap_ratio 0.25 0.25 --combine_method=nms --match_threshold=0.6 --match_metric=ios
+```
+- 设置`--slice_infer`表示切图预测并拼装重组结果,如果不使用则不写;
+- 设置`--slice_size`表示切图的子图尺寸大小,设置`--overlap_ratio`表示子图间重叠率;
+- 设置`--combine_method`表示子图结果重组去重的方式,默认是`nms`;
+- 设置`--match_threshold`表示子图结果重组去重的阈值,默认是0.6;
+- 设置`--match_metric`表示子图结果重组去重的度量标准,默认是`ios`表示交小比(两个框交集面积除以更小框的面积),也可以选择交并比`iou`(两个框交集面积除以并集面积),精度效果因数据集而而异,但选择`ios`预测速度会更快一点;
+- 设置`--save_results`表示保存图片结果为json文件,一般只单张图预测时使用;
+
+
+### 部署
+
+#### 4.1 导出模型
+```bash
+# export model
+CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
+```
+
+#### 4.2 使用原图或子图直接推理
+```bash
+# deploy infer
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025 --image_file=demo/visdrone_0000315_01601_d_0000509.jpg --device=GPU --save_images --threshold=0.25
+```
+
+#### 4.3 使用原图自动切图并拼图重组结果来推理
+```bash
+# deploy slice infer
+# 单张图
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025 --image_file=demo/visdrone_0000315_01601_d_0000509.jpg --device=GPU --save_images --threshold=0.25 --slice_infer --slice_size 640 640 --overlap_ratio 0.25 0.25 --combine_method=nms --match_threshold=0.6 --match_metric=ios --save_results=True
+# 或图片文件夹
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_crn_l_80e_sliced_visdrone_640_025 --image_dir=demo/ --device=GPU --save_images --threshold=0.25 --slice_infer --slice_size 640 640 --overlap_ratio 0.25 0.25 --combine_method=nms --match_threshold=0.6 --match_metric=ios
+```
+- 设置`--slice_infer`表示切图预测并拼装重组结果,如果不使用则不写;
+- 设置`--slice_size`表示切图的子图尺寸大小,设置`--overlap_ratio`表示子图间重叠率;
+- 设置`--combine_method`表示子图结果重组去重的方式,默认是`nms`;
+- 设置`--match_threshold`表示子图结果重组去重的阈值,默认是0.6;
+- 设置`--match_metric`表示子图结果重组去重的度量标准,默认是`ios`表示交小比(两个框交集面积除以更小框的面积),也可以选择交并比`iou`(两个框交集面积除以并集面积),精度效果因数据集而而异,但选择`ios`预测速度会更快一点;
+- 设置`--save_results`表示保存图片结果为json文件,一般只单张图预测时使用;
+
+
+## 引用
+```
+@article{akyon2022sahi,
+ title={Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection},
+ author={Akyon, Fatih Cagatay and Altinuc, Sinan Onur and Temizel, Alptekin},
+ journal={2022 IEEE International Conference on Image Processing (ICIP)},
+ doi={10.1109/ICIP46576.2022.9897990},
+ pages={966-970},
+ year={2022}
+}
+
+@inproceedings{xia2018dota,
+ title={DOTA: A large-scale dataset for object detection in aerial images},
+ author={Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
+ booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
+ pages={3974--3983},
+ year={2018}
+}
+
+@ARTICLE{9573394,
+ author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ title={Detection and Tracking Meet Drones Challenge},
+ year={2021},
+ volume={},
+ number={},
+ pages={1-1},
+ doi={10.1109/TPAMI.2021.3119563}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/smalldet/_base_/DOTA_sliced_500_025_detection.yml b/PaddleDetection-release-2.6/configs/smalldet/_base_/DOTA_sliced_500_025_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d0fc0c389f6ed6e0af1bb9e52406cd2c80205c2c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/_base_/DOTA_sliced_500_025_detection.yml
@@ -0,0 +1,20 @@
+metric: COCO
+num_classes: 15
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images_500_025
+ anno_path: train_500_025.json
+ dataset_dir: dataset/dota_sliced
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val_images_500_025
+ anno_path: val_500_025.json
+ dataset_dir: dataset/dota_sliced
+
+TestDataset:
+ !ImageFolder
+ anno_path: val_500_025.json
+ dataset_dir: dataset/dota_sliced
diff --git a/PaddleDetection-release-2.6/configs/smalldet/_base_/visdrone_sliced_640_025_detection.yml b/PaddleDetection-release-2.6/configs/smalldet/_base_/visdrone_sliced_640_025_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..03848ca17549e159d5dab0886c9f83d461c4fdd7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/_base_/visdrone_sliced_640_025_detection.yml
@@ -0,0 +1,20 @@
+metric: COCO
+num_classes: 10
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images_640_025
+ anno_path: train_640_025.json
+ dataset_dir: dataset/visdrone_sliced
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val_images_640_025
+ anno_path: val_640_025.json
+ dataset_dir: dataset/visdrone_sliced
+
+TestDataset:
+ !ImageFolder
+ anno_path: val_640_025.json
+ dataset_dir: dataset/visdrone_sliced
diff --git a/PaddleDetection-release-2.6/configs/smalldet/_base_/xview_sliced_400_025_detection.yml b/PaddleDetection-release-2.6/configs/smalldet/_base_/xview_sliced_400_025_detection.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c80f545bd7e280b7d97f8ff9e7db25e86162bdf5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/_base_/xview_sliced_400_025_detection.yml
@@ -0,0 +1,20 @@
+metric: COCO
+num_classes: 60
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images_400_025
+ anno_path: train_400_025.json
+ dataset_dir: dataset/xview_sliced
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val_images_400_025
+ anno_path: val_400_025.json
+ dataset_dir: dataset/xview_sliced
+
+TestDataset:
+ !ImageFolder
+ anno_path: val_400_025.json
+ dataset_dir: dataset/xview_sliced
diff --git a/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml
new file mode 100644
index 0000000000000000000000000000000000000000..26275899ff3b5125b46b86db402e0543e7780036
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml
@@ -0,0 +1,58 @@
+_BASE_: [
+ './_base_/visdrone_sliced_640_025_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_l_80e_sliced_visdrone_640_025/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+
+TrainReader:
+ batch_size: 8
+
+EvalReader:
+ batch_size: 1
+
+TestReader:
+ batch_size: 1
+ fuse_normalize: True
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 10000
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.6
+
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val_images_640_025
+ anno_path: val_640_025.json
+ dataset_dir: dataset/visdrone_sliced
+
+# EvalDataset:
+# !COCODataSet
+# image_dir: VisDrone2019-DET-val
+# anno_path: val.json
+# dataset_dir: dataset/visdrone
diff --git a/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml
new file mode 100644
index 0000000000000000000000000000000000000000..91e45ff54c31b1092a2863b015a2d944ba3b678e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml
@@ -0,0 +1,15 @@
+_BASE_: [
+ 'ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml',
+]
+weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
+
+
+# very slow, better to use the determined weight (xx. pdparams) to evaluate separately, rather than evaling during training
+# if you want to eval during training, change SlicedCOCODataSet to COCODataSet, then delete sliced_size and overlap_ratio
+EvalDataset:
+ !SlicedCOCODataSet
+ image_dir: VisDrone2019-DET-val
+ anno_path: val.json
+ dataset_dir: dataset/visdrone
+ sliced_size: [640, 640]
+ overlap_ratio: [0.25, 0.25]
diff --git a/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_p2_crn_l_80e_sliced_DOTA_500_025.yml b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_p2_crn_l_80e_sliced_DOTA_500_025.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4e47f2c88d3a1c230dcc6461cab70eeb68f53419
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_p2_crn_l_80e_sliced_DOTA_500_025.yml
@@ -0,0 +1,54 @@
+_BASE_: [
+ './_base_/DOTA_sliced_500_025_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_p2_crn_l_80e_sliced_DOTA_500_025/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+
+CSPResNet:
+ return_idx: [0, 1, 2, 3]
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192, 64]
+
+
+TrainReader:
+ batch_size: 4
+
+EvalReader:
+ batch_size: 1
+
+TestReader:
+ batch_size: 1
+ fuse_normalize: True
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8, 4]
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 10000
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_p2_crn_l_80e_sliced_xview_400_025.yml b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_p2_crn_l_80e_sliced_xview_400_025.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e94d799b2727e5097a7b5e90f7b7c1935bed0df8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_p2_crn_l_80e_sliced_xview_400_025.yml
@@ -0,0 +1,54 @@
+_BASE_: [
+ './_base_/xview_sliced_400_025_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_300e.yml',
+ '../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_p2_crn_l_80e_sliced_xview_400_025/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+
+CSPResNet:
+ return_idx: [0, 1, 2, 3]
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192, 64]
+
+
+TrainReader:
+ batch_size: 4
+
+EvalReader:
+ batch_size: 1
+
+TestReader:
+ batch_size: 1
+ fuse_normalize: True
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8, 4]
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 10000
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ad4c52eac56fad3303169d02c6f6578abbdcf106
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/ppyoloe_plus_sod_crn_l_80e_coco.yml
@@ -0,0 +1,31 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyoloe/_base_/optimizer_80e.yml',
+ '../ppyoloe/_base_/ppyoloe_plus_crn.yml',
+ '../ppyoloe/_base_/ppyoloe_plus_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 5
+weights: output/ppyoloe_plus_sod_crn_l_80e_coco/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/ppyoloe_crn_l_obj365_pretrained.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+CustomCSPPAN:
+ num_layers: 4
+ use_trans: True
+
+PPYOLOEHead:
+ reg_range: [-2, 17]
+ static_assigner_epoch: -1
+ assigner:
+ name: TaskAlignedAssigner_CR
+ center_radius: 1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/README.md b/PaddleDetection-release-2.6/configs/smalldet/visdrone/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..fbe4ad82224ec6b4b3ee8bb1d7d30fd8c2913791
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/README.md
@@ -0,0 +1,186 @@
+# VisDrone-DET 小目标检测模型
+
+PaddleDetection团队提供了针对VisDrone-DET小目标数航拍场景的基于PP-YOLOE的检测模型,用户可以下载模型进行使用。整理后的COCO格式VisDrone-DET数据集[下载链接](https://bj.bcebos.com/v1/paddledet/data/smalldet/visdrone.zip),检测其中的10类,包括 `pedestrian(1), people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10)`,原始数据集[下载链接](https://github.com/VisDrone/VisDrone-Dataset)。其他相关小目标数据集可参照 [DataDownload.md](../DataDownload.md)。
+
+**注意:**
+- VisDrone-DET数据集包括**train集6471张,val集548张,test_dev集1610张**,test-challenge集1580张(未开放检测框标注),前三者均有开放检测框标注。
+- 模型均**只使用train集训练**,在val集和test_dev集上分别验证精度,test_dev集图片数较多,精度参考性较高。
+
+
+## 原图训练,原图评估:
+
+| 模型 | COCOAPI mAPval
0.5:0.95 | COCOAPI mAPval
0.5 | COCOAPI mAPtest_dev
0.5:0.95 | COCOAPI mAPtest_dev
0.5 | MatlabAPI mAPtest_dev
0.5:0.95 | MatlabAPI mAPtest_dev
0.5 | 下载 | 配置文件 |
+|:---------|:------:|:------:| :----: | :------:| :------: | :------:| :----: | :------:|
+|PP-YOLOE-s| 23.5 | 39.9 | 19.4 | 33.6 | 23.68 | 40.66 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_80e_visdrone.pdparams) | [配置文件](./ppyoloe_crn_s_80e_visdrone.yml) |
+|PP-YOLOE-P2-Alpha-s| 24.4 | 41.6 | 20.1 | 34.7 | 24.55 | 42.19 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_p2_alpha_80e_visdrone.pdparams) | [配置文件](./ppyoloe_crn_s_p2_alpha_80e_visdrone.yml) |
+|**PP-YOLOE+_SOD-s**| **25.1** | **42.8** | **20.7** | **36.2** | **25.16** | **43.86** | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_s_80e_visdrone.pdparams) | [配置文件](./ppyoloe_plus_sod_crn_s_80e_visdrone.yml) |
+|PP-YOLOE-l| 29.2 | 47.3 | 23.5 | 39.1 | 28.00 | 46.20 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_visdrone.pdparams) | [配置文件](./ppyoloe_crn_l_80e_visdrone.yml) |
+|PP-YOLOE-P2-Alpha-l| 30.1 | 48.9 | 24.3 | 40.8 | 28.47 | 48.16 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_p2_alpha_80e_visdrone.pdparams) | [配置文件](./ppyoloe_crn_l_p2_alpha_80e_visdrone.yml) |
+|**PP-YOLOE+_SOD-l**| **31.9** | **52.1** | **25.6** | **43.5** | **30.25** | **51.18** | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_visdrone.pdparams) | [配置文件](./ppyoloe_plus_sod_crn_l_80e_visdrone.yml) |
+|PP-YOLOE-Alpha-largesize-l| 41.9 | 65.0 | 32.3 | 53.0 | 37.13 | 61.15 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_alpha_largesize_80e_visdrone.pdparams) | [配置文件](./ppyoloe_crn_l_alpha_largesize_80e_visdrone.yml) |
+|PP-YOLOE-P2-Alpha-largesize-l| 41.3 | 64.5 | 32.4 | 53.1 | 37.49 | 51.54 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_p2_alpha_largesize_80e_visdrone.pdparams) | [配置文件](./ppyoloe_crn_l_p2_alpha_largesize_80e_visdrone.yml) |
+|PP-YOLOE+_largesize-l | 43.3 | 66.7 | 33.5 | 54.7 | 38.24 | 62.76 | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_largesize_80e_visdrone.pdparams) | [配置文件](./ppyoloe_plus_crn_l_largesize_80e_visdrone.yml) |
+|**PP-YOLOE+_SOD-largesize-l** | 42.7 | 65.9 | **33.6** | **55.1** | **38.4** | **63.07** | [下载链接](https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.pdparams) | [配置文件](./ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.yml) |
+
+**注意:**
+ - 上表中的模型均为**使用原图训练**,也**使用原图评估预测**,AP精度均为**原图验证集**上评估的结果。
+ - VisDrone-DET数据集**可使用原图训练,也可使用切图后训练**,通过数据集统计分布分析,推荐使用**原图训练**,推荐直接使用带**SOD**的模型配置文件去训练评估和预测部署,在显卡算力有限时也可使用切图后训练。
+ - 上表中的模型指标均是使用VisDrone-DET的train子集作为训练集,使用VisDrone-DET的val子集和test_dev子集作为验证集。
+ - **SOD**表示使用**基于向量的DFL算法**和针对小目标的**中心先验优化策略**,并**在模型的Neck结构中加入transformer**。
+ - **P2**表示增加P2层(1/4下采样层)的特征,共输出4个PPYOLOEHead。
+ - **Alpha**表示对CSPResNet骨干网络增加可一个学习权重参数Alpha参与训练。
+ - **largesize**表示使用**以1600尺度为基础的多尺度训练**和**1920尺度预测**,相应的训练batch_size也减小,以速度来换取高精度。
+ - MatlabAPI测试是使用官网评测工具[VisDrone2018-DET-toolkit](https://github.com/VisDrone/VisDrone2018-DET-toolkit)。
+
+
+ 快速开始
+
+```shell
+# 训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml --amp --eval
+# 评估
+python tools/eval.py -c configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_visdrone.pdparams
+# 预测
+python tools/infer.py -c configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_80e_visdrone.pdparams --infer_img=demo/visdrone_0000315_01601_d_0000509.jpg --draw_threshold=0.25
+```
+
+
+
+
+## 子图训练,原图评估和拼图评估:
+
+| 模型 | 数据集 | SLICE_SIZE | OVERLAP_RATIO | 类别数 | mAPval
0.5:0.95 | APval
0.5 | 下载链接 | 配置文件 |
+|:---------|:---------------:|:---------------:|:---------------:|:------:|:-----------------------:|:-------------------:|:---------:| :-----: |
+|PP-YOLOE-l(子图直接评估)| VisDrone-DET| 640 | 0.25 | 10 | 38.5(子图val) | 60.2 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams) | [配置文件](./ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml) |
+|PP-YOLOE-l(原图直接评估)| VisDrone-DET| 640 | 0.25 | 10 | 29.7(原图val) | 48.5 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams) | [配置文件](../ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml) |
+|PP-YOLOE-l (切图拼图评估)| VisDrone-DET| 640 | 0.25 | 10 | 37.3(原图val) | 59.5 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams) | [配置文件](../ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml) |
+
+**注意:**
+ - 上表中的模型均为使用**切图后的子图**训练,评估预测时分为两种,**直接使用原图**评估预测,和**使用子图自动拼成原图**评估预测,AP精度均为**原图验证集**上评估的结果。。
+ - **SLICE_SIZE**表示使用SAHI工具切图后子图的边长大小,**OVERLAP_RATIO**表示切图的子图之间的重叠率。
+ - VisDrone-DET的模型与[切图模型](../README.md#切图模型)表格中的VisDrone-DET是**同一个模型权重**,但此处AP精度是在**原图验证集**上评估的结果,需要提前修改`ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml`里的`EvalDataset`的默认的子图验证集路径为以下**原图验证集路径**:
+ ```
+ EvalDataset:
+ !COCODataSet
+ image_dir: VisDrone2019-DET-val
+ anno_path: val.json
+ dataset_dir: dataset/visdrone
+ ```
+
+
+ 快速开始
+
+```shell
+# 训练
+python -m paddle.distributed.launch --log_dir=logs/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml --amp --eval
+# 子图直接评估
+python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
+# 原图直接评估,注意需要提前修改此yml中的 `EvalDataset` 的默认的子图验证集路径 为 原图验证集路径:
+python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams
+# 切图拼图评估,加上 --slice_infer,注意是使用的带 _slice_infer 后缀的yml配置文件
+python tools/eval.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025_slice_infer.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --slice_infer
+# 切图拼图预测,加上 --slice_infer
+python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --infer_img=demo/visdrone_0000315_01601_d_0000509.jpg --draw_threshold=0.25 --slice_infer
+```
+
+
+
+
+## 注意事项:
+ - PP-YOLOE模型训练过程中使用8 GPUs进行混合精度训练,如果**GPU卡数**或者**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率。
+ - 具体使用教程请参考[ppyoloe](../../ppyoloe#getting-start)。
+ - MatlabAPI测试是使用官网评测工具[VisDrone2018-DET-toolkit](https://github.com/VisDrone/VisDrone2018-DET-toolkit)。
+
+
+## PP-YOLOE+_SOD 部署模型
+
+| 网络模型 | 输入尺寸 | 导出后的权重(w/o NMS) | ONNX(w/o NMS) |
+| :-------- | :--------: | :---------------------: | :----------------: |
+| PP-YOLOE+_SOD-s | 640 | [( w/ nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_s_80e_visdrone_w_nms.zip) | [( w/o nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_s_80e_visdrone_wo_nms.zip) | [( w/ nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_s_80e_visdrone_w_nms.onnx) | [( w/o nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_s_80e_visdrone_wo_nms.onnx) |
+| PP-YOLOE+_SOD-l | 640 | [( w/ nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_l_80e_visdrone_w_nms.zip) | [( w/o nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_l_80e_visdrone_wo_nms.zip) | [( w/ nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_l_80e_visdrone_w_nms.onnx) | [( w/o nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_l_80e_visdrone_wo_nms.onnx) |
+| PP-YOLOE+_SOD-largesize-l | 1920 | [( w/ nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone_w_nms.zip) | [( w/o nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone_wo_nms.zip) | [( w/ nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone_w_nms.onnx) | [( w/o nms)](https://paddledet.bj.bcebos.com/deploy/smalldet/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone_wo_nms.onnx) |
+
+
+## 测速
+
+1.参考[Paddle Inference文档](https://www.paddlepaddle.org.cn/inference/master/user_guides/download_lib.html#python),下载并安装与你的CUDA, CUDNN和TensorRT相应的wheel包。
+测速需要设置`--run_benchmark=True`, 你需要安装以下依赖`pip install pynvml psutil GPUtil`。
+导出ONNX,你需要安装以下依赖`pip install paddle2onnx`。
+
+2.运行以下命令导出**带NMS的模型和ONNX**,并使用TensorRT FP16进行推理和测速
+
+### 注意:
+
+- 由于NMS参数设置对速度影响极大,部署测速时可调整`keep_top_k`和`nms_top_k`,在只低约0.1 mAP精度的情况下加快预测速度,导出模型的时候也可这样设置:
+ ```
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000 # 10000
+ keep_top_k: 100 # 500
+ score_threshold: 0.01
+ nms_threshold: 0.6
+ ```
+
+```bash
+# 导出带NMS的模型
+python tools/export_model.py -c configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.pdparams trt=True
+
+# 导出带NMS的ONNX
+paddle2onnx --model_dir output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.onnx
+
+# 推理单张图片
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --image_file=demo/visdrone_0000315_01601_d_0000509.jpg --device=gpu --run_mode=trt_fp16
+
+# 推理文件夹下的所有图片
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --image_dir=demo/ --device=gpu --run_mode=trt_fp16
+
+# 单张图片普通测速
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --image_file=demo/visdrone_0000315_01601_d_0000509.jpg --device=gpu --run_benchmark=True
+
+# 单张图片TensorRT FP16测速
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --image_file=demo/visdrone_0000315_01601_d_0000509.jpg --device=gpu --run_benchmark=True --run_mode=trt_fp16
+```
+
+3.运行以下命令导出**不带NMS的模型和ONNX**,并使用TensorRT FP16进行推理和测速,以及**ONNX下FP16测速**
+
+```bash
+# 导出带NMS的模型
+python tools/export_model.py -c configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.pdparams trt=True exclude_nms=True
+
+# 导出带NMS的ONNX
+paddle2onnx --model_dir output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.onnx
+
+# 推理单张图片
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --image_file=demo/visdrone_0000315_01601_d_0000509.jpg --device=gpu --run_mode=trt_fp16
+
+# 推理文件夹下的所有图片
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --image_dir=demo/ --device=gpu --run_mode=trt_fp16
+
+# 单张图片普通测速
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --image_file=demo/visdrone_0000315_01601_d_0000509.jpg --device=gpu --run_benchmark=True
+
+# 单张图片TensorRT FP16测速
+CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone --image_file=demo/visdrone_0000315_01601_d_0000509.jpg --device=gpu --run_benchmark=True --run_mode=trt_fp16
+
+# 单张图片ONNX TensorRT FP16测速
+/usr/local/TensorRT-8.0.3.4/bin/trtexec --onnx=ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.onnx --workspace=4096 --avgRuns=10 --shapes=input:1x3x1920x1920 --fp16
+```
+
+**注意:**
+- TensorRT会根据网络的定义,执行针对当前硬件平台的优化,生成推理引擎并序列化为文件。该推理引擎只适用于当前软硬件平台。如果你的软硬件平台没有发生变化,你可以设置[enable_tensorrt_engine](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.5/deploy/python/infer.py#L857)的参数`use_static=True`,这样生成的序列化文件将会保存在`output_inference`文件夹下,下次执行TensorRT时将加载保存的序列化文件。
+- PaddleDetection release/2.4及其之后的版本将支持NMS调用TensorRT,需要依赖PaddlePaddle release/2.3及其之后的版本
+
+
+# 引用
+```
+@ARTICLE{9573394,
+ author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ title={Detection and Tracking Meet Drones Challenge},
+ year={2021},
+ volume={},
+ number={},
+ pages={1-1},
+ doi={10.1109/TPAMI.2021.3119563}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2eaa3164309a77ad441e9087ff65822d61ab278b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_80e_visdrone.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../../datasets/visdrone_detection.yml',
+ '../../runtime.yml',
+ '../../ppyoloe/_base_/optimizer_300e.yml',
+ '../../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_l_80e_visdrone/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+
+TrainReader:
+ batch_size: 8
+
+EvalReader:
+ batch_size: 1
+
+TestReader:
+ batch_size: 1
+ fuse_normalize: True
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 10000
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_alpha_largesize_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_alpha_largesize_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c16f2116a7015be193d052bc7c9811d2b4133de3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_alpha_largesize_80e_visdrone.yml
@@ -0,0 +1,58 @@
+_BASE_: [
+ 'ppyoloe_crn_l_80e_visdrone.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_l_alpha_largesize_80e_visdrone/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+
+
+CSPResNet:
+ use_alpha: True
+
+
+LearningRate:
+ base_lr: 0.0025
+
+
+worker_num: 2
+eval_height: &eval_height 1920
+eval_width: &eval_width 1920
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [1024, 1088, 1152, 1216, 1280, 1344, 1408, 1472, 1536, 1600, 1664, 1728, 1792, 1856, 1920], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: True
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_p2_alpha_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_p2_alpha_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f0093beb0fdf66079483f3bec07a94cc5afec617
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_p2_alpha_80e_visdrone.yml
@@ -0,0 +1,33 @@
+_BASE_: [
+ 'ppyoloe_crn_l_80e_visdrone.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_l_p2_alpha_80e_visdrone/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+
+
+TrainReader:
+ batch_size: 4
+
+EvalReader:
+ batch_size: 1
+
+TestReader:
+ batch_size: 1
+ fuse_normalize: True
+
+
+LearningRate:
+ base_lr: 0.005
+
+
+CSPResNet:
+ return_idx: [0, 1, 2, 3]
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192, 64]
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8, 4]
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_p2_alpha_largesize_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_p2_alpha_largesize_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..53b4c9f0d66b74e20897e0ae509176d1ab4beceb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_l_p2_alpha_largesize_80e_visdrone.yml
@@ -0,0 +1,65 @@
+_BASE_: [
+ 'ppyoloe_crn_l_80e_visdrone.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_l_p2_alpha_largesize_80e_visdrone/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+
+
+LearningRate:
+ base_lr: 0.005
+
+
+CSPResNet:
+ return_idx: [0, 1, 2, 3]
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192, 64]
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8, 4]
+
+
+worker_num: 2
+eval_height: &eval_height 1920
+eval_width: &eval_width 1920
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [1024, 1088, 1152, 1216, 1280, 1344, 1408, 1472, 1536, 1600, 1664, 1728, 1792, 1856, 1920, 1984, 2048], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: True
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_s_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_s_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d5fefe6a8ae2db471d15575c0fd4bc342141a480
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_s_80e_visdrone.yml
@@ -0,0 +1,45 @@
+_BASE_: [
+ '../../datasets/visdrone_detection.yml',
+ '../../runtime.yml',
+ '../../ppyoloe/_base_/optimizer_300e.yml',
+ '../../ppyoloe/_base_/ppyoloe_crn.yml',
+ '../../ppyoloe/_base_/ppyoloe_reader.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_s_80e_visdrone/model_final
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+
+TrainReader:
+ batch_size: 8
+
+EvalReader:
+ batch_size: 1
+
+TestReader:
+ batch_size: 1
+ fuse_normalize: True
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+PPYOLOEHead:
+ static_assigner_epoch: -1
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 10000
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_s_p2_alpha_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_s_p2_alpha_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..70e34cd05a872e168f88e7d59858d69305559e29
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_crn_s_p2_alpha_80e_visdrone.yml
@@ -0,0 +1,32 @@
+_BASE_: [
+ 'ppyoloe_crn_s_80e_visdrone.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_crn_s_p2_alpha_80e_visdrone/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+
+
+TrainReader:
+ batch_size: 4
+
+EvalReader:
+ batch_size: 1
+
+TestReader:
+ batch_size: 1
+ fuse_normalize: True
+
+
+LearningRate:
+ base_lr: 0.005
+
+CSPResNet:
+ return_idx: [0, 1, 2, 3]
+ use_alpha: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192, 64]
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8, 4]
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_crn_l_largesize_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_crn_l_largesize_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fddfe46a1afe02ff3955d0b22f41efd511cb7722
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_crn_l_largesize_80e_visdrone.yml
@@ -0,0 +1,58 @@
+_BASE_: [
+ 'ppyoloe_crn_l_80e_visdrone.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_plus_crn_l_largesize_80e_visdrone/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams
+
+
+CSPResNet:
+ use_alpha: True
+
+
+LearningRate:
+ base_lr: 0.0025
+
+
+worker_num: 2
+eval_height: &eval_height 1920
+eval_width: &eval_width 1920
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [1024, 1088, 1152, 1216, 1280, 1344, 1408, 1472, 1536, 1600, 1664, 1728, 1792, 1856, 1920], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: True
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e0c6e41d3dc30ff78cae57596bca86521a008099
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_80e_visdrone.yml
@@ -0,0 +1,58 @@
+_BASE_: [
+ '../../datasets/visdrone_detection.yml',
+ '../../runtime.yml',
+ '../../ppyoloe/_base_/optimizer_80e.yml',
+ '../../ppyoloe/_base_/ppyoloe_plus_crn.yml',
+ '../../ppyoloe/_base_/ppyoloe_plus_reader.yml'
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_plus_sod_crn_l_80e_visdrone/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_l_80e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+TrainReader:
+ batch_size: 8
+
+EvalReader:
+ batch_size: 1
+
+TestReader:
+ batch_size: 1
+ fuse_normalize: True
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+CustomCSPPAN:
+ num_layers: 4
+ use_trans: True
+
+PPYOLOEHead:
+ reg_range: [-2,8]
+ static_assigner_epoch: -1
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner_CR
+ center_radius: 1
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 10000
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d7d0865132b10fdf872af4bb70526d17333e0d71
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone.yml
@@ -0,0 +1,56 @@
+_BASE_: [
+ 'ppyoloe_plus_sod_crn_l_80e_visdrone.yml',
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_plus_sod_crn_l_largesize_80e_visdrone/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_l_80e_coco.pdparams
+
+PPYOLOEHead:
+ reg_range: [-2,20]
+ static_assigner_epoch: -1
+
+LearningRate:
+ base_lr: 0.00125
+
+worker_num: 2
+eval_height: &eval_height 1920
+eval_width: &eval_width 1920
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [1024, 1088, 1152, 1216, 1280, 1344, 1408, 1472, 1536, 1600, 1664, 1728, 1792, 1856, 1920], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
+ fuse_normalize: True
diff --git a/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_sod_crn_s_80e_visdrone.yml b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_sod_crn_s_80e_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fd444eab6f238e266839caaf91f65d65be159ef8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smalldet/visdrone/ppyoloe_plus_sod_crn_s_80e_visdrone.yml
@@ -0,0 +1,58 @@
+_BASE_: [
+ '../../datasets/visdrone_detection.yml',
+ '../../runtime.yml',
+ '../../ppyoloe/_base_/optimizer_80e.yml',
+ '../../ppyoloe/_base_/ppyoloe_plus_crn.yml',
+ '../../ppyoloe/_base_/ppyoloe_plus_reader.yml'
+]
+log_iter: 100
+snapshot_epoch: 10
+weights: output/ppyoloe_plus_sod_crn_s_80e_visdrone/model_final
+
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/ppyoloe_plus_crn_s_80e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+TrainReader:
+ batch_size: 8
+
+EvalReader:
+ batch_size: 1
+
+TestReader:
+ batch_size: 1
+ fuse_normalize: True
+
+
+epoch: 80
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: 96
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 1
+
+CustomCSPPAN:
+ num_layers: 4
+ use_trans: True
+
+PPYOLOEHead:
+ reg_range: [-2,8]
+ static_assigner_epoch: -1
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner_CR
+ center_radius: 1
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 10000
+ keep_top_k: 500
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/DataAnalysis.md b/PaddleDetection-release-2.6/configs/smrt/DataAnalysis.md
new file mode 100644
index 0000000000000000000000000000000000000000..66da22f43d9ba9494c35ee0fa0285aa45099399f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/DataAnalysis.md
@@ -0,0 +1,68 @@
+# 数据分析功能说明
+
+为了更好的帮助用户进行数据分析,从推荐更合适的模型,我们推出了**数据分析**功能,用户不需要上传原图,只需要上传标注好的文件格式即可进一步分析数据特点。
+
+当前支持格式有:
+* LabelMe标注数据格式
+* 精灵标注数据格式
+* LabelImg标注数据格式
+* VOC数据格式
+* COCO数据格式
+* Seg数据格式
+
+## LabelMe标注数据格式
+
+1. 需要选定包含标注文件的zip格式压缩包。zip格式压缩包中包含一个annotations文件夹,文件夹中的内容为与标注图像相同数量的json文件,每一个json文件除后缀外与对应的图像同名。
+2. 支持检测与分割任务。若提供的标注信息与所选择的任务类型不匹配,则将提示错误。
+3. 对于检测任务,需提供rectangle类型标注信息;对于分割任务,需提供polygon类型标注信息。
+
+

+
+
+## 精灵标注数据格式
+
+1. 需要选定包含标注文件的zip格式压缩包。zip格式压缩包中包含一个annotations文件夹,文件夹中的内容为与标注图像相同数量的json文件,每一个json文件除后缀外与对应的图像同名。
+2. 支持检测与分割任务。若提供的标注信息与所选择的任务类型不匹配,则将提示错误。
+3. 对于检测任务,需提供bndbox或polygon类型标注信息;对于分割任务,需提供polygon类型标注信息。
+
+

+
+
+## LabelImg标注数据格式
+
+1. 需要选定包含标注文件的zip格式压缩包。zip格式压缩包中包含一个annotations文件夹,文件夹中的内容为与标注图像相同数量的xml文件,每一个xml文件除后缀外与对应的图像同名。
+2. 仅支持检测任务。
+3. 标注文件中必须提供bndbox字段信息;segmentation字段是可选的。
+
+
+

+
+
+## VOC数据格式
+
+1. 需要选定包含标注文件的zip格式压缩包。zip格式压缩包中包含一个annotations文件夹,文件夹中的内容为与标注图像相同数量的xml文件,每一个xml文件除后缀外与对应的图像同名。
+2. 仅支持检测任务。
+3. 标注文件中必须提供bndbox字段信息;segmentation字段是可选的。
+
+

+
+
+## COCO数据格式
+
+1. 需要选定包含标注文件的zip格式压缩包。zip格式压缩包中包含一个annotations文件夹,文件夹中仅存在一个名为annotation.json的文件。
+2. 支持检测与分割任务。若提供的标注信息与所选择的任务类型不匹配,则将提示错误。
+3. 对于检测任务,标注文件中必须包含bbox字段,segmentation字段是可选的;对于分割任务,标注文件中必须包含segmentation字段。
+
+

+
+
+
+## Seg数据格式
+
+1. 需要选定包含标注文件的zip格式压缩包。zip格式压缩包中包含一个annotations文件夹,文件夹中的内容为与标注图像相同数量的png文件,每一个png文件除后缀外与对应的图像同名。
+2. 仅支持分割任务。
+3. 标注文件需要与原始图像在像素上严格保持一一对应,格式只可为png(后缀为.png或.PNG)。标注文件中的每个像素值为[0,255]区间内从0开始依序递增的整数ID,除255外,标注ID值的增加不能跳跃。在标注文件中,使用255表示需要忽略的像素,使用0表示背景类标注。
+
+
+

+
diff --git a/PaddleDetection-release-2.6/configs/smrt/README.md b/PaddleDetection-release-2.6/configs/smrt/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..d9ffbcc1275dc5cb55d07bbe88f030defc55ddf5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/README.md
@@ -0,0 +1,216 @@
+# 飞桨产业模型选型工具PaddleSMRT
+
+## 一、项目介绍
+
+PaddleSMRT (Paddle Sense Model Recommend Tool) 是飞桨结合产业落地经验推出的产业模型选型工具,在项目落地过程中,用户根据自身的实际情况,输入自己的需求,即可以得到对应在算法模型、部署硬件以及教程文档的信息。同时为了更加精准的推荐,增加了数据分析功能,用户上传自己的标注文件,系统可以自动分析数据特点,例如数据分布不均衡、小目标、密集型等,从而提供更加精准的模型以及优化策略,更好的符合场景的需求。
+
+飞桨官网使用[链接](https://www.paddlepaddle.org.cn/smrt)
+
+本文档主要介绍PaddleSMRT在检测方向上是如何进行模型选型推荐,以及推荐模型的使用方法。分割方向模型介绍请参考[文档](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.5/configs/smrt)
+
+## 二、数据介绍
+
+PaddleSMRT结合产业真实场景,通过比较检测算法效果,向用户推荐最适合的模型。目前PaddleSMRT覆盖工业质检、城市安防两大场景,下面介绍PaddleSMRT进行算法对比所使用的数据集
+
+### 1. 新能源电池质检数据集
+
+数据集为新能源电池电池组件质检数据集,包含15021张图片,包含22045个标注框,覆盖45种缺陷类型,例如掉胶,裂纹,划痕等。
+
+新能源电池数据展示图:
+
+
+

+

+
+
+数据集特点为:
+
+1. 类别分布均衡
+2. 属于小目标数据
+3. 非密集型数据
+
+### 2. 铝件质检数据集
+
+数据集为铝件生产过程中的质检数据集,包含11293张图片,包含43157个标注框,覆盖5种缺陷类型,例如划伤,压伤,起皮等。
+
+铝件质检数据展示图:
+
+
+

+

+
+
+
+数据集特点为:
+
+1. 类别分布不均衡
+2. 属于小目标数据
+3. 非密集型数据
+
+
+### 3. 人车数据集
+
+数据集包含2600张人工标注的两点anchor box标签。标签包括以下人和车的类别共22种:
+其中行人包括普通行人、3D 假人、坐着的人、骑车的人;车辆包括两厢车、三厢车、小型客车、小货车、皮卡车、轻卡、厢式货车、牵引车、水泥车、工程车辆、校车、中小型客车、大型单层客车、小型电动车、摩托车、自行车、三轮车以及其它特殊车辆。
+
+人车数据展示图:
+
+
+

+

+
+
+
+数据集特点为:
+
+1. 类别分布不均衡
+2. 属于小目标数据
+3. 非密集型数据
+
+**说明:**
+
+数据集特点判断依据如下:
+
+- 数据分布不均衡:采样1000张图片,不同类别样本个数标准差大于400
+- 小目标数据集:相对大小小于0.1或绝对大小小于32像素的样本个数比例大于30%
+- 密集型数据集:
+
+```
+ 密集目标定义:周围目标距离小于自身大小两倍的个数大于2;
+
+ 密集图片定义:密集目标个数占图片目标总数50%以上;
+
+ 密集数据集定义:密集图片个数占总个数30%以上
+
+```
+
+为了更好的帮助用户选择模型,我们也提供了丰富的数据分析功能,用户只需要上传标注文件(不需要原图)即可了解数据特点分布和模型优化建议
+
+
+

+
+
+## 三、推荐模型使用全流程
+
+通过模型选型工具会得到对应场景和数据特点的检测模型配置,例如[PP-YOLOE](./ppyoloe/ppyoloe_crn_m_300e_battery_1024.yml)
+
+该配置文件的使用方法如下
+
+### 1. 环境配置
+
+首先需要安装PaddlePaddle
+
+```bash
+# CUDA10.2
+pip install paddlepaddle-gpu==2.2.2 -i https://mirror.baidu.com/pypi/simple
+
+# CPU
+pip install paddlepaddle==2.2.2 -i https://mirror.baidu.com/pypi/simple
+```
+
+然后安装PaddleDetection和相关依赖
+
+```bash
+# 克隆PaddleDetection仓库
+cd
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+
+# 安装其他依赖
+cd PaddleDetection
+pip install -r requirements.txt
+```
+
+详细安装文档请参考[文档](../../docs/tutorials/INSTALL_cn.md)
+
+### 2. 数据准备
+
+用户需要准备训练数据集,建议标注文件使用COCO数据格式。如果使用lableme或者VOC数据格式,先使用[格式转换脚本](../../tools/x2coco.py)将标注格式转化为COCO,详细数据准备文档请参考[文档](../../docs/tutorials/data/PrepareDataSet.md)
+
+本文档以新能源电池工业质检子数据集为例展开,数据下载[链接](https://bj.bcebos.com/v1/paddle-smrt/data/battery_mini.zip)
+
+数据储存格式如下:
+
+```
+battery_mini
+├── annotations
+│ ├── test.json
+│ └── train.json
+└── images
+ ├── Board_daowen_101.png
+ ├── Board_daowen_109.png
+ ├── Board_daowen_117.png
+ ...
+```
+
+
+
+### 3. 模型训练/评估/预测
+
+使用经过模型选型工具推荐的模型进行训练,目前所推荐的模型均使用**单卡训练**,可以在训练的过程中进行评估,模型默认保存在`./output`下
+
+```bash
+python tools/train.py -c configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery_1024.yml --eval
+```
+
+如果训练过程出现中断,可以使用-r命令恢复训练
+
+```bash
+python tools/train.py -c configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery_1024.yml --eval -r output/ppyoloe_crn_m_300e_battery_1024/9.pdparams
+```
+
+如果期望单独评估模型训练精度,可以使用`tools/eval.py`
+
+```bash
+python tools/eval.py -c configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery_1024.yml -o weights=output/ppyoloe_crn_m_300e_battery_1024/model_final.pdparams
+```
+
+完成训练后,可以使用`tools/infer.py`可视化训练效果
+
+```bash
+python tools/infer.py -c configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery_1024.yml -o weights=output/ppyoloe_crn_m_300e_battery_1024/model_final.pdparams --infer_img=images/Board_diaojiao_1591.png
+```
+
+更多模型训练参数请参考[文档](../../docs/tutorials/GETTING_STARTED_cn.md)
+
+### 4. 模型导出部署
+
+完成模型训练后,需要将模型部署到1080Ti,2080Ti或其他服务器设备上,使用Paddle Inference完成C++部署
+
+首先需要将模型导出为部署时使用的模型和配置文件
+
+```bash
+python tools/export_model.py -c configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery_1024.yml -o weights=output/ppyoloe_crn_m_300e_battery_1024/model_final.pdparams
+```
+
+接下来可以使用PaddleDetection中的部署代码实现C++部署,详细步骤请参考[文档](../../deploy/cpp/README.md)
+
+如果期望使用可视化界面的方式进行部署,可以参考下面部分的内容。
+
+## 四、部署demo
+
+为了更方便大家部署,我们也提供了完备的可视化部署Demo,欢迎尝试使用
+
+* [Windows Demo下载地址](https://github.com/PaddlePaddle/PaddleX/tree/develop/deploy/cpp/docs/csharp_deploy)
+
+
+

+
+
+* [Linux Demo下载地址](https://github.com/cjh3020889729/The-PaddleX-QT-Visualize-GUI)
+
+
+

+
+
+## 五、场景范例
+
+为了更方便大家更好的进行产业落地,PaddleSMRT也提供了详细的应用范例,欢迎大家使用。
+
+* 工业视觉
+ * [工业缺陷检测](https://aistudio.baidu.com/aistudio/projectdetail/2598319)
+ * [表计读数](https://aistudio.baidu.com/aistudio/projectdetail/2598327)
+ * [钢筋计数](https://aistudio.baidu.com/aistudio/projectdetail/2404188)
+* 城市
+ * [行人计数](https://aistudio.baidu.com/aistudio/projectdetail/2421822)
+ * [车辆计数](https://aistudio.baidu.com/aistudio/projectdetail/3391734?contributionType=1)
+ * [安全帽检测](https://aistudio.baidu.com/aistudio/projectdetail/3944737?contributionType=1)
diff --git a/PaddleDetection-release-2.6/configs/smrt/images/00362.jpg b/PaddleDetection-release-2.6/configs/smrt/images/00362.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..da4ab37d5cb5501e3c1471b30a3f465dd9b0a88f
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/smrt/images/00362.jpg differ
diff --git a/PaddleDetection-release-2.6/configs/smrt/images/Board_diaojiao_1591.png b/PaddleDetection-release-2.6/configs/smrt/images/Board_diaojiao_1591.png
new file mode 100644
index 0000000000000000000000000000000000000000..0ec35b9450209fba4b9579fcc325e70fc5f63ddd
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/smrt/images/Board_diaojiao_1591.png differ
diff --git a/PaddleDetection-release-2.6/configs/smrt/images/UpCoa_liewen_163.png b/PaddleDetection-release-2.6/configs/smrt/images/UpCoa_liewen_163.png
new file mode 100644
index 0000000000000000000000000000000000000000..294c29b4ed04c81d672cbe72ddaa4ccb3e301f67
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/smrt/images/UpCoa_liewen_163.png differ
diff --git a/PaddleDetection-release-2.6/configs/smrt/images/lvjian1_0.jpg b/PaddleDetection-release-2.6/configs/smrt/images/lvjian1_0.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..dbf0dfaa769f9a72bd7825f8589fffa5aca3ac6e
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/smrt/images/lvjian1_0.jpg differ
diff --git a/PaddleDetection-release-2.6/configs/smrt/images/lvjian1_10.jpg b/PaddleDetection-release-2.6/configs/smrt/images/lvjian1_10.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..25467e8174b27df7bf43b33795eb7ea1af605813
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/smrt/images/lvjian1_10.jpg differ
diff --git a/PaddleDetection-release-2.6/configs/smrt/images/renche_00002.jpg b/PaddleDetection-release-2.6/configs/smrt/images/renche_00002.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..9446db44df96cf18ef7871c345a8010cdfec49df
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/smrt/images/renche_00002.jpg differ
diff --git a/PaddleDetection-release-2.6/configs/smrt/images/renche_00204.jpg b/PaddleDetection-release-2.6/configs/smrt/images/renche_00204.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..2c46e933b970411eca850195b59f1c477d5d2a5e
Binary files /dev/null and b/PaddleDetection-release-2.6/configs/smrt/images/renche_00204.jpg differ
diff --git a/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_1024_coco_lcnet_battery.yml b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_1024_coco_lcnet_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..601dd1915ee65fb452340c783be8ca1cab905ce1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_1024_coco_lcnet_battery.yml
@@ -0,0 +1,162 @@
+weights: output/picodet_l_1024_coco_lcnet_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
+
+worker_num: 2
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 50
+LearningRate:
+ base_lr: 0.006
+ schedulers:
+ - !CosineDecay
+ max_epochs: 50
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 300
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 10
+print_flops: false
+find_unused_parameters: True
+use_ema: true
+
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+architecture: PicoDet
+
+PicoDet:
+ backbone: LCNet
+ neck: LCPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 2.0
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 160
+ use_depthwise: True
+ num_features: 4
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 160
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ grid_cell_scale: 5.0
+ static_assigner_epoch: 100
+ use_align_head: True
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ force_gt_matching: False
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: False
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.5
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.5
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_1024_coco_lcnet_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_1024_coco_lcnet_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..734f1bee70ce4c2708f846af4d10e350fa6a329f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_1024_coco_lcnet_lvjian1.yml
@@ -0,0 +1,162 @@
+weights: output/picodet_l_1024_coco_lcnet_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
+
+worker_num: 2
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 50
+LearningRate:
+ base_lr: 0.006
+ schedulers:
+ - !CosineDecay
+ max_epochs: 50
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 300
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 10
+print_flops: false
+find_unused_parameters: True
+use_ema: true
+
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+architecture: PicoDet
+
+PicoDet:
+ backbone: LCNet
+ neck: LCPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 2.0
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 160
+ use_depthwise: True
+ num_features: 4
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 160
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ grid_cell_scale: 5.0
+ static_assigner_epoch: 100
+ use_align_head: True
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ force_gt_matching: False
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: False
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.5
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.5
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_1024_coco_lcnet_renche.yml b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_1024_coco_lcnet_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cdebd4ba4ae55e40c940b230bd61528a39fe0fcf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_1024_coco_lcnet_renche.yml
@@ -0,0 +1,162 @@
+weights: output/picodet_l_1024_coco_lcnet_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
+
+worker_num: 2
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 50
+LearningRate:
+ base_lr: 0.006
+ schedulers:
+ - !CosineDecay
+ max_epochs: 50
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 300
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 10
+print_flops: false
+find_unused_parameters: True
+use_ema: true
+
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+architecture: PicoDet
+
+PicoDet:
+ backbone: LCNet
+ neck: LCPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 2.0
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 160
+ use_depthwise: True
+ num_features: 4
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 160
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ grid_cell_scale: 5.0
+ static_assigner_epoch: 100
+ use_align_head: True
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ force_gt_matching: False
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: False
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.5
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.5
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_640_coco_lcnet_battery.yml b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_640_coco_lcnet_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8200439dc928fea3d0c091d98acee30117a33be1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_640_coco_lcnet_battery.yml
@@ -0,0 +1,162 @@
+weights: output/picodet_l_640_coco_lcnet_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
+
+worker_num: 2
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 50
+LearningRate:
+ base_lr: 0.006
+ schedulers:
+ - !CosineDecay
+ max_epochs: 50
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 300
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 10
+print_flops: false
+find_unused_parameters: True
+use_ema: true
+
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+architecture: PicoDet
+
+PicoDet:
+ backbone: LCNet
+ neck: LCPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 2.0
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 160
+ use_depthwise: True
+ num_features: 4
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 160
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ grid_cell_scale: 5.0
+ static_assigner_epoch: 100
+ use_align_head: True
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ force_gt_matching: False
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: False
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.5
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.5
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_640_coco_lcnet_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_640_coco_lcnet_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6000902f03363a5763e7c26fd38565f85dcb2388
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_640_coco_lcnet_lvjian1.yml
@@ -0,0 +1,162 @@
+weights: output/picodet_l_640_coco_lcnet_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
+
+worker_num: 2
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: /paddle/dataset/model-select/gongye/lvjian1/slice_lvjian1_data/train/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: /paddle/dataset/model-select/gongye/lvjian1/slice_lvjian1_data/eval/
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 50
+LearningRate:
+ base_lr: 0.006
+ schedulers:
+ - !CosineDecay
+ max_epochs: 50
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 300
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 10
+print_flops: false
+find_unused_parameters: True
+use_ema: true
+
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+architecture: PicoDet
+
+PicoDet:
+ backbone: LCNet
+ neck: LCPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 2.0
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 160
+ use_depthwise: True
+ num_features: 4
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 160
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ grid_cell_scale: 5.0
+ static_assigner_epoch: 100
+ use_align_head: True
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ force_gt_matching: False
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: False
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.5
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.5
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_640_coco_lcnet_renche.yml b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_640_coco_lcnet_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fc1ce195ea4c5dce2bb60358f0509eb5e01a50c4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/picodet/picodet_l_640_coco_lcnet_renche.yml
@@ -0,0 +1,162 @@
+weights: output/picodet_l_640_coco_lcnet_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
+
+worker_num: 2
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 50
+LearningRate:
+ base_lr: 0.006
+ schedulers:
+ - !CosineDecay
+ max_epochs: 50
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 300
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {}
+ - RandomFlip: {prob: 0.5}
+ - RandomDistort: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [1, 3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 10
+print_flops: false
+find_unused_parameters: True
+use_ema: true
+
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.00004
+ type: L2
+
+architecture: PicoDet
+
+PicoDet:
+ backbone: LCNet
+ neck: LCPAN
+ head: PicoHeadV2
+
+LCNet:
+ scale: 2.0
+ feature_maps: [3, 4, 5]
+
+LCPAN:
+ out_channels: 160
+ use_depthwise: True
+ num_features: 4
+
+PicoHeadV2:
+ conv_feat:
+ name: PicoFeat
+ feat_in: 160
+ feat_out: 160
+ num_convs: 4
+ num_fpn_stride: 4
+ norm_type: bn
+ share_cls_reg: True
+ use_se: True
+ fpn_stride: [8, 16, 32, 64]
+ feat_in_chan: 160
+ prior_prob: 0.01
+ reg_max: 7
+ cell_offset: 0.5
+ grid_cell_scale: 5.0
+ static_assigner_epoch: 100
+ use_align_head: True
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ force_gt_matching: False
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ loss_class:
+ name: VarifocalLoss
+ use_sigmoid: False
+ iou_weighted: True
+ loss_weight: 1.0
+ loss_dfl:
+ name: DistributionFocalLoss
+ loss_weight: 0.5
+ loss_bbox:
+ name: GIoULoss
+ loss_weight: 2.5
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.025
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_battery.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..507d0088cbc343e7ca281f8c8f54aa169e135a43
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_battery.yml
@@ -0,0 +1,154 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r101vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/battery_mini # if set, anno_path will be 'dataset_dir/anno_path'
+
+
+epoch: 40
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+snapshot_epoch: 5
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 101
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.4
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_battery_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_battery_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cd8b6d49f2d5b3a2de4a72e8bfe2064bcfcfc4a7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_battery_1024.yml
@@ -0,0 +1,154 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r101vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/battery_mini # if set, anno_path will be 'dataset_dir/anno_path'
+
+
+epoch: 40
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+snapshot_epoch: 5
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 1024, 1024]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 101
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.4
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_lvjian1_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_lvjian1_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0c09049a60423464a8c14666f7c86098564482d8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_lvjian1_1024.yml
@@ -0,0 +1,155 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r101vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+
+epoch: 20
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+
+snapshot_epoch: 3
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[8, 7], [24, 12], [14, 25], [37, 35], [30, 140], [89, 52], [93, 189], [226, 99], [264, 352]], downsample_ratios: [32, 16, 8]}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 1024, 1024]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 101
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[8, 7], [24, 12], [14, 25],
+ [37, 35], [30, 140], [89, 52],
+ [93, 189], [226, 99], [264, 352]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_lvjian1_640.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_lvjian1_640.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f7dc75f975efdbd08ae4f96ce2f54cc55a4569f4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_lvjian1_640.yml
@@ -0,0 +1,155 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r101vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+
+epoch: 20
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+
+snapshot_epoch: 3
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[8, 7], [24, 12], [14, 25], [37, 35], [30, 140], [89, 52], [93, 189], [226, 99], [264, 352]], downsample_ratios: [32, 16, 8]}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 101
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[8, 7], [24, 12], [14, 25],
+ [37, 35], [30, 140], [89, 52],
+ [93, 189], [226, 99], [264, 352]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_renche_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_renche_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..96ab192171798f7947ee857b8291152e5933c57a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_renche_1024.yml
@@ -0,0 +1,156 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r101vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/renche/test.json
+
+
+epoch: 100
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+
+snapshot_epoch: 3
+worker_num: 8
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 1024, 1024]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 101
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_renche_640.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_renche_640.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ccc2162de1c995fdede25ccfa337d6136d14b3df
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r101vd_dcn_365e_renche_640.yml
@@ -0,0 +1,156 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r101vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/renche/test.json
+
+
+epoch: 100
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+
+snapshot_epoch: 3
+worker_num: 8
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 101
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_battery.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e886dd6c10bd03bcb2ef64f1a5f54ba0e923efcc
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_battery.yml
@@ -0,0 +1,154 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/battery_mini # if set, anno_path will be 'dataset_dir/anno_path'
+
+
+epoch: 40
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+snapshot_epoch: 5
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.4
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_battery_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_battery_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d3b7ac28fc68dfb85c0bd0d67f61b77b844bd034
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_battery_1024.yml
@@ -0,0 +1,154 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json # also support txt (like VOC's label_list.txt)
+ dataset_dir: dataset/battery_mini # if set, anno_path will be 'dataset_dir/anno_path'
+
+
+epoch: 40
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+snapshot_epoch: 5
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 1024, 1024]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.4
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_lvjian1_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_lvjian1_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6138e875e83a9a91708159b7f99e692c901f4f1b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_lvjian1_1024.yml
@@ -0,0 +1,155 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+
+epoch: 20
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+
+snapshot_epoch: 3
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[8, 7], [24, 12], [14, 25], [37, 35], [30, 140], [89, 52], [93, 189], [226, 99], [264, 352]], downsample_ratios: [32, 16, 8]}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 1024, 1024]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[8, 7], [24, 12], [14, 25],
+ [37, 35], [30, 140], [89, 52],
+ [93, 189], [226, 99], [264, 352]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_lvjian1_640.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_lvjian1_640.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5da1090006a2e1ba967c0870ba7311306ec1a164
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_lvjian1_640.yml
@@ -0,0 +1,155 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+
+epoch: 20
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+
+snapshot_epoch: 3
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[8, 7], [24, 12], [14, 25], [37, 35], [30, 140], [89, 52], [93, 189], [226, 99], [264, 352]], downsample_ratios: [32, 16, 8]}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[8, 7], [24, 12], [14, 25],
+ [37, 35], [30, 140], [89, 52],
+ [93, 189], [226, 99], [264, 352]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_renche_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_renche_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..611ea34a6b6bed018698f7c2fc7f1e6cf6528988
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_renche_1024.yml
@@ -0,0 +1,156 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/renche/test.json
+
+
+epoch: 100
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+
+snapshot_epoch: 3
+worker_num: 8
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 1024, 1024]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [1024, 1024], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_renche_640.yml b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_renche_640.yml
new file mode 100644
index 0000000000000000000000000000000000000000..37fb675f0acc4585f5ded137db46473b57c517c0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyolo/ppyolov2_r50vd_dcn_365e_renche_640.yml
@@ -0,0 +1,156 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyolov2_r50vd_dcn_365e_coco.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/coco/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/coco/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: dataset/coco/renche/test.json
+
+
+epoch: 100
+LearningRate:
+ base_lr: 0.0002
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 80
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+
+snapshot_epoch: 3
+worker_num: 8
+TrainReader:
+ inputs_def:
+ num_max_boxes: 100
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [576, 608, 640, 672, 704], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 100}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35.
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+YOLOv3:
+ backbone: ResNet
+ neck: PPYOLOPAN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+PPYOLOPAN:
+ drop_block: true
+ block_size: 3
+ keep_prob: 0.9
+ spp: true
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+ iou_aware: true
+ iou_aware_factor: 0.5
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+ scale_x_y: 1.05
+ iou_loss: IouLoss
+ iou_aware_loss: IouAwareLoss
+
+IouLoss:
+ loss_weight: 2.5
+ loss_square: true
+
+IouAwareLoss:
+ loss_weight: 1.0
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.01
+ downsample_ratio: 32
+ clip_bbox: true
+ scale_x_y: 1.05
+ nms:
+ name: MatrixNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ post_threshold: 0.01
+ nms_top_k: -1
+ background_label: -1
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_battery.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bc58d999cabfdfb8f2252ca0e34c73e118ba70e9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_battery.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_l_300e_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_battery_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_battery_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..027e38e202eaff50e69ac0d3204541d5ae7a08a6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_battery_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_l_300e_battery_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+worker_num: 4
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..272caf679a296cb4375e3628aa070fd71cec9931
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_lvjian1.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_l_300e_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 30
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_lvjian1_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_lvjian1_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..38a14259f54dd7a515aa68e5a5f7a79909f5a40b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_lvjian1_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_l_300e_lvjian1_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+worker_num: 4
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 30
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_renche.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..80c7bac76453e407d743a4e677257ebd4e2505b3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_renche.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_l_300e_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_renche_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_renche_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2151ecf711c0f52560f9318085f0fee2de7b8a85
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_l_300e_renche_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_l_300e_renche_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams
+depth_mult: 1.0
+width_mult: 1.0
+
+worker_num: 4
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8902f32ec42da89643b85f0743799555c3abc8ec
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_m_300e_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f244c1dd13381d360440a1c7705c8f5f81abf576
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_battery_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_m_300e_battery_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
+
+worker_num: 4
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7563756955b97f722a9c099dfb8ce57a90b6c6f7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_lvjian1.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_m_300e_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 30
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 16
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_lvjian1_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_lvjian1_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d15e07f8e88cd1f9d592296e71cc587a6e6892ef
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_lvjian1_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_m_300e_lvjian1_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
+
+worker_num: 2
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0015
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_renche.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a65cbdf540bd9e48800610516e0978d9f51b2c41
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_renche.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_m_300e_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_renche_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_renche_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0427b81d4f8eeca71f6245a583f0f0a2d99f3569
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_m_300e_renche_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_m_300e_renche_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_m_300e_coco.pdparams
+depth_mult: 0.67
+width_mult: 0.75
+
+worker_num: 4
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: /paddle/dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: /paddle/dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: /paddle/dataset/renche
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_battery.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1ef01cfc633414a9e4f71bbfc656a116c76fc7bf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_battery.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_s_300e_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_battery_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_battery_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..42d30e00ff940b49b778306fa45562cf87f36396
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_battery_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_s_300e_battery_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+worker_num: 4
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b6155305fc4233b1c754dae4f2bb6cc368aa55f8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_lvjian1.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_s_300e_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 30
+LearningRate:
+ base_lr: 0.002
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 16
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_lvjian_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_lvjian_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..72a184127f10d32176a90bd0045d20a6d88457fa
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_lvjian_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_s_300e_lvjian1_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+worker_num: 2
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 30
+LearningRate:
+ base_lr: 0.003
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 16
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_renche.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..df1939153b2672222fd9f3589da89ac3aa1a5a93
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_renche.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_s_300e_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [12, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_renche_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_renche_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..07310a067794e789bd58172381cfecf37a1b3f03
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_s_300e_renche_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_s_300e_renche_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_s_300e_coco.pdparams
+depth_mult: 0.33
+width_mult: 0.50
+
+worker_num: 4
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_battery.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ba94ad254319fa8fa2ca1cb3b982c7f4b5508c5f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_battery.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_x_300e_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams
+depth_mult: 1.33
+width_mult: 1.25
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_battery_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_battery_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..961d7823a32e8ee377274f1bf65399ab21b5a321
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_battery_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_x_300e_battery_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams
+depth_mult: 1.33
+width_mult: 1.25
+
+worker_num: 4
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7a47aded5e8cea1ded2d916509f54d53157dd7be
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_lvjian1.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_x_300e_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams
+depth_mult: 1.33
+width_mult: 1.25
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 30
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 1
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_lvjian1_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_lvjian1_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c1e70d2198f2af380c5cc9ab80704a9861f11c00
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_lvjian1_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_x_300e_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams
+depth_mult: 1.33
+width_mult: 1.25
+
+worker_num: 2
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_renche.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..be3f79044af32b12bba0e5aa13059585fd65d9ab
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_renche.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_x_300e_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams
+depth_mult: 1.33
+width_mult: 1.25
+
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_renche_1024.yml b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_renche_1024.yml
new file mode 100644
index 0000000000000000000000000000000000000000..250251c32504ced54291d2b5449e1ffdafb8b3ea
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/ppyoloe/ppyoloe_crn_x_300e_renche_1024.yml
@@ -0,0 +1,140 @@
+weights: output/ppyoloe_crn_x_300e_renche_1024/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/ppyoloe_crn_x_300e_coco.pdparams
+depth_mult: 1.33
+width_mult: 1.25
+
+worker_num: 4
+eval_height: &eval_height 1024
+eval_width: &eval_width 1024
+eval_size: &eval_size [*eval_height, *eval_width]
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 30
+LearningRate:
+ base_lr: 0.0005
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 3
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [960, 992, 1024, 1056, 1088], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+
+# Exporting the model
+export:
+ post_process: True # Whether post-processing is included in the network when export model.
+ nms: True # Whether NMS is included in the network when export model.
+ benchmark: False # It is used to testing model performance, if set `True`, post-process and NMS will not be exported.
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+architecture: YOLOv3
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+YOLOv3:
+ backbone: CSPResNet
+ neck: CustomCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
+
+CustomCSPPAN:
+ out_channels: [768, 384, 192]
+ stage_num: 1
+ block_num: 3
+ act: 'swish'
+ spp: true
+
+PPYOLOEHead:
+ fpn_strides: [32, 16, 8]
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ static_assigner_epoch: 100
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_battery.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..128328bf3853bff327b47bb1945908c338b3dcb8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_battery.yml
@@ -0,0 +1,168 @@
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 24
+LearningRate:
+ base_lr: 0.00025
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [1350,1425,1500,1575,1650], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+find_unused_parameters: True
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+architecture: CascadeRCNN
+
+CascadeRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+CascadeHead:
+ head: CascadeTwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ cascade_iou: [0.5, 0.6, 0.7]
+ use_random: True
+
+CascadeTwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode:
+ name: RCNNBox
+ prior_box_var: [30.0, 30.0, 15.0, 15.0]
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c6b4b8ce5c6ef099f9ba3ef9e603ddc4e273e413
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_lvjian1.yml
@@ -0,0 +1,168 @@
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 24
+LearningRate:
+ base_lr: 0.00025
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [1350,1425,1500,1575,1650], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+find_unused_parameters: True
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+architecture: CascadeRCNN
+
+CascadeRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+CascadeHead:
+ head: CascadeTwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ cascade_iou: [0.5, 0.6, 0.7]
+ use_random: True
+
+CascadeTwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode:
+ name: RCNNBox
+ prior_box_var: [30.0, 30.0, 15.0, 15.0]
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_renche.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ef11461339080740eb3ac2414eda709f10b00ddb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_renche.yml
@@ -0,0 +1,168 @@
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_2x_1500_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 24
+LearningRate:
+ base_lr: 0.00025
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [1350,1425,1500,1575,1650], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+find_unused_parameters: True
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+architecture: CascadeRCNN
+
+CascadeRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+CascadeHead:
+ head: CascadeTwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ cascade_iou: [0.5, 0.6, 0.7]
+ use_random: True
+
+CascadeTwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode:
+ name: RCNNBox
+ prior_box_var: [30.0, 30.0, 15.0, 15.0]
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_800_battery.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_800_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..20025b07da573fbb7cff5936c50509358b85aa99
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_800_battery.yml
@@ -0,0 +1,168 @@
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_2x_800_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 24
+LearningRate:
+ base_lr: 0.00025
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+find_unused_parameters: True
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+architecture: CascadeRCNN
+
+CascadeRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+CascadeHead:
+ head: CascadeTwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ cascade_iou: [0.5, 0.6, 0.7]
+ use_random: True
+
+CascadeTwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode:
+ name: RCNNBox
+ prior_box_var: [30.0, 30.0, 15.0, 15.0]
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_800_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_800_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6e0352a1952c58a3d168787364f0b2b77fede322
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_800_lvjian1.yml
@@ -0,0 +1,168 @@
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_2x_800_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 24
+LearningRate:
+ base_lr: 0.00025
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+find_unused_parameters: True
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+architecture: CascadeRCNN
+
+CascadeRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+CascadeHead:
+ head: CascadeTwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ cascade_iou: [0.5, 0.6, 0.7]
+ use_random: True
+
+CascadeTwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode:
+ name: RCNNBox
+ prior_box_var: [30.0, 30.0, 15.0, 15.0]
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_800_renche.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_800_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..448b65db663322476f7f0db79fcd5e6a52982720
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/cascade_rcnn_r50_vd_fpn_ssld_2x_800_renche.yml
@@ -0,0 +1,168 @@
+weights: output/cascade_rcnn_r50_vd_fpn_ssld_2x_800_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/cascade_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 24
+LearningRate:
+ base_lr: 0.00025
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [12, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 5
+print_flops: false
+find_unused_parameters: True
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+architecture: CascadeRCNN
+
+CascadeRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+CascadeHead:
+ head: CascadeTwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ cascade_iou: [0.5, 0.6, 0.7]
+ use_random: True
+
+CascadeTwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode:
+ name: RCNNBox
+ prior_box_var: [30.0, 30.0, 15.0, 15.0]
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_1500_battery.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_1500_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7e6b8871b9525a0f6775266298872178cf5b49aa
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_1500_battery.yml
@@ -0,0 +1,166 @@
+weights: output/faster_rcnn_r101_vd_fpn_ssld_2x_1500_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[800, 800], [900, 900], [1000, 1000], [1200, 1200], [1400, 1400], [1500, 1500]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_1500_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_1500_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..190ed8fa183656127445602792df861b8018e938
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_1500_lvjian1.yml
@@ -0,0 +1,166 @@
+weights: output/faster_rcnn_r101_vd_fpn_ssld_2x_1500_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[800, 800], [900, 900], [1000, 1000], [1200, 1200], [1400, 1400], [1500, 1500]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_1500_renche.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_1500_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..947c6e43bc6ff42f150566b7ef1e9713cd749926
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_1500_renche.yml
@@ -0,0 +1,166 @@
+weights: output/faster_rcnn_r101_vd_fpn_ssld_2x_1500_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[800, 800], [900, 900], [1000, 1000], [1200, 1200], [1400, 1400], [1500, 1500]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_800_battery.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_800_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..148a0459e8e8f5aea9b74d1e943852c82f524127
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_800_battery.yml
@@ -0,0 +1,167 @@
+weights: output/faster_rcnn_r101_vd_fpn_ssld_2x_800_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_800_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_800_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9362638d3f05b30c2274e199410e7fa509e0eb10
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_800_lvjian1.yml
@@ -0,0 +1,167 @@
+weights: output/faster_rcnn_r101_vd_fpn_ssld_2x_800_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_800_renche.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_800_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..bf881d55a0808df85739784270373e1ada4d9f3a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r101_vd_fpn_ssld_2x_800_renche.yml
@@ -0,0 +1,167 @@
+weights: output/faster_rcnn_r101_vd_fpn_ssld_2x_800_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r101_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 101
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_1500_battery.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_1500_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..688ea9bfdf6160715343d18c5b9ea83a27b6bc8e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_1500_battery.yml
@@ -0,0 +1,166 @@
+weights: output/faster_rcnn_r50_vd_fpn_ssld_2x_1500_battery/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[800, 800], [900, 900], [1000, 1000], [1200, 1200], [1400, 1400], [1500, 1500]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_1500_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_1500_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4b7d8e7d85a3cf61aadf2bfb276f1d325a712808
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_1500_lvjian1.yml
@@ -0,0 +1,166 @@
+weights: output/faster_rcnn_r50_vd_fpn_ssld_2x_1500_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[800, 800], [900, 900], [1000, 1000], [1200, 1200], [1400, 1400], [1500, 1500]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_1500_renche.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_1500_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..39eca1f8ee87f21026ee483dc6c69e6f30ac9bf7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_1500_renche.yml
@@ -0,0 +1,166 @@
+weights: output/faster_rcnn_r50_vd_fpn_ssld_2x_1500_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[800, 800], [900, 900], [1000, 1000], [1200, 1200], [1400, 1400], [1500, 1500]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [1500, 1500], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_800_battery.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_800_battery.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7a982c06df9f32675c3de251f96e0b6477ea0943
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_800_battery.yml
@@ -0,0 +1,167 @@
+weights: output/faster_rcnn_r50_vd_fpn_ssld_2x_800_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 45
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/train.json
+ dataset_dir: dataset/battery_mini
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/test.json
+ dataset_dir: dataset/battery_mini
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_800_lvjian1.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_800_lvjian1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..39020c77e8ef1d47e9b3df08417f7f4c6a765249
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_800_lvjian1.yml
@@ -0,0 +1,167 @@
+weights: output/faster_rcnn_r50_vd_fpn_ssld_2x_800_lvjian1/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 5
+
+TrainDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: train.json
+ dataset_dir: dataset/slice_lvjian1_data/train/
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: images
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+TestDataset:
+ !ImageFolder
+ anno_path: val.json
+ dataset_dir: dataset/slice_lvjian1_data/eval
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_800_renche.yml b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_800_renche.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e27315c3572f3c89f1f98fc250e50a3d23661250
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/smrt/rcnn/faster_rcnn_r50_vd_fpn_ssld_2x_800_renche.yml
@@ -0,0 +1,167 @@
+weights: output/faster_rcnn_r50_vd_fpn_ssld_2x_800_renche/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_vd_fpn_ssld_2x_coco.pdparams
+
+metric: COCO
+num_classes: 22
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: train.json
+ dataset_dir: dataset/renche
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: train_images
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+TestDataset:
+ !ImageFolder
+ anno_path: test.json
+ dataset_dir: dataset/renche
+
+epoch: 24
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [16, 22]
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 1000
+
+
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+use_gpu: true
+use_xpu: false
+log_iter: 100
+save_dir: output
+snapshot_epoch: 2
+print_flops: false
+
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
+
+
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: ResNet
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ variant: d
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+
+
+BBoxHead:
+ head: TwoFCHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+TwoFCHead:
+ out_channel: 1024
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/sniper/README.md b/PaddleDetection-release-2.6/configs/sniper/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..3aadee560ee6d7cd3691075db016a94fec7e0ea3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sniper/README.md
@@ -0,0 +1,67 @@
+English | [简体中文](README_cn.md)
+
+# SNIPER: Efficient Multi-Scale Training
+
+## Model Zoo
+
+| Sniper | GPU number | images/GPU | Model | Dataset | Schedulers | Box AP | Download | Config |
+| :---------------- | :-------------------: | :------------------: | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| w/o | 4 | 1 | ResNet-r50-FPN | [VisDrone](https://github.com/VisDrone/VisDrone-Dataset) | 1x | 23.3 | [Download Link](https://bj.bcebos.com/v1/paddledet/models/faster_rcnn_r50_fpn_1x_visdrone.pdparams ) | [config](./faster_rcnn_r50_fpn_1x_visdrone.yml) |
+| w/ | 4 | 1 | ResNet-r50-FPN | [VisDrone](https://github.com/VisDrone/VisDrone-Dataset) | 1x | 29.7 | [Download Link](https://bj.bcebos.com/v1/paddledet/models/faster_rcnn_r50_fpn_1x_sniper_visdrone.pdparams) | [config](./faster_rcnn_r50_fpn_1x_sniper_visdrone.yml) |
+
+### Note
+- Here, we use VisDrone dataset, and to detect 9 objects including `person, bicycles, car, van, truck, tricycle, awning-tricycle, bus, motor`.
+- Do not support deploy by now because sniper dataset crop behavior.
+
+## Getting Start
+### 1. Training
+a. optional: Run `tools/sniper_params_stats.py` to get image_target_sizes\valid_box_ratio_ranges\chip_target_size\chip_target_stride,and modify this params in configs/datasets/sniper_coco_detection.yml
+```bash
+python tools/sniper_params_stats.py FasterRCNN annotations/instances_train2017.json
+```
+b. optional: train detector to get negative proposals.
+```bash
+python -m paddle.distributed.launch --log_dir=./sniper/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml --save_proposals --proposals_path=./proposals.json &>sniper.log 2>&1 &
+```
+c. train models
+```bash
+python -m paddle.distributed.launch --log_dir=./sniper/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml --eval &>sniper.log 2>&1 &
+```
+
+### 2. Evaluation
+Evaluating SNIPER on custom dataset in single GPU with following commands:
+```bash
+# use saved checkpoint in training
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final
+```
+
+### 3. Inference
+Inference images in single GPU with following commands, use `--infer_img` to inference a single image and `--infer_dir` to inference all images in the directory.
+
+```bash
+# inference single image
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final --infer_img=demo/P0861__1.0__1154___824.png
+
+# inference all images in the directory
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final --infer_dir=demo
+```
+
+## Citations
+```
+@misc{1805.09300,
+Author = {Bharat Singh and Mahyar Najibi and Larry S. Davis},
+Title = {SNIPER: Efficient Multi-Scale Training},
+Year = {2018},
+Eprint = {arXiv:1805.09300},
+}
+
+@ARTICLE{9573394,
+ author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ title={Detection and Tracking Meet Drones Challenge},
+ year={2021},
+ volume={},
+ number={},
+ pages={1-1},
+ doi={10.1109/TPAMI.2021.3119563}}
+```
diff --git a/PaddleDetection-release-2.6/configs/sniper/README_cn.md b/PaddleDetection-release-2.6/configs/sniper/README_cn.md
new file mode 100644
index 0000000000000000000000000000000000000000..a01a3a928c56518ec93f50d3d6645ea086c33321
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sniper/README_cn.md
@@ -0,0 +1,68 @@
+简体中文 | [English](README.md)
+
+# SNIPER: Efficient Multi-Scale Training
+
+## 模型库
+| 有无sniper | GPU个数 | 每张GPU图片个数 | 骨架网络 | 数据集 | 学习率策略 | Box AP | 模型下载 | 配置文件 |
+| :---------------- | :-------------------: | :------------------: | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| w/o sniper | 4 | 1 | ResNet-r50-FPN | [VisDrone](https://github.com/VisDrone/VisDrone-Dataset) | 1x | 23.3 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/faster_rcnn_r50_fpn_1x_visdrone.pdparams ) | [配置文件](./faster_rcnn_r50_fpn_1x_visdrone.yml) |
+| w sniper | 4 | 1 | ResNet-r50-FPN | [VisDrone](https://github.com/VisDrone/VisDrone-Dataset) | 1x | 29.7 | [下载链接](https://bj.bcebos.com/v1/paddledet/models/faster_rcnn_r50_fpn_1x_sniper_visdrone.pdparams) | [配置文件](./faster_rcnn_r50_fpn_1x_sniper_visdrone.yml) |
+
+
+### 注意
+- 我们使用的是`VisDrone`数据集, 并且检查其中的9类,包括 `person, bicycles, car, van, truck, tricyle, awning-tricyle, bus, motor`.
+- 暂时不支持和导出预测部署(deploy).
+
+
+## 使用说明
+### 1. 训练
+a. 可选:统计数据集信息,获得数据缩放尺度、有效框范围、chip尺度和步长等参数,修改configs/datasets/sniper_coco_detection.yml中对应参数
+```bash
+python tools/sniper_params_stats.py FasterRCNN annotations/instances_train2017.json
+```
+b. 可选:训练检测器,生成负样本
+```bash
+python -m paddle.distributed.launch --log_dir=./sniper/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml --save_proposals --proposals_path=./proposals.json &>sniper.log 2>&1 &
+```
+c. 训练模型
+```bash
+python -m paddle.distributed.launch --log_dir=./sniper/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml --eval &>sniper.log 2>&1 &
+```
+
+### 2. 评估
+使用单GPU通过如下命令一键式评估模型在COCO val2017数据集效果
+```bash
+# 使用训练保存的checkpoint
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final
+```
+
+### 3. 推理
+使用单GPU通过如下命令一键式推理图像,通过`--infer_img`指定图像路径,或通过`--infer_dir`指定目录并推理目录下所有图像
+
+```bash
+# 推理单张图像
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final --infer_img=demo/P0861__1.0__1154___824.png
+
+# 推理目录下所有图像
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml -o weights=output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final --infer_dir=demo
+```
+
+## Citations
+```
+@misc{1805.09300,
+Author = {Bharat Singh and Mahyar Najibi and Larry S. Davis},
+Title = {SNIPER: Efficient Multi-Scale Training},
+Year = {2018},
+Eprint = {arXiv:1805.09300},
+}
+
+@ARTICLE{9573394,
+ author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ title={Detection and Tracking Meet Drones Challenge},
+ year={2021},
+ volume={},
+ number={},
+ pages={1-1},
+ doi={10.1109/TPAMI.2021.3119563}}
+```
diff --git a/PaddleDetection-release-2.6/configs/sniper/_base_/faster_fpn_reader.yml b/PaddleDetection-release-2.6/configs/sniper/_base_/faster_fpn_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..363ca4664b9effb1317e6661732f99113b7d1bff
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sniper/_base_/faster_fpn_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - SniperDecodeCrop: {}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+
+EvalReader:
+ sample_transforms:
+ - SniperDecodeCrop: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - SniperDecodeCrop: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/sniper/_base_/faster_reader.yml b/PaddleDetection-release-2.6/configs/sniper/_base_/faster_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5c3b348024e8d48289db85706ddd6454f40c0815
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sniper/_base_/faster_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - SniperDecodeCrop: {}
+ - RandomResize: {target_size: [[800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: true
+
+
+EvalReader:
+ sample_transforms:
+ - SniperDecodeCrop: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - SniperDecodeCrop: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: -1}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/sniper/_base_/ppyolo_reader.yml b/PaddleDetection-release-2.6/configs/sniper/_base_/ppyolo_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f88e908c903b256bc08fa209f0f2368e3d58596b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sniper/_base_/ppyolo_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - SniperDecodeCrop: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 50}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - SniperDecodeCrop: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 8
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 608, 608]
+ sample_transforms:
+ - SniperDecodeCrop: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml b/PaddleDetection-release-2.6/configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..08039e98dbd5e3e85812c780ffb0dd9dcc555a07
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sniper/faster_rcnn_r50_fpn_1x_sniper_visdrone.yml
@@ -0,0 +1,9 @@
+_BASE_: [
+ '../datasets/sniper_visdrone_detection.yml',
+ '../runtime.yml',
+ '../faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
+ '../faster_rcnn/_base_/optimizer_1x.yml',
+ '_base_/faster_fpn_reader.yml',
+]
+weights: output/faster_rcnn_r50_fpn_1x_sniper_visdrone/model_final
+find_unused_parameters: true
diff --git a/PaddleDetection-release-2.6/configs/sniper/faster_rcnn_r50_fpn_1x_visdrone.yml b/PaddleDetection-release-2.6/configs/sniper/faster_rcnn_r50_fpn_1x_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b6a449328e6d5c90c05dd087b5e0074fc292c38a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sniper/faster_rcnn_r50_fpn_1x_visdrone.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../faster_rcnn/_base_/optimizer_1x.yml',
+ '../faster_rcnn/_base_/faster_rcnn_r50_fpn.yml',
+ '../faster_rcnn/_base_/faster_fpn_reader.yml',
+]
+weights: output/faster_rcnn_r50_fpn_1x_visdrone/model_final
+
+
+metric: COCO
+num_classes: 9
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train
+ anno_path: annotations/train.json
+ dataset_dir: dataset/VisDrone2019_coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val
+ anno_path: annotations/val.json
+ dataset_dir: dataset/VisDrone2019_coco
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val.json
diff --git a/PaddleDetection-release-2.6/configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml b/PaddleDetection-release-2.6/configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c615c7732c51a73c4c4618240a000cf6ce351a80
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sniper/ppyolo_r50vd_dcn_1x_sniper_visdrone.yml
@@ -0,0 +1,33 @@
+_BASE_: [
+ '../datasets/sniper_visdrone_detection.yml',
+ '../runtime.yml',
+ '../ppyolo/_base_/ppyolo_r50vd_dcn.yml',
+ '../ppyolo/_base_/optimizer_1x.yml',
+ './_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 8
+use_ema: true
+weights: output/ppyolo_r50vd_dcn_1x_sniper_visdrone/model_final
+
+
+
+LearningRate:
+ base_lr: 0.005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.
+ milestones:
+ - 153
+ - 173
+ - !LinearWarmup
+ start_factor: 0.1
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/sniper/ppyolo_r50vd_dcn_1x_visdrone.yml b/PaddleDetection-release-2.6/configs/sniper/ppyolo_r50vd_dcn_1x_visdrone.yml
new file mode 100644
index 0000000000000000000000000000000000000000..dd1db0e872de109b70840f9799de07e80c9bb950
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sniper/ppyolo_r50vd_dcn_1x_visdrone.yml
@@ -0,0 +1,54 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '../ppyolo/_base_/ppyolo_r50vd_dcn.yml',
+ '../ppyolo/_base_/optimizer_1x.yml',
+ '../ppyolo/_base_/ppyolo_reader.yml',
+]
+
+snapshot_epoch: 8
+use_ema: true
+weights: output/ppyolo_r50vd_dcn_1x_visdrone/model_final
+
+epoch: 192
+
+LearningRate:
+ base_lr: 0.005
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 153
+ - 173
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
+
+
+metric: COCO
+num_classes: 9
+
+TrainDataset:
+ !COCODataSet
+ image_dir: train
+ anno_path: annotations/train.json
+ dataset_dir: dataset/VisDrone2019_coco
+ data_fields: ['image', 'gt_bbox', 'gt_class', 'is_crowd']
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val
+ anno_path: annotations/val.json
+ dataset_dir: dataset/VisDrone2019_coco
+
+TestDataset:
+ !ImageFolder
+ anno_path: annotations/val.json
diff --git a/PaddleDetection-release-2.6/configs/solov2/README.md b/PaddleDetection-release-2.6/configs/solov2/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..1cc378f847d27fbe0add5d7cd89883b020e6d646
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/solov2/README.md
@@ -0,0 +1,52 @@
+# SOLOv2 for instance segmentation
+
+## Introduction
+
+SOLOv2 (Segmenting Objects by Locations) is a fast instance segmentation framework with strong performance. We reproduced the model of the paper, and improved and optimized the accuracy and speed of the SOLOv2.
+
+**Highlights:**
+
+- Training Time: The training time of the model of `solov2_r50_fpn_1x` on Tesla v100 with 8 GPU is only 10 hours.
+
+## Model Zoo
+
+| Detector | Backbone | Multi-scale training | Lr schd | Mask APval | V100 FP32(FPS) | GPU | Download | Configs |
+| :-------: | :---------------------: | :-------------------: | :-----: | :--------------------: | :-------------: | :-----: | :---------: | :------------------------: |
+| YOLACT++ | R50-FPN | False | 80w iter | 34.1 (test-dev) | 33.5 | Xp | - | - |
+| CenterMask | R50-FPN | True | 2x | 36.4 | 13.9 | Xp | - | - |
+| CenterMask | V2-99-FPN | True | 3x | 40.2 | 8.9 | Xp | - | - |
+| PolarMask | R50-FPN | True | 2x | 30.5 | 9.4 | V100 | - | - |
+| BlendMask | R50-FPN | True | 3x | 37.8 | 13.5 | V100 | - | - |
+| SOLOv2 (Paper) | R50-FPN | False | 1x | 34.8 | 18.5 | V100 | - | - |
+| SOLOv2 (Paper) | X101-DCN-FPN | True | 3x | 42.4 | 5.9 | V100 | - | - |
+| SOLOv2 | R50-FPN | False | 1x | 35.5 | 21.9 | V100 | [model](https://paddledet.bj.bcebos.com/models/solov2_r50_fpn_1x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/solov2/solov2_r50_fpn_1x_coco.yml) |
+| SOLOv2 | R50-FPN | True | 3x | 38.0 | 21.9 | V100 | [model](https://paddledet.bj.bcebos.com/models/solov2_r50_fpn_3x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/solov2/solov2_r50_fpn_3x_coco.yml) |
+| SOLOv2 | R101vd-FPN | True | 3x | 42.7 | 12.1 | V100 | [model](https://paddledet.bj.bcebos.com/models/solov2_r101_vd_fpn_3x_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/solov2/solov2_r101_vd_fpn_3x_coco.yml) |
+
+**Notes:**
+
+- SOLOv2 is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
+
+## Enhanced model
+| Backbone | Input size | Lr schd | V100 FP32(FPS) | Mask APval | Download | Configs |
+| :---------------------: | :-------------------: | :-----: | :------------: | :-----: | :---------: | :------------------------: |
+| Light-R50-VD-DCN-FPN | 512 | 3x | 38.6 | 39.0 | [model](https://paddledet.bj.bcebos.com/models/solov2_r50_enhance_coco.pdparams) | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/solov2/solov2_r50_enhance_coco.yml) |
+
+**Optimizing method of enhanced model:**
+- Better backbone network: ResNet50vd-DCN
+- A better pre-training model for knowledge distillation
+- [Exponential Moving Average](https://www.investopedia.com/terms/e/ema.asp)
+- Synchronized Batch Normalization
+- Multi-scale training
+- More data augmentation methods
+- DropBlock
+
+## Citations
+```
+@article{wang2020solov2,
+ title={SOLOv2: Dynamic, Faster and Stronger},
+ author={Wang, Xinlong and Zhang, Rufeng and Kong, Tao and Li, Lei and Shen, Chunhua},
+ journal={arXiv preprint arXiv:2003.10152},
+ year={2020}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/solov2/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/solov2/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d034482d1e007c4e07fc9b1323b86e04588710bb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/solov2/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/solov2/_base_/solov2_light_reader.yml b/PaddleDetection-release-2.6/configs/solov2/_base_/solov2_light_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..901049c13d35251558c9235058cad80d8e5ea1be
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/solov2/_base_/solov2_light_reader.yml
@@ -0,0 +1,47 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Mask: {}
+ - RandomDistort: {}
+ - RandomCrop: {}
+ - RandomResize: {interp: 1,
+ target_size: [[352, 852], [384, 852], [416, 852], [448, 852], [480, 852], [512, 852]],
+ keep_ratio: True}
+ - RandomFlip: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2Solov2Target: {num_grids: [40, 36, 24, 16, 12],
+ scale_ranges: [[1, 96], [48, 192], [96, 384], [192, 768], [384, 2048]],
+ coord_sigma: 0.2}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Resize: {interp: 1, target_size: [512, 852], keep_ratio: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Resize: {interp: 1, target_size: [512, 852], keep_ratio: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/solov2/_base_/solov2_r50_fpn.yml b/PaddleDetection-release-2.6/configs/solov2/_base_/solov2_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..93a6892698a879c6ff60e731f617e6d0649072a9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/solov2/_base_/solov2_r50_fpn.yml
@@ -0,0 +1,40 @@
+architecture: SOLOv2
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+SOLOv2:
+ backbone: ResNet
+ neck: FPN
+ solov2_head: SOLOv2Head
+ mask_head: SOLOv2MaskHead
+
+ResNet:
+ depth: 50
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+
+SOLOv2Head:
+ seg_feat_channels: 512
+ stacked_convs: 4
+ num_grids: [40, 36, 24, 16, 12]
+ kernel_out_channels: 256
+ solov2_loss: SOLOv2Loss
+ mask_nms: MaskMatrixNMS
+
+SOLOv2MaskHead:
+ mid_channels: 128
+ out_channels: 256
+ start_level: 0
+ end_level: 3
+
+SOLOv2Loss:
+ ins_loss_weight: 3.0
+ focal_loss_gamma: 2.0
+ focal_loss_alpha: 0.25
+
+MaskMatrixNMS:
+ pre_nms_top_n: 500
+ post_nms_top_n: 100
diff --git a/PaddleDetection-release-2.6/configs/solov2/_base_/solov2_reader.yml b/PaddleDetection-release-2.6/configs/solov2/_base_/solov2_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f21516235f46a1822387d48c73e7026763e7fc4c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/solov2/_base_/solov2_reader.yml
@@ -0,0 +1,43 @@
+worker_num: 8
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Mask: {}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - RandomFlip: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2Solov2Target: {num_grids: [40, 36, 24, 16, 12],
+ scale_ranges: [[1, 96], [48, 192], [96, 384], [192, 768], [384, 2048]],
+ coord_sigma: 0.2}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/solov2/solov2_r101_vd_fpn_3x_coco.yml b/PaddleDetection-release-2.6/configs/solov2/solov2_r101_vd_fpn_3x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..db29c9ad19edd5396562e1b9e3f8400ae1a3367c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/solov2/solov2_r101_vd_fpn_3x_coco.yml
@@ -0,0 +1,66 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/solov2_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/solov2_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_pretrained.pdparams
+weights: output/solov2_r101_vd_fpn_3x_coco/model_final
+epoch: 36
+use_ema: true
+ema_decay: 0.9998
+
+ResNet:
+ depth: 101
+ variant: d
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ dcn_v2_stages: [1,2,3]
+ num_stages: 4
+
+SOLOv2Head:
+ seg_feat_channels: 512
+ stacked_convs: 4
+ num_grids: [40, 36, 24, 16, 12]
+ kernel_out_channels: 256
+ solov2_loss: SOLOv2Loss
+ mask_nms: MaskMatrixNMS
+ dcn_v2_stages: [0, 1, 2, 3]
+
+SOLOv2MaskHead:
+ mid_channels: 128
+ out_channels: 256
+ start_level: 0
+ end_level: 3
+ use_dcn_in_tower: True
+
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [24, 33]
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 2000
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Mask: {}
+ - RandomResize: {interp: 1,
+ target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]],
+ keep_ratio: True}
+ - RandomFlip: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2Solov2Target: {num_grids: [40, 36, 24, 16, 12],
+ scale_ranges: [[1, 96], [48, 192], [96, 384], [192, 768], [384, 2048]],
+ coord_sigma: 0.2}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
diff --git a/PaddleDetection-release-2.6/configs/solov2/solov2_r50_enhance_coco.yml b/PaddleDetection-release-2.6/configs/solov2/solov2_r50_enhance_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0cadd8783a3c45efc4c20f96fcd3241a0df8c02a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/solov2/solov2_r50_enhance_coco.yml
@@ -0,0 +1,50 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/solov2_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/solov2_light_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+weights: output/solov2_r50_fpn_3x_coco/model_final
+epoch: 36
+use_ema: true
+ema_decay: 0.9998
+
+ResNet:
+ depth: 50
+ variant: d
+ freeze_at: 0
+ freeze_norm: false
+ norm_type: sync_bn
+ return_idx: [0,1,2,3]
+ dcn_v2_stages: [1,2,3]
+ lr_mult_list: [0.05, 0.05, 0.1, 0.15]
+ num_stages: 4
+
+SOLOv2Head:
+ seg_feat_channels: 256
+ stacked_convs: 3
+ num_grids: [40, 36, 24, 16, 12]
+ kernel_out_channels: 128
+ solov2_loss: SOLOv2Loss
+ mask_nms: MaskMatrixNMS
+ dcn_v2_stages: [2]
+ drop_block: True
+
+SOLOv2MaskHead:
+ mid_channels: 128
+ out_channels: 128
+ start_level: 0
+ end_level: 3
+ use_dcn_in_tower: True
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [24, 33]
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/solov2/solov2_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/solov2/solov2_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e5f548d53a80937be526f9c927fa8f6cdb6e7e9c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/solov2/solov2_r50_fpn_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/solov2_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/solov2_reader.yml',
+]
+weights: output/solov2_r50_fpn_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/solov2/solov2_r50_fpn_3x_coco.yml b/PaddleDetection-release-2.6/configs/solov2/solov2_r50_fpn_3x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6ffff46bbfd6806036ac602091da747d06eb8bd7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/solov2/solov2_r50_fpn_3x_coco.yml
@@ -0,0 +1,38 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ '_base_/solov2_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/solov2_reader.yml',
+]
+weights: output/solov2_r50_fpn_3x_coco/model_final
+epoch: 36
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [24, 33]
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Poly2Mask: {}
+ - RandomResize: {interp: 1,
+ target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]],
+ keep_ratio: True}
+ - RandomFlip: {}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2Solov2Target: {num_grids: [40, 36, 24, 16, 12],
+ scale_ranges: [[1, 96], [48, 192], [96, 384], [192, 768], [384, 2048]],
+ coord_sigma: 0.2}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
diff --git a/PaddleDetection-release-2.6/configs/sparse_rcnn/README.md b/PaddleDetection-release-2.6/configs/sparse_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..5443b037f247938fc9a72194fff62c9a27cedc50
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sparse_rcnn/README.md
@@ -0,0 +1,25 @@
+# Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
+
+
+## Introduction
+Sparse RCNN is a purely sparse method for object detection in images.
+
+
+## Model Zoo
+
+| Backbone | Proposals | lr schedule | Box AP | download | config |
+| :-------------- | :-----: | :------------: | :-----: | :-----: | :-----: |
+| ResNet50-FPN | 100 | 3x | 43.0 | [download](https://paddledet.bj.bcebos.com/models/sparse_rcnn_r50_fpn_3x_pro100_coco.pdparams) | [config](./sparse_rcnn_r50_fpn_3x_pro100_coco.yml) |
+| ResNet50-FPN | 300 | 3x | 44.6 | [download](https://paddledet.bj.bcebos.com/models/sparse_rcnn_r50_fpn_3x_pro300_coco.pdparams) | [config](./sparse_rcnn_r50_fpn_3x_pro300_coco.yml) |
+
+## Citations
+```
+@misc{sun2021sparse,
+ title={Sparse R-CNN: End-to-End Object Detection with Learnable Proposals},
+ author={Peize Sun and Rufeng Zhang and Yi Jiang and Tao Kong and Chenfeng Xu and Wei Zhan and Masayoshi Tomizuka and Lei Li and Zehuan Yuan and Changhu Wang and Ping Luo},
+ year={2021},
+ eprint={2011.12450},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/sparse_rcnn/_base_/optimizer_3x.yml b/PaddleDetection-release-2.6/configs/sparse_rcnn/_base_/optimizer_3x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..19e1037130158909632a4d6515f6adf53cf5ad3c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sparse_rcnn/_base_/optimizer_3x.yml
@@ -0,0 +1,17 @@
+epoch: 36
+
+LearningRate:
+ base_lr: 0.000025
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [28, 34]
+ - !LinearWarmup
+ start_factor: 0.01
+ steps: 1000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 1.0
+ optimizer:
+ type: AdamW
+ weight_decay: 0.0001
diff --git a/PaddleDetection-release-2.6/configs/sparse_rcnn/_base_/sparse_rcnn_r50_fpn.yml b/PaddleDetection-release-2.6/configs/sparse_rcnn/_base_/sparse_rcnn_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9f7516fcd8652c866ad660f2f0afc9e36f1a6033
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sparse_rcnn/_base_/sparse_rcnn_r50_fpn.yml
@@ -0,0 +1,44 @@
+architecture: SparseRCNN
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+SparseRCNN:
+ backbone: ResNet
+ neck: FPN
+ head: SparseRCNNHead
+ postprocess: SparsePostProcess
+
+ResNet:
+ # index 0 stands for res2
+ depth: 50
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [0,1,2,3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+
+SparseRCNNHead:
+ head_hidden_dim: 256
+ head_dim_feedforward: 2048
+ nhead: 8
+ head_dropout: 0.0
+ head_cls: 1
+ head_reg: 3
+ head_dim_dynamic: 64
+ head_num_dynamic: 2
+ head_num_heads: 6
+ deep_supervision: true
+ num_proposals: 100
+ loss_func: SparseRCNNLoss
+
+SparseRCNNLoss:
+ losses: ["labels", "boxes"]
+ focal_loss_alpha: 0.25
+ focal_loss_gamma: 2.0
+ class_weight: 2.0
+ l1_weight: 5.0
+ giou_weight: 2.0
+
+SparsePostProcess:
+ num_proposals: 100
diff --git a/PaddleDetection-release-2.6/configs/sparse_rcnn/_base_/sparse_rcnn_reader.yml b/PaddleDetection-release-2.6/configs/sparse_rcnn/_base_/sparse_rcnn_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b9544b31c44159b66cd63df23a6d6a79aeb081bd
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sparse_rcnn/_base_/sparse_rcnn_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 4
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResize: {target_size: [[480, 1333], [512, 1333], [544, 1333], [576, 1333], [608, 1333], [640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: true, interp: 1}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2SparseTarget: {use_padding_shape: True}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2SparseTarget: {use_padding_shape: True}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - Gt2SparseTarget: {use_padding_shape: True}
+ batch_size: 1
+ shuffle: false
diff --git a/PaddleDetection-release-2.6/configs/sparse_rcnn/sparse_rcnn_r50_fpn_3x_pro100_coco.yml b/PaddleDetection-release-2.6/configs/sparse_rcnn/sparse_rcnn_r50_fpn_3x_pro100_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6123f149df1fd72c7446eec4e8702eb3df592441
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sparse_rcnn/sparse_rcnn_r50_fpn_3x_pro100_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/sparse_rcnn_r50_fpn.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/sparse_rcnn_reader.yml',
+]
+
+num_classes: 80
+weights: output/sparse_rcnn_r50_fpn_3x_pro100_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/sparse_rcnn/sparse_rcnn_r50_fpn_3x_pro300_coco.yml b/PaddleDetection-release-2.6/configs/sparse_rcnn/sparse_rcnn_r50_fpn_3x_pro300_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6cb3187829cd021753d5c699a1389abfd5048764
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/sparse_rcnn/sparse_rcnn_r50_fpn_3x_pro300_coco.yml
@@ -0,0 +1,19 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/sparse_rcnn_r50_fpn.yml',
+ '_base_/optimizer_3x.yml',
+ '_base_/sparse_rcnn_reader.yml',
+]
+
+num_classes: 80
+weights: output/sparse_rcnn_r50_fpn_3x_pro300_coco/model_final
+
+snapshot_epoch: 1
+
+
+SparseRCNNHead:
+ num_proposals: 300
+
+SparsePostProcess:
+ num_proposals: 300
diff --git a/PaddleDetection-release-2.6/configs/ssd/README.md b/PaddleDetection-release-2.6/configs/ssd/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..1b8a82d0960ca579acfd89a90c24740250d2bf59
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/README.md
@@ -0,0 +1,22 @@
+# SSD: Single Shot MultiBox Detector
+
+## Model Zoo
+
+### SSD on Pascal VOC
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | 下载 | 配置文件 |
+| :-------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| VGG | SSD | 8 | 240e | ---- | 77.8 | [下载链接](https://paddledet.bj.bcebos.com/models/ssd_vgg16_300_240e_voc.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ssd/ssd_vgg16_300_240e_voc.yml) |
+| MobileNet v1 | SSD | 32 | 120e | ---- | 73.8 | [下载链接](https://paddledet.bj.bcebos.com/models/ssd_mobilenet_v1_300_120e_voc.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ssd/ssd_mobilenet_v1_300_120e_voc.yml) |
+
+**注意:** SSD-VGG使用4GPU在总batch size为32下训练240个epoch。SSD-MobileNetv1使用2GPU在总batch size为64下训练120周期。
+
+## Citations
+```
+@article{Liu_2016,
+ title={SSD: Single Shot MultiBox Detector},
+ journal={ECCV},
+ author={Liu, Wei and Anguelov, Dragomir and Erhan, Dumitru and Szegedy, Christian and Reed, Scott and Fu, Cheng-Yang and Berg, Alexander C.},
+ year={2016},
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_120e.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_120e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6fb65a906245bcf13106d60eea24110f3c62c70b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_120e.yml
@@ -0,0 +1,17 @@
+epoch: 120
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [40, 60, 80, 100]
+ gamma: [0.5, 0.5, 0.4, 0.1]
+ use_warmup: false
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.0
+ type: RMSProp
+ regularizer:
+ factor: 0.00005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_1700e.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_1700e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fe5fedc7cd33855ef103325359adda131587fe64
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_1700e.yml
@@ -0,0 +1,18 @@
+epoch: 1700
+
+LearningRate:
+ base_lr: 0.4
+ schedulers:
+ - !CosineDecay
+ max_epochs: 1700
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 2000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_240e.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_240e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..de31eac3d22c97b2b72083a79342b880f4be9b8a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_240e.yml
@@ -0,0 +1,21 @@
+epoch: 240
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 160
+ - 200
+ - !LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_70e.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_70e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7cf56fee5f57e9b885265cf8a266af9acab3af8f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/optimizer_70e.yml
@@ -0,0 +1,17 @@
+epoch: 70
+
+LearningRate:
+ base_lr: 0.05
+ schedulers:
+ - !PiecewiseDecay
+ milestones: [48, 60]
+ gamma: [0.1, 0.1]
+ use_warmup: false
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_mobilenet_reader.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_mobilenet_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2b3e1da90b0d48ebb2b8cdd2a28174c495d698e4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_mobilenet_reader.yml
@@ -0,0 +1,39 @@
+worker_num: 8
+TrainReader:
+ inputs_def:
+ num_max_boxes: 90
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {brightness: [0.5, 1.125, 0.875], random_apply: False}
+ - RandomExpand: {fill_value: [127.5, 127.5, 127.5]}
+ - RandomCrop: {allow_no_crop: False}
+ - RandomFlip: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 90}
+ batch_transforms:
+ - NormalizeImage: {mean: [127.5, 127.5, 127.5], std: [127.502231, 127.502231, 127.502231], is_scale: false}
+ - Permute: {}
+ batch_size: 32
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [127.5, 127.5, 127.5], std: [127.502231, 127.502231, 127.502231], is_scale: false}
+ - Permute: {}
+ batch_size: 1
+
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 300, 300]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [127.5, 127.5, 127.5], std: [127.502231, 127.502231, 127.502231], is_scale: false}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_mobilenet_v1_300.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_mobilenet_v1_300.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b8fe6946eeaf43272a7bb5c7e94b7df5b420802e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_mobilenet_v1_300.yml
@@ -0,0 +1,41 @@
+architecture: SSD
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ssd_mobilenet_v1_coco_pretrained.pdparams
+
+SSD:
+ backbone: MobileNet
+ ssd_head: SSDHead
+ post_process: BBoxPostProcess
+
+MobileNet:
+ norm_decay: 0.
+ scale: 1
+ conv_learning_rate: 0.1
+ extra_block_filters: [[256, 512], [128, 256], [128, 256], [64, 128]]
+ with_extra_blocks: true
+ feature_maps: [11, 13, 14, 15, 16, 17]
+
+SSDHead:
+ kernel_size: 1
+ padding: 0
+ anchor_generator:
+ steps: [0, 0, 0, 0, 0, 0]
+ aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2., 3.], [2., 3.]]
+ min_ratio: 20
+ max_ratio: 90
+ base_size: 300
+ min_sizes: [60.0, 105.0, 150.0, 195.0, 240.0, 285.0]
+ max_sizes: [[], 150.0, 195.0, 240.0, 285.0, 300.0]
+ offset: 0.5
+ flip: true
+ min_max_aspect_ratios_order: false
+
+BBoxPostProcess:
+ decode:
+ name: SSDBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 200
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 400
+ nms_eta: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_r34_300.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_r34_300.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5b463b718d245205fe4daaf336fbe92d5725bbdf
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_r34_300.yml
@@ -0,0 +1,38 @@
+architecture: SSD
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet34_pretrained.pdparams
+
+SSD:
+ backbone: ResNet
+ ssd_head: SSDHead
+ post_process: BBoxPostProcess
+ r34_backbone: True
+
+ResNet:
+ # index 0 stands for res2
+ depth: 34
+ norm_type: bn
+ freeze_norm: False
+ freeze_at: -1
+ return_idx: [2]
+ num_stages: 3
+
+SSDHead:
+ anchor_generator:
+ steps: [8, 16, 32, 64, 100, 300]
+ aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2.], [2.]]
+ min_sizes: [21.0, 45.0, 99.0, 153.0, 207.0, 261.0]
+ max_sizes: [45.0, 99.0, 153.0, 207.0, 261.0, 315.0]
+ offset: 0.5
+ clip: True
+ min_max_aspect_ratios_order: True
+ use_extra_head: True
+
+BBoxPostProcess:
+ decode:
+ name: SSDBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 200
+ score_threshold: 0.05
+ nms_threshold: 0.5
+ nms_top_k: 400
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_r34_reader.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_r34_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0888b30a0af972279a1dc026a778ec476deb310c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_r34_reader.yml
@@ -0,0 +1,38 @@
+worker_num: 3
+TrainReader:
+ inputs_def:
+ num_max_boxes: 90
+ sample_transforms:
+ - Decode: {}
+ - RandomCrop: {num_attempts: 1}
+ - RandomFlip: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - RandomDistort: {brightness: [0.875, 1.125, 0.5], random_apply: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 90}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ - Permute: {}
+ batch_size: 1
+
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 300, 300]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_reader.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..22f8cc0a3ad2d2f11ffc1c77e3354f34abc431b6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 90
+
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {brightness: [0.5, 1.125, 0.875], random_apply: False}
+ - RandomExpand: {fill_value: [104., 117., 123.]}
+ - RandomCrop: {allow_no_crop: true}
+ - RandomFlip: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 90}
+
+ batch_transforms:
+ - NormalizeImage: {mean: [104., 117., 123.], std: [1., 1., 1.], is_scale: false}
+ - Permute: {}
+
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [104., 117., 123.], std: [1., 1., 1.], is_scale: false}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 300, 300]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [104., 117., 123.], std: [1., 1., 1.], is_scale: false}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_vgg16_300.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_vgg16_300.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8d322d9c1f8646a40d5256180546af63eea8a8fb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssd_vgg16_300.yml
@@ -0,0 +1,37 @@
+architecture: SSD
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/VGG16_caffe_pretrained.pdparams
+
+# Model Architecture
+SSD:
+ # model feat info flow
+ backbone: VGG
+ ssd_head: SSDHead
+ # post process
+ post_process: BBoxPostProcess
+
+VGG:
+ depth: 16
+ normalizations: [20., -1, -1, -1, -1, -1]
+
+SSDHead:
+ anchor_generator:
+ steps: [8, 16, 32, 64, 100, 300]
+ aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2.], [2.]]
+ min_ratio: 20
+ max_ratio: 90
+ min_sizes: [30.0, 60.0, 111.0, 162.0, 213.0, 264.0]
+ max_sizes: [60.0, 111.0, 162.0, 213.0, 264.0, 315.0]
+ offset: 0.5
+ flip: true
+ min_max_aspect_ratios_order: true
+
+BBoxPostProcess:
+ decode:
+ name: SSDBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 200
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 400
+ nms_eta: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite300_reader.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite300_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..86b69737cbe9c5d35893da15fbbfe44579c91f9f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite300_reader.yml
@@ -0,0 +1,39 @@
+worker_num: 8
+TrainReader:
+ inputs_def:
+ num_max_boxes: 90
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {brightness: [0.5, 1.125, 0.875], random_apply: False}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {allow_no_crop: Fasle}
+ - RandomFlip: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 90}
+ batch_transforms:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ - Permute: {}
+ batch_size: 1
+
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 300, 300]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [300, 300], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite320_reader.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite320_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..57eeadc6ebe751e6657e84359ecfaac8cdb67824
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite320_reader.yml
@@ -0,0 +1,39 @@
+worker_num: 8
+TrainReader:
+ inputs_def:
+ num_max_boxes: 90
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {brightness: [0.5, 1.125, 0.875], random_apply: False}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {allow_no_crop: Fasle}
+ - RandomFlip: {}
+ - Resize: {target_size: [320, 320], keep_ratio: False, interp: 1}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 90}
+ batch_transforms:
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ - Permute: {}
+ batch_size: 64
+ shuffle: true
+ drop_last: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [320, 320], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ - Permute: {}
+ batch_size: 1
+
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 320, 320]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [320, 320], keep_ratio: False, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: true}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_ghostnet_320.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_ghostnet_320.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6a9e13b5a1ca30a0dee10388a1931ffba0d412eb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_ghostnet_320.yml
@@ -0,0 +1,42 @@
+architecture: SSD
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/GhostNet_x1_3_ssld_pretrained.pdparams
+
+SSD:
+ backbone: GhostNet
+ ssd_head: SSDHead
+ post_process: BBoxPostProcess
+
+GhostNet:
+ scale: 1.3
+ conv_decay: 0.00004
+ with_extra_blocks: true
+ extra_block_filters: [[256, 512], [128, 256], [128, 256], [64, 128]]
+ feature_maps: [13, 18, 19, 20, 21, 22]
+ lr_mult_list: [0.25, 0.25, 0.5, 0.5, 0.75]
+
+SSDHead:
+ use_sepconv: True
+ conv_decay: 0.00004
+ anchor_generator:
+ steps: [16, 32, 64, 107, 160, 320]
+ aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2., 3.], [2., 3.]]
+ min_ratio: 20
+ max_ratio: 95
+ base_size: 320
+ min_sizes: []
+ max_sizes: []
+ offset: 0.5
+ flip: true
+ clip: true
+ min_max_aspect_ratios_order: false
+
+BBoxPostProcess:
+ decode:
+ name: SSDBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 200
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 400
+ nms_eta: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_mobilenet_v1_300.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_mobilenet_v1_300.yml
new file mode 100644
index 0000000000000000000000000000000000000000..db811ade9d7b89d2b407d40dec7e313feff11420
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_mobilenet_v1_300.yml
@@ -0,0 +1,41 @@
+architecture: SSD
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV1_ssld_pretrained.pdparams
+
+SSD:
+ backbone: MobileNet
+ ssd_head: SSDHead
+ post_process: BBoxPostProcess
+
+MobileNet:
+ conv_decay: 0.00004
+ scale: 1
+ extra_block_filters: [[256, 512], [128, 256], [128, 256], [64, 128]]
+ with_extra_blocks: true
+ feature_maps: [11, 13, 14, 15, 16, 17]
+
+SSDHead:
+ use_sepconv: True
+ conv_decay: 0.00004
+ anchor_generator:
+ steps: [16, 32, 64, 100, 150, 300]
+ aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2., 3.], [2., 3.]]
+ min_ratio: 20
+ max_ratio: 95
+ base_size: 300
+ min_sizes: []
+ max_sizes: []
+ offset: 0.5
+ flip: true
+ clip: true
+ min_max_aspect_ratios_order: False
+
+BBoxPostProcess:
+ decode:
+ name: SSDBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 200
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 400
+ nms_eta: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_mobilenet_v3_large_320.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_mobilenet_v3_large_320.yml
new file mode 100644
index 0000000000000000000000000000000000000000..cc6e3284a3ed961009128c6b7c51f6abd901f376
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_mobilenet_v3_large_320.yml
@@ -0,0 +1,44 @@
+architecture: SSD
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x1_0_ssld_pretrained.pdparams
+
+SSD:
+ backbone: MobileNetV3
+ ssd_head: SSDHead
+ post_process: BBoxPostProcess
+
+MobileNetV3:
+ scale: 1.0
+ model_name: large
+ conv_decay: 0.00004
+ with_extra_blocks: true
+ extra_block_filters: [[256, 512], [128, 256], [128, 256], [64, 128]]
+ feature_maps: [14, 17, 18, 19, 20, 21]
+ lr_mult_list: [0.25, 0.25, 0.5, 0.5, 0.75]
+ multiplier: 0.5
+
+SSDHead:
+ use_sepconv: True
+ conv_decay: 0.00004
+ anchor_generator:
+ steps: [16, 32, 64, 107, 160, 320]
+ aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2., 3.], [2., 3.]]
+ min_ratio: 20
+ max_ratio: 95
+ base_size: 320
+ min_sizes: []
+ max_sizes: []
+ offset: 0.5
+ flip: true
+ clip: true
+ min_max_aspect_ratios_order: false
+
+BBoxPostProcess:
+ decode:
+ name: SSDBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 200
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 400
+ nms_eta: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_mobilenet_v3_small_320.yml b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_mobilenet_v3_small_320.yml
new file mode 100644
index 0000000000000000000000000000000000000000..887f95fa291c772d73d7b133f488e8282d315940
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/_base_/ssdlite_mobilenet_v3_small_320.yml
@@ -0,0 +1,44 @@
+architecture: SSD
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_small_x1_0_ssld_pretrained.pdparams
+
+SSD:
+ backbone: MobileNetV3
+ ssd_head: SSDHead
+ post_process: BBoxPostProcess
+
+MobileNetV3:
+ scale: 1.0
+ model_name: small
+ conv_decay: 0.00004
+ with_extra_blocks: true
+ extra_block_filters: [[256, 512], [128, 256], [128, 256], [64, 128]]
+ feature_maps: [10, 13, 14, 15, 16, 17]
+ lr_mult_list: [0.25, 0.25, 0.5, 0.5, 0.75]
+ multiplier: 0.5
+
+SSDHead:
+ use_sepconv: True
+ conv_decay: 0.00004
+ anchor_generator:
+ steps: [16, 32, 64, 107, 160, 320]
+ aspect_ratios: [[2.], [2., 3.], [2., 3.], [2., 3.], [2., 3.], [2., 3.]]
+ min_ratio: 20
+ max_ratio: 95
+ base_size: 320
+ min_sizes: []
+ max_sizes: []
+ offset: 0.5
+ flip: true
+ clip: true
+ min_max_aspect_ratios_order: false
+
+BBoxPostProcess:
+ decode:
+ name: SSDBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 200
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 400
+ nms_eta: 1.0
diff --git a/PaddleDetection-release-2.6/configs/ssd/ssd_mobilenet_v1_300_120e_voc.yml b/PaddleDetection-release-2.6/configs/ssd/ssd_mobilenet_v1_300_120e_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..feaec0c43273aecded4dc1d6c63164ceef50c487
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/ssd_mobilenet_v1_300_120e_voc.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ '../datasets/voc.yml',
+ '../runtime.yml',
+ '_base_/optimizer_120e.yml',
+ '_base_/ssd_mobilenet_v1_300.yml',
+ '_base_/ssd_mobilenet_reader.yml',
+]
+weights: output/ssd_mobilenet_v1_300_120e_voc/model_final
+
+# set collate_batch to false because ground-truth info is needed
+# on voc dataset and should not collate data in batch when batch size
+# is larger than 1.
+EvalReader:
+ collate_batch: false
diff --git a/PaddleDetection-release-2.6/configs/ssd/ssd_r34_70e_coco.yml b/PaddleDetection-release-2.6/configs/ssd/ssd_r34_70e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3c5af37f06a37da90f398d681bc623dbed09b7c1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/ssd_r34_70e_coco.yml
@@ -0,0 +1,11 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_70e.yml',
+ '_base_/ssd_r34_300.yml',
+ '_base_/ssd_r34_reader.yml',
+]
+weights: output/ssd_r34_70e_coco/model_final
+
+log_iter: 100
+snapshot_epoch: 5
diff --git a/PaddleDetection-release-2.6/configs/ssd/ssd_vgg16_300_240e_voc.yml b/PaddleDetection-release-2.6/configs/ssd/ssd_vgg16_300_240e_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ff24242a1fb94a8a895b6230684865bb40fff44a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/ssd_vgg16_300_240e_voc.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ '../datasets/voc.yml',
+ '../runtime.yml',
+ '_base_/optimizer_240e.yml',
+ '_base_/ssd_vgg16_300.yml',
+ '_base_/ssd_reader.yml',
+]
+weights: output/ssd_vgg16_300_240e_voc/model_final
+
+# set collate_batch to false because ground-truth info is needed
+# on voc dataset and should not collate data in batch when batch size
+# is larger than 1.
+EvalReader:
+ collate_batch: false
diff --git a/PaddleDetection-release-2.6/configs/ssd/ssdlite_ghostnet_320_coco.yml b/PaddleDetection-release-2.6/configs/ssd/ssdlite_ghostnet_320_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c6eb6c11725bb041f12a1234ebe931cb019bd18e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/ssdlite_ghostnet_320_coco.yml
@@ -0,0 +1,27 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1700e.yml',
+ '_base_/ssdlite_ghostnet_320.yml',
+ '_base_/ssdlite320_reader.yml',
+]
+weights: output/ssdlite_ghostnet_320_coco/model_final
+
+epoch: 1700
+
+LearningRate:
+ base_lr: 0.2
+ schedulers:
+ - !CosineDecay
+ max_epochs: 1700
+ - !LinearWarmup
+ start_factor: 0.33333
+ steps: 2000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ssd/ssdlite_mobilenet_v1_300_coco.yml b/PaddleDetection-release-2.6/configs/ssd/ssdlite_mobilenet_v1_300_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..75cb8a8a20dad059f2f638378bbe4ef418bcfd27
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/ssdlite_mobilenet_v1_300_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1700e.yml',
+ '_base_/ssdlite_mobilenet_v1_300.yml',
+ '_base_/ssdlite300_reader.yml',
+]
+weights: output/ssdlite_mobilenet_v1_300_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/ssd/ssdlite_mobilenet_v3_large_320_coco.yml b/PaddleDetection-release-2.6/configs/ssd/ssdlite_mobilenet_v3_large_320_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..78d561aade1cfcf27ea15b56c1225ae2aebdc4da
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/ssdlite_mobilenet_v3_large_320_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1700e.yml',
+ '_base_/ssdlite_mobilenet_v3_large_320.yml',
+ '_base_/ssdlite320_reader.yml',
+]
+weights: output/ssdlite_mobilenet_v3_large_320_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/ssd/ssdlite_mobilenet_v3_small_320_coco.yml b/PaddleDetection-release-2.6/configs/ssd/ssdlite_mobilenet_v3_small_320_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fa0ce5346b2d5a08fd7816ae986c7747a410af8b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ssd/ssdlite_mobilenet_v3_small_320_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1700e.yml',
+ '_base_/ssdlite_mobilenet_v3_small_320.yml',
+ '_base_/ssdlite320_reader.yml',
+]
+weights: output/ssdlite_mobilenet_v3_small_320_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/tood/README.md b/PaddleDetection-release-2.6/configs/tood/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..1eccb73dc50ced15bbe2cac9b11f340fba00ea78
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/tood/README.md
@@ -0,0 +1,35 @@
+# TOOD
+
+## Introduction
+
+[TOOD: Task-aligned One-stage Object Detection](https://arxiv.org/abs/2108.07755)
+
+TOOD is an object detection model. We reproduced the model of the paper.
+
+
+## Model Zoo
+
+| Backbone | Model | Images/GPU | Inf time (fps) | Box AP | Config | Download |
+|:------:|:--------:|:--------:|:--------------:|:------:|:------:|:--------:|
+| R-50 | TOOD | 4 | --- | 42.5 | [config](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/configs/tood/tood_r50_fpn_1x_coco.yml) | [model](https://paddledet.bj.bcebos.com/models/tood_r50_fpn_1x_coco.pdparams) |
+
+**Notes:**
+
+- TOOD is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)`.
+- TOOD uses 8GPU to train 12 epochs.
+
+GPU multi-card training
+```bash
+export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/tood/tood_r50_fpn_1x_coco.yml --fleet
+```
+
+## Citations
+```
+@inproceedings{feng2021tood,
+ title={TOOD: Task-aligned One-stage Object Detection},
+ author={Feng, Chengjian and Zhong, Yujie and Gao, Yu and Scott, Matthew R and Huang, Weilin},
+ booktitle={ICCV},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/tood/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/tood/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..39c54ac805031619debf9b31119afa86b3ead857
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/tood/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/tood/_base_/tood_r50_fpn.yml b/PaddleDetection-release-2.6/configs/tood/_base_/tood_r50_fpn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0cb8575b09beb8ba4d0e20d2512bdac5b34ecaf1
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/tood/_base_/tood_r50_fpn.yml
@@ -0,0 +1,42 @@
+architecture: TOOD
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+TOOD:
+ backbone: ResNet
+ neck: FPN
+ head: TOODHead
+
+ResNet:
+ depth: 50
+ variant: b
+ norm_type: bn
+ freeze_at: 0
+ return_idx: [1, 2, 3]
+ num_stages: 4
+
+FPN:
+ out_channel: 256
+ spatial_scales: [0.125, 0.0625, 0.03125]
+ extra_stage: 2
+ has_extra_convs: true
+ use_c5: false
+
+TOODHead:
+ stacked_convs: 6
+ grid_cell_scale: 8
+ static_assigner_epoch: 4
+ loss_weight: { class: 1.0, iou: 2.0 }
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/tood/_base_/tood_reader.yml b/PaddleDetection-release-2.6/configs/tood/_base_/tood_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2807a2b81b3e19f73791b90d024dcc03c79d3942
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/tood/_base_/tood_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {prob: 0.5}
+ - Resize: {target_size: [800, 1333], keep_ratio: true}
+ - NormalizeImage: {is_scale: true, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ - PadGT: {}
+ batch_size: 4
+ shuffle: true
+ drop_last: true
+ collate_batch: true
+ use_shared_memory: true
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
diff --git a/PaddleDetection-release-2.6/configs/tood/tood_r50_fpn_1x_coco.yml b/PaddleDetection-release-2.6/configs/tood/tood_r50_fpn_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3d05c9884ea12013ea7b599d9c04c81abd709f40
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/tood/tood_r50_fpn_1x_coco.yml
@@ -0,0 +1,11 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/tood_r50_fpn.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/tood_reader.yml',
+]
+
+weights: output/tood_r50_fpn_1x_coco/model_final
+find_unused_parameters: True
+log_iter: 100
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/README.md b/PaddleDetection-release-2.6/configs/ttfnet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..63836613b0a8aff5fd2ae0cd4b8d908c3cefdc55
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/README.md
@@ -0,0 +1,68 @@
+# 1. TTFNet
+
+## 简介
+
+TTFNet是一种用于实时目标检测且对训练时间友好的网络,对CenterNet收敛速度慢的问题进行改进,提出了利用高斯核生成训练样本的新方法,有效的消除了anchor-free head中存在的模糊性。同时简单轻量化的网络结构也易于进行任务扩展。
+
+**特点:**
+
+结构简单,仅需要两个head检测目标位置和大小,并且去除了耗时的后处理操作
+训练时间短,基于DarkNet53的骨干网路,V100 8卡仅需要训练2个小时即可达到较好的模型效果
+
+## Model Zoo
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | 下载 | 配置文件 |
+| :-------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| DarkNet53 | TTFNet | 12 | 1x | ---- | 33.5 | [下载链接](https://paddledet.bj.bcebos.com/models/ttfnet_darknet53_1x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ttfnet/ttfnet_darknet53_1x_coco.yml) |
+
+
+
+
+
+# 2. PAFNet
+
+## 简介
+
+PAFNet(Paddle Anchor Free)是PaddleDetection基于TTFNet的优化模型,精度达到anchor free领域SOTA水平,同时产出移动端轻量级模型PAFNet-Lite
+
+PAFNet系列模型从如下方面优化TTFNet模型:
+
+- [CutMix](https://arxiv.org/abs/1905.04899)
+- 更优的骨干网络: ResNet50vd-DCN
+- 更大的训练batch size: 8 GPUs,每GPU batch_size=18
+- Synchronized Batch Normalization
+- [Deformable Convolution](https://arxiv.org/abs/1703.06211)
+- [Exponential Moving Average](https://www.investopedia.com/terms/e/ema.asp)
+- 更优的预训练模型
+
+
+## 模型库
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | Box AP | 下载 | 配置文件 |
+| :-------------- | :------------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| ResNet50vd | PAFNet | 18 | 10x | ---- | 39.8 | [下载链接](https://paddledet.bj.bcebos.com/models/pafnet_10x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ttfnet/pafnet_10x_coco.yml) |
+
+
+
+### PAFNet-Lite
+
+| 骨架网络 | 网络类型 | 每张GPU图片个数 | 学习率策略 | Box AP | 麒麟990延时(ms) | 体积(M) | 下载 | 配置文件 |
+| :-------------- | :------------- | :-----: | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| MobileNetv3 | PAFNet-Lite | 12 | 20x | 23.9 | 26.00 | 14 | [下载链接](https://paddledet.bj.bcebos.com/models/pafnet_lite_mobilenet_v3_20x_coco.pdparams) | [配置文件](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ttfnet/pafnet_lite_mobilenet_v3_20x_coco.yml) |
+
+**注意:** 由于动态图框架整体升级,PAFNet的PaddleDetection发布的权重模型评估时需要添加--bias字段, 例如
+
+```bash
+# 使用PaddleDetection发布的权重
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/pafnet_10x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/pafnet_10x_coco.pdparams --bias
+```
+
+## Citations
+```
+@article{liu2019training,
+ title = {Training-Time-Friendly Network for Real-Time Object Detection},
+ author = {Zili Liu, Tu Zheng, Guodong Xu, Zheng Yang, Haifeng Liu, Deng Cai},
+ journal = {arXiv preprint arXiv:1909.00700},
+ year = {2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/README_en.md b/PaddleDetection-release-2.6/configs/ttfnet/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..f08fcfdfa9276194ca954cf6e9ae0deea1989487
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/README_en.md
@@ -0,0 +1,69 @@
+# 1. TTFNet
+
+## Introduction
+
+TTFNet is a network used for real-time object detection and friendly to training time. It improves the slow convergence speed of CenterNet and proposes a new method to generate training samples using Gaussian kernel, which effectively eliminates the fuzziness existing in Anchor Free head. At the same time, the simple and lightweight network structure is also easy to expand the task.
+
+
+**Characteristics:**
+
+The structure is simple, requiring only two heads to detect target position and size, and eliminating time-consuming post-processing operations
+The training time is short. Based on DarkNet53 backbone network, V100 8 cards only need 2 hours of training to achieve better model effect
+
+## Model Zoo
+
+| Backbone | Network type | Number of images per GPU | Learning rate strategy | Inferring time(fps) | Box AP | Download | Configuration File |
+| :-------- | :----------- | :----------------------: | :--------------------: | :-----------------: | :----: | :------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------: |
+| DarkNet53 | TTFNet | 12 | 1x | ---- | 33.5 | [link](https://paddledet.bj.bcebos.com/models/ttfnet_darknet53_1x_coco.pdparams) | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ttfnet/ttfnet_darknet53_1x_coco.yml) |
+
+
+
+
+
+# 2. PAFNet
+
+## Introduction
+
+PAFNet (Paddle Anchor Free) is an optimized model of PaddleDetection based on TTF Net, whose accuracy reaches the SOTA level in the Anchor Free field, and meanwhile produces mobile lightweight model PAFNet-Lite
+
+PAFNet series models optimize TTFNet model from the following aspects:
+
+- [CutMix](https://arxiv.org/abs/1905.04899)
+- Better backbone network: ResNet50vd-DCN
+- Larger training batch size: 8 GPUs, each GPU batch size=18
+- Synchronized Batch Normalization
+- [Deformable Convolution](https://arxiv.org/abs/1703.06211)
+- [Exponential Moving Average](https://www.investopedia.com/terms/e/ema.asp)
+- Better pretraining model
+
+
+## Model library
+
+| Backbone | Net type | Number of images per GPU | Learning rate strategy | Inferring time(fps) | Box AP | Download | Configuration File |
+| :--------- | :------- | :----------------------: | :--------------------: | :-----------------: | :----: | :---------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------: |
+| ResNet50vd | PAFNet | 18 | 10x | ---- | 39.8 | [link](https://paddledet.bj.bcebos.com/models/pafnet_10x_coco.pdparams) | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ttfnet/pafnet_10x_coco.yml) |
+
+
+
+### PAFNet-Lite
+
+| Backbone | Net type | Number of images per GPU | Learning rate strategy | Box AP | kirin 990 delay(ms) | volume(M) | Download | Configuration File |
+| :---------- | :---------- | :----------------------: | :--------------------: | :----: | :-------------------: | :---------: | :---------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------: |
+| MobileNetv3 | PAFNet-Lite | 12 | 20x | 23.9 | 26.00 | 14 | [link](https://paddledet.bj.bcebos.com/models/pafnet_lite_mobilenet_v3_20x_coco.pdparams) | [Configuration File](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ttfnet/pafnet_lite_mobilenet_v3_20x_coco.yml) |
+
+**Attention:** Due to the overall upgrade of the dynamic graph framework, the weighting model published by PaddleDetection of PAF Net needs to be evaluated with a --bias field, for example
+
+```bash
+# Published weights using Paddle Detection
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/pafnet_10x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/pafnet_10x_coco.pdparams --bias
+```
+
+## Citations
+```
+@article{liu2019training,
+ title = {Training-Time-Friendly Network for Real-Time Object Detection},
+ author = {Zili Liu, Tu Zheng, Guodong Xu, Zheng Yang, Haifeng Liu, Deng Cai},
+ journal = {arXiv preprint arXiv:1909.00700},
+ year = {2019}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/_base_/optimizer_10x.yml b/PaddleDetection-release-2.6/configs/ttfnet/_base_/optimizer_10x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..dd2c29d966650d76b0636b3f889e13efbbe5d95a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/_base_/optimizer_10x.yml
@@ -0,0 +1,19 @@
+epoch: 120
+
+LearningRate:
+ base_lr: 0.015
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [80, 110]
+ - !LinearWarmup
+ start_factor: 0.2
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0004
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/ttfnet/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8457ead9add410c85d75c0427748e6d3d4eb8319
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.015
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.2
+ steps: 500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0004
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/_base_/optimizer_20x.yml b/PaddleDetection-release-2.6/configs/ttfnet/_base_/optimizer_20x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4dd3492202a3fdf9a612541c0ecd1dc76f1b6519
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/_base_/optimizer_20x.yml
@@ -0,0 +1,20 @@
+epoch: 240
+
+LearningRate:
+ base_lr: 0.015
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [160, 220]
+ - !LinearWarmup
+ start_factor: 0.2
+ steps: 1000
+
+OptimizerBuilder:
+ clip_grad_by_norm: 35
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0004
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet.yml b/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c3b21c5cce9d59f0f382a4d051d58e8f4ecdc0bb
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet.yml
@@ -0,0 +1,40 @@
+architecture: TTFNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_pretrained.pdparams
+norm_type: sync_bn
+use_ema: true
+ema_decay: 0.9998
+
+TTFNet:
+ backbone: ResNet
+ neck: TTFFPN
+ ttf_head: TTFHead
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [0, 1, 2, 3]
+ freeze_at: -1
+ norm_decay: 0.
+ dcn_v2_stages: [1, 2, 3]
+
+TTFFPN:
+ planes: [256, 128, 64]
+ shortcut_num: [3, 2, 1]
+
+TTFHead:
+ dcn_head: true
+ hm_loss:
+ name: CTFocalLoss
+ loss_weight: 1.
+ wh_loss:
+ name: GIoULoss
+ loss_weight: 5.
+ reduction: sum
+
+BBoxPostProcess:
+ decode:
+ name: TTFBox
+ max_per_img: 100
+ score_thresh: 0.01
+ down_ratio: 4
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet_lite.yml b/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet_lite.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5ed2fa235b6eb0f35690183a884dabbea43b279e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet_lite.yml
@@ -0,0 +1,44 @@
+architecture: TTFNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x1_0_ssld_pretrained.pdparams
+norm_type: sync_bn
+
+TTFNet:
+ backbone: MobileNetV3
+ neck: TTFFPN
+ ttf_head: TTFHead
+ post_process: BBoxPostProcess
+
+MobileNetV3:
+ scale: 1.0
+ model_name: large
+ feature_maps: [5, 8, 14, 17]
+ with_extra_blocks: true
+ lr_mult_list: [0.25, 0.25, 0.5, 0.5, 0.75]
+ conv_decay: 0.00001
+ norm_decay: 0.0
+ extra_block_filters: []
+
+TTFFPN:
+ planes: [96, 48, 24]
+ shortcut_num: [2, 2, 1]
+ lite_neck: true
+ fusion_method: concat
+
+TTFHead:
+ hm_head_planes: 48
+ wh_head_planes: 24
+ lite_head: true
+ hm_loss:
+ name: CTFocalLoss
+ loss_weight: 1.
+ wh_loss:
+ name: GIoULoss
+ loss_weight: 5.
+ reduction: sum
+
+BBoxPostProcess:
+ decode:
+ name: TTFBox
+ max_per_img: 100
+ score_thresh: 0.01
+ down_ratio: 4
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet_lite_reader.yml b/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet_lite_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..024792114961e98f764e572efc548fb0bec7a7e5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet_lite_reader.yml
@@ -0,0 +1,37 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {brightness: [-32., 32., 0.5], random_apply: False, random_channel: True}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {aspect_ratio: NULL, cover_all_box: True}
+ - RandomFlip: {}
+ - GridMask: {upper_iter: 300000}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512], random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [123.675, 116.28, 103.53], std: [58.395, 57.12, 57.375], is_scale: false}
+ - Permute: {}
+ - Gt2TTFTarget: {down_ratio: 4}
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 12
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [320, 320], keep_ratio: False}
+ - NormalizeImage: {is_scale: false, mean: [123.675, 116.28, 103.53], std: [58.395, 57.12, 57.375]}
+ - Permute: {}
+ batch_size: 1
+ drop_last: false
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [320, 320], keep_ratio: False}
+ - NormalizeImage: {is_scale: false, mean: [123.675, 116.28, 103.53], std: [58.395, 57.12, 57.375]}
+ - Permute: {}
+ batch_size: 1
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet_reader.yml b/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ccbbdb257b21fca0a9901512d6f2f1962fc5105e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/_base_/pafnet_reader.yml
@@ -0,0 +1,36 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {brightness: [-32., 32., 0.5], random_apply: false, random_channel: true}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {aspect_ratio: NULL, cover_all_box: True}
+ - RandomFlip: {prob: 0.5}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [416, 448, 480, 512, 544, 576, 608, 640, 672], keep_ratio: false}
+ - NormalizeImage: {mean: [123.675, 116.28, 103.53], std: [58.395, 57.12, 57.375], is_scale: false}
+ - Permute: {}
+ - Gt2TTFTarget: {down_ratio: 4}
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 18
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [512, 512], keep_ratio: False}
+ - NormalizeImage: {is_scale: false, mean: [123.675, 116.28, 103.53], std: [58.395, 57.12, 57.375]}
+ - Permute: {}
+ batch_size: 1
+ drop_last: false
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [512, 512], keep_ratio: False}
+ - NormalizeImage: {is_scale: false, mean: [123.675, 116.28, 103.53], std: [58.395, 57.12, 57.375]}
+ - Permute: {}
+ batch_size: 1
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/_base_/ttfnet_darknet53.yml b/PaddleDetection-release-2.6/configs/ttfnet/_base_/ttfnet_darknet53.yml
new file mode 100644
index 0000000000000000000000000000000000000000..05c7dce6503209c76da2c62613e3e2960ce47cc0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/_base_/ttfnet_darknet53.yml
@@ -0,0 +1,35 @@
+architecture: TTFNet
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/DarkNet53_pretrained.pdparams
+
+TTFNet:
+ backbone: DarkNet
+ neck: TTFFPN
+ ttf_head: TTFHead
+ post_process: BBoxPostProcess
+
+DarkNet:
+ depth: 53
+ freeze_at: 0
+ return_idx: [1, 2, 3, 4]
+ norm_type: bn
+ norm_decay: 0.0004
+
+TTFFPN:
+ planes: [256, 128, 64]
+ shortcut_num: [3, 2, 1]
+
+TTFHead:
+ hm_loss:
+ name: CTFocalLoss
+ loss_weight: 1.
+ wh_loss:
+ name: GIoULoss
+ loss_weight: 5.
+ reduction: sum
+
+BBoxPostProcess:
+ decode:
+ name: TTFBox
+ max_per_img: 100
+ score_thresh: 0.01
+ down_ratio: 4
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/_base_/ttfnet_reader.yml b/PaddleDetection-release-2.6/configs/ttfnet/_base_/ttfnet_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9c12af727db6e7a199b76ccfda2286bc891a5b22
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/_base_/ttfnet_reader.yml
@@ -0,0 +1,33 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomFlip: {prob: 0.5}
+ - Resize: {interp: 1, target_size: [512, 512], keep_ratio: False}
+ - NormalizeImage: {mean: [123.675, 116.28, 103.53], std: [58.395, 57.12, 57.375], is_scale: false}
+ - Permute: {}
+ batch_transforms:
+ - Gt2TTFTarget: {down_ratio: 4}
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 12
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [512, 512], keep_ratio: False}
+ - NormalizeImage: {is_scale: false, mean: [123.675, 116.28, 103.53], std: [58.395, 57.12, 57.375]}
+ - Permute: {}
+ batch_size: 1
+ drop_last: false
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 1, target_size: [512, 512], keep_ratio: False}
+ - NormalizeImage: {is_scale: false, mean: [123.675, 116.28, 103.53], std: [58.395, 57.12, 57.375]}
+ - Permute: {}
+ batch_size: 1
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/pafnet_10x_coco.yml b/PaddleDetection-release-2.6/configs/ttfnet/pafnet_10x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b14a2bc912cce4cc4b0edca538cc19c3e51f65a5
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/pafnet_10x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_10x.yml',
+ '_base_/pafnet.yml',
+ '_base_/pafnet_reader.yml',
+]
+weights: output/pafnet_10x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/pafnet_lite_mobilenet_v3_20x_coco.yml b/PaddleDetection-release-2.6/configs/ttfnet/pafnet_lite_mobilenet_v3_20x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..577af1635acc3c7778114db78775bd720727a588
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/pafnet_lite_mobilenet_v3_20x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_20x.yml',
+ '_base_/pafnet_lite.yml',
+ '_base_/pafnet_lite_reader.yml',
+]
+weights: output/pafnet_lite_mobilenet_v3_10x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/ttfnet/ttfnet_darknet53_1x_coco.yml b/PaddleDetection-release-2.6/configs/ttfnet/ttfnet_darknet53_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..59123921f43742fa1bfc9d98ecd50f3bdb0bfaa7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/ttfnet/ttfnet_darknet53_1x_coco.yml
@@ -0,0 +1,8 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_1x.yml',
+ '_base_/ttfnet_darknet53.yml',
+ '_base_/ttfnet_reader.yml',
+]
+weights: output/ttfnet_darknet53_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/vitdet/README.md b/PaddleDetection-release-2.6/configs/vitdet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..e0037858544b671de34c79f32f43baa9525d9db4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/README.md
@@ -0,0 +1,69 @@
+# Vision Transformer Detection
+
+## Introduction
+
+- [Context Autoencoder for Self-Supervised Representation Learning](https://arxiv.org/abs/2202.03026)
+- [Benchmarking Detection Transfer Learning with Vision Transformers](https://arxiv.org/pdf/2111.11429.pdf)
+
+Object detection is a central downstream task used to
+test if pre-trained network parameters confer benefits, such
+as improved accuracy or training speed. The complexity
+of object detection methods can make this benchmarking
+non-trivial when new architectures, such as Vision Transformer (ViT) models, arrive.
+
+## Model Zoo
+
+| Model | Backbone | Pretrained | Scheduler | Images/GPU | Box AP | Mask AP | Config | Download |
+|:------:|:--------:|:--------------:|:--------------:|:--------------:|:--------------:|:------:|:------:|:--------:|
+| Cascade RCNN | ViT-base | CAE | 1x | 1 | 52.7 | - | [config](./cascade_rcnn_vit_base_hrfpn_cae_1x_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/cascade_rcnn_vit_base_hrfpn_cae_1x_coco.pdparams) |
+| Cascade RCNN | ViT-large | CAE | 1x | 1 | 55.7 | - | [config](./cascade_rcnn_vit_large_hrfpn_cae_1x_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/cascade_rcnn_vit_large_hrfpn_cae_1x_coco.pdparams) |
+| PP-YOLOE | ViT-base | CAE | 36e | 2 | 52.2 | - | [config](./ppyoloe_vit_base_csppan_cae_36e_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/ppyoloe_vit_base_csppan_cae_36e_coco.pdparams) |
+| Mask RCNN | ViT-base | CAE | 1x | 1 | 50.6 | 44.9 | [config](./mask_rcnn_vit_base_hrfpn_cae_1x_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/mask_rcnn_vit_base_hrfpn_cae_1x_coco.pdparams) |
+| Mask RCNN | ViT-large | CAE | 1x | 1 | 54.2 | 47.4 | [config](./mask_rcnn_vit_large_hrfpn_cae_1x_coco.yml) | [model](https://bj.bcebos.com/v1/paddledet/models/mask_rcnn_vit_large_hrfpn_cae_1x_coco.pdparams) |
+
+
+**Notes:**
+- Model is trained on COCO train2017 dataset and evaluated on val2017 results of `mAP(IoU=0.5:0.95)
+- Base model is trained on 8x32G V100 GPU, large model on 8x80G A100
+- The `Cascade RCNN` experiments are based on PaddlePaddle 2.2.2
+
+## Citations
+```
+@article{chen2022context,
+ title={Context autoencoder for self-supervised representation learning},
+ author={Chen, Xiaokang and Ding, Mingyu and Wang, Xiaodi and Xin, Ying and Mo, Shentong and Wang, Yunhao and Han, Shumin and Luo, Ping and Zeng, Gang and Wang, Jingdong},
+ journal={arXiv preprint arXiv:2202.03026},
+ year={2022}
+}
+
+@article{DBLP:journals/corr/abs-2111-11429,
+ author = {Yanghao Li and
+ Saining Xie and
+ Xinlei Chen and
+ Piotr Doll{\'{a}}r and
+ Kaiming He and
+ Ross B. Girshick},
+ title = {Benchmarking Detection Transfer Learning with Vision Transformers},
+ journal = {CoRR},
+ volume = {abs/2111.11429},
+ year = {2021},
+ url = {https://arxiv.org/abs/2111.11429},
+ eprinttype = {arXiv},
+ eprint = {2111.11429},
+ timestamp = {Fri, 26 Nov 2021 13:48:43 +0100},
+ biburl = {https://dblp.org/rec/journals/corr/abs-2111-11429.bib},
+ bibsource = {dblp computer science bibliography, https://dblp.org}
+}
+
+@article{Cai_2019,
+ title={Cascade R-CNN: High Quality Object Detection and Instance Segmentation},
+ ISSN={1939-3539},
+ url={http://dx.doi.org/10.1109/tpami.2019.2956516},
+ DOI={10.1109/tpami.2019.2956516},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ publisher={Institute of Electrical and Electronics Engineers (IEEE)},
+ author={Cai, Zhaowei and Vasconcelos, Nuno},
+ year={2019},
+ pages={1–1}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/vitdet/_base_/faster_rcnn_reader.yml b/PaddleDetection-release-2.6/configs/vitdet/_base_/faster_rcnn_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e1165cd0a03fd07f41eaea2701526639010cc7e9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/_base_/faster_rcnn_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomResizeCrop: {resizes: [400, 500, 600], cropsizes: [[384, 600], ], prob: 0.5}
+ - RandomResize: {target_size: [[480, 1333], [512, 1333], [544, 1333], [576, 1333], [608, 1333], [640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], keep_ratio: True, interp: 2}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ inputs_def:
+ image_shape: [-1, 3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: 640, keep_ratio: True}
+ - Pad: {size: 640}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/vitdet/_base_/mask_rcnn_reader.yml b/PaddleDetection-release-2.6/configs/vitdet/_base_/mask_rcnn_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..83fd376b730ed10767508d3541e778d9663f4555
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/_base_/mask_rcnn_reader.yml
@@ -0,0 +1,41 @@
+worker_num: 2
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ # - RandomResizeCrop: {resizes: [400, 500, 600], cropsizes: [[384, 600], ], prob: 0.5}
+ - RandomResize: {target_size: [[640, 1333], [672, 1333], [704, 1333], [736, 1333], [768, 1333], [800, 1333]], interp: 2, keep_ratio: True}
+ - RandomFlip: {prob: 0.5}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: true
+ drop_last: true
+ collate_batch: false
+ use_shared_memory: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: [800, 1333], keep_ratio: True}
+ - NormalizeImage: {is_scale: true, mean: [0.485,0.456,0.406], std: [0.229, 0.224,0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ shuffle: false
+ drop_last: false
diff --git a/PaddleDetection-release-2.6/configs/vitdet/_base_/optimizer_base_1x.yml b/PaddleDetection-release-2.6/configs/vitdet/_base_/optimizer_base_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b822b3bf92a6a12facafe4b569a0ebcad3cf1d3b
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/_base_/optimizer_base_1x.yml
@@ -0,0 +1,22 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [9, 11]
+ - !LinearWarmup
+ start_factor: 0.001
+ steps: 1000
+
+OptimizerBuilder:
+ optimizer:
+ type: AdamWDL
+ betas: [0.9, 0.999]
+ layer_decay: 0.75
+ weight_decay: 0.02
+ num_layers: 12
+ filter_bias_and_bn: True
+ skip_decay_names: ['pos_embed', 'cls_token']
+ set_param_lr_func: 'layerwise_lr_decay'
diff --git a/PaddleDetection-release-2.6/configs/vitdet/_base_/optimizer_base_36e.yml b/PaddleDetection-release-2.6/configs/vitdet/_base_/optimizer_base_36e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..83b8708d046ea2ce57345ba543ed39453389f45d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/_base_/optimizer_base_36e.yml
@@ -0,0 +1,20 @@
+
+epoch: 36
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 36
+ min_lr_ratio: 0.1 # 0.1
+ - !LinearWarmup
+ start_factor: 0.001
+ epochs: 1
+
+
+OptimizerBuilder:
+ clip_grad_by_norm: 0.1
+ regularizer: false
+ optimizer:
+ type: AdamW
+ weight_decay: 0.0001
diff --git a/PaddleDetection-release-2.6/configs/vitdet/_base_/ppyoloe_reader.yml b/PaddleDetection-release-2.6/configs/vitdet/_base_/ppyoloe_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a4feaff4a1c1d64556bd787bd36b7ec7c6b08d81
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/_base_/ppyoloe_reader.yml
@@ -0,0 +1,40 @@
+worker_num: 4
+eval_height: &eval_height 640
+eval_width: &eval_width 640
+eval_size: &eval_size [*eval_height, *eval_width]
+
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608, 640, 672, 704, 736, 768], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ - PadGT: {}
+ batch_size: 2
+ shuffle: true
+ drop_last: true
+ use_shared_memory: true
+ collate_batch: true
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 2
+
+TestReader:
+ inputs_def:
+ image_shape: [3, *eval_height, *eval_width]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: *eval_size, keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/vitdet/cascade_rcnn_vit_base_hrfpn_cae_1x_coco.yml b/PaddleDetection-release-2.6/configs/vitdet/cascade_rcnn_vit_base_hrfpn_cae_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f7808b7cc3ed90be63a984f8552731e3e0e289f7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/cascade_rcnn_vit_base_hrfpn_cae_1x_coco.yml
@@ -0,0 +1,131 @@
+
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/faster_rcnn_reader.yml',
+ './_base_/optimizer_base_1x.yml'
+]
+
+weights: output/cascade_rcnn_vit_base_hrfpn_cae_1x_coco/model_final
+
+
+# runtime
+log_iter: 100
+snapshot_epoch: 1
+find_unused_parameters: True
+
+use_gpu: true
+norm_type: sync_bn
+
+
+# reader
+worker_num: 2
+TrainReader:
+ batch_size: 1
+
+
+# model
+architecture: CascadeRCNN
+
+CascadeRCNN:
+ backbone: VisionTransformer
+ neck: HRFPN
+ rpn_head: RPNHead
+ bbox_head: CascadeHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+
+
+VisionTransformer:
+ patch_size: 16
+ embed_dim: 768
+ depth: 12
+ num_heads: 12
+ mlp_ratio: 4
+ qkv_bias: True
+ drop_rate: 0.0
+ drop_path_rate: 0.2
+ init_values: 0.1
+ final_norm: False
+ use_rel_pos_bias: False
+ use_sincos_pos_emb: True
+ epsilon: 0.000001 # 1e-6
+ out_indices: [3, 5, 7, 11]
+ with_fpn: True
+ pretrained: https://bj.bcebos.com/v1/paddledet/models/pretrained/vit_base_cae_pretrained.pdparams
+
+HRFPN:
+ out_channel: 256
+ use_bias: True
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 2000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+ loss_rpn_bbox: SmoothL1Loss
+
+SmoothL1Loss:
+ beta: 0.1111111111111111
+
+
+CascadeHead:
+ head: CascadeXConvNormHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+ bbox_loss: GIoULoss
+ num_cascade_stages: 3
+ reg_class_agnostic: False
+ stage_loss_weights: [1, 0.5, 0.25]
+ loss_normalize_pos: True
+ add_gt_as_proposals: [True, True, True]
+
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ cascade_iou: [0.5, 0.6, 0.7]
+ use_random: True
+
+
+CascadeXConvNormHead:
+ norm_type: bn
+
+
+GIoULoss:
+ loss_weight: 10.
+ reduction: 'none'
+ eps: 0.000001
+
+
+BBoxPostProcess:
+ decode:
+ name: RCNNBox
+ prior_box_var: [30.0, 30.0, 15.0, 15.0]
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/vitdet/cascade_rcnn_vit_large_hrfpn_cae_1x_coco.yml b/PaddleDetection-release-2.6/configs/vitdet/cascade_rcnn_vit_large_hrfpn_cae_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..3bd443ef3e3fc0d2dcb623b60e45c51c629c6d2a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/cascade_rcnn_vit_large_hrfpn_cae_1x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ './cascade_rcnn_vit_base_hrfpn_cae_1x_coco.yml'
+]
+
+weights: output/cascade_rcnn_vit_large_hrfpn_cae_1x_coco/model_final
+
+
+depth: &depth 24
+dim: &dim 1024
+use_fused_allreduce_gradients: &use_checkpoint True
+
+VisionTransformer:
+ img_size: [800, 1344]
+ embed_dim: *dim
+ depth: *depth
+ num_heads: 16
+ drop_path_rate: 0.25
+ out_indices: [7, 11, 15, 23]
+ use_checkpoint: *use_checkpoint
+ pretrained: https://bj.bcebos.com/v1/paddledet/models/pretrained/vit_large_cae_pretrained.pdparams
+
+HRFPN:
+ in_channels: [*dim, *dim, *dim, *dim]
+
+OptimizerBuilder:
+ optimizer:
+ layer_decay: 0.9
+ weight_decay: 0.02
+ num_layers: *depth
diff --git a/PaddleDetection-release-2.6/configs/vitdet/faster_rcnn_vit_base_fpn_cae_1x_coco.yml b/PaddleDetection-release-2.6/configs/vitdet/faster_rcnn_vit_base_fpn_cae_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8b693f687fd370231e2bdee47a8e7c719c4d63f2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/faster_rcnn_vit_base_fpn_cae_1x_coco.yml
@@ -0,0 +1,130 @@
+
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/faster_rcnn_reader.yml',
+ './_base_/optimizer_base_1x.yml'
+]
+
+weights: output/faster_rcnn_vit_base_fpn_cae_1x_coco/model_final
+
+
+# runtime
+log_iter: 100
+snapshot_epoch: 1
+find_unused_parameters: True
+
+use_gpu: true
+norm_type: sync_bn
+
+OptimizerBuilder:
+ optimizer:
+ weight_decay: 0.05
+
+# reader
+worker_num: 2
+TrainReader:
+ batch_size: 1
+
+
+# model
+architecture: FasterRCNN
+
+FasterRCNN:
+ backbone: VisionTransformer
+ neck: FPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ bbox_post_process: BBoxPostProcess
+
+VisionTransformer:
+ patch_size: 16
+ embed_dim: 768
+ depth: 12
+ num_heads: 12
+ mlp_ratio: 4
+ qkv_bias: True
+ drop_rate: 0.0
+ drop_path_rate: 0.2
+ init_values: 0.1
+ final_norm: False
+ use_rel_pos_bias: False
+ use_sincos_pos_emb: True
+ epsilon: 0.000001 # 1e-6
+ out_indices: [3, 5, 7, 11]
+ with_fpn: True
+ pretrained: https://bj.bcebos.com/v1/paddledet/models/pretrained/vit_base_cae_pretrained.pdparams
+
+
+FPN:
+ out_channel: 256
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+ loss_rpn_bbox: SmoothL1Loss
+
+
+SmoothL1Loss:
+ beta: 0.1111111111111111
+
+
+BBoxHead:
+ # head: TwoFCHead
+ head: XConvNormHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+ loss_normalize_pos: True
+ bbox_loss: GIoULoss
+
+
+GIoULoss:
+ loss_weight: 10.
+ reduction: 'none'
+ eps: 0.000001 # 1e-6
+
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+# TwoFCHead:
+# out_channel: 1024
+
+XConvNormHead:
+ num_convs: 4
+ norm_type: bn
+
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
diff --git a/PaddleDetection-release-2.6/configs/vitdet/mask_rcnn_vit_base_hrfpn_cae_1x_coco.yml b/PaddleDetection-release-2.6/configs/vitdet/mask_rcnn_vit_base_hrfpn_cae_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c11ce890d64a8709ba31df0a93d24494f7e3aa65
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/mask_rcnn_vit_base_hrfpn_cae_1x_coco.yml
@@ -0,0 +1,135 @@
+_BASE_: [
+ '../datasets/coco_instance.yml',
+ '../runtime.yml',
+ './_base_/mask_rcnn_reader.yml',
+ './_base_/optimizer_base_1x.yml'
+]
+
+weights: output/mask_rcnn_vit_base_hrfpn_cae_1x_coco/model_final
+
+
+# runtime
+log_iter: 100
+snapshot_epoch: 1
+norm_type: sync_bn
+use_fused_allreduce_gradients: &use_checkpoint False
+
+
+architecture: MaskRCNN
+MaskRCNN:
+ backbone: VisionTransformer
+ neck: HRFPN
+ rpn_head: RPNHead
+ bbox_head: BBoxHead
+ mask_head: MaskHead
+ # post process
+ bbox_post_process: BBoxPostProcess
+ mask_post_process: MaskPostProcess
+
+VisionTransformer:
+ patch_size: 16
+ embed_dim: 768
+ depth: 12
+ num_heads: 12
+ mlp_ratio: 4
+ qkv_bias: True
+ drop_rate: 0.0
+ drop_path_rate: 0.2
+ init_values: 0.1
+ final_norm: False
+ use_rel_pos_bias: False
+ use_sincos_pos_emb: True
+ epsilon: 0.000001 # 1e-6
+ out_indices: [3, 5, 7, 11]
+ with_fpn: True
+ use_checkpoint: *use_checkpoint
+ pretrained: https://bj.bcebos.com/v1/paddledet/models/pretrained/vit_base_cae_pretrained.pdparams
+
+HRFPN:
+ out_channel: 256
+ use_bias: True
+
+RPNHead:
+ anchor_generator:
+ aspect_ratios: [0.5, 1.0, 2.0]
+ anchor_sizes: [[32], [64], [128], [256], [512]]
+ strides: [4, 8, 16, 32, 64]
+ rpn_target_assign:
+ batch_size_per_im: 256
+ fg_fraction: 0.5
+ negative_overlap: 0.3
+ positive_overlap: 0.7
+ use_random: True
+ train_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 2000
+ post_nms_top_n: 1000
+ topk_after_collect: True
+ test_proposal:
+ min_size: 0.0
+ nms_thresh: 0.7
+ pre_nms_top_n: 1000
+ post_nms_top_n: 1000
+ loss_rpn_bbox: SmoothL1Loss
+
+SmoothL1Loss:
+ beta: 0.1111111111111111
+
+
+BBoxHead:
+ head: XConvNormHead
+ roi_extractor:
+ resolution: 7
+ sampling_ratio: 0
+ aligned: True
+ bbox_assigner: BBoxAssigner
+ loss_normalize_pos: True
+ bbox_loss: GIoULoss
+
+BBoxAssigner:
+ batch_size_per_im: 512
+ bg_thresh: 0.5
+ fg_thresh: 0.5
+ fg_fraction: 0.25
+ use_random: True
+
+
+XConvNormHead:
+ num_convs: 4
+ norm_type: bn
+
+GIoULoss:
+ loss_weight: 10.
+ reduction: 'none'
+ eps: 0.000001
+
+
+
+BBoxPostProcess:
+ decode: RCNNBox
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.5
+
+MaskHead:
+ head: MaskFeat
+ roi_extractor:
+ resolution: 14
+ sampling_ratio: 0
+ aligned: True
+ mask_assigner: MaskAssigner
+ share_bbox_feat: False
+
+MaskFeat:
+ num_convs: 4
+ out_channel: 256
+ norm_type: ~
+
+MaskAssigner:
+ mask_resolution: 28
+
+MaskPostProcess:
+ binary_thresh: 0.5
diff --git a/PaddleDetection-release-2.6/configs/vitdet/mask_rcnn_vit_large_hrfpn_cae_1x_coco.yml b/PaddleDetection-release-2.6/configs/vitdet/mask_rcnn_vit_large_hrfpn_cae_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5884e91e9d146e6ec031e23b6840026e3c39b073
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/mask_rcnn_vit_large_hrfpn_cae_1x_coco.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ './mask_rcnn_vit_base_hrfpn_cae_1x_coco.yml'
+]
+
+weights: output/mask_rcnn_vit_large_hrfpn_cae_1x_coco/model_final
+
+
+depth: &depth 24
+dim: &dim 1024
+use_fused_allreduce_gradients: &use_checkpoint True
+
+VisionTransformer:
+ img_size: [800, 1344]
+ embed_dim: *dim
+ depth: *depth
+ num_heads: 16
+ drop_path_rate: 0.25
+ out_indices: [7, 11, 15, 23]
+ use_checkpoint: *use_checkpoint
+ pretrained: https://bj.bcebos.com/v1/paddledet/models/pretrained/vit_large_cae_pretrained.pdparams
+
+HRFPN:
+ in_channels: [*dim, *dim, *dim, *dim]
+
+OptimizerBuilder:
+ optimizer:
+ layer_decay: 0.9
+ weight_decay: 0.02
+ num_layers: *depth
diff --git a/PaddleDetection-release-2.6/configs/vitdet/ppyoloe_vit_base_csppan_cae_36e_coco.yml b/PaddleDetection-release-2.6/configs/vitdet/ppyoloe_vit_base_csppan_cae_36e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..556f4b49d7a2a1c7c6928b4e917881b175b01384
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/vitdet/ppyoloe_vit_base_csppan_cae_36e_coco.yml
@@ -0,0 +1,78 @@
+
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/ppyoloe_reader.yml',
+ './_base_/optimizer_base_36e.yml'
+]
+
+weights: output/ppyoloe_vit_base_csppan_cae_36e_coco/model_final
+
+
+snapshot_epoch: 2
+log_iter: 100
+
+
+use_ema: true
+ema_decay: 0.9999
+ema_skip_names: ['yolo_head.proj_conv.weight', 'backbone.pos_embed']
+custom_black_list: ['reduce_mean']
+use_fused_allreduce_gradients: &use_checkpoint False
+
+
+architecture: YOLOv3
+norm_type: sync_bn
+
+YOLOv3:
+ backbone: VisionTransformer
+ neck: YOLOCSPPAN
+ yolo_head: PPYOLOEHead
+ post_process: ~
+
+VisionTransformer:
+ patch_size: 16
+ embed_dim: 768
+ depth: 12
+ num_heads: 12
+ mlp_ratio: 4
+ qkv_bias: True
+ drop_rate: 0.0
+ drop_path_rate: 0.2
+ init_values: 0.1
+ final_norm: False
+ use_rel_pos_bias: False
+ use_sincos_pos_emb: True
+ epsilon: 0.000001 # 1e-6
+ out_indices: [11, ]
+ with_fpn: True
+ num_fpn_levels: 3
+ out_with_norm: False
+ use_checkpoint: *use_checkpoint
+ pretrained: https://bj.bcebos.com/v1/paddledet/models/pretrained/vit_base_cae_pretrained.pdparams
+
+YOLOCSPPAN:
+ in_channels: [768, 768, 768]
+ act: 'silu'
+
+PPYOLOEHead:
+ fpn_strides: [8, 16, 32]
+ in_channels: [768, 768, 768]
+ static_assigner_epoch: -1
+ grid_cell_scale: 5.0
+ grid_cell_offset: 0.5
+ use_varifocal_loss: True
+ loss_weight: {class: 1.0, iou: 2.5, dfl: 0.5}
+ static_assigner:
+ name: ATSSAssigner
+ topk: 9
+ assigner:
+ name: TaskAlignedAssigner
+ topk: 13
+ alpha: 1.0
+ beta: 6.0
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 300
+ score_threshold: 0.01
+ nms_threshold: 0.7
diff --git a/PaddleDetection-release-2.6/configs/yolof/README.md b/PaddleDetection-release-2.6/configs/yolof/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..84c86bf3b7b31dc16bc3574d3233f3ecd1bf6186
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolof/README.md
@@ -0,0 +1,22 @@
+# YOLOF (You Only Look One-level Feature)
+
+## ModelZOO
+
+| 网络网络 | 输入尺寸 | 图片数/GPU | Epochs | 模型推理耗时(ms) | mAPval
0.5:0.95 | Params(M) | FLOPs(G) | 下载链接 | 配置文件 |
+| :--------------------- | :------- | :-------: | :----: | :----------: | :---------------------: | :----------------: |:---------: | :------: |:---------------: |
+| YOLOF-R_50_C5 (paper) | 800x1333 | 4 | 12 | - | 37.7 | - | - | - | - |
+| YOLOF-R_50_C5 | 800x1333 | 4 | 12 | - | 38.1 | 44.16 | 241.64 | [下载链接](https://paddledet.bj.bcebos.com/models/yolof_r50_c5_1x_coco.pdparams) | [配置文件](./yolof_r50_c5_1x_coco.yml) |
+
+**注意:**
+ - YOLOF模型训练过程中默认使用8 GPUs进行混合精度训练,总batch_size默认为32。
+
+
+## Citations
+```
+@inproceedings{chen2021you,
+ title={You Only Look One-level Feature},
+ author={Chen, Qiang and Wang, Yingming and Yang, Tong and Zhang, Xiangyu and Cheng, Jian and Sun, Jian},
+ booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/yolof/_base_/optimizer_1x.yml b/PaddleDetection-release-2.6/configs/yolof/_base_/optimizer_1x.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6951a6c34bbde8bc484cc86d0a440cb2ed909fec
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolof/_base_/optimizer_1x.yml
@@ -0,0 +1,19 @@
+epoch: 12
+
+LearningRate:
+ base_lr: 0.06
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones: [8, 11]
+ - !LinearWarmup
+ start_factor: 0.00066
+ steps: 1500
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0001
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/yolof/_base_/yolof_r50_c5.yml b/PaddleDetection-release-2.6/configs/yolof/_base_/yolof_r50_c5.yml
new file mode 100644
index 0000000000000000000000000000000000000000..53b2eb972ba4dd8d058c5ae571fd3eafa7fb4b99
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolof/_base_/yolof_r50_c5.yml
@@ -0,0 +1,54 @@
+architecture: YOLOF
+find_unused_parameters: True
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_cos_pretrained.pdparams
+
+YOLOF:
+ backbone: ResNet
+ neck: DilatedEncoder
+ head: YOLOFHead
+
+ResNet:
+ depth: 50
+ variant: b # resnet-va in paper
+ freeze_at: 0 # res2
+ return_idx: [3] # only res5 feature
+ lr_mult_list: [0.3333, 0.3333, 0.3333, 0.3333]
+
+DilatedEncoder:
+ in_channels: [2048]
+ out_channels: [512]
+ block_mid_channels: 128
+ num_residual_blocks: 4
+ block_dilations: [2, 4, 6, 8]
+
+YOLOFHead:
+ conv_feat:
+ name: YOLOFFeat
+ feat_in: 512
+ feat_out: 512
+ num_cls_convs: 2
+ num_reg_convs: 4
+ norm_type: bn
+ anchor_generator:
+ name: AnchorGenerator
+ anchor_sizes: [[32, 64, 128, 256, 512]]
+ aspect_ratios: [1.0]
+ strides: [32]
+ bbox_assigner:
+ name: UniformAssigner
+ pos_ignore_thr: 0.15
+ neg_ignore_thr: 0.7
+ match_times: 4
+ loss_class:
+ name: FocalLoss
+ gamma: 2.0
+ alpha: 0.25
+ loss_bbox:
+ name: GIoULoss
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 1000
+ keep_top_k: 100
+ score_threshold: 0.05
+ nms_threshold: 0.6
diff --git a/PaddleDetection-release-2.6/configs/yolof/_base_/yolof_reader.yml b/PaddleDetection-release-2.6/configs/yolof/_base_/yolof_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a0b19c85aec11b1c3917a4ab478af2e78e11f1d9
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolof/_base_/yolof_reader.yml
@@ -0,0 +1,38 @@
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - RandomShift: {prob: 0.5, max_shift: 32}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - RandomFlip: {}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 4
+ shuffle: True
+ drop_last: True
+ collate_batch: False
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {is_scale: True, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+
+
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [800, 1333], keep_ratio: True, interp: 1}
+ - NormalizeImage: {is_scale: True, mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225]}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 1
+ fuse_normalize: True
diff --git a/PaddleDetection-release-2.6/configs/yolof/yolof_r50_c5_1x_coco.yml b/PaddleDetection-release-2.6/configs/yolof/yolof_r50_c5_1x_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..2bc476f505acf8e4e715fa1f1c3d8f1edf896d70
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolof/yolof_r50_c5_1x_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_1x.yml',
+ './_base_/yolof_r50_c5.yml',
+ './_base_/yolof_reader.yml'
+]
+log_iter: 50
+snapshot_epoch: 1
+weights: output/yolof_r50_c5_1x_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolov3/README.md b/PaddleDetection-release-2.6/configs/yolov3/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..16327dd3e3b7e482a42ad1776671e5ead870e062
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/README.md
@@ -0,0 +1,85 @@
+# YOLOv3
+
+## Model Zoo
+
+### YOLOv3 on COCO
+
+| 骨架网络 | 输入尺寸 | 每张GPU图片个数 | 学习率策略 |推理时间(fps) | mAPval
0.5:0.95 | 下载 | 配置文件 |
+| :------------------- | :------- | :-----: | :-----: | :------------: | :-----: | :-----------------------------------------------------: | :-----: |
+| DarkNet53(paper) | 608 | 8 | 270e | - | 33.0 | - | - |
+| DarkNet53(paper) | 416 | 8 | 270e | - | 31.0 | - | - |
+| DarkNet53(paper) | 320 | 8 | 270e | - | 28.2 | - | - |
+| DarkNet53 | 608 | 8 | 270e | - | **39.1** | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams) | [配置文件](./yolov3_darknet53_270e_coco.yml) |
+| DarkNet53 | 416 | 8 | 270e | - | 37.7 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams) | [配置文件](./yolov3_darknet53_270e_coco.yml) |
+| DarkNet53 | 320 | 8 | 270e | - | 34.8 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams) | [配置文件](./yolov3_darknet53_270e_coco.yml) |
+| ResNet50_vd-DCN | 608 | 8 | 270e | - | **40.6** | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_r50vd_dcn_270e_coco.pdparams) | [配置文件](./yolov3_r50vd_dcn_270e_coco.yml) |
+| ResNet50_vd-DCN | 416 | 8 | 270e | - | 38.2 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_r50vd_dcn_270e_coco.pdparams) | [配置文件](./yolov3_r50vd_dcn_270e_coco.yml) |
+| ResNet50_vd-DCN | 320 | 8 | 270e | - | 35.1 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_r50vd_dcn_270e_coco.pdparams) | [配置文件](./yolov3_r50vd_dcn_270e_coco.yml) |
+| ResNet34 | 608 | 8 | 270e | - | 36.2 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_r34_270e_coco.pdparams) | [配置文件](./yolov3_r34_270e_coco.yml) |
+| ResNet34 | 416 | 8 | 270e | - | 34.3 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_r34_270e_coco.pdparams) | [配置文件](./yolov3_r34_270e_coco.yml) |
+| ResNet34 | 320 | 8 | 270e | - | 31.2 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_r34_270e_coco.pdparams) | [配置文件](./yolov3_r34_270e_coco.yml) |
+| MobileNet-V1 | 608 | 8 | 270e | - | 29.4 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) | [配置文件](./yolov3_mobilenet_v1_270e_coco.yml) |
+| MobileNet-V1 | 416 | 8 | 270e | - | 29.3 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) | [配置文件](./yolov3_mobilenet_v1_270e_coco.yml) |
+| MobileNet-V1 | 320 | 8 | 270e | - | 27.2 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams) | [配置文件](./yolov3_mobilenet_v1_270e_coco.yml) |
+| MobileNet-V3 | 608 | 8 | 270e | - | 31.4 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_270e_coco.pdparams) | [配置文件](./yolov3_mobilenet_v3_large_270e_coco.yml) |
+| MobileNet-V3 | 416 | 8 | 270e | - | 29.6 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_270e_coco.pdparams) | [配置文件](./yolov3_mobilenet_v3_large_270e_coco.yml) |
+| MobileNet-V3 | 320 | 8 | 270e | - | 27.1 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_270e_coco.pdparams) | [配置文件](./yolov3_mobilenet_v3_large_270e_coco.yml) |
+| MobileNet-V1-SSLD | 608 | 8 | 270e | - | 31.0 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_ssld_270e_coco.pdparams) | [配置文件](./yolov3_mobilenet_v1_ssld_270e_coco.yml) |
+| MobileNet-V1-SSLD | 416 | 8 | 270e | - | 30.6 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_ssld_270e_coco.pdparams) | [配置文件](./yolov3_mobilenet_v1_ssld_270e_coco.yml) |
+| MobileNet-V1-SSLD | 320 | 8 | 270e | - | 28.4 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_ssld_270e_coco.pdparams) | [配置文件](./yolov3_mobilenet_v1_ssld_270e_coco.yml) |
+
+### YOLOv3 on Pasacl VOC
+
+| 骨架网络 | 输入尺寸 | 每张GPU图片个数 | 学习率策略 |推理时间(fps)| mAP(0.50,11point) | 下载 | 配置文件 |
+| :----------- | :--: | :-----: | :-----: |:------------: |:----: | :-------: | :----: |
+| DarkNet53 | 608 | 8 | 270e | - | **85.4** (56.1 mAP
0.5:0.95) | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_voc.pdparams) | [配置文件](./yolov3_darknet53_270e_voc.yml) |
+| DarkNet53 | 416 | 8 | 270e | - | 85.2 (57.3 mAP
0.5:0.95) | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_voc.pdparams) | [配置文件](./yolov3_darknet53_270e_voc.yml) |
+| DarkNet53 | 320 | 8 | 270e | - | 84.3 (55.2 mAP
0.5:0.95) | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_voc.pdparams) | [配置文件](./yolov3_darknet53_270e_voc.yml) |
+| MobileNet-V1 | 608 | 8 | 270e | - | 75.2 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v1_270e_voc.yml) |
+| MobileNet-V1 | 416 | 8 | 270e | - | 76.2 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v1_270e_voc.yml) |
+| MobileNet-V1 | 320 | 8 | 270e | - | 74.3 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v1_270e_voc.yml) |
+| MobileNet-V3 | 608 | 8 | 270e | - | 79.6 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v3_large_270e_voc.yml) |
+| MobileNet-V3 | 416 | 8 | 270e | - | 78.6 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v3_large_270e_voc.yml) |
+| MobileNet-V3 | 320 | 8 | 270e | - | 76.4 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v3_large_270e_voc.yml) |
+| MobileNet-V1-SSLD | 608 | 8 | 270e | - | 78.3 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_ssld_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v1_ssld_270e_voc.yml) |
+| MobileNet-V1-SSLD | 416 | 8 | 270e | - | 79.6 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_ssld_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v1_ssld_270e_voc.yml) |
+| MobileNet-V1-SSLD | 320 | 8 | 270e | - | 77.3 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_ssld_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v1_ssld_270e_voc.yml) |
+| MobileNet-V3-SSLD | 608 | 8 | 270e | - | 80.4 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_ssld_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v3_large_ssld_270e_voc.yml) |
+| MobileNet-V3-SSLD | 416 | 8 | 270e | - | 79.2 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_ssld_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v3_large_ssld_270e_voc.yml) |
+| MobileNet-V3-SSLD | 320 | 8 | 270e | - | 77.3 | [下载链接](https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v3_large_ssld_270e_voc.pdparams) | [配置文件](./yolov3_mobilenet_v3_large_ssld_270e_voc.yml) |
+
+
+**注意:**
+ - YOLOv3模型训练过程中默认使用8 GPUs,总batch_size默认为64,评估时网络尺度默认为`608*608`;
+ - `416*416`和`320*320`尺度只需更改`EvalReader`的`Resize`参数为相应值即可,无需重新训练模型,如:
+ ```
+ EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [416, 416], keep_ratio: False, interp: 2} # or [320, 320]
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+ ```
+ - VOC数据集可以从此[链接](https://bj.bcebos.com/v1/paddledet/data/voc.zip)下载,默认评估指标为mAP(0.50,11point),如果想转为COCO格式指标的mAP
0.5:0.95,可以参照[yolov3_darknet53_270e_voc](./yolov3_darknet53_270e_voc.yml) 添加以下几行重新eval:
+ ```
+ metric: COCO
+ EvalDataset:
+ !COCODataSet
+ image_dir: VOCdevkit/VOC2007/JPEGImages
+ anno_path: voc_test.json
+ dataset_dir: dataset/voc
+ ```
+
+
+## Citations
+```
+@misc{redmon2018yolov3,
+ title={YOLOv3: An Incremental Improvement},
+ author={Joseph Redmon and Ali Farhadi},
+ year={2018},
+ eprint={1804.02767},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/yolov3/_base_/optimizer_270e.yml b/PaddleDetection-release-2.6/configs/yolov3/_base_/optimizer_270e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d92f3df60ca6686d7ada476b7b9f01419f0edb81
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/_base_/optimizer_270e.yml
@@ -0,0 +1,21 @@
+epoch: 270
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 216
+ - 243
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 4000
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/yolov3/_base_/optimizer_40e.yml b/PaddleDetection-release-2.6/configs/yolov3/_base_/optimizer_40e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..7cf676d7119162d55dc0a2566c0590457344cfd3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/_base_/optimizer_40e.yml
@@ -0,0 +1,21 @@
+epoch: 40
+
+LearningRate:
+ base_lr: 0.0001
+ schedulers:
+ - name: PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 32
+ - 36
+ - name: LinearWarmup
+ start_factor: 0.3333333333333333
+ steps: 100
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_darknet53.yml b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_darknet53.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1187f6eac9d4e55eeea4d0b6e0c678ad01d724b0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_darknet53.yml
@@ -0,0 +1,41 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/DarkNet53_pretrained.pdparams
+norm_type: sync_bn
+
+YOLOv3:
+ backbone: DarkNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+DarkNet:
+ depth: 53
+ return_idx: [2, 3, 4]
+
+# use default config
+# YOLOv3FPN:
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_mobilenet_v1.yml b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_mobilenet_v1.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6452b5132b47203379bb3292eb0afc6958d4609f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_mobilenet_v1.yml
@@ -0,0 +1,43 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV1_pretrained.pdparams
+norm_type: sync_bn
+
+YOLOv3:
+ backbone: MobileNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+MobileNet:
+ scale: 1
+ feature_maps: [4, 6, 13]
+ with_extra_blocks: false
+ extra_block_filters: []
+
+# use default config
+# YOLOv3FPN:
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_mobilenet_v3_large.yml b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_mobilenet_v3_large.yml
new file mode 100644
index 0000000000000000000000000000000000000000..94b5dea3ea6b2039ce5aaf8ccab44d651727bd21
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_mobilenet_v3_large.yml
@@ -0,0 +1,44 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x1_0_ssld_pretrained.pdparams
+norm_type: sync_bn
+
+YOLOv3:
+ backbone: MobileNetV3
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+MobileNetV3:
+ model_name: large
+ scale: 1.
+ with_extra_blocks: false
+ extra_block_filters: []
+ feature_maps: [7, 13, 16]
+
+# use default config
+# YOLOv3FPN:
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_mobilenet_v3_small.yml b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_mobilenet_v3_small.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f0f144b916c44da6de45de762310e3470179ac5a
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_mobilenet_v3_small.yml
@@ -0,0 +1,44 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_small_x1_0_ssld_pretrained.pdparams
+norm_type: sync_bn
+
+YOLOv3:
+ backbone: MobileNetV3
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+MobileNetV3:
+ model_name: small
+ scale: 1.
+ with_extra_blocks: false
+ extra_block_filters: []
+ feature_maps: [4, 9, 12]
+
+# use default config
+# YOLOv3FPN:
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_r34.yml b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_r34.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c2d1489f07ba65240e5b545662b8c1672750b705
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_r34.yml
@@ -0,0 +1,41 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet34_pretrained.pdparams
+norm_type: sync_bn
+
+YOLOv3:
+ backbone: ResNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 34
+ return_idx: [1, 2, 3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_r50vd_dcn.yml b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_r50vd_dcn.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0d01148b476e5bbc4c9ae96cc7a215258e7d7042
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_r50vd_dcn.yml
@@ -0,0 +1,45 @@
+architecture: YOLOv3
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_pretrained.pdparams
+norm_type: sync_bn
+
+YOLOv3:
+ backbone: ResNet
+ neck: YOLOv3FPN
+ yolo_head: YOLOv3Head
+ post_process: BBoxPostProcess
+
+ResNet:
+ depth: 50
+ variant: d
+ return_idx: [1, 2, 3]
+ dcn_v2_stages: [3]
+ freeze_at: -1
+ freeze_norm: false
+ norm_decay: 0.
+
+# YOLOv3FPN:
+
+YOLOv3Head:
+ anchors: [[10, 13], [16, 30], [33, 23],
+ [30, 61], [62, 45], [59, 119],
+ [116, 90], [156, 198], [373, 326]]
+ anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
+ loss: YOLOv3Loss
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+BBoxPostProcess:
+ decode:
+ name: YOLOBox
+ conf_thresh: 0.005
+ downsample_ratio: 32
+ clip_bbox: true
+ nms:
+ name: MultiClassNMS
+ keep_top_k: 100
+ score_threshold: 0.01
+ nms_threshold: 0.45
+ nms_top_k: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_reader.yml b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5dab6742b120a68ea76599b911567ee753b68253
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/_base_/yolov3_reader.yml
@@ -0,0 +1,44 @@
+worker_num: 2
+TrainReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - Decode: {}
+ - Mixup: {alpha: 1.5, beta: 1.5}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 50}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8]}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ mixup_epoch: 250
+ use_shared_memory: true
+
+EvalReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 608, 608]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [608, 608], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4fbd401d302ea2d9c55a7b51384e36eff790abe2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_270e_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_darknet53.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_darknet53_270e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_270e_voc.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_270e_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..92631171af748e5c1b4d5e9219dccb140822123f
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_270e_voc.yml
@@ -0,0 +1,28 @@
+_BASE_: [
+ '../datasets/voc.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_darknet53.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_darknet53_270e_voc/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams
+
+
+# set collate_batch to false because ground-truth info is needed
+# on voc dataset and should not collate data in batch when batch size
+# is larger than 1.
+EvalReader:
+ collate_batch: false
+
+
+# ### remove comment below and run evaluate again to get 56.1 COCO for mAP(0.5:0.95)
+# metric: COCO
+# EvalDataset:
+# !COCODataSet
+# image_dir: VOCdevkit/VOC2007/JPEGImages
+# anno_path: voc_test.json
+# dataset_dir: dataset/voc
+# # wget https://bj.bcebos.com/v1/paddledet/data/voc.zip
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_original_270e_coco.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_original_270e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..f245d3b1c7309cb7af0b52fc2fa2747fe5d443a0
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_original_270e_coco.yml
@@ -0,0 +1,40 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_darknet53.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_darknet53_270e_coco/model_final
+
+norm_type: bn
+
+YOLOv3Loss:
+ ignore_thresh: 0.5
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+TrainReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53], ratio: 2.0}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [320, 352, 384, 416, 448, 480, 512, 544, 576, 608], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 50}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8], iou_thresh: 0.5}
+ batch_size: 8
+ shuffle: true
+ drop_last: true
+ mixup_epoch: -1
+ use_shared_memory: true
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_original_320e_coco_1p.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_original_320e_coco_1p.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fded8bbc3a53b24e7ffad28ba61b80b8960eb0ff
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_darknet53_original_320e_coco_1p.yml
@@ -0,0 +1,59 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/yolov3_darknet53.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_darknet53_270e_coco/model_final
+
+norm_type: bn
+
+YOLOv3Loss:
+ ignore_thresh: 0.5
+ downsample: [32, 16, 8]
+ label_smooth: false
+
+worker_num: 8
+TrainReader:
+ inputs_def:
+ num_max_boxes: 50
+ sample_transforms:
+ - Decode: {}
+ - RandomDistort: {}
+ - RandomExpand: {fill_value: [123.675, 116.28, 103.53], ratio: 2.0}
+ - RandomCrop: {}
+ - RandomFlip: {}
+ batch_transforms:
+ - BatchRandomResize: {target_size: [416], random_size: True, random_interp: True, keep_ratio: False}
+ - NormalizeBox: {}
+ - PadBox: {num_max_boxes: 50}
+ - BboxXYXY2XYWH: {}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ - Gt2YoloTarget: {anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]], anchors: [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]], downsample_ratios: [32, 16, 8], iou_thresh: 0.5}
+ batch_size: 32
+ shuffle: true
+ drop_last: true
+ mixup_epoch: -1
+ use_shared_memory: true
+
+epoch: 320
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !CosineDecay
+ max_epochs: 320
+ - !LinearWarmup
+ start_factor: 0.
+ epochs: 4
+
+OptimizerBuilder:
+ optimizer:
+ momentum: 0.9
+ type: Momentum
+ regularizer:
+ factor: 0.016
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..b9dd33bdb27a539193ee1c003095f45c58b5e368
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_270e_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_mobilenet_v1.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_mobilenet_v1_270e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_270e_voc.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_270e_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..996757af6be052409b5a71f8d543e5da63cb491d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_270e_voc.yml
@@ -0,0 +1,28 @@
+_BASE_: [
+ '../datasets/voc.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_mobilenet_v1.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_mobilenet_v1_270e_voc/model_final
+
+# set collate_batch to false because ground-truth info is needed
+# on voc dataset and should not collate data in batch when batch size
+# is larger than 1.
+EvalReader:
+ collate_batch: false
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 216
+ - 243
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_roadsign.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_roadsign.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e897276c655ad97aa57f0ca195bba4db9900b5a8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_roadsign.yml
@@ -0,0 +1,13 @@
+_BASE_: [
+ '../datasets/roadsign_voc.yml',
+ '../runtime.yml',
+ '_base_/optimizer_40e.yml',
+ '_base_/yolov3_mobilenet_v1.yml',
+ '_base_/yolov3_reader.yml',
+]
+pretrain_weights: https://paddledet.bj.bcebos.com/models/yolov3_mobilenet_v1_270e_coco.pdparams
+weights: output/yolov3_mobilenet_v1_roadsign/model_final
+
+YOLOv3Loss:
+ ignore_thresh: 0.7
+ label_smooth: true
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_ssld_270e_coco.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_ssld_270e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..10cf8166d9e9ab1a63211f873d73a3e8eee4eb91
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_ssld_270e_coco.yml
@@ -0,0 +1,11 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_mobilenet_v1.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV1_ssld_pretrained.pdparams
+weights: output/yolov3_mobilenet_v1_ssld_270e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_ssld_270e_voc.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_ssld_270e_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..0f9c85fd981113ccbd1e1080000ea76a0cd680a6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v1_ssld_270e_voc.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../datasets/voc.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_mobilenet_v1.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV1_ssld_pretrained.pdparams
+weights: output/yolov3_mobilenet_v1_ssld_270e_voc/model_final
+
+# set collate_batch to false because ground-truth info is needed
+# on voc dataset and should not collate data in batch when batch size
+# is larger than 1.
+EvalReader:
+ collate_batch: false
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 216
+ - 243
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_coco.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d1b8af566e99310cbde30dead1d6ad3b6ff428a4
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_mobilenet_v3_large.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_mobilenet_v3_large_270e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_voc.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..e246c8bae484833e7e63034318f150c7fbba93d6
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v3_large_270e_voc.yml
@@ -0,0 +1,28 @@
+_BASE_: [
+ '../datasets/voc.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_mobilenet_v3_large.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_mobilenet_v3_large_270e_voc/model_final
+
+# set collate_batch to false because ground-truth info is needed
+# on voc dataset and should not collate data in batch when batch size
+# is larger than 1.
+EvalReader:
+ collate_batch: false
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 216
+ - 243
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v3_large_ssld_270e_voc.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v3_large_ssld_270e_voc.yml
new file mode 100644
index 0000000000000000000000000000000000000000..13a2583397bfda58ab7e06d9b1621edec47f506e
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_mobilenet_v3_large_ssld_270e_voc.yml
@@ -0,0 +1,29 @@
+_BASE_: [
+ '../datasets/voc.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_mobilenet_v3_large.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/MobileNetV3_large_x1_0_ssld_pretrained.pdparams
+weights: output/yolov3_mobilenet_v3_large_ssld_270e_voc/model_final
+
+# set collate_batch to false because ground-truth info is needed
+# on voc dataset and should not collate data in batch when batch size
+# is larger than 1.
+EvalReader:
+ collate_batch: false
+
+LearningRate:
+ base_lr: 0.001
+ schedulers:
+ - !PiecewiseDecay
+ gamma: 0.1
+ milestones:
+ - 216
+ - 243
+ - !LinearWarmup
+ start_factor: 0.
+ steps: 1000
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_r34_270e_coco.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_r34_270e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..8653b06161b9145dbd23e00878d5c056986db5ec
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_r34_270e_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_r34.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_r34_270e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolov3/yolov3_r50vd_dcn_270e_coco.yml b/PaddleDetection-release-2.6/configs/yolov3/yolov3_r50vd_dcn_270e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a07cbdde1dcfa2caf50ec93ae7f499a7734335ab
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolov3/yolov3_r50vd_dcn_270e_coco.yml
@@ -0,0 +1,10 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ '_base_/optimizer_270e.yml',
+ '_base_/yolov3_r50vd_dcn.yml',
+ '_base_/yolov3_reader.yml',
+]
+
+snapshot_epoch: 5
+weights: output/yolov3_r50vd_dcn_270e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolox/README.md b/PaddleDetection-release-2.6/configs/yolox/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..057b87f2f8e27adaf76a2e993f1a7d6e395fefd3
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/README.md
@@ -0,0 +1,190 @@
+# YOLOX (YOLOX: Exceeding YOLO Series in 2021)
+
+## 内容
+- [模型库](#模型库)
+- [使用说明](#使用说明)
+- [速度测试](#速度测试)
+- [引用](#引用)
+
+
+## 模型库
+### YOLOX on COCO
+
+| 网络网络 | 输入尺寸 | 图片数/GPU | 学习率策略 | 模型推理耗时(ms) | mAPval
0.5:0.95 | mAPval
0.5 | Params(M) | FLOPs(G) | 下载链接 | 配置文件 |
+| :------------- | :------- | :-------: | :------: | :------------: | :---------------------: | :----------------: |:---------: | :------: |:---------------: |:-----: |
+| YOLOX-nano | 416 | 8 | 300e | 2.3 | 26.1 | 42.0 | 0.91 | 1.08 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_nano_300e_coco.pdparams) | [配置文件](./yolox_nano_300e_coco.yml) |
+| YOLOX-tiny | 416 | 8 | 300e | 2.8 | 32.9 | 50.4 | 5.06 | 6.45 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_tiny_300e_coco.pdparams) | [配置文件](./yolox_tiny_300e_coco.yml) |
+| YOLOX-s | 640 | 8 | 300e | 3.0 | 40.4 | 59.6 | 9.0 | 26.8 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams) | [配置文件](./yolox_s_300e_coco.yml) |
+| YOLOX-m | 640 | 8 | 300e | 5.8 | 46.9 | 65.7 | 25.3 | 73.8 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_m_300e_coco.pdparams) | [配置文件](./yolox_m_300e_coco.yml) |
+| YOLOX-l | 640 | 8 | 300e | 9.3 | 50.1 | 68.8 | 54.2 | 155.6 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_l_300e_coco.pdparams) | [配置文件](./yolox_l_300e_coco.yml) |
+| YOLOX-x | 640 | 8 | 300e | 16.6 | **51.8** | **70.6** | 99.1 | 281.9 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_x_300e_coco.pdparams) | [配置文件](./yolox_x_300e_coco.yml) |
+
+
+| 网络网络 | 输入尺寸 | 图片数/GPU | 学习率策略 | 模型推理耗时(ms) | mAPval
0.5:0.95 | mAPval
0.5 | Params(M) | FLOPs(G) | 下载链接 | 配置文件 |
+| :------------- | :------- | :-------: | :------: | :------------: | :---------------------: | :----------------: |:---------: | :------: |:---------------: |:-----: |
+| YOLOX-cdn-tiny | 416 | 8 | 300e | 1.9 | 32.4 | 50.2 | 5.03 | 6.33 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_cdn_tiny_300e_coco.pdparams) | [配置文件](./yolox_cdn_tiny_300e_coco.yml) |
+| YOLOX-crn-s | 640 | 8 | 300e | 3.0 | 40.4 | 59.6 | 7.7 | 24.69 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_crn_s_300e_coco.pdparams) | [配置文件](./yolox_crn_s_300e_coco.yml) |
+| YOLOX-ConvNeXt-s| 640 | 8 | 36e | - | **44.6** | **65.3** | 36.2 | 27.52 | [下载链接](https://paddledet.bj.bcebos.com/models/yolox_convnext_s_36e_coco.pdparams) | [配置文件](../convnext/yolox_convnext_s_36e_coco.yml) |
+
+
+**注意:**
+ - YOLOX模型训练使用COCO train2017作为训练集,YOLOX-cdn表示使用与YOLOv5 releases v6.0之后版本相同的主干网络,YOLOX-crn表示使用与PPYOLOE相同的主干网络CSPResNet,YOLOX-ConvNeXt表示使用ConvNeXt作为主干网络;
+ - YOLOX模型训练过程中默认使用8 GPUs进行混合精度训练,默认每卡batch_size为8,默认lr为0.01为8卡总batch_size=64的设置,如果**GPU卡数**或者每卡**batch size**发生了改变,你需要按照公式 **lrnew = lrdefault * (batch_sizenew * GPU_numbernew) / (batch_sizedefault * GPU_numberdefault)** 调整学习率;
+ - 为保持高mAP的同时提高推理速度,可以将[yolox_cspdarknet.yml](_base_/yolox_cspdarknet.yml)中的`nms_top_k`修改为`1000`,将`keep_top_k`修改为`100`,将`score_threshold`修改为`0.01`,mAP会下降约0.1~0.2%;
+ - 为快速的demo演示效果,可以将[yolox_cspdarknet.yml](_base_/yolox_cspdarknet.yml)中的`score_threshold`修改为`0.25`,将`nms_threshold`修改为`0.45`,但mAP会下降较多;
+ - YOLOX模型推理速度测试采用单卡V100,batch size=1进行测试,使用**CUDA 10.2**, **CUDNN 7.6.5**,TensorRT推理速度测试使用**TensorRT 6.0.1.8**。
+ - 参考[速度测试](#速度测试)以复现YOLOX推理速度测试结果,速度为**tensorRT-FP16**测速后的最快速度,**不包含数据预处理和模型输出后处理(NMS)**的耗时。
+ - 如果你设置了`--run_benchmark=True`, 你首先需要安装以下依赖`pip install pynvml psutil GPUtil`。
+
+## 使用教程
+
+### 1.训练
+执行以下指令使用混合精度训练YOLOX
+```bash
+python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/yolox/yolox_s_300e_coco.yml --amp --eval
+```
+**注意:**
+- `--amp`表示开启混合精度训练以避免显存溢出,`--eval`表示边训边验证。
+
+### 2.评估
+执行以下命令在单个GPU上评估COCO val2017数据集
+```bash
+CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/yolox/yolox_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams
+```
+
+### 3.推理
+使用以下命令在单张GPU上预测图片,使用`--infer_img`推理单张图片以及使用`--infer_dir`推理文件中的所有图片。
+```bash
+# 推理单张图片
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/yolox/yolox_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
+
+# 推理文件中的所有图片
+CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/yolox/yolox_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams --infer_dir=demo
+```
+
+### 4.导出模型
+YOLOX在GPU上推理部署或benchmark测速等需要通过`tools/export_model.py`导出模型。
+
+当你**使用Paddle Inference但不使用TensorRT**时,运行以下的命令导出模型
+
+```bash
+python tools/export_model.py -c configs/yolox/yolox_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams
+```
+
+当你**使用Paddle Inference且使用TensorRT**时,需要指定`-o trt=True`来导出模型。
+
+```bash
+python tools/export_model.py -c configs/yolox/yolox_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams trt=True
+```
+
+如果你想将YOLOX模型导出为**ONNX格式**,参考
+[PaddleDetection模型导出为ONNX格式教程](../../deploy/EXPORT_ONNX_MODEL.md),运行以下命令:
+
+```bash
+
+# 导出推理模型
+python tools/export_model.py -c configs/yolox/yolox_s_300e_coco.yml --output_dir=output_inference -o weights=https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams
+
+# 安装paddle2onnx
+pip install paddle2onnx
+
+# 转换成onnx格式
+paddle2onnx --model_dir output_inference/yolox_s_300e_coco --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 11 --save_file yolox_s_300e_coco.onnx
+```
+
+**注意:** ONNX模型目前只支持batch_size=1
+
+
+### 5.推理部署
+YOLOX可以使用以下方式进行部署:
+ - Paddle Inference [Python](../../deploy/python) & [C++](../../deploy/cpp)
+ - [Paddle-TensorRT](../../deploy/TENSOR_RT.md)
+ - [PaddleServing](https://github.com/PaddlePaddle/Serving)
+ - [PaddleSlim模型量化](../slim)
+
+运行以下命令导出模型
+
+```bash
+python tools/export_model.py -c configs/yolox/yolox_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams trt=True
+```
+
+**注意:**
+- trt=True表示**使用Paddle Inference且使用TensorRT**进行测速,速度会更快,默认不加即为False,表示**使用Paddle Inference但不使用TensorRT**进行测速。
+- 如果是使用Paddle Inference在TensorRT FP16模式下部署,需要参考[Paddle Inference文档](https://www.paddlepaddle.org.cn/inference/master/user_guides/download_lib.html#python),下载并安装与你的CUDA, CUDNN和TensorRT相应的wheel包。
+
+#### 5.1.Python部署
+`deploy/python/infer.py`使用上述导出后的Paddle Inference模型用于推理和benchnark测速,如果设置了`--run_benchmark=True`, 首先需要安装以下依赖`pip install pynvml psutil GPUtil`。
+
+```bash
+# Python部署推理单张图片
+python deploy/python/infer.py --model_dir=output_inference/yolox_s_300e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu
+
+# 推理文件夹下的所有图片
+python deploy/python/infer.py --model_dir=output_inference/yolox_s_300e_coco --image_dir=demo/ --device=gpu
+```
+
+#### 5.2. C++部署
+`deploy/cpp/build/main`使用上述导出后的Paddle Inference模型用于C++推理部署, 首先按照[docs](../../deploy/cpp/docs)编译安装环境。
+```bash
+# C++部署推理单张图片
+./deploy/cpp/build/main --model_dir=output_inference/yolox_s_300e_coco/ --image_file=demo/000000014439_640x640.jpg --run_mode=paddle --device=GPU --threshold=0.5 --output_dir=cpp_infer_output/yolox_s_300e_coco
+```
+
+
+## 速度测试
+
+为了公平起见,在[模型库](#模型库)中的速度测试结果均为不包含数据预处理和模型输出后处理(NMS)的数据(与[YOLOv4(AlexyAB)](https://github.com/AlexeyAB/darknet)测试方法一致),需要在导出模型时指定`-o exclude_nms=True`。测速需设置`--run_benchmark=True`, 首先需要安装以下依赖`pip install pynvml psutil GPUtil`。
+
+**使用Paddle Inference但不使用TensorRT**进行测速,执行以下命令:
+
+```bash
+# 导出模型
+python tools/export_model.py -c configs/yolox/yolox_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams exclude_nms=True
+
+# 速度测试,使用run_benchmark=True
+python deploy/python/infer.py --model_dir=output_inference/yolox_s_300e_coco --image_file=demo/000000014439_640x640.jpg --run_mode=paddle --device=gpu --run_benchmark=True
+```
+
+**使用Paddle Inference且使用TensorRT**进行测速,执行以下命令:
+
+```bash
+# 导出模型,使用trt=True
+python tools/export_model.py -c configs/yolox/yolox_s_300e_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/yolox_s_300e_coco.pdparams exclude_nms=True trt=True
+
+# 速度测试,使用run_benchmark=True
+python deploy/python/infer.py --model_dir=output_inference/yolox_s_300e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_benchmark=True
+
+# tensorRT-FP32测速
+python deploy/python/infer.py --model_dir=output_inference/yolox_s_300e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_benchmark=True --run_mode=trt_fp32
+
+# tensorRT-FP16测速
+python deploy/python/infer.py --model_dir=output_inference/yolox_s_300e_coco --image_file=demo/000000014439_640x640.jpg --device=gpu --run_benchmark=True --run_mode=trt_fp16
+```
+**注意:**
+- 导出模型时指定`-o exclude_nms=True`仅作为测速时用,这样导出的模型其推理部署预测的结果不是最终检出框的结果。
+- [模型库](#模型库)中的速度测试结果为**tensorRT-FP16**测速后的最快速度,为**不包含数据预处理和模型输出后处理(NMS)**的耗时。
+
+## FAQ
+
+
+如何计算模型参数量
+可以将以下代码插入:[trainer.py](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/ppdet/engine/trainer.py#L154) 来计算参数量。
+```python
+params = sum([
+ p.numel() for n, p in self.model.named_parameters()
+ if all([x not in n for x in ['_mean', '_variance']])
+]) # exclude BatchNorm running status
+print('Params: ', params)
+```
+
+
+
+## Citations
+```
+ @article{yolox2021,
+ title={YOLOX: Exceeding YOLO Series in 2021},
+ author={Ge, Zheng and Liu, Songtao and Wang, Feng and Li, Zeming and Sun, Jian},
+ journal={arXiv preprint arXiv:2107.08430},
+ year={2021}
+}
+```
diff --git a/PaddleDetection-release-2.6/configs/yolox/_base_/optimizer_300e.yml b/PaddleDetection-release-2.6/configs/yolox/_base_/optimizer_300e.yml
new file mode 100644
index 0000000000000000000000000000000000000000..1853ad61ff3e8f222388a005db9e60640700c996
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/_base_/optimizer_300e.yml
@@ -0,0 +1,20 @@
+epoch: 300
+
+LearningRate:
+ base_lr: 0.01
+ schedulers:
+ - !CosineDecay
+ max_epochs: 300
+ min_lr_ratio: 0.05
+ last_plateau_epochs: 15
+ - !ExpWarmup
+ epochs: 5
+
+OptimizerBuilder:
+ optimizer:
+ type: Momentum
+ momentum: 0.9
+ use_nesterov: True
+ regularizer:
+ factor: 0.0005
+ type: L2
diff --git a/PaddleDetection-release-2.6/configs/yolox/_base_/yolox_cspdarknet.yml b/PaddleDetection-release-2.6/configs/yolox/_base_/yolox_cspdarknet.yml
new file mode 100644
index 0000000000000000000000000000000000000000..24ef370c437e308c3a7e9da973fe3eea439faf17
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/_base_/yolox_cspdarknet.yml
@@ -0,0 +1,42 @@
+architecture: YOLOX
+norm_type: sync_bn
+use_ema: True
+ema_decay: 0.9999
+ema_decay_type: "exponential"
+act: silu
+find_unused_parameters: True
+
+depth_mult: 1.0
+width_mult: 1.0
+
+YOLOX:
+ backbone: CSPDarkNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ size_stride: 32
+ size_range: [15, 25] # multi-scale range [480*480 ~ 800*800]
+
+CSPDarkNet:
+ arch: "X"
+ return_idx: [2, 3, 4]
+ depthwise: False
+
+YOLOCSPPAN:
+ depthwise: False
+
+YOLOXHead:
+ l1_epoch: 285
+ depthwise: False
+ loss_weight: {cls: 1.0, obj: 1.0, iou: 5.0, l1: 1.0}
+ assigner:
+ name: SimOTAAssigner
+ candidate_topk: 10
+ use_vfl: False
+ nms:
+ name: MultiClassNMS
+ nms_top_k: 10000
+ keep_top_k: 1000
+ score_threshold: 0.001
+ nms_threshold: 0.65
+ # For speed while keep high mAP, you can modify 'nms_top_k' to 1000 and 'keep_top_k' to 100, the mAP will drop about 0.1%.
+ # For high speed demo, you can modify 'score_threshold' to 0.25 and 'nms_threshold' to 0.45, but the mAP will drop a lot.
diff --git a/PaddleDetection-release-2.6/configs/yolox/_base_/yolox_reader.yml b/PaddleDetection-release-2.6/configs/yolox/_base_/yolox_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..a33b847b159a515248c8556a24bb29e779f1def8
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/_base_/yolox_reader.yml
@@ -0,0 +1,44 @@
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Mosaic:
+ prob: 1.0
+ input_dim: [640, 640]
+ degrees: [-10, 10]
+ scale: [0.1, 2.0]
+ shear: [-2, 2]
+ translate: [-0.1, 0.1]
+ enable_mixup: True
+ mixup_prob: 1.0
+ mixup_scale: [0.5, 1.5]
+ - AugmentHSV: {is_bgr: False, hgain: 5, sgain: 30, vgain: 30}
+ - PadResize: {target_size: 640}
+ - RandomFlip: {}
+ batch_transforms:
+ - Permute: {}
+ batch_size: 8
+ shuffle: True
+ drop_last: True
+ collate_batch: False
+ mosaic_epoch: 285
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: True, interp: 1}
+ - Pad: {size: [640, 640], fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 4
+
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 640, 640]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: True, interp: 1}
+ - Pad: {size: [640, 640], fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/yolox/yolox_cdn_tiny_300e_coco.yml b/PaddleDetection-release-2.6/configs/yolox/yolox_cdn_tiny_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..81c6c075d3620caa98dce2ebcd3b45bd694cef8d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/yolox_cdn_tiny_300e_coco.yml
@@ -0,0 +1,14 @@
+_BASE_: [
+ 'yolox_tiny_300e_coco.yml'
+]
+depth_mult: 0.33
+width_mult: 0.375
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/yolox_cdn_tiny_300e_coco/model_final
+
+CSPDarkNet:
+ arch: "P5" # using the same backbone of YOLOv5 releases v6.0 and later version
+ return_idx: [2, 3, 4]
+ depthwise: False
diff --git a/PaddleDetection-release-2.6/configs/yolox/yolox_crn_s_300e_coco.yml b/PaddleDetection-release-2.6/configs/yolox/yolox_crn_s_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..ae463113e5909f76905b70409ae75794a66430d7
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/yolox_crn_s_300e_coco.yml
@@ -0,0 +1,28 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/yolox_cspdarknet.yml',
+ './_base_/yolox_reader.yml'
+]
+depth_mult: 0.33
+width_mult: 0.50
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/yolox_crn_s_300e_coco/model_final
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/CSPResNetb_s_pretrained.pdparams
+
+
+YOLOX:
+ backbone: CSPResNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ size_stride: 32
+ size_range: [15, 25] # multi-scale range [480*480 ~ 800*800]
+
+CSPResNet:
+ layers: [3, 6, 6, 3]
+ channels: [64, 128, 256, 512, 1024]
+ return_idx: [1, 2, 3]
+ use_large_stem: True
diff --git a/PaddleDetection-release-2.6/configs/yolox/yolox_l_300e_coco.yml b/PaddleDetection-release-2.6/configs/yolox/yolox_l_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..79cffd5e544b0d2cf4629c6a9f37e75eda4a5a6d
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/yolox_l_300e_coco.yml
@@ -0,0 +1,13 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/yolox_cspdarknet.yml',
+ './_base_/yolox_reader.yml'
+]
+depth_mult: 1.0
+width_mult: 1.0
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/yolox_l_300e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolox/yolox_m_300e_coco.yml b/PaddleDetection-release-2.6/configs/yolox/yolox_m_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..4c25d7e2561cf120b60b712e621ad695debdb61c
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/yolox_m_300e_coco.yml
@@ -0,0 +1,13 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/yolox_cspdarknet.yml',
+ './_base_/yolox_reader.yml'
+]
+depth_mult: 0.67
+width_mult: 0.75
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/yolox_m_300e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolox/yolox_nano_300e_coco.yml b/PaddleDetection-release-2.6/configs/yolox/yolox_nano_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..80b8b5c51fbc200ecce2ff10013b7e9a94300999
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/yolox_nano_300e_coco.yml
@@ -0,0 +1,81 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/yolox_cspdarknet.yml',
+ './_base_/yolox_reader.yml'
+]
+depth_mult: 0.33
+width_mult: 0.25
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/yolox_nano_300e_coco/model_final
+
+
+### model config:
+# Note: YOLOX-nano use depthwise conv in backbone, neck and head.
+YOLOX:
+ backbone: CSPDarkNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ size_stride: 32
+ size_range: [10, 20] # multi-scale range [320*320 ~ 640*640]
+
+CSPDarkNet:
+ arch: "X"
+ return_idx: [2, 3, 4]
+ depthwise: True
+
+YOLOCSPPAN:
+ depthwise: True
+
+YOLOXHead:
+ depthwise: True
+
+
+### reader config:
+# Note: YOLOX-tiny/nano uses 416*416 for evaluation and inference.
+# And multi-scale training setting is in model config, TrainReader's operators use 640*640 as default.
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Mosaic:
+ prob: 0.5 # 1.0 in YOLOX-tiny/s/m/l/x
+ input_dim: [640, 640]
+ degrees: [-10, 10]
+ scale: [0.5, 1.5] # [0.1, 2.0] in YOLOX-s/m/l/x
+ shear: [-2, 2]
+ translate: [-0.1, 0.1]
+ enable_mixup: False # True in YOLOX-s/m/l/x
+ - AugmentHSV: {is_bgr: False, hgain: 5, sgain: 30, vgain: 30}
+ - PadResize: {target_size: 640}
+ - RandomFlip: {}
+ batch_transforms:
+ - Permute: {}
+ batch_size: 8
+ shuffle: True
+ drop_last: True
+ collate_batch: False
+ mosaic_epoch: 285
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [416, 416], keep_ratio: True, interp: 1}
+ - Pad: {size: [416, 416], fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 8
+
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 416, 416]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [416, 416], keep_ratio: True, interp: 1}
+ - Pad: {size: [416, 416], fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/yolox/yolox_s_300e_coco.yml b/PaddleDetection-release-2.6/configs/yolox/yolox_s_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..9ba6120a93ec1d5c46cc8d8dc88351671ff44349
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/yolox_s_300e_coco.yml
@@ -0,0 +1,13 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/yolox_cspdarknet.yml',
+ './_base_/yolox_reader.yml'
+]
+depth_mult: 0.33
+width_mult: 0.50
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/yolox_s_300e_coco/model_final
diff --git a/PaddleDetection-release-2.6/configs/yolox/yolox_tiny_300e_coco.yml b/PaddleDetection-release-2.6/configs/yolox/yolox_tiny_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..c81c172d27982c460bbead78f966158c67de7bc2
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/yolox_tiny_300e_coco.yml
@@ -0,0 +1,69 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/yolox_cspdarknet.yml',
+ './_base_/yolox_reader.yml'
+]
+depth_mult: 0.33
+width_mult: 0.375
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/yolox_tiny_300e_coco/model_final
+
+
+### model config:
+YOLOX:
+ backbone: CSPDarkNet
+ neck: YOLOCSPPAN
+ head: YOLOXHead
+ size_stride: 32
+ size_range: [10, 20] # multi-scale ragne [320*320 ~ 640*640]
+
+
+### reader config:
+# Note: YOLOX-tiny/nano uses 416*416 for evaluation and inference.
+# And multi-scale training setting is in model config, TrainReader's operators use 640*640 as default.
+worker_num: 4
+TrainReader:
+ sample_transforms:
+ - Decode: {}
+ - Mosaic:
+ prob: 1.0
+ input_dim: [640, 640]
+ degrees: [-10, 10]
+ scale: [0.5, 1.5] # [0.1, 2.0] in YOLOX-s/m/l/x
+ shear: [-2, 2]
+ translate: [-0.1, 0.1]
+ enable_mixup: False # True in YOLOX-s/m/l/x
+ - AugmentHSV: {is_bgr: False, hgain: 5, sgain: 30, vgain: 30}
+ - PadResize: {target_size: 640}
+ - RandomFlip: {}
+ batch_transforms:
+ - Permute: {}
+ batch_size: 8
+ shuffle: True
+ drop_last: True
+ collate_batch: False
+ mosaic_epoch: 285
+
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [416, 416], keep_ratio: True, interp: 1}
+ - Pad: {size: [416, 416], fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 8
+
+
+TestReader:
+ inputs_def:
+ image_shape: [3, 416, 416]
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [416, 416], keep_ratio: True, interp: 1}
+ - Pad: {size: [416, 416], fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/configs/yolox/yolox_x_300e_coco.yml b/PaddleDetection-release-2.6/configs/yolox/yolox_x_300e_coco.yml
new file mode 100644
index 0000000000000000000000000000000000000000..fd8e0d2eb0fbc2d8f052e549b71f9995aa325a05
--- /dev/null
+++ b/PaddleDetection-release-2.6/configs/yolox/yolox_x_300e_coco.yml
@@ -0,0 +1,13 @@
+_BASE_: [
+ '../datasets/coco_detection.yml',
+ '../runtime.yml',
+ './_base_/optimizer_300e.yml',
+ './_base_/yolox_cspdarknet.yml',
+ './_base_/yolox_reader.yml'
+]
+depth_mult: 1.33
+width_mult: 1.25
+
+log_iter: 100
+snapshot_epoch: 10
+weights: output/yolox_x_300e_coco/model_final
diff --git a/PaddleDetection-release-2.6/dataset/coco/download_coco.py b/PaddleDetection-release-2.6/dataset/coco/download_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..993218fff32b9d78eab43e4a37264b031338f496
--- /dev/null
+++ b/PaddleDetection-release-2.6/dataset/coco/download_coco.py
@@ -0,0 +1,28 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os.path as osp
+import logging
+# add python path of PaddleDetection to sys.path
+parent_path = osp.abspath(osp.join(__file__, *(['..'] * 3)))
+if parent_path not in sys.path:
+ sys.path.append(parent_path)
+
+from ppdet.utils.download import download_dataset
+
+logging.basicConfig(level=logging.INFO)
+
+download_path = osp.split(osp.realpath(sys.argv[0]))[0]
+download_dataset(download_path, 'coco')
diff --git a/PaddleDetection-release-2.6/dataset/dota/.gitignore b/PaddleDetection-release-2.6/dataset/dota/.gitignore
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/PaddleDetection-release-2.6/dataset/mot/gen_labels_MOT.py b/PaddleDetection-release-2.6/dataset/mot/gen_labels_MOT.py
new file mode 100644
index 0000000000000000000000000000000000000000..22995aed3aefa392c245cf45b1206b18b7d7119f
--- /dev/null
+++ b/PaddleDetection-release-2.6/dataset/mot/gen_labels_MOT.py
@@ -0,0 +1,65 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os.path as osp
+import os
+import numpy as np
+
+MOT_data = 'MOT16'
+
+# choose a data in ['MOT15', 'MOT16', 'MOT17', 'MOT20']
+# or your custom data (prepare it following the 'docs/tutorials/PrepareMOTDataSet.md')
+
+
+def mkdirs(d):
+ if not osp.exists(d):
+ os.makedirs(d)
+
+
+seq_root = './{}/images/train'.format(MOT_data)
+label_root = './{}/labels_with_ids/train'.format(MOT_data)
+mkdirs(label_root)
+seqs = [s for s in os.listdir(seq_root)]
+
+tid_curr = 0
+tid_last = -1
+for seq in seqs:
+ seq_info = open(osp.join(seq_root, seq, 'seqinfo.ini')).read()
+ seq_width = int(seq_info[seq_info.find('imWidth=') + 8:seq_info.find(
+ '\nimHeight')])
+ seq_height = int(seq_info[seq_info.find('imHeight=') + 9:seq_info.find(
+ '\nimExt')])
+
+ gt_txt = osp.join(seq_root, seq, 'gt', 'gt.txt')
+ gt = np.loadtxt(gt_txt, dtype=np.float64, delimiter=',')
+
+ seq_label_root = osp.join(label_root, seq, 'img1')
+ mkdirs(seq_label_root)
+
+ for fid, tid, x, y, w, h, mark, label, _ in gt:
+ if mark == 0 or not label == 1:
+ continue
+ fid = int(fid)
+ tid = int(tid)
+ if not tid == tid_last:
+ tid_curr += 1
+ tid_last = tid
+ x += w / 2
+ y += h / 2
+ label_fpath = osp.join(seq_label_root, '{:06d}.txt'.format(fid))
+ label_str = '0 {:d} {:.6f} {:.6f} {:.6f} {:.6f}\n'.format(
+ tid_curr, x / seq_width, y / seq_height, w / seq_width,
+ h / seq_height)
+ with open(label_fpath, 'a') as f:
+ f.write(label_str)
diff --git a/PaddleDetection-release-2.6/dataset/roadsign_voc/download_roadsign_voc.py b/PaddleDetection-release-2.6/dataset/roadsign_voc/download_roadsign_voc.py
new file mode 100644
index 0000000000000000000000000000000000000000..7d8ef2252f3d8b91f9c0c30e6be5ad186a00c18f
--- /dev/null
+++ b/PaddleDetection-release-2.6/dataset/roadsign_voc/download_roadsign_voc.py
@@ -0,0 +1,28 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os.path as osp
+import logging
+# add python path of PaddleDetection to sys.path
+parent_path = osp.abspath(osp.join(__file__, *(['..'] * 3)))
+if parent_path not in sys.path:
+ sys.path.append(parent_path)
+
+from ppdet.utils.download import download_dataset
+
+logging.basicConfig(level=logging.INFO)
+
+download_path = osp.split(osp.realpath(sys.argv[0]))[0]
+download_dataset(download_path, 'roadsign_voc')
diff --git a/PaddleDetection-release-2.6/dataset/roadsign_voc/label_list.txt b/PaddleDetection-release-2.6/dataset/roadsign_voc/label_list.txt
new file mode 100644
index 0000000000000000000000000000000000000000..1be460f457a2fdbec91d3a69377c232ae4a6beb0
--- /dev/null
+++ b/PaddleDetection-release-2.6/dataset/roadsign_voc/label_list.txt
@@ -0,0 +1,4 @@
+speedlimit
+crosswalk
+trafficlight
+stop
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/dataset/spine_coco/download_spine_coco.py b/PaddleDetection-release-2.6/dataset/spine_coco/download_spine_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..2b23dd6387b66a9e42f26b59e5fe6fea7bf81d7d
--- /dev/null
+++ b/PaddleDetection-release-2.6/dataset/spine_coco/download_spine_coco.py
@@ -0,0 +1,28 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os.path as osp
+import logging
+# add python path of PaddleDetection to sys.path
+parent_path = osp.abspath(osp.join(__file__, *(['..'] * 3)))
+if parent_path not in sys.path:
+ sys.path.append(parent_path)
+
+from ppdet.utils.download import download_dataset
+
+logging.basicConfig(level=logging.INFO)
+
+download_path = osp.split(osp.realpath(sys.argv[0]))[0]
+download_dataset(download_path, 'spine_coco')
diff --git a/PaddleDetection-release-2.6/dataset/voc/create_list.py b/PaddleDetection-release-2.6/dataset/voc/create_list.py
new file mode 100644
index 0000000000000000000000000000000000000000..7696073448d1dc65e1e0e20919048b69658d5ea1
--- /dev/null
+++ b/PaddleDetection-release-2.6/dataset/voc/create_list.py
@@ -0,0 +1,28 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os.path as osp
+import logging
+# add python path of PaddleDetection to sys.path
+parent_path = osp.abspath(osp.join(__file__, *(['..'] * 3)))
+if parent_path not in sys.path:
+ sys.path.append(parent_path)
+
+from ppdet.utils.download import create_voc_list
+
+logging.basicConfig(level=logging.INFO)
+
+voc_path = osp.split(osp.realpath(sys.argv[0]))[0]
+create_voc_list(voc_path)
diff --git a/PaddleDetection-release-2.6/dataset/voc/download_voc.py b/PaddleDetection-release-2.6/dataset/voc/download_voc.py
new file mode 100644
index 0000000000000000000000000000000000000000..2375fbf3c17c6424763ea5323f4a470f30eff3df
--- /dev/null
+++ b/PaddleDetection-release-2.6/dataset/voc/download_voc.py
@@ -0,0 +1,28 @@
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os.path as osp
+import logging
+# add python path of PaddleDetection to sys.path
+parent_path = osp.abspath(osp.join(__file__, *(['..'] * 3)))
+if parent_path not in sys.path:
+ sys.path.append(parent_path)
+
+from ppdet.utils.download import download_dataset
+
+logging.basicConfig(level=logging.INFO)
+
+download_path = osp.split(osp.realpath(sys.argv[0]))[0]
+download_dataset(download_path, 'voc')
diff --git a/PaddleDetection-release-2.6/dataset/voc/label_list.txt b/PaddleDetection-release-2.6/dataset/voc/label_list.txt
new file mode 100644
index 0000000000000000000000000000000000000000..9513a5faa0f2f75a3f9aa2470ff541a16dc888da
--- /dev/null
+++ b/PaddleDetection-release-2.6/dataset/voc/label_list.txt
@@ -0,0 +1,2 @@
+fall
+nofall
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/dataset/wider_face/download_wider_face.sh b/PaddleDetection-release-2.6/dataset/wider_face/download_wider_face.sh
new file mode 100644
index 0000000000000000000000000000000000000000..59a2054def3dfa7e27a2ac7ba84b779800a32933
--- /dev/null
+++ b/PaddleDetection-release-2.6/dataset/wider_face/download_wider_face.sh
@@ -0,0 +1,21 @@
+# All rights `PaddleDetection` reserved
+# References:
+# @inproceedings{yang2016wider,
+# Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou},
+# Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
+# Title = {WIDER FACE: A Face Detection Benchmark},
+# Year = {2016}}
+
+DIR="$( cd "$(dirname "$0")" ; pwd -P )"
+cd "$DIR"
+
+# Download the data.
+echo "Downloading..."
+wget https://dataset.bj.bcebos.com/wider_face/WIDER_train.zip
+wget https://dataset.bj.bcebos.com/wider_face/WIDER_val.zip
+wget https://dataset.bj.bcebos.com/wider_face/wider_face_split.zip
+# Extract the data.
+echo "Extracting..."
+unzip -q WIDER_train.zip
+unzip -q WIDER_val.zip
+unzip -q wider_face_split.zip
diff --git a/PaddleDetection-release-2.6/demo/000000014439.jpg b/PaddleDetection-release-2.6/demo/000000014439.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..0abbdab06eb5950b93908cc91adfa640e8a3ac78
Binary files /dev/null and b/PaddleDetection-release-2.6/demo/000000014439.jpg differ
diff --git a/PaddleDetection-release-2.6/demo/000000014439_640x640.jpg b/PaddleDetection-release-2.6/demo/000000014439_640x640.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..58e9d3e228af43c9b55d8d0cb385ce82ebb8b996
Binary files /dev/null and b/PaddleDetection-release-2.6/demo/000000014439_640x640.jpg differ
diff --git a/PaddleDetection-release-2.6/demo/000000087038.jpg b/PaddleDetection-release-2.6/demo/000000087038.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..9f77f5d5f057b6f92dc096da704ecb8dee99bdf5
Binary files /dev/null and b/PaddleDetection-release-2.6/demo/000000087038.jpg differ
diff --git a/PaddleDetection-release-2.6/demo/000000570688.jpg b/PaddleDetection-release-2.6/demo/000000570688.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..cb304bd56c4010c08611a30dcca58ea9140cea54
Binary files /dev/null and b/PaddleDetection-release-2.6/demo/000000570688.jpg differ
diff --git a/PaddleDetection-release-2.6/demo/39006.jpg b/PaddleDetection-release-2.6/demo/39006.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..ce980e366cac812263d5dbe4e660209345997688
Binary files /dev/null and b/PaddleDetection-release-2.6/demo/39006.jpg differ
diff --git a/PaddleDetection-release-2.6/demo/P0072__1.0__0___0.png b/PaddleDetection-release-2.6/demo/P0072__1.0__0___0.png
new file mode 100644
index 0000000000000000000000000000000000000000..d3e307e7eec4b26b824cb717b619ecf2c88fb7f0
--- /dev/null
+++ b/PaddleDetection-release-2.6/demo/P0072__1.0__0___0.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:8ec9093d3b3ea72d578453b06497e35c4f7e62e0f0d8443d80f4c44af44d6c77
+size 1680618
diff --git a/PaddleDetection-release-2.6/demo/P0861__1.0__1154___824.png b/PaddleDetection-release-2.6/demo/P0861__1.0__1154___824.png
new file mode 100644
index 0000000000000000000000000000000000000000..56dac9aa07f82f657bb0bbd51926b2ee67d8e37d
--- /dev/null
+++ b/PaddleDetection-release-2.6/demo/P0861__1.0__1154___824.png
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:a58b94a1bc149eae55ed852c742d839b6f5902d81a053cb72868c91254adea32
+size 1256871
diff --git a/PaddleDetection-release-2.6/demo/car.jpg b/PaddleDetection-release-2.6/demo/car.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..486788dd3445cc1dbca7b7b835fc87f999701664
--- /dev/null
+++ b/PaddleDetection-release-2.6/demo/car.jpg
@@ -0,0 +1,3 @@
+version https://git-lfs.github.com/spec/v1
+oid sha256:b6d75247fcd88918054fdbcd09864e3d303de064d28ca18766b8568a31c0d898
+size 2283212
diff --git a/PaddleDetection-release-2.6/demo/hrnet_demo.jpg b/PaddleDetection-release-2.6/demo/hrnet_demo.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..4d0c4ac247670f5941f6fe2115d288e7a5604f0d
Binary files /dev/null and b/PaddleDetection-release-2.6/demo/hrnet_demo.jpg differ
diff --git a/PaddleDetection-release-2.6/demo/orange_71.jpg b/PaddleDetection-release-2.6/demo/orange_71.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..da7974a1a1371298f1ca5f4ef9c82bd3824d7ac3
Binary files /dev/null and b/PaddleDetection-release-2.6/demo/orange_71.jpg differ
diff --git a/PaddleDetection-release-2.6/demo/road554.png b/PaddleDetection-release-2.6/demo/road554.png
new file mode 100644
index 0000000000000000000000000000000000000000..7733e57f922b0fee893775da4f698c202804966f
Binary files /dev/null and b/PaddleDetection-release-2.6/demo/road554.png differ
diff --git a/PaddleDetection-release-2.6/demo/visdrone_0000315_01601_d_0000509.jpg b/PaddleDetection-release-2.6/demo/visdrone_0000315_01601_d_0000509.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..cc7a3602c1c015213ca1f7e27b0d006e827ee935
Binary files /dev/null and b/PaddleDetection-release-2.6/demo/visdrone_0000315_01601_d_0000509.jpg differ
diff --git a/PaddleDetection-release-2.6/deploy/BENCHMARK_INFER.md b/PaddleDetection-release-2.6/deploy/BENCHMARK_INFER.md
new file mode 100644
index 0000000000000000000000000000000000000000..988cf30f6c672195d4b3833fe9a186b497a11c2e
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/BENCHMARK_INFER.md
@@ -0,0 +1,60 @@
+# 推理Benchmark
+
+## 一、环境准备
+- 1、测试环境:
+ - CUDA 10.1
+ - CUDNN 7.6
+ - TensorRT-6.0.1
+ - PaddlePaddle v2.0.1
+ - GPU分别为: Tesla V100和GTX 1080Ti和Jetson AGX Xavier
+- 2、测试方式:
+ - 为了方便比较不同模型的推理速度,输入采用同样大小的图片,为 3x640x640,采用 `demo/000000014439_640x640.jpg` 图片。
+ - Batch Size=1
+ - 去掉前100轮warmup时间,测试100轮的平均时间,单位ms/image,包括网络计算时间、数据拷贝至CPU的时间。
+ - 采用Fluid C++预测引擎: 包含Fluid C++预测、Fluid-TensorRT预测,下面同时测试了Float32 (FP32) 和Float16 (FP16)的推理速度。
+
+**注意:** TensorRT中固定尺寸和动态尺寸区别请参考文档[TENSOR教程](TENSOR_RT.md)。由于固定尺寸下对两阶段模型支持不完善,所以faster rcnn模型采用动态尺寸测试。固定尺寸和动态尺寸支持融合的OP不完全一样,因此同一个模型在固定尺寸和动态尺寸下测试的性能可能会有一点差异。
+
+## 二、推理速度
+
+### 1、Linux系统
+#### (1)Tesla V100
+
+| 模型 | backbone | 是否固定尺寸 | 入网尺寸 | paddle_inference | trt_fp32 | trt_fp16 |
+|-------------------------------|--------------|--------|----------|------------------|----------|----------|
+| Faster RCNN FPN | ResNet50 | 否 | 640x640 | 27.99 | 26.15 | 21.92 |
+| Faster RCNN FPN | ResNet50 | 否 | 800x1312 | 32.49 | 25.54 | 21.70 |
+| YOLOv3 | Mobilenet\_v1 | 是 | 608x608 | 9.74 | 8.61 | 6.28 |
+| YOLOv3 | Darknet53 | 是 | 608x608 | 17.84 | 15.43 | 9.86 |
+| PPYOLO | ResNet50 | 是 | 608x608 | 20.77 | 18.40 | 13.53 |
+| SSD | Mobilenet\_v1 | 是 | 300x300 | 5.17 | 4.43 | 4.29 |
+| TTFNet | Darknet53 | 是 | 512x512 | 10.14 | 8.71 | 5.55 |
+| FCOS | ResNet50 | 是 | 640x640 | 35.47 | 35.02 | 34.24 |
+
+
+#### (2)Jetson AGX Xavier
+
+| 模型 | backbone | 是否固定尺寸 | 入网尺寸 | paddle_inference | trt_fp32 | trt_fp16 |
+|-------------------------------|--------------|--------|----------|------------------|----------|----------|
+| Faster RCNN FPN | ResNet50 | 否 | 640x640 | 169.45 | 158.92 | 119.25 |
+| Faster RCNN FPN | ResNet50 | 否 | 800x1312 | 228.07 | 156.39 | 117.03 |
+| YOLOv3 | Mobilenet\_v1 | 是 | 608x608 | 48.76 | 43.83 | 18.41 |
+| YOLOv3 | Darknet53 | 是 | 608x608 | 121.61 | 110.30 | 42.38 |
+| PPYOLO | ResNet50 | 是 | 608x608 | 111.80 | 99.40 | 48.05 |
+| SSD | Mobilenet\_v1 | 是 | 300x300 | 10.52 | 8.84 | 8.77 |
+| TTFNet | Darknet53 | 是 | 512x512 | 73.77 | 64.03 | 31.46 |
+| FCOS | ResNet50 | 是 | 640x640 | 217.11 | 214.38 | 205.78 |
+
+### 2、Windows系统
+#### (1)GTX 1080Ti
+
+| 模型 | backbone | 是否固定尺寸 | 入网尺寸 | paddle_inference | trt_fp32 | trt_fp16 |
+|-------------------------------|--------------|--------|----------|------------------|----------|----------|
+| Faster RCNN FPN | ResNet50 | 否 | 640x640 | 50.74 | 57.17 | 62.08 |
+| Faster RCNN FPN | ResNet50 | 否 | 800x1312 | 50.31 | 57.61 | 62.05 |
+| YOLOv3 | Mobilenet\_v1 | 是 | 608x608 | 14.51 | 11.23 | 11.13 |
+| YOLOv3 | Darknet53 | 是 | 608x608 | 30.26 | 23.92 | 24.02 |
+| PPYOLO | ResNet50 | 是 | 608x608 | 38.06 | 31.40 | 31.94 |
+| SSD | Mobilenet\_v1 | 是 | 300x300 | 16.47 | 13.87 | 13.76 |
+| TTFNet | Darknet53 | 是 | 512x512 | 21.83 | 17.14 | 17.09 |
+| FCOS | ResNet50 | 是 | 640x640 | 71.88 | 69.93 | 69.52 |
diff --git a/PaddleDetection-release-2.6/deploy/BENCHMARK_INFER_en.md b/PaddleDetection-release-2.6/deploy/BENCHMARK_INFER_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..b0b92b6cc142bb6b07a703ccadc5a017f8080956
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/BENCHMARK_INFER_en.md
@@ -0,0 +1,61 @@
+# Inference Benchmark
+
+## 一、Prepare the Environment
+- 1、Test Environment:
+ - CUDA 10.1
+ - CUDNN 7.6
+ - TensorRT-6.0.1
+ - PaddlePaddle v2.0.1
+ - The GPUS are Tesla V100 and GTX 1080 Ti and Jetson AGX Xavier
+- 2、Test Method:
+ - In order to compare the inference speed of different models, the input shape is 3x640x640, use `demo/000000014439_640x640.jpg`.
+ - Batch_size=1
+ - Delete the warmup time of the first 100 rounds and test the average time of 100 rounds in ms/image, including network calculation time and data copy time to CPU.
+ - Using Fluid C++ prediction engine: including Fluid C++ prediction, Fluid TensorRT prediction, the following test Float32 (FP32) and Float16 (FP16) inference speed.
+
+**Attention:** For TensorRT, please refer to the [TENSOR tutorial](TENSOR_RT.md) for the difference between fixed and dynamic dimensions. Due to the imperfect support for the two-stage model under fixed size, dynamic size test was adopted for the Faster RCNN model. Fixed size and dynamic size do not support exactly the same OP for fusion, so the performance of the same model tested at fixed size and dynamic size may differ slightly.
+
+
+## 二、Inferring Speed
+
+### 1、Linux System
+#### (1)Tesla V100
+
+| Model | backbone | Fixed size or not | The net size | paddle_inference | trt_fp32 | trt_fp16 |
+| --------------- | ------------- | ----------------- | ------------ | ---------------- | -------- | -------- |
+| Faster RCNN FPN | ResNet50 | no | 640x640 | 27.99 | 26.15 | 21.92 |
+| Faster RCNN FPN | ResNet50 | no | 800x1312 | 32.49 | 25.54 | 21.70 |
+| YOLOv3 | Mobilenet\_v1 | yes | 608x608 | 9.74 | 8.61 | 6.28 |
+| YOLOv3 | Darknet53 | yes | 608x608 | 17.84 | 15.43 | 9.86 |
+| PPYOLO | ResNet50 | yes | 608x608 | 20.77 | 18.40 | 13.53 |
+| SSD | Mobilenet\_v1 | yes | 300x300 | 5.17 | 4.43 | 4.29 |
+| TTFNet | Darknet53 | yes | 512x512 | 10.14 | 8.71 | 5.55 |
+| FCOS | ResNet50 | yes | 640x640 | 35.47 | 35.02 | 34.24 |
+
+
+#### (2)Jetson AGX Xavier
+
+| Model | backbone | Fixed size or not | The net size | paddle_inference | trt_fp32 | trt_fp16 |
+| --------------- | ------------- | ----------------- | ------------ | ---------------- | -------- | -------- |
+| Faster RCNN FPN | ResNet50 | no | 640x640 | 169.45 | 158.92 | 119.25 |
+| Faster RCNN FPN | ResNet50 | no | 800x1312 | 228.07 | 156.39 | 117.03 |
+| YOLOv3 | Mobilenet\_v1 | yes | 608x608 | 48.76 | 43.83 | 18.41 |
+| YOLOv3 | Darknet53 | yes | 608x608 | 121.61 | 110.30 | 42.38 |
+| PPYOLO | ResNet50 | yes | 608x608 | 111.80 | 99.40 | 48.05 |
+| SSD | Mobilenet\_v1 | yes | 300x300 | 10.52 | 8.84 | 8.77 |
+| TTFNet | Darknet53 | yes | 512x512 | 73.77 | 64.03 | 31.46 |
+| FCOS | ResNet50 | yes | 640x640 | 217.11 | 214.38 | 205.78 |
+
+### 2、Windows System
+#### (1)GTX 1080Ti
+
+| Model | backbone | Fixed size or not | The net size | paddle_inference | trt_fp32 | trt_fp16 |
+| --------------- | ------------- | ----------------- | ------------ | ---------------- | -------- | -------- |
+| Faster RCNN FPN | ResNet50 | no | 640x640 | 50.74 | 57.17 | 62.08 |
+| Faster RCNN FPN | ResNet50 | no | 800x1312 | 50.31 | 57.61 | 62.05 |
+| YOLOv3 | Mobilenet\_v1 | yes | 608x608 | 14.51 | 11.23 | 11.13 |
+| YOLOv3 | Darknet53 | yes | 608x608 | 30.26 | 23.92 | 24.02 |
+| PPYOLO | ResNet50 | yes | 608x608 | 38.06 | 31.40 | 31.94 |
+| SSD | Mobilenet\_v1 | yes | 300x300 | 16.47 | 13.87 | 13.76 |
+| TTFNet | Darknet53 | yes | 512x512 | 21.83 | 17.14 | 17.09 |
+| FCOS | ResNet50 | yes | 640x640 | 71.88 | 69.93 | 69.52 |
diff --git a/PaddleDetection-release-2.6/deploy/EXPORT_MODEL.md b/PaddleDetection-release-2.6/deploy/EXPORT_MODEL.md
new file mode 100644
index 0000000000000000000000000000000000000000..91f34b5860d6384baf773e71a39ffa4ec773dee6
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/EXPORT_MODEL.md
@@ -0,0 +1,54 @@
+# PaddleDetection模型导出教程
+
+## 一、模型导出
+本章节介绍如何使用`tools/export_model.py`脚本导出模型。
+### 1、导出模输入输出说明
+- 输入变量以及输入形状如下:
+
+ | 输入名称 | 输入形状 | 表示含义 |
+ | :---------: | ----------- | ---------- |
+ | image | [None, 3, H, W] | 输入网络的图像,None表示batch维度,如果输入图像大小为变长,则H,W为None |
+ | im_shape | [None, 2] | 图像经过resize后的大小,表示为H,W, None表示batch维度 |
+ | scale_factor | [None, 2] | 输入图像大小比真实图像大小,表示为scale_y, scale_x |
+
+**注意**具体预处理方式可参考配置文件中TestReader部分。
+
+
+- PaddleDetection中动转静导出模型输出统一为:
+
+ - bbox, NMS的输出,形状为[N, 6], 其中N为预测框的个数,6为[class_id, score, x1, y1, x2, y2]。
+ - bbox\_num, 每张图片对应预测框的个数,例如batch_size为2,输出为[N1, N2], 表示第一张图包含N1个预测框,第二张图包含N2个预测框,并且预测框的总个数和NMS输出的第一维N相同
+ - mask,如果网络中包含mask,则会输出mask分支
+
+**注意**模型动转静导出不支持模型结构中包含numpy相关操作的情况。
+
+
+### 2、启动参数说明
+
+| FLAG | 用途 | 默认值 | 备注 |
+|:--------------:|:--------------:|:------------:|:-----------------------------------------:|
+| -c | 指定配置文件 | None | |
+| --output_dir | 模型保存路径 | `./output_inference` | 模型默认保存在`output/配置文件名/`路径下 |
+
+### 3、使用示例
+
+使用训练得到的模型进行试用,脚本如下
+
+```bash
+# 导出YOLOv3模型
+python tools/export_model.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml --output_dir=./inference_model \
+ -o weights=weights/yolov3_darknet53_270e_coco.pdparams
+```
+
+预测模型会导出到`inference_model/yolov3_darknet53_270e_coco`目录下,分别为`infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel`。
+
+
+### 4、设置导出模型的输入大小
+
+使用Fluid-TensorRT进行预测时,由于<=TensorRT 5.1的版本仅支持定长输入,保存模型的`data`层的图片大小需要和实际输入图片大小一致。而Fluid C++预测引擎没有此限制。设置TestReader中的`image_shape`可以修改保存模型中的输入图片大小。示例如下:
+
+```bash
+# 导出YOLOv3模型,输入是3x640x640
+python tools/export_model.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml --output_dir=./inference_model \
+ -o weights=weights/yolov3_darknet53_270e_coco.pdparams TestReader.inputs_def.image_shape=[3,640,640]
+```
diff --git a/PaddleDetection-release-2.6/deploy/EXPORT_MODEL_en.md b/PaddleDetection-release-2.6/deploy/EXPORT_MODEL_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..d2828edeb8388b4633b7d8489923a059ef96321c
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/EXPORT_MODEL_en.md
@@ -0,0 +1,53 @@
+# PaddleDetection Model Export Tutorial
+
+## 一、Model Export
+This section describes how to use the `tools/export_model.py` script to export models.
+### Export model input and output description
+- Input variables and input shapes are as follows:
+
+ | Input Name | Input Shape | Meaning |
+ | :----------: | --------------- | ------------------------------------------------------------------------------------------------------------------------- |
+ | image | [None, 3, H, W] | Enter the network image. None indicates the Batch dimension. If the input image size is variable length, H and W are None |
+ | im_shape | [None, 2] | The size of the image after resize is expressed as H,W, and None represents the Batch dimension |
+ | scale_factor | [None, 2] | The input image size is larger than the real image size, denoted byscale_y, scale_x |
+
+**Attention**For details about the preprocessing method, see the Test Reader section in the configuration file.
+
+
+-The output of the dynamic and static derived model in Paddle Detection is unified as follows:
+
+ - bbox, the output of NMS, in the shape of [N, 6], where N is the number of prediction boxes, and 6 is [class_id, score, x1, y1, x2, y2].
+ - bbox\_num, Each picture corresponds to the number of prediction boxes. For example, batch size is 2 and the output is [N1, N2], indicating that the first picture contains N1 prediction boxes and the second picture contains N2 prediction boxes, and the total number of prediction boxes is the same as the first dimension N output by NMS
+ - mask, If the network contains a mask, the mask branch is printed
+
+**Attention**The model-to-static export does not support cases where numpy operations are included in the model structure.
+
+
+### 2、Start Parameters
+
+| FLAG | USE | DEFAULT | NOTE |
+| :----------: | :-----------------------------: | :------------------: | :-------------------------------------------------------------------: |
+| -c | Specifying a configuration file | None | |
+| --output_dir | Model save path | `./output_inference` | The model is saved in the `output/default_file_name/` path by default |
+
+### 3、Example
+
+Using the trained model for trial use, the script is as follows:
+
+```bash
+# The YOLOv3 model is exported
+python tools/export_model.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml --output_dir=./inference_model \
+ -o weights=weights/yolov3_darknet53_270e_coco.pdparams
+```
+The prediction model will be exported to the `inference_model/yolov3_darknet53_270e_coco` directory. `infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel` respectively.
+
+
+### 4、Sets the input size of the export model
+When using Fluid TensorRT for prediction, since <= TensorRT 5.1 only supports fixed-length input, the image size of the `data` layer of the saved model needs to be the same as the actual input image size. Fluid C++ prediction engine does not have this limitation. Setting `image_shape` in Test Reader changes the size of the input image in the saved model. The following is an example:
+
+
+```bash
+#Export the YOLOv3 model with the input 3x640x640
+python tools/export_model.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml --output_dir=./inference_model \
+ -o weights=weights/yolov3_darknet53_270e_coco.pdparams TestReader.inputs_def.image_shape=[3,640,640]
+```
diff --git a/PaddleDetection-release-2.6/deploy/EXPORT_ONNX_MODEL.md b/PaddleDetection-release-2.6/deploy/EXPORT_ONNX_MODEL.md
new file mode 100644
index 0000000000000000000000000000000000000000..e1f4027833973a9c37fb9f144e77beeead3acb41
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/EXPORT_ONNX_MODEL.md
@@ -0,0 +1,112 @@
+# PaddleDetection模型导出为ONNX格式教程
+
+PaddleDetection模型支持保存为ONNX格式,目前测试支持的列表如下
+| 模型 | OP版本 | 备注 |
+| :---- | :----- | :--- |
+| YOLOv3 | 11 | 仅支持batch=1推理;模型导出需固定shape |
+| PP-YOLO | 11 | 仅支持batch=1推理;MatrixNMS将被转换NMS,精度略有变化;模型导出需固定shape |
+| PP-YOLOv2 | 11 | 仅支持batch=1推理;MatrixNMS将被转换NMS,精度略有变化;模型导出需固定shape |
+| PP-YOLO Tiny | 11 | 仅支持batch=1推理;模型导出需固定shape |
+| PP-YOLOE | 11 | 仅支持batch=1推理;模型导出需固定shape |
+| PP-PicoDet | 11 | 仅支持batch=1推理;模型导出需固定shape |
+| FCOS | 11 |仅支持batch=1推理 |
+| PAFNet | 11 |- |
+| TTFNet | 11 |-|
+| SSD | 11 |仅支持batch=1推理 |
+| PP-TinyPose | 11 | - |
+| Faster RCNN | 16 | 仅支持batch=1推理, 依赖0.9.7及以上版本|
+| Mask RCNN | 16 | 仅支持batch=1推理, 依赖0.9.7及以上版本|
+| Cascade RCNN | 16 | 仅支持batch=1推理, 依赖0.9.7及以上版本|
+| Cascade Mask RCNN | 16 | 仅支持batch=1推理, 依赖0.9.7及以上版本|
+
+保存ONNX的功能由[Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX)提供,如在转换中有相关问题反馈,可在Paddle2ONNX的Github项目中通过[ISSUE](https://github.com/PaddlePaddle/Paddle2ONNX/issues)与工程师交流。
+
+## 导出教程
+
+### 步骤一、导出PaddlePaddle部署模型
+
+
+导出步骤参考文档[PaddleDetection部署模型导出教程](./EXPORT_MODEL.md), 导出示例如下
+
+- 非RCNN系列模型, 以YOLOv3为例
+```
+cd PaddleDetection
+python tools/export_model.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams \
+ TestReader.inputs_def.image_shape=[3,608,608] \
+ --output_dir inference_model
+```
+导出后的模型保存在`inference_model/yolov3_darknet53_270e_coco/`目录中,结构如下
+```
+yolov3_darknet
+ ├── infer_cfg.yml # 模型配置文件信息
+ ├── model.pdiparams # 静态图模型参数
+ ├── model.pdiparams.info # 参数额外信息,一般无需关注
+ └── model.pdmodel # 静态图模型文件
+```
+> 注意导出时的参数`TestReader.inputs_def.image_shape`,对于YOLO系列模型注意导出时指定该参数,否则无法转换成功
+
+- RCNN系列模型,以Faster RCNN为例
+
+RCNN系列模型导出ONNX模型时,需要去除模型中的控制流,因此需要额外添加`export_onnx=True` 字段
+```
+cd PaddleDetection
+python tools/export_model.py -c configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_1x_coco.pdparams \
+ export_onnx=True \
+ --output_dir inference_model
+```
+
+导出的模型保存在`inference_model/faster_rcnn_r50_fpn_1x_coco/`目录中,结构如下
+```
+faster_rcnn_r50_fpn_1x_coco
+ ├── infer_cfg.yml # 模型配置文件信息
+ ├── model.pdiparams # 静态图模型参数
+ ├── model.pdiparams.info # 参数额外信息,一般无需关注
+ └── model.pdmodel # 静态图模型文件
+```
+
+### 步骤二、将部署模型转为ONNX格式
+安装Paddle2ONNX(高于或等于0.9.7版本)
+```
+pip install paddle2onnx
+```
+使用如下命令转换
+```
+# YOLOv3
+paddle2onnx --model_dir inference_model/yolov3_darknet53_270e_coco \
+ --model_filename model.pdmodel \
+ --params_filename model.pdiparams \
+ --opset_version 11 \
+ --save_file yolov3.onnx
+
+# Faster RCNN
+paddle2onnx --model_dir inference_model/faster_rcnn_r50_fpn_1x_coco \
+ --model_filename model.pdmodel \
+ --params_filename model.pdiparams \
+ --opset_version 16 \
+ --save_file faster_rcnn.onnx
+```
+转换后的模型即为在当前路径下的`yolov3.onnx`和`faster_rcnn.onnx`
+
+### 步骤三、使用onnxruntime进行推理
+安装onnxruntime
+```
+pip install onnxruntime
+```
+推理代码示例在[deploy/third_engine/onnx](./third_engine/onnx)下
+
+使用如下命令进行推理:
+```
+# YOLOv3
+python deploy/third_engine/onnx/infer.py
+ --infer_cfg inference_model/yolov3_darknet53_270e_coco/infer_cfg.yml \
+ --onnx_file yolov3.onnx \
+ --image_file demo/000000014439.jpg
+
+# Faster RCNN
+python deploy/third_engine/onnx/infer.py
+ --infer_cfg inference_model/faster_rcnn_r50_fpn_1x_coco/infer_cfg.yml \
+ --onnx_file faster_rcnn.onnx \
+ --image_file demo/000000014439.jpg
+```
diff --git a/PaddleDetection-release-2.6/deploy/EXPORT_ONNX_MODEL_en.md b/PaddleDetection-release-2.6/deploy/EXPORT_ONNX_MODEL_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..750959062dc20cc68600bbd89e9264468c11e4d6
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/EXPORT_ONNX_MODEL_en.md
@@ -0,0 +1,110 @@
+# PaddleDetection Model Export as ONNX Format Tutorial
+
+PaddleDetection Model support is saved in ONNX format and the list of current test support is as follows
+| Model | OP Version | NOTE |
+| :---- | :----- | :--- |
+| YOLOv3 | 11 | Only batch=1 inferring is supported. Model export needs fixed shape |
+| PP-YOLO | 11 | Only batch=1 inferring is supported. A MatrixNMS will be converted to an NMS with slightly different precision; Model export needs fixed shape |
+| PP-YOLOv2 | 11 | Only batch=1 inferring is supported. MatrixNMS will be converted to NMS with slightly different precision; Model export needs fixed shape |
+| PP-YOLO Tiny | 11 | Only batch=1 inferring is supported. Model export needs fixed shape |
+| PP-YOLOE | 11 | Only batch=1 inferring is supported. Model export needs fixed shape |
+| PP-PicoDet | 11 | Only batch=1 inferring is supported. Model export needs fixed shape |
+| FCOS | 11 |Only batch=1 inferring is supported |
+| PAFNet | 11 |- |
+| TTFNet | 11 |-|
+| SSD | 11 |Only batch=1 inferring is supported |
+| PP-TinyPose | 11 | - |
+| Faster RCNN | 16 | Only batch=1 inferring is supported, require paddle2onnx>=0.9.7|
+| Mask RCNN | 16 | Only batch=1 inferring is supported, require paddle2onnx>=0.9.7|
+| Cascade RCNN | 16 | Only batch=1 inferring is supported, require paddle2onnx>=0.9.7|
+| Cascade Mask RCNN | 16 | Only batch=1 inferring is supported, require paddle2onnx>=0.9.7|
+
+
+The function of saving ONNX is provided by [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX). If there is feedback on related problems during conversion, Communicate with engineers in Paddle2ONNX's Github project via [ISSUE](https://github.com/PaddlePaddle/Paddle2ONNX/issues).
+
+## Export Tutorial
+
+### Step 1. Export the Paddle deployment model
+Export procedure reference document[Tutorial on PaddleDetection deployment model export](./EXPORT_MODEL_en.md), for example:
+
+- Models except RCNN series, take YOLOv3 as example
+```
+cd PaddleDetection
+python tools/export_model.py -c configs/yolov3/yolov3_darknet53_270e_coco.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/yolov3_darknet53_270e_coco.pdparams \
+ TestReader.inputs_def.image_shape=[3,608,608] \
+ --output_dir inference_model
+```
+The derived models were saved in `inference_model/yolov3_darknet53_270e_coco/`, with the structure as follows
+```
+yolov3_darknet
+ ├── infer_cfg.yml # Model configuration file information
+ ├── model.pdiparams # Static diagram model parameters
+ ├── model.pdiparams.info # Parameter Information is not required
+ └── model.pdmodel # Static diagram model file
+```
+> check`TestReader.inputs_def.image_shape`, For YOLO series models, specify this parameter when exporting; otherwise, the conversion fails
+
+- RCNN series models, take Faster RCNN as example
+
+The conditional block needs to be removed in RCNN series when export ONNX model. Add `export_onnx=True` in command line
+```
+cd PaddleDetection
+python tools/export_model.py -c configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/faster_rcnn_r50_fpn_1x_coco.pdparams \
+ export_onnx=True \
+ --output_dir inference_model
+```
+The derived models were saved in `inference_model/faster_rcnn_r50_fpn_1x_coco/`, with the structure as follows
+```
+faster_rcnn_r50_fpn_1x_coco
+ ├── infer_cfg.yml # Model configuration file information
+ ├── model.pdiparams # Static diagram model parameters
+ ├── model.pdiparams.info # Parameter Information is not required
+ └── model.pdmodel # Static diagram model file
+```
+
+### Step 2. Convert the deployment model to ONNX format
+Install Paddle2ONNX (version 0.9.7 or higher)
+```
+pip install paddle2onnx
+```
+Use the following command to convert
+```
+# YOLOv3
+paddle2onnx --model_dir inference_model/yolov3_darknet53_270e_coco \
+ --model_filename model.pdmodel \
+ --params_filename model.pdiparams \
+ --opset_version 11 \
+ --save_file yolov3.onnx
+
+# Faster RCNN
+paddle2onnx --model_dir inference_model/faster_rcnn_r50_fpn_1x_coco \
+ --model_filename model.pdmodel \
+ --params_filename model.pdiparams \
+ --opset_version 16 \
+ --save_file faster_rcnn.onnx
+```
+The transformed model is under the current path`yolov3.onnx` and `faster_rcnn.onnx`
+
+### Step 3. Inference with onnxruntime
+Install onnxruntime
+```
+pip install onnxruntime
+```
+Inference code examples are in [deploy/third_engine/onnx](./third_engine/onnx)
+
+Use the following commands for inference:
+```
+# YOLOv3
+python deploy/third_engine/onnx/infer.py
+ --infer_cfg inference_model/yolov3_darknet53_270e_coco/infer_cfg.yml \
+ --onnx_file yolov3.onnx \
+ --image_file demo/000000014439.jpg
+
+# Faster RCNN
+python deploy/third_engine/onnx/infer.py
+ --infer_cfg inference_model/faster_rcnn_r50_fpn_1x_coco/infer_cfg.yml \
+ --onnx_file faster_rcnn.onnx \
+ --image_file demo/000000014439.jpg
+```
diff --git a/PaddleDetection-release-2.6/deploy/README.md b/PaddleDetection-release-2.6/deploy/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..ac1ba72f61c760d04376a510af55ed6bd4ac75b7
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/README.md
@@ -0,0 +1,84 @@
+# PaddleDetection 预测部署
+
+PaddleDetection提供了Paddle Inference、Paddle Serving、Paddle-Lite多种部署形式,支持服务端、移动端、嵌入式等多种平台,提供了完善的Python和C++部署方案。
+
+## PaddleDetection支持的部署形式说明
+|形式|语言|教程|设备/平台|
+|-|-|-|-|
+|Paddle Inference|Python|已完善|Linux(ARM\X86)、Windows
+|Paddle Inference|C++|已完善|Linux(ARM\X86)、Windows|
+|Paddle Serving|Python|已完善|Linux(ARM\X86)、Windows|
+|Paddle-Lite|C++|已完善|Android、IOS、FPGA、RK...
+
+
+## 1.Paddle Inference部署
+
+### 1.1 导出模型
+
+使用`tools/export_model.py`脚本导出模型以及部署时使用的配置文件,配置文件名字为`infer_cfg.yml`。模型导出脚本如下:
+```bash
+# 导出YOLOv3模型
+python tools/export_model.py -c configs/yolov3/yolov3_mobilenet_v1_roadsign.yml -o weights=output/yolov3_mobilenet_v1_roadsign/best_model.pdparams
+```
+预测模型会导出到`output_inference/yolov3_mobilenet_v1_roadsign`目录下,分别为`infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel`。
+模型导出具体请参考文档[PaddleDetection模型导出教程](EXPORT_MODEL.md)。
+
+### 1.2 使用PaddleInference进行预测
+* Python部署 支持`CPU`、`GPU`和`XPU`环境,支持,windows、linux系统,支持NV Jetson嵌入式设备上部署。参考文档[python部署](python/README.md)
+* C++部署 支持`CPU`、`GPU`和`XPU`环境,支持,windows、linux系统,支持NV Jetson嵌入式设备上部署。参考文档[C++部署](cpp/README.md)
+* PaddleDetection支持TensorRT加速,相关文档请参考[TensorRT预测部署教程](TENSOR_RT.md)
+
+**注意:** Paddle预测库版本需要>=2.1,batch_size>1仅支持YOLOv3和PP-YOLO。
+
+## 2.PaddleServing部署
+### 2.1 导出模型
+
+如果需要导出`PaddleServing`格式的模型,需要设置`export_serving_model=True`:
+```buildoutcfg
+python tools/export_model.py -c configs/yolov3/yolov3_mobilenet_v1_roadsign.yml -o weights=output/yolov3_mobilenet_v1_roadsign/best_model.pdparams --export_serving_model=True
+```
+预测模型会导出到`output_inference/yolov3_darknet53_270e_coco`目录下,分别为`infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel`, `serving_client/`文件夹, `serving_server/`文件夹。
+
+模型导出具体请参考文档[PaddleDetection模型导出教程](EXPORT_MODEL.md)。
+
+### 2.2 使用PaddleServing进行预测
+* [安装PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md#installation)
+* [使用PaddleServing](./serving/README.md)
+
+
+## 3.PaddleLite部署
+- [使用PaddleLite部署PaddleDetection模型](./lite/README.md)
+- 详细案例请参考[Paddle-Lite-Demo](https://github.com/PaddlePaddle/Paddle-Lite-Demo)部署。更多内容,请参考[Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)
+
+
+## 4.第三方部署(MNN、NCNN、Openvino)
+- 第三方部署提供PicoDet、TinyPose案例,其他模型请参考修改
+- TinyPose部署推荐工具:Intel CPU端推荐使用Openvino,GPU端推荐使用PaddleInference,ARM/ANDROID端推荐使用PaddleLite或者MNN
+
+| Third_Engine | MNN | NCNN | OPENVINO |
+| ------------ | ---- | ----- | ---------- |
+| PicoDet | [PicoDet_MNN](./third_engine/demo_mnn/README.md) | [PicoDet_NCNN](./third_engine/demo_ncnn/README.md) | [PicoDet_OPENVINO](./third_engine/demo_openvino/README.md) |
+| TinyPose | [TinyPose_MNN](./third_engine/demo_mnn_kpts/README.md) | - | [TinyPose_OPENVINO](./third_engine/demo_openvino_kpts/README.md) |
+
+
+
+## 5.Benchmark测试
+- 使用导出的模型,运行Benchmark批量测试脚本:
+```shell
+sh deploy/benchmark/benchmark.sh {model_dir} {model_name}
+```
+**注意** 如果是量化模型,请使用`deploy/benchmark/benchmark_quant.sh`脚本。
+- 将测试结果log导出至Excel中:
+```
+python deploy/benchmark/log_parser_excel.py --log_path=./output_pipeline --output_name=benchmark_excel.xlsx
+```
+
+## 6.常见问题QA
+- 1、`Paddle 1.8.4`训练的模型,可以用`Paddle2.0`部署吗?
+ Paddle 2.0是兼容Paddle 1.8.4的,因此是可以的。但是部分模型(如SOLOv2)使用到了Paddle 2.0中新增OP,这类模型不可以。
+
+- 2、Windows编译时,预测库是VS2015编译的,选择VS2017或VS2019会有问题吗?
+ 关于VS兼容性问题请参考:[C++Visual Studio 2015、2017和2019之间的二进制兼容性](https://docs.microsoft.com/zh-cn/cpp/porting/binary-compat-2015-2017?view=msvc-160)
+
+- 3、cuDNN 8.0.4连续预测会发生内存泄漏吗?
+ 经QA测试,发现cuDNN 8系列连续预测时都有内存泄漏问题,且cuDNN 8性能差于cuDNN 7,推荐使用CUDA + cuDNN7.6.4的方式进行部署。
diff --git a/PaddleDetection-release-2.6/deploy/README_en.md b/PaddleDetection-release-2.6/deploy/README_en.md
new file mode 100644
index 0000000000000000000000000000000000000000..f587b56b99e7a6b7c7ed31c5ae6307ade6e18126
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/README_en.md
@@ -0,0 +1,83 @@
+# PaddleDetection Predict deployment
+
+PaddleDetection provides multiple deployment forms of Paddle Inference, Paddle Serving and Paddle-Lite, supports multiple platforms such as server, mobile and embedded, and provides a complete Python and C++ deployment solution
+
+## PaddleDetection This section describes the supported deployment modes
+| formalization | language | Tutorial | Equipment/Platform |
+| ---------------- | -------- | ----------- | ------------------------- |
+| Paddle Inference | Python | Has perfect | Linux(ARM\X86)、Windows |
+| Paddle Inference | C++ | Has perfect | Linux(ARM\X86)、Windows |
+| Paddle Serving | Python | Has perfect | Linux(ARM\X86)、Windows |
+| Paddle-Lite | C++ | Has perfect | Android、IOS、FPGA、RK... |
+
+
+## 1.Paddle Inference Deployment
+
+### 1.1 The export model
+
+Use the `tools/export_model.py` script to export the model and the configuration file used during deployment. The configuration file name is `infer_cfg.yml`. The model export script is as follows
+
+```bash
+# The YOLOv3 model is derived
+python tools/export_model.py -c configs/yolov3/yolov3_mobilenet_v1_roadsign.yml -o weights=output/yolov3_mobilenet_v1_roadsign/best_model.pdparams
+```
+The prediction model will be exported to the `output_inference/yolov3_mobilenet_v1_roadsign` directory `infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel`. For details on model export, please refer to the documentation [Tutorial on Paddle Detection MODEL EXPORT](./EXPORT_MODEL_en.md).
+
+### 1.2 Use Paddle Inference to Make Predictions
+* Python deployment supports `CPU`, `GPU` and `XPU` environments, Windows, Linux, and NV Jetson embedded devices. Reference Documentation [Python Deployment](python/README.md)
+* C++ deployment supports `CPU`, `GPU` and `XPU` environments, Windows and Linux systems, and NV Jetson embedded devices. Reference documentation [C++ deployment](cpp/README.md)
+* PaddleDetection supports TensorRT acceleration. Please refer to the documentation for [TensorRT Predictive Deployment Tutorial](TENSOR_RT.md)
+
+**Attention:** Paddle prediction library version requires >=2.1, and batch_size>1 only supports YOLOv3 and PP-YOLO.
+
+## 2.PaddleServing Deployment
+### 2.1 Export model
+
+If you want to export the model in `PaddleServing` format, set `export_serving_model=True`:
+```buildoutcfg
+python tools/export_model.py -c configs/yolov3/yolov3_mobilenet_v1_roadsign.yml -o weights=output/yolov3_mobilenet_v1_roadsign/best_model.pdparams --export_serving_model=True
+```
+The prediction model will be exported to the `output_inference/yolov3_darknet53_270e_coco` directory `infer_cfg.yml`, `model.pdiparams`, `model.pdiparams.info`, `model.pdmodel`, `serving_client/` and `serving_server/` folder.
+
+For details on model export, please refer to the documentation [Tutorial on Paddle Detection MODEL EXPORT](./EXPORT_MODEL_en.md).
+
+### 2.2 Predictions are made using Paddle Serving
+* [Install PaddleServing](https://github.com/PaddlePaddle/Serving/blob/develop/README.md#installation)
+* [Use PaddleServing](./serving/README.md)
+
+
+## 3. PaddleLite Deployment
+- [Deploy the PaddleDetection model using PaddleLite](./lite/README.md)
+- For details, please refer to [Paddle-Lite-Demo](https://github.com/PaddlePaddle/Paddle-Lite-Demo) deployment. For more information, please refer to [Paddle-Lite](https://github.com/PaddlePaddle/Paddle-Lite)
+
+
+## 4.Third-Engine deploy(MNN、NCNN、Openvino)
+- The Third-Engine deploy take example of PicoDet、TinyPose,the others model is the same
+- Suggestion for TinyPose: For Intel CPU Openvino is recommended,for Nvidia GPU PaddleInference is recommended,and for ARM/ANDROID PaddleLite or MNN is recommended.
+
+| Third_Engine | MNN | NCNN | OPENVINO |
+| ------------ | ------------------------------------------------------ | -------------------------------------------------- | ------------------------------------------------------------ |
+| PicoDet | [PicoDet_MNN](./third_engine/demo_mnn/README.md) | [PicoDet_NCNN](./third_engine/demo_ncnn/README.md) | [PicoDet_OPENVINO](./third_engine/demo_openvino/README.md) |
+| TinyPose | [TinyPose_MNN](./third_engine/demo_mnn_kpts/README.md) | - | [TinyPose_OPENVINO](./third_engine/demo_openvino_kpts/README.md) |
+
+
+## 5. Benchmark Test
+- Using the exported model, run the Benchmark batch test script:
+```shell
+sh deploy/benchmark/benchmark.sh {model_dir} {model_name}
+```
+**Attention** If it is a quantitative model, please use the `deploy/benchmark/benchmark_quant.sh` script.
+- Export the test result log to Excel:
+```
+python deploy/benchmark/log_parser_excel.py --log_path=./output_pipeline --output_name=benchmark_excel.xlsx
+```
+
+## 6. FAQ
+- 1、Can `Paddle 1.8.4` trained models be deployed with `Paddle2.0`?
+ Paddle 2.0 is compatible with Paddle 1.8.4, so it is ok. However, some models (such as SOLOv2) use the new OP in Paddle 2.0, which is not allowed.
+
+- 2、When compiling for Windows, the prediction library is compiled with VS2015, will it be a problem to choose VS2017 or VS2019?
+ For compatibility issues with VS, please refer to: [C++ Visual Studio 2015, 2017 and 2019 binary compatibility](https://docs.microsoft.com/zh-cn/cpp/porting/binary-compat-2015-2017?view=msvc-160)
+
+- 3、Does cuDNN 8.0.4 continuously predict memory leaks?
+ QA tests show that cuDNN 8 series have memory leakage problems in continuous prediction, and cuDNN 8 performance is worse than cuDNN7. CUDA + cuDNN7.6.4 is recommended for deployment.
diff --git a/PaddleDetection-release-2.6/deploy/TENSOR_RT.md b/PaddleDetection-release-2.6/deploy/TENSOR_RT.md
new file mode 100644
index 0000000000000000000000000000000000000000..b1dd29789540746cce5f7ea3ce0a783e2178438d
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/TENSOR_RT.md
@@ -0,0 +1,98 @@
+# TensorRT预测部署教程
+TensorRT是NVIDIA提出的用于统一模型部署的加速库,可以应用于V100、JETSON Xavier等硬件,它可以极大提高预测速度。Paddle TensorRT教程请参考文档[使用Paddle-TensorRT库预测](https://www.paddlepaddle.org.cn/inference/optimize/paddle_trt.html)
+
+## 1. 安装PaddleInference预测库
+- Python安装包,请从[这里](https://paddleinference.paddlepaddle.org.cn/user_guides/download_lib.html#python) 下载带有tensorrt的安装包进行安装
+
+- CPP预测库,请从[这里](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/05_inference_deployment/inference/build_and_install_lib_cn.html) 下载带有TensorRT编译的预测库
+
+- 如果Python和CPP官网没有提供已编译好的安装包或预测库,请参考[源码安装](https://www.paddlepaddle.org.cn/documentation/docs/zh/install/compile/linux-compile.html) 自行编译
+
+**注意:**
+- 您的机器上TensorRT的版本需要跟您使用的预测库中TensorRT版本保持一致。
+- PaddleDetection中部署预测要求TensorRT版本 > 6.0。
+
+## 2. 导出模型
+模型导出具体请参考文档[PaddleDetection模型导出教程](./EXPORT_MODEL.md)。
+
+## 3. 开启TensorRT加速
+### 3.1 配置TensorRT
+在使用Paddle预测库构建预测器配置config时,打开TensorRT引擎就可以了:
+
+```
+config->EnableUseGpu(100, 0); // 初始化100M显存,使用GPU ID为0
+config->GpuDeviceId(); // 返回正在使用的GPU ID
+// 开启TensorRT预测,可提升GPU预测性能,需要使用带TensorRT的预测库
+config->EnableTensorRtEngine(1 << 20 /*workspace_size*/,
+ batch_size /*max_batch_size*/,
+ 3 /*min_subgraph_size*/,
+ AnalysisConfig::Precision::kFloat32 /*precision*/,
+ false /*use_static*/,
+ false /*use_calib_mode*/);
+
+```
+**注意:**
+ --run_benchmark如果设置为True,则需要安装依赖`pip install pynvml psutil GPUtil`。
+
+### 3.2 TensorRT固定尺寸预测
+
+例如在模型Reader配置文件中设置:
+```yaml
+TestReader:
+ inputs_def:
+ image_shape: [3,608,608]
+ ...
+```
+或者在导出模型时设置`-o TestReader.inputs_def.image_shape=[3,608,608]`,模型将会进行固定尺寸预测,具体请参考[PaddleDetection模型导出教程](./EXPORT_MODEL.md) 。
+
+可以通过[visualdl](https://www.paddlepaddle.org.cn/paddle/visualdl/demo/graph) 打开`model.pdmodel`文件,查看输入的第一个Tensor尺寸是否是固定的,如果不指定,尺寸会用`?`表示,如下图所示:
+
+
+
+注意:由于TesnorRT不支持在batch维度进行slice操作,Faster RCNN 和 Mask RCNN不能使用固定尺寸输入预测,所以不能设置`TestReader.inputs_def.image_shape`字段。
+
+以`YOLOv3`为例,使用固定尺寸输入预测:
+```
+python python/infer.py --model_dir=./output_inference/yolov3_darknet53_270e_coco/ --image_file=./demo/000000014439.jpg --device=GPU --run_mode=trt_fp32 --run_benchmark=True
+```
+
+### 3.3 TensorRT动态尺寸预测
+
+TensorRT版本>=6时,使用TensorRT预测时,可以支持动态尺寸输入。如果模型Reader配置文件中没有设置例如`TestReader.inputs_def.image_shape=[3,608,608]`的字段,或者`image_shape=[3.-1,-1]`,导出模型将以动态尺寸进行预测。一般RCNN系列模型使用动态图尺寸预测。
+Paddle预测库关于动态尺寸输入请查看[Paddle CPP预测](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/05_inference_deployment/inference/native_infer.html) 的`SetTRTDynamicShapeInfo`函数说明。
+
+`python/infer.py`设置动态尺寸输入参数说明:
+
+- trt_min_shape 用于设定TensorRT的输入图像height、width中的最小尺寸,默认值:1
+
+- trt_max_shape 用于设定TensorRT的输入图像height、width中的最大尺寸,默认值:1280
+
+- trt_opt_shape 用于设定TensorRT的输入图像height、width中的最优尺寸,默认值:640
+
+**注意:`TensorRT`中动态尺寸设置是4维的,这里只设置输入图像的尺寸。**
+
+以`Faster RCNN`为例,使用动态尺寸输入预测:
+```
+python python/infer.py --model_dir=./output_inference/faster_rcnn_r50_fpn_1x_coco/ --image_file=./demo/000000014439.jpg --device=GPU --run_mode=trt_fp16 --run_benchmark=True --trt_max_shape=1280 --trt_min_shape=800 --trt_opt_shape=960
+```
+
+## 4、常见问题QA
+**Q:** 提示没有`tensorrt_op`
+**A:** 请检查是否使用带有TensorRT的Paddle Python包或预测库。
+
+**Q:** 提示`op out of memory`
+**A:** 检查GPU是否是别人也在使用,请尝试使用空闲GPU
+
+**Q:** 提示`some trt inputs dynamic shape info not set`
+**A:** 这是由于`TensorRT`会把网络结果划分成多个子图,我们只设置了输入数据的动态尺寸,划分的其他子图的输入并未设置动态尺寸。有两个解决方法:
+
+- 方法一:通过增大`min_subgraph_size`,跳过对这些子图的优化。根据提示,设置min_subgraph_size大于并未设置动态尺寸输入的子图中OP个数即可。
+`min_subgraph_size`的意思是,在加载TensorRT引擎的时候,大于`min_subgraph_size`的OP才会被优化,并且这些OP是连续的且是TensorRT可以优化的。
+
+- 方法二:找到子图的这些输入,按照上面方式也设置子图的输入动态尺寸。
+
+**Q:** 如何打开日志
+**A:** 预测库默认是打开日志的,只要注释掉`config.disable_glog_info()`就可以打开日志
+
+**Q:** 开启TensorRT,预测时提示Slice on batch axis is not supported in TensorRT
+**A:** 请尝试使用动态尺寸输入
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/README.md b/PaddleDetection-release-2.6/deploy/auto_compression/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..26e7b808f976867ef734ed2c6e01cdfa0d730883
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/README.md
@@ -0,0 +1,155 @@
+# 自动化压缩
+
+目录:
+- [1.简介](#1简介)
+- [2.Benchmark](#2Benchmark)
+- [3.开始自动压缩](#自动压缩流程)
+ - [3.1 环境准备](#31-准备环境)
+ - [3.2 准备数据集](#32-准备数据集)
+ - [3.3 准备预测模型](#33-准备预测模型)
+ - [3.4 测试模型精度](#34-测试模型精度)
+ - [3.5 自动压缩并产出模型](#35-自动压缩并产出模型)
+- [4.预测部署](#4预测部署)
+
+## 1. 简介
+本示例使用PaddleDetection中Inference部署模型进行自动化压缩,使用的自动化压缩策略为量化蒸馏。
+
+
+## 2.Benchmark
+
+### PP-YOLOE+
+
+| 模型 | Base mAP | 离线量化mAP | ACT量化mAP | TRT-FP32 | TRT-FP16 | TRT-INT8 | 配置文件 | 量化模型 |
+| :-------- |:-------- |:--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: |
+| PP-YOLOE+_s | 43.7 | - | 42.9 | - | - | - | [config](./configs/ppyoloe_plus_s_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_s_qat_dis.tar) |
+| PP-YOLOE+_m | 49.8 | - | 49.3 | - | - | - | [config](./configs/ppyoloe_plus_m_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_m_qat_dis.tar) |
+| PP-YOLOE+_l | 52.9 | - | 52.6 | - | - | - | [config](./configs/ppyoloe_plus_l_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_l_qat_dis.tar) |
+| PP-YOLOE+_x | 54.7 | - | 54.4 | - | - | - | [config](./configs/ppyoloe_plus_x_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddledet/deploy/Inference/ppyoloe_plus_x_qat_dis.tar) |
+
+- mAP的指标均在COCO val2017数据集中评测得到,IoU=0.5:0.95。
+
+### YOLOv8
+
+| 模型 | Base mAP | 离线量化mAP | ACT量化mAP | TRT-FP32 | TRT-FP16 | TRT-INT8 | 配置文件 | 量化模型 |
+| :-------- |:-------- |:--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: |
+| YOLOv8-s | 44.9 | 43.9 | 44.3 | 9.27ms | 4.65ms | **3.78ms** | [config](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/detection/configs/yolov8_s_qat_dis.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/yolov8_s_500e_coco_trt_nms_quant.tar) |
+
+**注意:**
+- 表格中YOLOv8模型均为带NMS的模型,可直接在TRT中部署,如果需要对齐测试标准,需要测试不带NMS的模型。
+- mAP的指标均在COCO val2017数据集中评测得到,IoU=0.5:0.95。
+- 表格中的性能在Tesla T4的GPU环境下测试,并且开启TensorRT,batch_size=1。
+
+### PP-YOLOE
+
+| 模型 | Base mAP | 离线量化mAP | ACT量化mAP | TRT-FP32 | TRT-FP16 | TRT-INT8 | 配置文件 | 量化模型 |
+| :-------- |:-------- |:--------: | :---------------------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: |
+| PP-YOLOE-l | 50.9 | - | 50.6 | 11.2ms | 7.7ms | **6.7ms** | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/deploy/auto_compression/configs/ppyoloe_l_qat_dis.yaml) | [Quant Model](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_crn_l_300e_coco_quant.tar) |
+
+- mAP的指标均在COCO val2017数据集中评测得到,IoU=0.5:0.95。
+- PP-YOLOE-l模型在Tesla V100的GPU环境下测试,并且开启TensorRT,batch_size=1,包含NMS,测试脚本是[benchmark demo](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/deploy/python)。
+
+### PP-PicoDet
+
+| 模型 | 策略 | mAP | FP32 | FP16 | INT8 | 配置文件 | 模型 |
+| :-------- |:-------- |:--------: | :----------------: | :----------------: | :---------------: | :----------------------: | :---------------------: |
+| PicoDet-S-NPU | Baseline | 30.1 | - | - | - | [config](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/picodet/picodet_s_416_coco_npu.yml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_416_coco_npu.tar) |
+| PicoDet-S-NPU | 量化训练 | 29.7 | - | - | - | [config](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/full_quantization/detection/configs/picodet_s_qat_dis.yaml) | [Model](https://bj.bcebos.com/v1/paddle-slim-models/act/picodet_s_npu_quant.tar) |
+
+- mAP的指标均在COCO val2017数据集中评测得到,IoU=0.5:0.95。
+
+## 3. 自动压缩流程
+
+#### 3.1 准备环境
+- PaddlePaddle >= 2.4 (可从[Paddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)下载安装)
+- PaddleSlim >= 2.4.1
+- PaddleDet >= 2.5
+- opencv-python
+
+安装paddlepaddle:
+```shell
+# CPU
+pip install paddlepaddle
+# GPU
+pip install paddlepaddle-gpu
+```
+
+安装paddleslim:
+```shell
+pip install paddleslim
+```
+
+安装paddledet:
+```shell
+pip install paddledet
+```
+
+**注意:** YOLOv8模型的自动化压缩需要依赖安装最新[Develop Paddle](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html)和[Develop PaddleSlim](https://github.com/PaddlePaddle/PaddleSlim#%E5%AE%89%E8%A3%85)版本。
+
+#### 3.2 准备数据集
+
+本案例默认以COCO数据进行自动压缩实验,如果自定义COCO数据,或者其他格式数据,请参考[数据准备文档](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/docs/tutorials/data/PrepareDataSet.md) 来准备数据。
+
+如果数据集为非COCO格式数据,请修改[configs](./configs)中reader配置文件中的Dataset字段。
+
+以PP-YOLOE模型为例,如果已经准备好数据集,请直接修改[./configs/yolo_reader.yml]中`EvalDataset`的`dataset_dir`字段为自己数据集路径即可。
+
+#### 3.3 准备预测模型
+
+预测模型的格式为:`model.pdmodel` 和 `model.pdiparams`两个,带`pdmodel`的是模型文件,带`pdiparams`后缀的是权重文件。
+
+
+根据[PaddleDetection文档](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.6/docs/tutorials/GETTING_STARTED_cn.md#8-%E6%A8%A1%E5%9E%8B%E5%AF%BC%E5%87%BA) 导出Inference模型,具体可参考下方PP-YOLOE模型的导出示例:
+- 下载代码
+```
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+```
+- 导出预测模型
+
+PPYOLOE-l模型,包含NMS:如快速体验,可直接下载[PP-YOLOE-l导出模型](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_crn_l_300e_coco.tar)
+```shell
+python tools/export_model.py \
+ -c configs/ppyoloe/ppyoloe_crn_l_300e_coco.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams \
+ trt=True \
+```
+
+YOLOv8-s模型,包含NMS,具体可参考[YOLOv8模型文档](https://github.com/PaddlePaddle/PaddleYOLO/tree/release/2.5/configs/yolov8), 然后执行:
+```shell
+python tools/export_model.py \
+ -c configs/yolov8/yolov8_s_500e_coco.yml \
+ -o weights=https://paddledet.bj.bcebos.com/models/yolov8_s_500e_coco.pdparams \
+ trt=True
+```
+
+如快速体验,可直接下载[YOLOv8-s导出模型](https://bj.bcebos.com/v1/paddle-slim-models/act/yolov8_s_500e_coco_trt_nms.tar)
+
+#### 3.4 自动压缩并产出模型
+
+蒸馏量化自动压缩示例通过run.py脚本启动,会使用接口```paddleslim.auto_compression.AutoCompression```对模型进行自动压缩。配置config文件中模型路径、蒸馏、量化、和训练等部分的参数,配置完成后便可对模型进行量化和蒸馏。具体运行命令为:
+
+- 单卡训练:
+```
+export CUDA_VISIBLE_DEVICES=0
+python run.py --config_path=./configs/ppyoloe_l_qat_dis.yaml --save_dir='./output/'
+```
+
+- 多卡训练:
+```
+CUDA_VISIBLE_DEVICES=0,1,2,3 python -m paddle.distributed.launch --log_dir=log --gpus 0,1,2,3 run.py \
+ --config_path=./configs/ppyoloe_l_qat_dis.yaml --save_dir='./output/'
+```
+
+#### 3.5 测试模型精度
+
+使用eval.py脚本得到模型的mAP:
+```
+export CUDA_VISIBLE_DEVICES=0
+python eval.py --config_path=./configs/ppyoloe_l_qat_dis.yaml
+```
+
+**注意**:
+- 要测试的模型路径可以在配置文件中`model_dir`字段下进行修改。
+
+## 4.预测部署
+
+- 可以参考[PaddleDetection部署教程](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/deploy),GPU上量化模型开启TensorRT并设置trt_int8模式进行部署。
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/picodet_reader.yml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/picodet_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..952a978ae32723e5a98bc63989e473d04e480c7c
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/picodet_reader.yml
@@ -0,0 +1,32 @@
+metric: COCO
+num_classes: 80
+
+
+# Datset configuration
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco/
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco/
+
+worker_num: 6
+eval_height: &eval_height 416
+eval_width: &eval_width 416
+eval_size: &eval_size [*eval_height, *eval_width]
+
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {interp: 2, target_size: *eval_size, keep_ratio: False}
+ - NormalizeImage: {mean: [0, 0, 0], std: [1, 1, 1], is_scale: True}
+ - Permute: {}
+ batch_transforms:
+ - PadBatch: {pad_to_stride: 32}
+ batch_size: 8
+ shuffle: false
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/picodet_s_qat_dis.yaml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/picodet_s_qat_dis.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..a5012be15a1e6791b27a9053417709ed96830bb0
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/picodet_s_qat_dis.yaml
@@ -0,0 +1,34 @@
+Global:
+ reader_config: ./configs/picodet_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./picodet_s_416_coco_npu/
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: l2
+
+QuantAware:
+ use_pact: true
+ activation_quantize_type: 'moving_average_abs_max'
+ weight_bits: 8
+ activation_bits: 8
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 8000
+ eval_iter: 1000
+ learning_rate:
+ type: CosineAnnealingDecay
+ learning_rate: 0.00001
+ T_max: 8000
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 4.0e-05
+
+
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_l_qat_dis.yaml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_l_qat_dis.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..df346d2b00ec24b351f2d62974a13e33293f431b
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_l_qat_dis.yaml
@@ -0,0 +1,32 @@
+
+Global:
+ reader_config: configs/ppyoloe_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./ppyoloe_crn_l_300e_coco
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: soft_label
+
+QuantAware:
+ use_pact: true
+ activation_quantize_type: 'moving_average_abs_max'
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 5000
+ eval_iter: 1000
+ learning_rate:
+ type: CosineAnnealingDecay
+ learning_rate: 0.00003
+ T_max: 6000
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 4.0e-05
+
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_l_qat_dis.yaml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_l_qat_dis.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..fd03aed09d9a1ed3a67eec3283ef227224e941fb
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_l_qat_dis.yaml
@@ -0,0 +1,32 @@
+
+Global:
+ reader_config: configs/ppyoloe_plus_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./ppyoloe_plus_crn_l_80e_coco
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: soft_label
+
+QuantAware:
+ use_pact: true
+ activation_quantize_type: 'moving_average_abs_max'
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 5000
+ eval_iter: 1000
+ learning_rate:
+ type: CosineAnnealingDecay
+ learning_rate: 0.00003
+ T_max: 6000
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 4.0e-05
+
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_m_qat_dis.yaml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_m_qat_dis.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..4d31332f5e3745604e03c50ad2f9db62376c1373
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_m_qat_dis.yaml
@@ -0,0 +1,32 @@
+
+Global:
+ reader_config: configs/ppyoloe_plus_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./ppyoloe_plus_crn_m_80e_coco
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: soft_label
+
+QuantAware:
+ use_pact: true
+ activation_quantize_type: 'moving_average_abs_max'
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 5000
+ eval_iter: 1000
+ learning_rate:
+ type: CosineAnnealingDecay
+ learning_rate: 0.00003
+ T_max: 6000
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 4.0e-05
+
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_reader.yml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..5f3795f29be025e6836a7c88b51dd79ecb04a9f4
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_reader.yml
@@ -0,0 +1,26 @@
+metric: COCO
+num_classes: 80
+
+# Datset configuration
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco/
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco/
+
+worker_num: 0
+
+# preprocess reader in test
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 4
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_s_qat_dis.yaml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_s_qat_dis.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..41bfde1e47855cdd1c543d13292d387781b8c0d6
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_s_qat_dis.yaml
@@ -0,0 +1,32 @@
+
+Global:
+ reader_config: configs/ppyoloe_plus_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./ppyoloe_plus_crn_s_80e_coco
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: soft_label
+
+QuantAware:
+ use_pact: true
+ activation_quantize_type: 'moving_average_abs_max'
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 5000
+ eval_iter: 1000
+ learning_rate:
+ type: CosineAnnealingDecay
+ learning_rate: 0.00003
+ T_max: 6000
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 4.0e-05
+
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_x_qat_dis.yaml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_x_qat_dis.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..ac62e7ca2d22bae19ffcf99f8265a05ea7e1331c
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_plus_x_qat_dis.yaml
@@ -0,0 +1,32 @@
+
+Global:
+ reader_config: configs/ppyoloe_plus_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./ppyoloe_plus_crn_x_80e_coco
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: soft_label
+
+QuantAware:
+ use_pact: true
+ activation_quantize_type: 'moving_average_abs_max'
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 5000
+ eval_iter: 1000
+ learning_rate:
+ type: CosineAnnealingDecay
+ learning_rate: 0.00003
+ T_max: 6000
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 4.0e-05
+
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_reader.yml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..d1061453051e8f7408f4e605078956a8b634f13c
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/ppyoloe_reader.yml
@@ -0,0 +1,26 @@
+metric: COCO
+num_classes: 80
+
+# Datset configuration
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco/
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco/
+
+worker_num: 0
+
+# preprocess reader in test
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+ - NormalizeImage: {mean: [0.485, 0.456, 0.406], std: [0.229, 0.224, 0.225], is_scale: True}
+ - Permute: {}
+ batch_size: 4
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov5_reader.yml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov5_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..6ad321a04d12f822e98facd179d9d72b0d8aa741
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov5_reader.yml
@@ -0,0 +1,26 @@
+metric: COCO
+num_classes: 80
+
+# Datset configuration
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco/
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco/
+
+worker_num: 0
+
+# preprocess reader in test
+TestReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: True, interp: 1}
+ - Pad: {size: [640, 640], fill_value: [114., 114., 114.]}
+ - Permute: {}
+ batch_size: 1
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov5_s_qat_dis.yml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov5_s_qat_dis.yml
new file mode 100644
index 0000000000000000000000000000000000000000..309977ef696ab23cc859fa224486e2ed7e91900e
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov5_s_qat_dis.yml
@@ -0,0 +1,29 @@
+
+Global:
+ reader_config: configs/yolov5_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./yolov5_s_300e_coco
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: soft_label
+
+QuantAware:
+ use_pact: true
+ activation_quantize_type: 'moving_average_abs_max'
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 3000
+ eval_iter: 1000
+ learning_rate: 0.00001
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 4.0e-05
+ target_metric: 0.365
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov6mt_s_qat_dis.yaml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov6mt_s_qat_dis.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..e134494fe2833333f3b2bcf87edb71e0b870a56f
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov6mt_s_qat_dis.yaml
@@ -0,0 +1,30 @@
+
+Global:
+ reader_config: configs/yolov5_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./yolov6mt_s_400e_coco
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: soft_label
+
+QuantAware:
+ activation_quantize_type: 'moving_average_abs_max'
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 8000
+ eval_iter: 1000
+ learning_rate:
+ type: CosineAnnealingDecay
+ learning_rate: 0.00003
+ T_max: 8000
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 0.00004
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov7_l_qat_dis.yaml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov7_l_qat_dis.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..801ccb4057c4f36fe379c281a21965ddc63a2e8b
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov7_l_qat_dis.yaml
@@ -0,0 +1,30 @@
+
+Global:
+ reader_config: configs/yolov5_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./yolov7_l_300e_coco
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: soft_label
+
+QuantAware:
+ activation_quantize_type: 'moving_average_abs_max'
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 8000
+ eval_iter: 1000
+ learning_rate:
+ type: CosineAnnealingDecay
+ learning_rate: 0.00003
+ T_max: 8000
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 0.00004
\ No newline at end of file
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov8_reader.yml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov8_reader.yml
new file mode 100644
index 0000000000000000000000000000000000000000..202a49415572201811ed53fe806c2b31c9051fde
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov8_reader.yml
@@ -0,0 +1,27 @@
+metric: COCO
+num_classes: 80
+
+# Dataset configuration
+TrainDataset:
+ !COCODataSet
+ image_dir: train2017
+ anno_path: annotations/instances_train2017.json
+ dataset_dir: dataset/coco/
+
+EvalDataset:
+ !COCODataSet
+ image_dir: val2017
+ anno_path: annotations/instances_val2017.json
+ dataset_dir: dataset/coco/
+
+worker_num: 0
+
+# preprocess reader in test
+EvalReader:
+ sample_transforms:
+ - Decode: {}
+ - Resize: {target_size: [640, 640], keep_ratio: True, interp: 1}
+ - Pad: {size: [640, 640], fill_value: [114., 114., 114.]}
+ - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+ - Permute: {}
+ batch_size: 4
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov8_s_qat_dis.yaml b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov8_s_qat_dis.yaml
new file mode 100644
index 0000000000000000000000000000000000000000..8c93e203e918d798e055e260d73f747a6ef9d5cb
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/configs/yolov8_s_qat_dis.yaml
@@ -0,0 +1,32 @@
+
+Global:
+ reader_config: configs/yolov8_reader.yml
+ include_nms: True
+ Evaluation: True
+ model_dir: ./yolov8_s_500e_coco_trt_nms/
+ model_filename: model.pdmodel
+ params_filename: model.pdiparams
+
+Distillation:
+ alpha: 1.0
+ loss: soft_label
+
+QuantAware:
+ onnx_format: true
+ activation_quantize_type: 'moving_average_abs_max'
+ quantize_op_types:
+ - conv2d
+ - depthwise_conv2d
+
+TrainConfig:
+ train_iter: 8000
+ eval_iter: 1000
+ learning_rate:
+ type: CosineAnnealingDecay
+ learning_rate: 0.00003
+ T_max: 10000
+ optimizer_builder:
+ optimizer:
+ type: SGD
+ weight_decay: 4.0e-05
+
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/eval.py b/PaddleDetection-release-2.6/deploy/auto_compression/eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..6de8aff85ce5f3cffa4119a1a3c26e318101db74
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/eval.py
@@ -0,0 +1,163 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import numpy as np
+import argparse
+import paddle
+from ppdet.core.workspace import load_config, merge_config
+from ppdet.core.workspace import create
+from ppdet.metrics import COCOMetric, VOCMetric, KeyPointTopDownCOCOEval
+from paddleslim.auto_compression.config_helpers import load_config as load_slim_config
+from post_process import PPYOLOEPostProcess
+
+
+def argsparser():
+ parser = argparse.ArgumentParser(description=__doc__)
+ parser.add_argument(
+ '--config_path',
+ type=str,
+ default=None,
+ help="path of compression strategy config.",
+ required=True)
+ parser.add_argument(
+ '--devices',
+ type=str,
+ default='gpu',
+ help="which device used to compress.")
+
+ return parser
+
+
+def reader_wrapper(reader, input_list):
+ def gen():
+ for data in reader:
+ in_dict = {}
+ if isinstance(input_list, list):
+ for input_name in input_list:
+ in_dict[input_name] = data[input_name]
+ elif isinstance(input_list, dict):
+ for input_name in input_list.keys():
+ in_dict[input_list[input_name]] = data[input_name]
+ yield in_dict
+
+ return gen
+
+
+def convert_numpy_data(data, metric):
+ data_all = {}
+ data_all = {k: np.array(v) for k, v in data.items()}
+ if isinstance(metric, VOCMetric):
+ for k, v in data_all.items():
+ if not isinstance(v[0], np.ndarray):
+ tmp_list = []
+ for t in v:
+ tmp_list.append(np.array(t))
+ data_all[k] = np.array(tmp_list)
+ else:
+ data_all = {k: np.array(v) for k, v in data.items()}
+ return data_all
+
+
+def eval():
+
+ place = paddle.CUDAPlace(0) if FLAGS.devices == 'gpu' else paddle.CPUPlace()
+ exe = paddle.static.Executor(place)
+
+ val_program, feed_target_names, fetch_targets = paddle.static.load_inference_model(
+ global_config["model_dir"].rstrip('/'),
+ exe,
+ model_filename=global_config["model_filename"],
+ params_filename=global_config["params_filename"])
+ print('Loaded model from: {}'.format(global_config["model_dir"]))
+
+ metric = global_config['metric']
+ for batch_id, data in enumerate(val_loader):
+ data_all = convert_numpy_data(data, metric)
+ data_input = {}
+ for k, v in data.items():
+ if isinstance(global_config['input_list'], list):
+ if k in global_config['input_list']:
+ data_input[k] = np.array(v)
+ elif isinstance(global_config['input_list'], dict):
+ if k in global_config['input_list'].keys():
+ data_input[global_config['input_list'][k]] = np.array(v)
+
+ outs = exe.run(val_program,
+ feed=data_input,
+ fetch_list=fetch_targets,
+ return_numpy=False)
+ res = {}
+ if 'arch' in global_config and global_config['arch'] == 'PPYOLOE':
+ postprocess = PPYOLOEPostProcess(
+ score_threshold=0.01, nms_threshold=0.6)
+ res = postprocess(np.array(outs[0]), data_all['scale_factor'])
+ else:
+ for out in outs:
+ v = np.array(out)
+ if len(v.shape) > 1:
+ res['bbox'] = v
+ else:
+ res['bbox_num'] = v
+ metric.update(data_all, res)
+ if batch_id % 100 == 0:
+ print('Eval iter:', batch_id)
+ metric.accumulate()
+ metric.log()
+ metric.reset()
+
+
+def main():
+ global global_config
+ all_config = load_slim_config(FLAGS.config_path)
+ assert "Global" in all_config, "Key 'Global' not found in config file."
+ global_config = all_config["Global"]
+ reader_cfg = load_config(global_config['reader_config'])
+
+ dataset = reader_cfg['EvalDataset']
+ global val_loader
+ val_loader = create('EvalReader')(reader_cfg['EvalDataset'],
+ reader_cfg['worker_num'],
+ return_list=True)
+ metric = None
+ if reader_cfg['metric'] == 'COCO':
+ clsid2catid = {v: k for k, v in dataset.catid2clsid.items()}
+ anno_file = dataset.get_anno()
+ metric = COCOMetric(
+ anno_file=anno_file, clsid2catid=clsid2catid, IouType='bbox')
+ elif reader_cfg['metric'] == 'VOC':
+ metric = VOCMetric(
+ label_list=dataset.get_label_list(),
+ class_num=reader_cfg['num_classes'],
+ map_type=reader_cfg['map_type'])
+ elif reader_cfg['metric'] == 'KeyPointTopDownCOCOEval':
+ anno_file = dataset.get_anno()
+ metric = KeyPointTopDownCOCOEval(anno_file,
+ len(dataset), 17, 'output_eval')
+ else:
+ raise ValueError("metric currently only supports COCO and VOC.")
+ global_config['metric'] = metric
+
+ eval()
+
+
+if __name__ == '__main__':
+ paddle.enable_static()
+ parser = argsparser()
+ FLAGS = parser.parse_args()
+ assert FLAGS.devices in ['cpu', 'gpu', 'xpu', 'npu']
+ paddle.set_device(FLAGS.devices)
+
+ main()
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/post_process.py b/PaddleDetection-release-2.6/deploy/auto_compression/post_process.py
new file mode 100644
index 0000000000000000000000000000000000000000..eea2f019548ec288a23e37b3bd2faf24f9a98935
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/post_process.py
@@ -0,0 +1,157 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import numpy as np
+import cv2
+
+
+def hard_nms(box_scores, iou_threshold, top_k=-1, candidate_size=200):
+ """
+ Args:
+ box_scores (N, 5): boxes in corner-form and probabilities.
+ iou_threshold: intersection over union threshold.
+ top_k: keep top_k results. If k <= 0, keep all the results.
+ candidate_size: only consider the candidates with the highest scores.
+ Returns:
+ picked: a list of indexes of the kept boxes
+ """
+ scores = box_scores[:, -1]
+ boxes = box_scores[:, :-1]
+ picked = []
+ indexes = np.argsort(scores)
+ indexes = indexes[-candidate_size:]
+ while len(indexes) > 0:
+ current = indexes[-1]
+ picked.append(current)
+ if 0 < top_k == len(picked) or len(indexes) == 1:
+ break
+ current_box = boxes[current, :]
+ indexes = indexes[:-1]
+ rest_boxes = boxes[indexes, :]
+ iou = iou_of(
+ rest_boxes,
+ np.expand_dims(
+ current_box, axis=0), )
+ indexes = indexes[iou <= iou_threshold]
+
+ return box_scores[picked, :]
+
+
+def iou_of(boxes0, boxes1, eps=1e-5):
+ """Return intersection-over-union (Jaccard index) of boxes.
+ Args:
+ boxes0 (N, 4): ground truth boxes.
+ boxes1 (N or 1, 4): predicted boxes.
+ eps: a small number to avoid 0 as denominator.
+ Returns:
+ iou (N): IoU values.
+ """
+ overlap_left_top = np.maximum(boxes0[..., :2], boxes1[..., :2])
+ overlap_right_bottom = np.minimum(boxes0[..., 2:], boxes1[..., 2:])
+
+ overlap_area = area_of(overlap_left_top, overlap_right_bottom)
+ area0 = area_of(boxes0[..., :2], boxes0[..., 2:])
+ area1 = area_of(boxes1[..., :2], boxes1[..., 2:])
+ return overlap_area / (area0 + area1 - overlap_area + eps)
+
+
+def area_of(left_top, right_bottom):
+ """Compute the areas of rectangles given two corners.
+ Args:
+ left_top (N, 2): left top corner.
+ right_bottom (N, 2): right bottom corner.
+ Returns:
+ area (N): return the area.
+ """
+ hw = np.clip(right_bottom - left_top, 0.0, None)
+ return hw[..., 0] * hw[..., 1]
+
+
+class PPYOLOEPostProcess(object):
+ """
+ Args:
+ input_shape (int): network input image size
+ scale_factor (float): scale factor of ori image
+ """
+
+ def __init__(self,
+ score_threshold=0.4,
+ nms_threshold=0.5,
+ nms_top_k=10000,
+ keep_top_k=300):
+ self.score_threshold = score_threshold
+ self.nms_threshold = nms_threshold
+ self.nms_top_k = nms_top_k
+ self.keep_top_k = keep_top_k
+
+ def _non_max_suppression(self, prediction, scale_factor):
+ batch_size = prediction.shape[0]
+ out_boxes_list = []
+ box_num_list = []
+ for batch_id in range(batch_size):
+ bboxes, confidences = prediction[batch_id][..., :4], prediction[
+ batch_id][..., 4:]
+ # nms
+ picked_box_probs = []
+ picked_labels = []
+ for class_index in range(0, confidences.shape[1]):
+ probs = confidences[:, class_index]
+ mask = probs > self.score_threshold
+ probs = probs[mask]
+ if probs.shape[0] == 0:
+ continue
+ subset_boxes = bboxes[mask, :]
+ box_probs = np.concatenate(
+ [subset_boxes, probs.reshape(-1, 1)], axis=1)
+ box_probs = hard_nms(
+ box_probs,
+ iou_threshold=self.nms_threshold,
+ top_k=self.nms_top_k)
+ picked_box_probs.append(box_probs)
+ picked_labels.extend([class_index] * box_probs.shape[0])
+
+ if len(picked_box_probs) == 0:
+ out_boxes_list.append(np.empty((0, 4)))
+
+ else:
+ picked_box_probs = np.concatenate(picked_box_probs)
+ # resize output boxes
+ picked_box_probs[:, 0] /= scale_factor[batch_id][1]
+ picked_box_probs[:, 2] /= scale_factor[batch_id][1]
+ picked_box_probs[:, 1] /= scale_factor[batch_id][0]
+ picked_box_probs[:, 3] /= scale_factor[batch_id][0]
+
+ # clas score box
+ out_box = np.concatenate(
+ [
+ np.expand_dims(
+ np.array(picked_labels), axis=-1), np.expand_dims(
+ picked_box_probs[:, 4], axis=-1),
+ picked_box_probs[:, :4]
+ ],
+ axis=1)
+ if out_box.shape[0] > self.keep_top_k:
+ out_box = out_box[out_box[:, 1].argsort()[::-1]
+ [:self.keep_top_k]]
+ out_boxes_list.append(out_box)
+ box_num_list.append(out_box.shape[0])
+
+ out_boxes_list = np.concatenate(out_boxes_list, axis=0)
+ box_num_list = np.array(box_num_list)
+ return out_boxes_list, box_num_list
+
+ def __call__(self, outs, scale_factor):
+ out_boxes_list, box_num_list = self._non_max_suppression(outs,
+ scale_factor)
+ return {'bbox': out_boxes_list, 'bbox_num': box_num_list}
diff --git a/PaddleDetection-release-2.6/deploy/auto_compression/run.py b/PaddleDetection-release-2.6/deploy/auto_compression/run.py
new file mode 100644
index 0000000000000000000000000000000000000000..d940307db618c80f015b32637e7610784d1affb9
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/auto_compression/run.py
@@ -0,0 +1,191 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import sys
+import numpy as np
+import argparse
+import paddle
+from ppdet.core.workspace import load_config, merge_config
+from ppdet.core.workspace import create
+from ppdet.metrics import COCOMetric, VOCMetric, KeyPointTopDownCOCOEval
+from paddleslim.auto_compression.config_helpers import load_config as load_slim_config
+from paddleslim.auto_compression import AutoCompression
+from post_process import PPYOLOEPostProcess
+from paddleslim.common.dataloader import get_feed_vars
+
+
+def argsparser():
+ parser = argparse.ArgumentParser(description=__doc__)
+ parser.add_argument(
+ '--config_path',
+ type=str,
+ default=None,
+ help="path of compression strategy config.",
+ required=True)
+ parser.add_argument(
+ '--save_dir',
+ type=str,
+ default='output',
+ help="directory to save compressed model.")
+ parser.add_argument(
+ '--devices',
+ type=str,
+ default='gpu',
+ help="which device used to compress.")
+
+ return parser
+
+
+def reader_wrapper(reader, input_list):
+ def gen():
+ for data in reader:
+ in_dict = {}
+ if isinstance(input_list, list):
+ for input_name in input_list:
+ in_dict[input_name] = data[input_name]
+ elif isinstance(input_list, dict):
+ for input_name in input_list.keys():
+ in_dict[input_list[input_name]] = data[input_name]
+ yield in_dict
+
+ return gen
+
+
+def convert_numpy_data(data, metric):
+ data_all = {}
+ data_all = {k: np.array(v) for k, v in data.items()}
+ if isinstance(metric, VOCMetric):
+ for k, v in data_all.items():
+ if not isinstance(v[0], np.ndarray):
+ tmp_list = []
+ for t in v:
+ tmp_list.append(np.array(t))
+ data_all[k] = np.array(tmp_list)
+ else:
+ data_all = {k: np.array(v) for k, v in data.items()}
+ return data_all
+
+
+def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
+ metric = global_config['metric']
+ for batch_id, data in enumerate(val_loader):
+ data_all = convert_numpy_data(data, metric)
+ data_input = {}
+ for k, v in data.items():
+ if isinstance(global_config['input_list'], list):
+ if k in test_feed_names:
+ data_input[k] = np.array(v)
+ elif isinstance(global_config['input_list'], dict):
+ if k in global_config['input_list'].keys():
+ data_input[global_config['input_list'][k]] = np.array(v)
+ outs = exe.run(compiled_test_program,
+ feed=data_input,
+ fetch_list=test_fetch_list,
+ return_numpy=False)
+ res = {}
+ if 'include_nms' in global_config and not global_config['include_nms']:
+ if 'arch' in global_config and global_config['arch'] == 'PPYOLOE':
+ postprocess = PPYOLOEPostProcess(
+ score_threshold=0.01, nms_threshold=0.6)
+ else:
+ assert "Not support arch={} now.".format(global_config['arch'])
+ res = postprocess(np.array(outs[0]), data_all['scale_factor'])
+ else:
+ for out in outs:
+ v = np.array(out)
+ if len(v.shape) > 1:
+ res['bbox'] = v
+ else:
+ res['bbox_num'] = v
+
+ metric.update(data_all, res)
+ if batch_id % 100 == 0:
+ print('Eval iter:', batch_id)
+ metric.accumulate()
+ metric.log()
+ map_res = metric.get_results()
+ metric.reset()
+ map_key = 'keypoint' if 'arch' in global_config and global_config[
+ 'arch'] == 'keypoint' else 'bbox'
+ return map_res[map_key][0]
+
+
+def main():
+ global global_config
+ all_config = load_slim_config(FLAGS.config_path)
+ assert "Global" in all_config, "Key 'Global' not found in config file."
+ global_config = all_config["Global"]
+ reader_cfg = load_config(global_config['reader_config'])
+
+ train_loader = create('EvalReader')(reader_cfg['TrainDataset'],
+ reader_cfg['worker_num'],
+ return_list=True)
+ if global_config.get('input_list') is None:
+ global_config['input_list'] = get_feed_vars(
+ global_config['model_dir'], global_config['model_filename'],
+ global_config['params_filename'])
+ train_loader = reader_wrapper(train_loader, global_config['input_list'])
+
+ if 'Evaluation' in global_config.keys() and global_config[
+ 'Evaluation'] and paddle.distributed.get_rank() == 0:
+ eval_func = eval_function
+ dataset = reader_cfg['EvalDataset']
+ global val_loader
+ _eval_batch_sampler = paddle.io.BatchSampler(
+ dataset, batch_size=reader_cfg['EvalReader']['batch_size'])
+ val_loader = create('EvalReader')(dataset,
+ reader_cfg['worker_num'],
+ batch_sampler=_eval_batch_sampler,
+ return_list=True)
+ metric = None
+ if reader_cfg['metric'] == 'COCO':
+ clsid2catid = {v: k for k, v in dataset.catid2clsid.items()}
+ anno_file = dataset.get_anno()
+ metric = COCOMetric(
+ anno_file=anno_file, clsid2catid=clsid2catid, IouType='bbox')
+ elif reader_cfg['metric'] == 'VOC':
+ metric = VOCMetric(
+ label_list=dataset.get_label_list(),
+ class_num=reader_cfg['num_classes'],
+ map_type=reader_cfg['map_type'])
+ elif reader_cfg['metric'] == 'KeyPointTopDownCOCOEval':
+ anno_file = dataset.get_anno()
+ metric = KeyPointTopDownCOCOEval(anno_file,
+ len(dataset), 17, 'output_eval')
+ else:
+ raise ValueError("metric currently only supports COCO and VOC.")
+ global_config['metric'] = metric
+ else:
+ eval_func = None
+
+ ac = AutoCompression(
+ model_dir=global_config["model_dir"],
+ model_filename=global_config["model_filename"],
+ params_filename=global_config["params_filename"],
+ save_dir=FLAGS.save_dir,
+ config=all_config,
+ train_dataloader=train_loader,
+ eval_callback=eval_func)
+ ac.compress()
+
+
+if __name__ == '__main__':
+ paddle.enable_static()
+ parser = argsparser()
+ FLAGS = parser.parse_args()
+ assert FLAGS.devices in ['cpu', 'gpu', 'xpu', 'npu']
+ paddle.set_device(FLAGS.devices)
+
+ main()
diff --git a/PaddleDetection-release-2.6/deploy/benchmark/benchmark.sh b/PaddleDetection-release-2.6/deploy/benchmark/benchmark.sh
new file mode 100644
index 0000000000000000000000000000000000000000..e29aaa884d30316237aede0c18b38e2cc520ee4b
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/benchmark/benchmark.sh
@@ -0,0 +1,36 @@
+# All rights `PaddleDetection` reserved
+#!/bin/bash
+model_dir=$1
+model_name=$2
+
+export img_dir="demo"
+export log_path="output_pipeline"
+
+
+echo "model_dir : ${model_dir}"
+echo "img_dir: ${img_dir}"
+
+# TODO: support batch size>1
+for use_mkldnn in "True" "False"; do
+ for threads in "1" "6"; do
+ echo "${model_name} ${model_dir}, use_mkldnn: ${use_mkldnn} threads: ${threads}"
+ python deploy/python/infer.py \
+ --model_dir=${model_dir} \
+ --run_benchmark True \
+ --enable_mkldnn=${use_mkldnn} \
+ --device=CPU \
+ --cpu_threads=${threads} \
+ --image_dir=${img_dir} 2>&1 | tee ${log_path}/${model_name}_cpu_usemkldnn_${use_mkldnn}_cputhreads_${threads}_bs1_infer.log
+ done
+done
+
+for run_mode in "fluid" "trt_fp32" "trt_fp16"; do
+ echo "${model_name} ${model_dir}, run_mode: ${run_mode}"
+ python deploy/python/infer.py \
+ --model_dir=${model_dir} \
+ --run_benchmark=True \
+ --device=GPU \
+ --run_mode=${run_mode} \
+ --image_dir=${img_dir} 2>&1 | tee ${log_path}/${model_name}_gpu_runmode_${run_mode}_bs1_infer.log
+done
+
diff --git a/PaddleDetection-release-2.6/deploy/benchmark/benchmark_quant.sh b/PaddleDetection-release-2.6/deploy/benchmark/benchmark_quant.sh
new file mode 100644
index 0000000000000000000000000000000000000000..a21541dd044bf9bd4a33bb4eb2116b47743e5a8a
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/benchmark/benchmark_quant.sh
@@ -0,0 +1,23 @@
+# All rights `PaddleDetection` reserved
+#!/bin/bash
+model_dir=$1
+model_name=$2
+
+export img_dir="demo"
+export log_path="output_pipeline"
+
+
+echo "model_dir : ${model_dir}"
+echo "img_dir: ${img_dir}"
+
+# TODO: support batch size>1
+for run_mode in "trt_int8"; do
+ echo "${model_name} ${model_dir}, run_mode: ${run_mode}"
+ python deploy/python/infer.py \
+ --model_dir=${model_dir} \
+ --run_benchmark=True \
+ --device=GPU \
+ --run_mode=${run_mode} \
+ --image_dir=${img_dir} 2>&1 | tee ${log_path}/${model_name}_gpu_runmode_${run_mode}_bs1_infer.log
+done
+
diff --git a/PaddleDetection-release-2.6/deploy/benchmark/log_parser_excel.py b/PaddleDetection-release-2.6/deploy/benchmark/log_parser_excel.py
new file mode 100644
index 0000000000000000000000000000000000000000..317b3759572c6acef3438fbc654bc5918e8bdd38
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/benchmark/log_parser_excel.py
@@ -0,0 +1,300 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import re
+import argparse
+import pandas as pd
+
+
+def parse_args():
+ """
+ parse input args
+ """
+ parser = argparse.ArgumentParser()
+ parser.add_argument(
+ "--log_path",
+ type=str,
+ default="./output_pipeline",
+ help="benchmark log path")
+ parser.add_argument(
+ "--output_name",
+ type=str,
+ default="benchmark_excel.xlsx",
+ help="output excel file name")
+ parser.add_argument(
+ "--analysis_trt", dest="analysis_trt", action='store_true')
+ parser.add_argument(
+ "--analysis_mkl", dest="analysis_mkl", action='store_true')
+ return parser.parse_args()
+
+
+def find_all_logs(path_walk):
+ """
+ find all .log files from target dir
+ """
+ for root, ds, files in os.walk(path_walk):
+ for file_name in files:
+ if re.match(r'.*.log', file_name):
+ full_path = os.path.join(root, file_name)
+ yield file_name, full_path
+
+
+def process_log(file_name):
+ """
+ process log to dict
+ """
+ output_dict = {}
+ with open(file_name, 'r') as f:
+ for i, data in enumerate(f.readlines()):
+ if i == 0:
+ continue
+ line_lists = data.split(" ")
+
+ # conf info
+ if "runtime_device:" in line_lists:
+ pos_buf = line_lists.index("runtime_device:")
+ output_dict["runtime_device"] = line_lists[pos_buf + 1].strip()
+ if "ir_optim:" in line_lists:
+ pos_buf = line_lists.index("ir_optim:")
+ output_dict["ir_optim"] = line_lists[pos_buf + 1].strip()
+ if "enable_memory_optim:" in line_lists:
+ pos_buf = line_lists.index("enable_memory_optim:")
+ output_dict["enable_memory_optim"] = line_lists[pos_buf +
+ 1].strip()
+ if "enable_tensorrt:" in line_lists:
+ pos_buf = line_lists.index("enable_tensorrt:")
+ output_dict["enable_tensorrt"] = line_lists[pos_buf + 1].strip()
+ if "precision:" in line_lists:
+ pos_buf = line_lists.index("precision:")
+ output_dict["precision"] = line_lists[pos_buf + 1].strip()
+ if "enable_mkldnn:" in line_lists:
+ pos_buf = line_lists.index("enable_mkldnn:")
+ output_dict["enable_mkldnn"] = line_lists[pos_buf + 1].strip()
+ if "cpu_math_library_num_threads:" in line_lists:
+ pos_buf = line_lists.index("cpu_math_library_num_threads:")
+ output_dict["cpu_math_library_num_threads"] = line_lists[
+ pos_buf + 1].strip()
+
+ # model info
+ if "model_name:" in line_lists:
+ pos_buf = line_lists.index("model_name:")
+ output_dict["model_name"] = list(
+ filter(None, line_lists[pos_buf + 1].strip().split('/')))[
+ -1]
+
+ # data info
+ if "batch_size:" in line_lists:
+ pos_buf = line_lists.index("batch_size:")
+ output_dict["batch_size"] = line_lists[pos_buf + 1].strip()
+ if "input_shape:" in line_lists:
+ pos_buf = line_lists.index("input_shape:")
+ output_dict["input_shape"] = line_lists[pos_buf + 1].strip()
+
+ # perf info
+ if "cpu_rss(MB):" in line_lists:
+ pos_buf = line_lists.index("cpu_rss(MB):")
+ output_dict["cpu_rss(MB)"] = line_lists[pos_buf + 1].strip(
+ ).split(',')[0]
+ if "gpu_rss(MB):" in line_lists:
+ pos_buf = line_lists.index("gpu_rss(MB):")
+ output_dict["gpu_rss(MB)"] = line_lists[pos_buf + 1].strip(
+ ).split(',')[0]
+ if "gpu_util:" in line_lists:
+ pos_buf = line_lists.index("gpu_util:")
+ output_dict["gpu_util"] = line_lists[pos_buf + 1].strip().split(
+ ',')[0]
+ if "preproce_time(ms):" in line_lists:
+ pos_buf = line_lists.index("preproce_time(ms):")
+ output_dict["preproce_time(ms)"] = line_lists[
+ pos_buf + 1].strip().split(',')[0]
+ if "inference_time(ms):" in line_lists:
+ pos_buf = line_lists.index("inference_time(ms):")
+ output_dict["inference_time(ms)"] = line_lists[
+ pos_buf + 1].strip().split(',')[0]
+ if "postprocess_time(ms):" in line_lists:
+ pos_buf = line_lists.index("postprocess_time(ms):")
+ output_dict["postprocess_time(ms)"] = line_lists[
+ pos_buf + 1].strip().split(',')[0]
+ return output_dict
+
+
+def filter_df_merge(cpu_df, filter_column=None):
+ """
+ process cpu data frame, merge by 'model_name', 'batch_size'
+ Args:
+ cpu_df ([type]): [description]
+ """
+ if not filter_column:
+ raise Exception(
+ "please assign filter_column for filter_df_merge function")
+
+ df_lists = []
+ filter_column_lists = []
+ for k, v in cpu_df.groupby(filter_column, dropna=True):
+ filter_column_lists.append(k)
+ df_lists.append(v)
+ final_output_df = df_lists[-1]
+
+ # merge same model
+ for i in range(len(df_lists) - 1):
+ left_suffix = cpu_df[filter_column].unique()[0]
+ right_suffix = df_lists[i][filter_column].unique()[0]
+ print(left_suffix, right_suffix)
+ if not pd.isnull(right_suffix):
+ final_output_df = pd.merge(
+ final_output_df,
+ df_lists[i],
+ how='left',
+ left_on=['model_name', 'batch_size'],
+ right_on=['model_name', 'batch_size'],
+ suffixes=('', '_{0}_{1}'.format(filter_column, right_suffix)))
+
+ # rename default df columns
+ origin_column_names = list(cpu_df.columns.values)
+ origin_column_names.remove(filter_column)
+ suffix = final_output_df[filter_column].unique()[0]
+ for name in origin_column_names:
+ final_output_df.rename(
+ columns={name: "{0}_{1}_{2}".format(name, filter_column, suffix)},
+ inplace=True)
+ final_output_df.rename(
+ columns={
+ filter_column: "{0}_{1}_{2}".format(filter_column, filter_column,
+ suffix)
+ },
+ inplace=True)
+
+ final_output_df.sort_values(
+ by=[
+ "model_name_{0}_{1}".format(filter_column, suffix),
+ "batch_size_{0}_{1}".format(filter_column, suffix)
+ ],
+ inplace=True)
+ return final_output_df
+
+
+def trt_perf_analysis(raw_df):
+ """
+ sperate raw dataframe to a list of dataframe
+ compare tensorrt percision performance
+ """
+ # filter df by gpu, compare tensorrt and gpu
+ # define default dataframe for gpu performance analysis
+ gpu_df = raw_df.loc[raw_df['runtime_device'] == 'gpu']
+ new_df = filter_df_merge(gpu_df, "precision")
+
+ # calculate qps diff percentile
+ infer_fp32 = "inference_time(ms)_precision_fp32"
+ infer_fp16 = "inference_time(ms)_precision_fp16"
+ infer_int8 = "inference_time(ms)_precision_int8"
+ new_df["fp32_fp16_diff"] = new_df[[infer_fp32, infer_fp16]].apply(
+ lambda x: (float(x[infer_fp16]) - float(x[infer_fp32])) / float(x[infer_fp32]),
+ axis=1)
+ new_df["fp32_gpu_diff"] = new_df[["inference_time(ms)", infer_fp32]].apply(
+ lambda x: (float(x[infer_fp32]) - float(x[infer_fp32])) / float(x["inference_time(ms)"]),
+ axis=1)
+ new_df["fp16_int8_diff"] = new_df[[infer_fp16, infer_int8]].apply(
+ lambda x: (float(x[infer_int8]) - float(x[infer_fp16])) / float(x[infer_fp16]),
+ axis=1)
+
+ return new_df
+
+
+def mkl_perf_analysis(raw_df):
+ """
+ sperate raw dataframe to a list of dataframe
+ compare mkldnn performance with not enable mkldnn
+ """
+ # filter df by cpu, compare mkl and cpu
+ # define default dataframe for cpu mkldnn analysis
+ cpu_df = raw_df.loc[raw_df['runtime_device'] == 'cpu']
+ mkl_compare_df = cpu_df.loc[cpu_df['cpu_math_library_num_threads'] == '1']
+ thread_compare_df = cpu_df.loc[cpu_df['enable_mkldnn'] == 'True']
+
+ # define dataframe need to be analyzed
+ output_mkl_df = filter_df_merge(mkl_compare_df, 'enable_mkldnn')
+ output_thread_df = filter_df_merge(thread_compare_df,
+ 'cpu_math_library_num_threads')
+
+ # calculate performance diff percentile
+ # compare mkl performance with cpu
+ enable_mkldnn = "inference_time(ms)_enable_mkldnn_True"
+ disable_mkldnn = "inference_time(ms)_enable_mkldnn_False"
+ output_mkl_df["mkl_infer_diff"] = output_mkl_df[[
+ enable_mkldnn, disable_mkldnn
+ ]].apply(
+ lambda x: (float(x[enable_mkldnn]) - float(x[disable_mkldnn])) / float(x[disable_mkldnn]),
+ axis=1)
+ cpu_enable_mkldnn = "cpu_rss(MB)_enable_mkldnn_True"
+ cpu_disable_mkldnn = "cpu_rss(MB)_enable_mkldnn_False"
+ output_mkl_df["mkl_cpu_rss_diff"] = output_mkl_df[[
+ cpu_enable_mkldnn, cpu_disable_mkldnn
+ ]].apply(
+ lambda x: (float(x[cpu_enable_mkldnn]) - float(x[cpu_disable_mkldnn])) / float(x[cpu_disable_mkldnn]),
+ axis=1)
+
+ # compare cpu_multi_thread performance with cpu
+ num_threads_1 = "inference_time(ms)_cpu_math_library_num_threads_1"
+ num_threads_6 = "inference_time(ms)_cpu_math_library_num_threads_6"
+ output_thread_df["mkl_infer_diff"] = output_thread_df[[
+ num_threads_6, num_threads_1
+ ]].apply(
+ lambda x: (float(x[num_threads_6]) - float(x[num_threads_1])) / float(x[num_threads_1]),
+ axis=1)
+ cpu_num_threads_1 = "cpu_rss(MB)_cpu_math_library_num_threads_1"
+ cpu_num_threads_6 = "cpu_rss(MB)_cpu_math_library_num_threads_6"
+ output_thread_df["mkl_cpu_rss_diff"] = output_thread_df[[
+ cpu_num_threads_6, cpu_num_threads_1
+ ]].apply(
+ lambda x: (float(x[cpu_num_threads_6]) - float(x[cpu_num_threads_1])) / float(x[cpu_num_threads_1]),
+ axis=1)
+
+ return output_mkl_df, output_thread_df
+
+
+def main():
+ """
+ main
+ """
+ args = parse_args()
+ # create empty DataFrame
+ origin_df = pd.DataFrame(columns=[
+ "model_name", "batch_size", "input_shape", "runtime_device", "ir_optim",
+ "enable_memory_optim", "enable_tensorrt", "precision", "enable_mkldnn",
+ "cpu_math_library_num_threads", "preproce_time(ms)",
+ "inference_time(ms)", "postprocess_time(ms)", "cpu_rss(MB)",
+ "gpu_rss(MB)", "gpu_util"
+ ])
+
+ for file_name, full_path in find_all_logs(args.log_path):
+ dict_log = process_log(full_path)
+ origin_df = origin_df.append(dict_log, ignore_index=True)
+
+ raw_df = origin_df.sort_values(by='model_name')
+ raw_df.sort_values(by=["model_name", "batch_size"], inplace=True)
+ raw_df.to_excel(args.output_name)
+
+ if args.analysis_trt:
+ trt_df = trt_perf_analysis(raw_df)
+ trt_df.to_excel("trt_analysis_{}".format(args.output_name))
+
+ if args.analysis_mkl:
+ mkl_df, thread_df = mkl_perf_analysis(raw_df)
+ mkl_df.to_excel("mkl_enable_analysis_{}".format(args.output_name))
+ thread_df.to_excel("mkl_threads_analysis_{}".format(args.output_name))
+
+
+if __name__ == "__main__":
+ main()
diff --git a/PaddleDetection-release-2.6/deploy/cpp/CMakeLists.txt b/PaddleDetection-release-2.6/deploy/cpp/CMakeLists.txt
new file mode 100644
index 0000000000000000000000000000000000000000..34f8808d53e085c43048c4955a5715d663e4291e
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/cpp/CMakeLists.txt
@@ -0,0 +1,264 @@
+cmake_minimum_required(VERSION 3.0)
+project(PaddleObjectDetector CXX C)
+
+option(WITH_MKL "Compile demo with MKL/OpenBlas support,defaultuseMKL." ON)
+option(WITH_GPU "Compile demo with GPU/CPU, default use CPU." ON)
+option(WITH_TENSORRT "Compile demo with TensorRT." OFF)
+
+option(WITH_KEYPOINT "Whether to Compile KeyPoint detector" OFF)
+option(WITH_MOT "Whether to Compile MOT detector" OFF)
+
+SET(PADDLE_DIR "" CACHE PATH "Location of libraries")
+SET(PADDLE_LIB_NAME "" CACHE STRING "libpaddle_inference")
+SET(OPENCV_DIR "" CACHE PATH "Location of libraries")
+SET(CUDA_LIB "" CACHE PATH "Location of libraries")
+SET(CUDNN_LIB "" CACHE PATH "Location of libraries")
+SET(TENSORRT_INC_DIR "" CACHE PATH "Compile demo with TensorRT")
+SET(TENSORRT_LIB_DIR "" CACHE PATH "Compile demo with TensorRT")
+
+include(cmake/yaml-cpp.cmake)
+
+include_directories("${CMAKE_SOURCE_DIR}/")
+include_directories("${CMAKE_CURRENT_BINARY_DIR}/ext/yaml-cpp/src/ext-yaml-cpp/include")
+link_directories("${CMAKE_CURRENT_BINARY_DIR}/ext/yaml-cpp/lib")
+
+if (WITH_KEYPOINT)
+ set(SRCS src/main_keypoint.cc src/preprocess_op.cc src/object_detector.cc src/picodet_postprocess.cc src/utils.cc src/keypoint_detector.cc src/keypoint_postprocess.cc)
+elseif (WITH_MOT)
+ set(SRCS src/main_jde.cc src/preprocess_op.cc src/object_detector.cc src/jde_detector.cc src/tracker.cc src/trajectory.cc src/lapjv.cpp src/picodet_postprocess.cc src/utils.cc)
+else ()
+ set(SRCS src/main.cc src/preprocess_op.cc src/object_detector.cc src/picodet_postprocess.cc src/utils.cc)
+endif()
+
+macro(safe_set_static_flag)
+ foreach(flag_var
+ CMAKE_CXX_FLAGS CMAKE_CXX_FLAGS_DEBUG CMAKE_CXX_FLAGS_RELEASE
+ CMAKE_CXX_FLAGS_MINSIZEREL CMAKE_CXX_FLAGS_RELWITHDEBINFO)
+ if(${flag_var} MATCHES "/MD")
+ string(REGEX REPLACE "/MD" "/MT" ${flag_var} "${${flag_var}}")
+ endif(${flag_var} MATCHES "/MD")
+ endforeach(flag_var)
+endmacro()
+
+if (WITH_MKL)
+ ADD_DEFINITIONS(-DUSE_MKL)
+endif()
+
+if (NOT DEFINED PADDLE_DIR OR ${PADDLE_DIR} STREQUAL "")
+ message(FATAL_ERROR "please set PADDLE_DIR with -DPADDLE_DIR=/path/paddle_influence_dir")
+endif()
+message("PADDLE_DIR IS:" ${PADDLE_DIR})
+
+if (NOT DEFINED OPENCV_DIR OR ${OPENCV_DIR} STREQUAL "")
+ message(FATAL_ERROR "please set OPENCV_DIR with -DOPENCV_DIR=/path/opencv")
+endif()
+
+include_directories("${CMAKE_SOURCE_DIR}/")
+include_directories("${PADDLE_DIR}/")
+include_directories("${PADDLE_DIR}/third_party/install/protobuf/include")
+include_directories("${PADDLE_DIR}/third_party/install/glog/include")
+include_directories("${PADDLE_DIR}/third_party/install/gflags/include")
+include_directories("${PADDLE_DIR}/third_party/install/xxhash/include")
+if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/include")
+ include_directories("${PADDLE_DIR}/third_party/install/snappy/include")
+endif()
+if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/include")
+ include_directories("${PADDLE_DIR}/third_party/install/snappystream/include")
+endif()
+include_directories("${PADDLE_DIR}/third_party/boost")
+include_directories("${PADDLE_DIR}/third_party/eigen3")
+
+if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
+ link_directories("${PADDLE_DIR}/third_party/install/snappy/lib")
+endif()
+if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
+ link_directories("${PADDLE_DIR}/third_party/install/snappystream/lib")
+endif()
+
+link_directories("${PADDLE_DIR}/third_party/install/protobuf/lib")
+link_directories("${PADDLE_DIR}/third_party/install/glog/lib")
+link_directories("${PADDLE_DIR}/third_party/install/gflags/lib")
+link_directories("${PADDLE_DIR}/third_party/install/xxhash/lib")
+link_directories("${PADDLE_DIR}/third_party/install/paddle2onnx/lib")
+link_directories("${PADDLE_DIR}/third_party/install/onnxruntime/lib")
+link_directories("${PADDLE_DIR}/paddle/lib/")
+link_directories("${CMAKE_CURRENT_BINARY_DIR}")
+
+
+
+if (WIN32)
+ include_directories("${PADDLE_DIR}/paddle/fluid/inference")
+ include_directories("${PADDLE_DIR}/paddle/include")
+ link_directories("${PADDLE_DIR}/paddle/fluid/inference")
+ find_package(OpenCV REQUIRED PATHS ${OPENCV_DIR}/build/ NO_DEFAULT_PATH)
+
+else ()
+ find_package(OpenCV REQUIRED PATHS ${OPENCV_DIR}/share/OpenCV NO_DEFAULT_PATH)
+ include_directories("${PADDLE_DIR}/paddle/include")
+ link_directories("${PADDLE_DIR}/paddle/lib")
+endif ()
+include_directories(${OpenCV_INCLUDE_DIRS})
+
+if (WIN32)
+ add_definitions("/DGOOGLE_GLOG_DLL_DECL=")
+ set(CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} /bigobj /MTd")
+ set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} /bigobj /MT")
+ set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} /bigobj /MTd")
+ set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /bigobj /MT")
+else()
+ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -o2 -fopenmp -std=c++11")
+ set(CMAKE_STATIC_LIBRARY_PREFIX "")
+endif()
+
+# TODO let users define cuda lib path
+if (WITH_GPU)
+ if (NOT DEFINED CUDA_LIB OR ${CUDA_LIB} STREQUAL "")
+ message(FATAL_ERROR "please set CUDA_LIB with -DCUDA_LIB=/path/cuda-8.0/lib64")
+ endif()
+ if (NOT WIN32)
+ if (NOT DEFINED CUDNN_LIB)
+ message(FATAL_ERROR "please set CUDNN_LIB with -DCUDNN_LIB=/path/cudnn_v7.4/cuda/lib64")
+ endif()
+ endif(NOT WIN32)
+endif()
+
+
+if (NOT WIN32)
+ if (WITH_TENSORRT AND WITH_GPU)
+ include_directories("${TENSORRT_INC_DIR}/")
+ link_directories("${TENSORRT_LIB_DIR}/")
+ endif()
+endif(NOT WIN32)
+
+if (NOT WIN32)
+ set(NGRAPH_PATH "${PADDLE_DIR}/third_party/install/ngraph")
+ if(EXISTS ${NGRAPH_PATH})
+ include(GNUInstallDirs)
+ include_directories("${NGRAPH_PATH}/include")
+ link_directories("${NGRAPH_PATH}/${CMAKE_INSTALL_LIBDIR}")
+ set(NGRAPH_LIB ${NGRAPH_PATH}/${CMAKE_INSTALL_LIBDIR}/libngraph${CMAKE_SHARED_LIBRARY_SUFFIX})
+ endif()
+endif()
+
+if(WITH_MKL)
+ include_directories("${PADDLE_DIR}/third_party/install/mklml/include")
+ if (WIN32)
+ set(MATH_LIB ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.lib
+ ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.lib)
+ else ()
+ set(MATH_LIB ${PADDLE_DIR}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX}
+ ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5${CMAKE_SHARED_LIBRARY_SUFFIX})
+ execute_process(COMMAND cp -r ${PADDLE_DIR}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX} /usr/lib)
+ endif ()
+ set(MKLDNN_PATH "${PADDLE_DIR}/third_party/install/mkldnn")
+ if(EXISTS ${MKLDNN_PATH})
+ include_directories("${MKLDNN_PATH}/include")
+ if (WIN32)
+ set(MKLDNN_LIB ${MKLDNN_PATH}/lib/mkldnn.lib)
+ else ()
+ set(MKLDNN_LIB ${MKLDNN_PATH}/lib/libmkldnn.so.0)
+ endif ()
+ endif()
+else()
+ set(MATH_LIB ${PADDLE_DIR}/third_party/install/openblas/lib/libopenblas${CMAKE_STATIC_LIBRARY_SUFFIX})
+endif()
+
+
+if (WIN32)
+ if(EXISTS "${PADDLE_DIR}/paddle/fluid/inference/${PADDLE_LIB_NAME}${CMAKE_STATIC_LIBRARY_SUFFIX}")
+ set(DEPS
+ ${PADDLE_DIR}/paddle/fluid/inference/${PADDLE_LIB_NAME}${CMAKE_STATIC_LIBRARY_SUFFIX})
+ else()
+ set(DEPS
+ ${PADDLE_DIR}/paddle/lib/${PADDLE_LIB_NAME}${CMAKE_STATIC_LIBRARY_SUFFIX})
+ endif()
+endif()
+
+
+if (WIN32)
+ set(DEPS ${PADDLE_DIR}/paddle/lib/${PADDLE_LIB_NAME}${CMAKE_STATIC_LIBRARY_SUFFIX})
+else()
+ set(DEPS ${PADDLE_DIR}/paddle/lib/${PADDLE_LIB_NAME}${CMAKE_SHARED_LIBRARY_SUFFIX})
+endif()
+
+message("PADDLE_LIB_NAME:" ${PADDLE_LIB_NAME})
+message("DEPS:" $DEPS)
+
+if (NOT WIN32)
+ set(DEPS ${DEPS}
+ ${MATH_LIB} ${MKLDNN_LIB}
+ glog gflags protobuf z xxhash yaml-cpp
+ )
+ if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
+ set(DEPS ${DEPS} snappystream)
+ endif()
+ if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
+ set(DEPS ${DEPS} snappy)
+ endif()
+else()
+ set(DEPS ${DEPS}
+ ${MATH_LIB} ${MKLDNN_LIB}
+ glog gflags_static libprotobuf xxhash libyaml-cppmt)
+ set(DEPS ${DEPS} libcmt shlwapi)
+ if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
+ set(DEPS ${DEPS} snappy)
+ endif()
+ if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
+ set(DEPS ${DEPS} snappystream)
+ endif()
+endif(NOT WIN32)
+
+if(WITH_GPU)
+ if(NOT WIN32)
+ if (WITH_TENSORRT)
+ set(DEPS ${DEPS} ${TENSORRT_LIB_DIR}/libnvinfer${CMAKE_SHARED_LIBRARY_SUFFIX})
+ set(DEPS ${DEPS} ${TENSORRT_LIB_DIR}/libnvinfer_plugin${CMAKE_SHARED_LIBRARY_SUFFIX})
+ endif()
+ set(DEPS ${DEPS} ${CUDA_LIB}/libcudart${CMAKE_SHARED_LIBRARY_SUFFIX})
+ set(DEPS ${DEPS} ${CUDNN_LIB}/libcudnn${CMAKE_SHARED_LIBRARY_SUFFIX})
+ else()
+ set(DEPS ${DEPS} ${CUDA_LIB}/cudart${CMAKE_STATIC_LIBRARY_SUFFIX} )
+ set(DEPS ${DEPS} ${CUDA_LIB}/cublas${CMAKE_STATIC_LIBRARY_SUFFIX} )
+ set(DEPS ${DEPS} ${CUDNN_LIB}/cudnn${CMAKE_STATIC_LIBRARY_SUFFIX})
+ endif()
+endif()
+
+if (NOT WIN32)
+ set(EXTERNAL_LIB "-ldl -lrt -lgomp -lz -lm -lpthread")
+ set(DEPS ${DEPS} ${EXTERNAL_LIB})
+endif()
+
+set(DEPS ${DEPS} ${OpenCV_LIBS})
+add_executable(main ${SRCS})
+ADD_DEPENDENCIES(main ext-yaml-cpp)
+message("DEPS:" $DEPS)
+target_link_libraries(main ${DEPS})
+
+if (WIN32 AND WITH_MKL)
+ add_custom_command(TARGET main POST_BUILD
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.dll ./mklml.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.dll ./libiomp5md.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mkldnn/lib/mkldnn.dll ./mkldnn.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.dll ./release/mklml.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.dll ./release/libiomp5md.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mkldnn/lib/mkldnn.dll ./release/mkldnn.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/paddle/lib/${PADDLE_LIB_NAME}.dll ./release/${PADDLE_LIB_NAME}.dll
+ )
+endif()
+
+if (WIN32 AND NOT WITH_MKL)
+ add_custom_command(TARGET main POST_BUILD
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/openblas/lib/openblas.dll ./openblas.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/openblas/lib/openblas.dll ./release/openblas.dll
+ )
+endif()
+
+if (WIN32)
+ add_custom_command(TARGET main POST_BUILD
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/onnxruntime/lib/onnxruntime.dll ./onnxruntime.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/paddle2onnx/lib/paddle2onnx.dll ./paddle2onnx.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/onnxruntime/lib/onnxruntime.dll ./release/onnxruntime.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/paddle2onnx/lib/paddle2onnx.dll ./release/paddle2onnx.dll
+ COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/paddle/lib/${PADDLE_LIB_NAME}.dll ./release/${PADDLE_LIB_NAME}.dll
+ )
+endif()
diff --git a/PaddleDetection-release-2.6/deploy/cpp/README.md b/PaddleDetection-release-2.6/deploy/cpp/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..ffa5e251e7913b4af30fa6abe9912c9434af996f
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/cpp/README.md
@@ -0,0 +1,54 @@
+# C++端预测部署
+
+
+
+## 各环境编译部署教程
+- [Linux 编译部署](docs/linux_build.md)
+- [Windows编译部署(使用Visual Studio 2019)](docs/windows_vs2019_build.md)
+- [NV Jetson编译部署](docs/Jetson_build.md)
+
+
+## C++部署总览
+[1.说明](#1说明)
+
+[2.主要目录和文件](#2主要目录和文件)
+
+
+### 1.说明
+
+本目录为用户提供一个跨平台的`C++`部署方案,让用户通过`PaddleDetection`训练的模型导出后,即可基于本项目快速运行,也可以快速集成代码结合到自己的项目实际应用中去。
+
+主要设计的目标包括以下四点:
+- 跨平台,支持在 `Windows` 和 `Linux` 完成编译、二次开发集成和部署运行
+- 可扩展性,支持用户针对新模型开发自己特殊的数据预处理等逻辑
+- 高性能,除了`PaddlePaddle`自身带来的性能优势,我们还针对图像检测的特点对关键步骤进行了性能优化
+- 支持各种不同检测模型结构,包括`Yolov3`/`Faster_RCNN`/`SSD`等
+
+### 2.主要目录和文件
+
+```bash
+deploy/cpp
+|
+├── src
+│ ├── main.cc # 集成代码示例, 程序入口
+│ ├── object_detector.cc # 模型加载和预测主要逻辑封装类实现
+│ └── preprocess_op.cc # 预处理相关主要逻辑封装实现
+|
+├── include
+│ ├── config_parser.h # 导出模型配置yaml文件解析
+│ ├── object_detector.h # 模型加载和预测主要逻辑封装类
+│ └── preprocess_op.h # 预处理相关主要逻辑类封装
+|
+├── docs
+│ ├── linux_build.md # Linux 编译指南
+│ └── windows_vs2019_build.md # Windows VS2019编译指南
+│
+├── build.sh # 编译命令脚本
+│
+├── CMakeList.txt # cmake编译入口文件
+|
+├── CMakeSettings.json # Visual Studio 2019 CMake项目编译设置
+│
+└── cmake # 依赖的外部项目cmake(目前仅有yaml-cpp)
+
+```
diff --git a/PaddleDetection-release-2.6/deploy/cpp/cmake/yaml-cpp.cmake b/PaddleDetection-release-2.6/deploy/cpp/cmake/yaml-cpp.cmake
new file mode 100644
index 0000000000000000000000000000000000000000..7bc7f34d476d69d57336940bcf6c8c55311b8112
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/cpp/cmake/yaml-cpp.cmake
@@ -0,0 +1,30 @@
+
+find_package(Git REQUIRED)
+
+include(ExternalProject)
+
+message("${CMAKE_BUILD_TYPE}")
+
+ExternalProject_Add(
+ ext-yaml-cpp
+ URL https://bj.bcebos.com/paddlex/deploy/deps/yaml-cpp.zip
+ URL_MD5 9542d6de397d1fbd649ed468cb5850e6
+ CMAKE_ARGS
+ -DYAML_CPP_BUILD_TESTS=OFF
+ -DYAML_CPP_BUILD_TOOLS=OFF
+ -DYAML_CPP_INSTALL=OFF
+ -DYAML_CPP_BUILD_CONTRIB=OFF
+ -DMSVC_SHARED_RT=OFF
+ -DBUILD_SHARED_LIBS=OFF
+ -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
+ -DCMAKE_CXX_FLAGS=${CMAKE_CXX_FLAGS}
+ -DCMAKE_CXX_FLAGS_DEBUG=${CMAKE_CXX_FLAGS_DEBUG}
+ -DCMAKE_CXX_FLAGS_RELEASE=${CMAKE_CXX_FLAGS_RELEASE}
+ -DCMAKE_LIBRARY_OUTPUT_DIRECTORY=${CMAKE_BINARY_DIR}/ext/yaml-cpp/lib
+ -DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=${CMAKE_BINARY_DIR}/ext/yaml-cpp/lib
+ PREFIX "${CMAKE_BINARY_DIR}/ext/yaml-cpp"
+ # Disable install step
+ INSTALL_COMMAND ""
+ LOG_DOWNLOAD ON
+ LOG_BUILD 1
+)
diff --git a/PaddleDetection-release-2.6/deploy/cpp/docs/Jetson_build.md b/PaddleDetection-release-2.6/deploy/cpp/docs/Jetson_build.md
new file mode 100644
index 0000000000000000000000000000000000000000..ea9699a438ed3977e118b155a01b533d83bb12f4
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/cpp/docs/Jetson_build.md
@@ -0,0 +1,210 @@
+# Jetson平台编译指南
+
+## 说明
+`NVIDIA Jetson`设备是具有`NVIDIA GPU`的嵌入式设备,可以将目标检测算法部署到该设备上。本文档是在`Jetson`硬件上部署`PaddleDetection`模型的教程。
+
+本文档以`Jetson TX2`硬件、`JetPack 4.3`版本为例进行说明。
+
+`Jetson`平台的开发指南请参考[NVIDIA Jetson Linux Developer Guide](https://docs.nvidia.com/jetson/l4t/index.html).
+
+## Jetson环境搭建
+`Jetson`系统软件安装,请参考[NVIDIA Jetson Linux Developer Guide](https://docs.nvidia.com/jetson/l4t/index.html).
+
+* (1) 查看硬件系统的l4t的版本号
+```
+cat /etc/nv_tegra_release
+```
+* (2) 根据硬件,选择硬件可安装的`JetPack`版本,硬件和`JetPack`版本对应关系请参考[jetpack-archive](https://developer.nvidia.com/embedded/jetpack-archive).
+
+* (3) 下载`JetPack`,请参考[NVIDIA Jetson Linux Developer Guide](https://docs.nvidia.com/jetson/l4t/index.html) 中的`Preparing a Jetson Developer Kit for Use`章节内容进行刷写系统镜像。
+
+**注意**: 请在[jetpack-archive](https://developer.nvidia.com/embedded/jetpack-archive) 根据硬件选择适配的`JetPack`版本进行刷机。
+
+## 下载或编译`Paddle`预测库
+本文档使用`Paddle`在`JetPack4.3`上预先编译好的预测库,请根据硬件在[安装与编译 Linux 预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/05_inference_deployment/inference/build_and_install_lib_cn.html) 中选择对应版本的`Paddle`预测库。
+
+这里选择[nv_jetson_cuda10_cudnn7.6_trt6(jetpack4.3)](https://paddle-inference-lib.bj.bcebos.com/2.0.0-nv-jetson-jetpack4.3-all/paddle_inference.tgz), `Paddle`版本`2.0.0-rc0`,`CUDA`版本`10.0`,`CUDNN`版本`7.6`,`TensorRT`版本`6`。
+
+若需要自己在`Jetson`平台上自定义编译`Paddle`库,请参考文档[安装与编译 Linux 预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html) 的`NVIDIA Jetson嵌入式硬件预测库源码编译`部分内容。
+
+### Step1: 下载代码
+
+ `git clone https://github.com/PaddlePaddle/PaddleDetection.git`
+
+**说明**:其中`C++`预测代码在`/root/projects/PaddleDetection/deploy/cpp` 目录,该目录不依赖任何`PaddleDetection`下其他目录。
+
+
+### Step2: 下载PaddlePaddle C++ 预测库 paddle_inference
+
+解压下载的[nv_jetson_cuda10_cudnn7.6_trt6(jetpack4.3)](https://paddle-inference-lib.bj.bcebos.com/2.0.1-nv-jetson-jetpack4.3-all/paddle_inference.tgz) 。
+
+下载并解压后`/root/projects/paddle_inference`目录包含内容为:
+```
+paddle_inference
+├── paddle # paddle核心库和头文件
+|
+├── third_party # 第三方依赖库和头文件
+|
+└── version.txt # 版本和编译信息
+```
+
+**注意:** 预编译库`nv-jetson-cuda10-cudnn7.6-trt6`使用的`GCC`版本是`7.5.0`,其他都是使用`GCC 4.8.5`编译的。使用高版本的GCC可能存在`ABI`兼容性问题,建议降级或[自行编译预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)。
+
+
+### Step4: 编译
+
+编译`cmake`的命令在`scripts/build.sh`中,请根据实际情况修改主要参数,其主要内容说明如下:
+
+注意,`TX2`平台的`CUDA`、`CUDNN`需要通过`JetPack`安装。
+
+```
+# 是否使用GPU(即是否使用 CUDA)
+WITH_GPU=ON
+
+# 是否使用MKL or openblas,TX2需要设置为OFF
+WITH_MKL=OFF
+
+# 是否集成 TensorRT(仅WITH_GPU=ON 有效)
+WITH_TENSORRT=ON
+
+# TensorRT 的include路径
+TENSORRT_INC_DIR=/usr/include/aarch64-linux-gnu
+
+# TensorRT 的lib路径
+TENSORRT_LIB_DIR=/usr/lib/aarch64-linux-gnu
+
+# Paddle 预测库路径
+PADDLE_DIR=/path/to/paddle_inference/
+
+# Paddle 预测库名称
+PADDLE_LIB_NAME=paddle_inference
+
+# Paddle 的预测库是否使用静态库来编译
+# 使用TensorRT时,Paddle的预测库通常为动态库
+WITH_STATIC_LIB=OFF
+
+# CUDA 的 lib 路径
+CUDA_LIB=/usr/local/cuda-10.0/lib64
+
+# CUDNN 的 lib 路径
+CUDNN_LIB=/usr/lib/aarch64-linux-gnu
+
+# 是否开启关键点模型预测功能
+WITH_KEYPOINT=ON
+
+# OPENCV_DIR 的路径
+# linux平台请下载:https://bj.bcebos.com/paddleseg/deploy/opencv3.4.6gcc4.8ffmpeg.tar.gz2,并解压到deps文件夹下
+# TX2平台请下载:https://paddlemodels.bj.bcebos.com/TX2_JetPack4.3_opencv_3.4.10_gcc7.5.0.zip,并解压到deps文件夹下
+OPENCV_DIR=/path/to/opencv
+
+# 请检查以上各个路径是否正确
+
+# 以下无需改动
+cmake .. \
+ -DWITH_GPU=${WITH_GPU} \
+ -DWITH_MKL=OFF \
+ -DWITH_TENSORRT=${WITH_TENSORRT} \
+ -DTENSORRT_DIR=${TENSORRT_DIR} \
+ -DPADDLE_DIR=${PADDLE_DIR} \
+ -DWITH_STATIC_LIB=${WITH_STATIC_LIB} \
+ -DCUDA_LIB=${CUDA_LIB} \
+ -DCUDNN_LIB=${CUDNN_LIB} \
+ -DOPENCV_DIR=${OPENCV_DIR} \
+ -DPADDLE_LIB_NAME={PADDLE_LIB_NAME} \
+ -DWITH_KEYPOINT=${WITH_KEYPOINT}
+make
+```
+
+例如设置如下:
+```
+# 是否使用GPU(即是否使用 CUDA)
+WITH_GPU=ON
+
+# 是否使用MKL or openblas
+WITH_MKL=OFF
+
+# 是否集成 TensorRT(仅WITH_GPU=ON 有效)
+WITH_TENSORRT=OFF
+
+# TensorRT 的include路径
+TENSORRT_INC_DIR=/usr/include/aarch64-linux-gnu
+
+# TensorRT 的lib路径
+TENSORRT_LIB_DIR=/usr/lib/aarch64-linux-gnu
+
+# Paddle 预测库路径
+PADDLE_DIR=/home/nvidia/PaddleDetection_infer/paddle_inference/
+
+# Paddle 预测库名称
+PADDLE_LIB_NAME=paddle_inference
+
+# Paddle 的预测库是否使用静态库来编译
+# 使用TensorRT时,Paddle的预测库通常为动态库
+WITH_STATIC_LIB=OFF
+
+# CUDA 的 lib 路径
+CUDA_LIB=/usr/local/cuda-10.0/lib64
+
+# CUDNN 的 lib 路径
+CUDNN_LIB=/usr/lib/aarch64-linux-gnu/
+
+# 是否开启关键点模型预测功能
+WITH_KEYPOINT=ON
+```
+
+修改脚本设置好主要参数后,执行`build`脚本:
+ ```shell
+ sh ./scripts/build.sh
+ ```
+
+### Step5: 预测及可视化
+编译成功后,预测入口程序为`build/main`其主要命令参数说明如下:
+| 参数 | 说明 |
+| ---- | ---- |
+| --model_dir | 导出的检测预测模型所在路径 |
+| --model_dir_keypoint | Option | 导出的关键点预测模型所在路径 |
+| --image_file | 要预测的图片文件路径 |
+| --image_dir | 要预测的图片文件夹路径 |
+| --video_file | 要预测的视频文件路径 |
+| --camera_id | Option | 用来预测的摄像头ID,默认为-1(表示不使用摄像头预测)|
+| --device | 运行时的设备,可选择`CPU/GPU/XPU`,默认为`CPU`|
+| --gpu_id | 指定进行推理的GPU device id(默认值为0)|
+| --run_mode | 使用GPU时,默认为paddle, 可选(paddle/trt_fp32/trt_fp16/trt_int8)|
+| --batch_size | 检测模型预测时的batch size,在指定`image_dir`时有效 |
+| --batch_size_keypoint | 关键点模型预测时的batch size,默认为8 |
+| --run_benchmark | 是否重复预测来进行benchmark测速 |
+| --output_dir | 输出图片所在的文件夹, 默认为output |
+| --use_mkldnn | CPU预测中是否开启MKLDNN加速 |
+| --cpu_threads | 设置cpu线程数,默认为1 |
+| --use_dark | 关键点模型输出预测是否使用DarkPose后处理,默认为true |
+
+**注意**:
+- 优先级顺序:`camera_id` > `video_file` > `image_dir` > `image_file`。
+- --run_benchmark如果设置为True,则需要安装依赖`pip install pynvml psutil GPUtil`。
+
+
+`样例一`:
+```shell
+#不使用`GPU`测试图片 `/root/projects/images/test.jpeg`
+./main --model_dir=/root/projects/models/yolov3_darknet --image_file=/root/projects/images/test.jpeg
+```
+
+图片文件`可视化预测结果`会保存在当前目录下`output.jpg`文件中。
+
+
+`样例二`:
+```shell
+#使用 `GPU`预测视频`/root/projects/videos/test.mp4`
+./main --model_dir=/root/projects/models/yolov3_darknet --video_path=/root/projects/images/test.mp4 --device=GPU
+```
+视频文件目前支持`.mp4`格式的预测,`可视化预测结果`会保存在当前目录下`output.mp4`文件中。
+
+`样例三`:
+```shell
+#使用关键点模型与检测模型联合预测,使用 `GPU`预测
+#检测模型检测到的人送入关键点模型进行关键点预测
+./main --model_dir=/root/projects/models/yolov3_darknet --model_dir_keypoint=/root/projects/models/hrnet_w32_256x192 --image_file=/root/projects/images/test.jpeg --device=GPU
+```
+
+## 性能测试
+benchmark请查看[BENCHMARK_INFER](../../BENCHMARK_INFER.md)
diff --git a/PaddleDetection-release-2.6/deploy/cpp/docs/linux_build.md b/PaddleDetection-release-2.6/deploy/cpp/docs/linux_build.md
new file mode 100644
index 0000000000000000000000000000000000000000..ee28e73ee56db3ec46a1674a6af0cb3af1012b3e
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/cpp/docs/linux_build.md
@@ -0,0 +1,149 @@
+# Linux平台编译指南
+
+## 说明
+本文档在 `Linux`平台使用`GCC 8.2`测试过,如果需要使用其他G++版本编译使用,则需要重新编译Paddle预测库,请参考: [从源码编译Paddle预测库](https://paddleinference.paddlepaddle.org.cn/user_guides/source_compile.html)。本文档使用的预置的opencv库是在ubuntu 16.04上用gcc8.2编译的,如果需要在gcc8.2以外的环境编译,那么需自行编译opencv库。
+
+## 前置条件
+* G++ 8.2
+* CUDA 9.0 / CUDA 10.1, cudnn 7+ (仅在使用GPU版本的预测库时需要)
+* CMake 3.0+
+
+请确保系统已经安装好上述基本软件,**下面所有示例以工作目录为 `/root/projects/`演示**。
+
+### Step1: 下载代码
+
+ `git clone https://github.com/PaddlePaddle/PaddleDetection.git`
+
+**说明**:其中`C++`预测代码在`/root/projects/PaddleDetection/deploy/cpp` 目录,该目录不依赖任何`PaddleDetection`下其他目录。
+
+
+### Step2: 下载PaddlePaddle C++ 预测库 paddle_inference
+
+PaddlePaddle C++ 预测库针对不同的`CPU`和`CUDA`版本提供了不同的预编译版本,请根据实际情况下载: [C++预测库下载列表](https://paddleinference.paddlepaddle.org.cn/user_guides/download_lib.html)
+
+
+下载并解压后`/root/projects/paddle_inference`目录包含内容为:
+```
+paddle_inference
+├── paddle # paddle核心库和头文件
+|
+├── third_party # 第三方依赖库和头文件
+|
+└── version.txt # 版本和编译信息
+```
+
+**注意:** 预编译版本除`nv-jetson-cuda10-cudnn7.5-trt5` 以外其它包都是基于`GCC 4.8.5`编译,使用高版本`GCC`可能存在 `ABI`兼容性问题,建议降级或[自行编译预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)。
+
+
+### Step3: 编译
+
+编译`cmake`的命令在`scripts/build.sh`中,请根据实际情况修改主要参数,其主要内容说明如下:
+
+```
+# 是否使用GPU(即是否使用 CUDA)
+WITH_GPU=OFF
+
+# 使用MKL or openblas
+WITH_MKL=ON
+
+# 是否集成 TensorRT(仅WITH_GPU=ON 有效)
+WITH_TENSORRT=OFF
+
+# TensorRT 的include路径
+TENSORRT_LIB_DIR=/path/to/TensorRT/include
+
+# TensorRT 的lib路径
+TENSORRT_LIB_DIR=/path/to/TensorRT/lib
+
+# Paddle 预测库路径
+PADDLE_DIR=/path/to/paddle_inference
+
+# Paddle 预测库名称
+PADDLE_LIB_NAME=paddle_inference
+
+# CUDA 的 lib 路径
+CUDA_LIB=/path/to/cuda/lib
+
+# CUDNN 的 lib 路径
+CUDNN_LIB=/path/to/cudnn/lib
+
+# 是否开启关键点模型预测功能
+WITH_KEYPOINT=ON
+
+# 请检查以上各个路径是否正确
+
+# 以下无需改动
+cmake .. \
+ -DWITH_GPU=${WITH_GPU} \
+ -DWITH_MKL=${WITH_MKL} \
+ -DWITH_TENSORRT=${WITH_TENSORRT} \
+ -DTENSORRT_LIB_DIR=${TENSORRT_LIB_DIR} \
+ -DTENSORRT_INC_DIR=${TENSORRT_INC_DIR} \
+ -DPADDLE_DIR=${PADDLE_DIR} \
+ -DCUDA_LIB=${CUDA_LIB} \
+ -DCUDNN_LIB=${CUDNN_LIB} \
+ -DOPENCV_DIR=${OPENCV_DIR} \
+ -DPADDLE_LIB_NAME=${PADDLE_LIB_NAME} \
+ -DWITH_KEYPOINT=${WITH_KEYPOINT}
+make
+
+```
+
+修改脚本设置好主要参数后,执行`build`脚本:
+ ```shell
+ sh ./scripts/build.sh
+ ```
+
+**注意**: OPENCV依赖OPENBLAS,Ubuntu用户需确认系统是否已存在`libopenblas.so`。如未安装,可执行apt-get install libopenblas-dev进行安装。
+
+### Step4: 预测及可视化
+编译成功后,预测入口程序为`build/main`其主要命令参数说明如下:
+| 参数 | 说明 |
+| ---- | ---- |
+| --model_dir | 导出的检测预测模型所在路径 |
+| --model_dir_keypoint | Option | 导出的关键点预测模型所在路径 |
+| --image_file | 要预测的图片文件路径 |
+| --image_dir | 要预测的图片文件夹路径 |
+| --video_file | 要预测的视频文件路径 |
+| --camera_id | Option | 用来预测的摄像头ID,默认为-1(表示不使用摄像头预测)|
+| --device | 运行时的设备,可选择`CPU/GPU/XPU`,默认为`CPU`|
+| --gpu_id | 指定进行推理的GPU device id(默认值为0)|
+| --run_mode | 使用GPU时,默认为paddle, 可选(paddle/trt_fp32/trt_fp16/trt_int8)|
+| --batch_size | 检测模型预测时的batch size,在指定`image_dir`时有效 |
+| --batch_size_keypoint | 关键点模型预测时的batch size,默认为8 |
+| --run_benchmark | 是否重复预测来进行benchmark测速 |
+| --output_dir | 输出图片所在的文件夹, 默认为output |
+| --use_mkldnn | CPU预测中是否开启MKLDNN加速 |
+| --cpu_threads | 设置cpu线程数,默认为1 |
+| --use_dark | 关键点模型输出预测是否使用DarkPose后处理,默认为true |
+
+**注意**:
+- 优先级顺序:`camera_id` > `video_file` > `image_dir` > `image_file`。
+- --run_benchmark如果设置为True,则需要安装依赖`pip install pynvml psutil GPUtil`。
+
+`样例一`:
+```shell
+#不使用`GPU`测试图片 `/root/projects/images/test.jpeg`
+./build/main --model_dir=/root/projects/models/yolov3_darknet --image_file=/root/projects/images/test.jpeg
+```
+
+图片文件`可视化预测结果`会保存在当前目录下`output.jpg`文件中。
+
+
+`样例二`:
+```shell
+#使用 `GPU`预测视频`/root/projects/videos/test.mp4`
+./build/main --model_dir=/root/projects/models/yolov3_darknet --video_file=/root/projects/images/test.mp4 --device=GPU
+```
+视频文件目前支持`.mp4`格式的预测,`可视化预测结果`会保存在当前目录下`output.mp4`文件中。
+
+
+`样例三`:
+```shell
+#使用关键点模型与检测模型联合预测,使用 `GPU`预测
+#检测模型检测到的人送入关键点模型进行关键点预测
+./build/main --model_dir=/root/projects/models/yolov3_darknet --model_dir_keypoint=/root/projects/models/hrnet_w32_256x192 --image_file=/root/projects/images/test.jpeg --device=GPU
+```
+
+## 性能测试
+benchmark请查看[BENCHMARK_INFER](../../BENCHMARK_INFER.md)
diff --git a/PaddleDetection-release-2.6/deploy/cpp/docs/windows_vs2019_build.md b/PaddleDetection-release-2.6/deploy/cpp/docs/windows_vs2019_build.md
new file mode 100644
index 0000000000000000000000000000000000000000..1a23cabc7bf640ed548942012354013f500d6be2
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/cpp/docs/windows_vs2019_build.md
@@ -0,0 +1,158 @@
+# Visual Studio 2019 Community CMake 编译指南
+
+Windows 平台下,我们使用`Visual Studio 2019 Community` 进行了测试。微软从`Visual Studio 2017`开始即支持直接管理`CMake`跨平台编译项目,但是直到`2019`才提供了稳定和完全的支持,所以如果你想使用CMake管理项目编译构建,我们推荐你使用`Visual Studio 2019`环境下构建。
+
+
+## 前置条件
+* Visual Studio 2019 (根据Paddle预测库所使用的VS版本选择,请参考 [Visual Studio 不同版本二进制兼容性](https://docs.microsoft.com/zh-cn/cpp/porting/binary-compat-2015-2017?view=vs-2019) )
+* CUDA 9.0 / CUDA 10.0,cudnn 7+ / TensorRT(仅在使用GPU版本的预测库时需要)
+* CMake 3.0+ [CMake下载](https://cmake.org/download/)
+
+**特别注意:windows下预测库需要的TensorRT版本为:**。
+
+| 预测库版本 | TensorRT版本 |
+| ---- | ---- |
+| cuda10.1_cudnn7.6_avx_mkl_trt6 | TensorRT-6.0.1.5 |
+| cuda10.2_cudnn7.6_avx_mkl_trt7 | TensorRT-7.0.0.11 |
+| cuda11.0_cudnn8.0_avx_mkl_trt7 | TensorRT-7.2.1.6 |
+
+请确保系统已经安装好上述基本软件,我们使用的是`VS2019`的社区版。
+
+**下面所有示例以工作目录为 `D:\projects`演示**。
+
+### Step1: 下载代码
+
+下载源代码
+```shell
+git clone https://github.com/PaddlePaddle/PaddleDetection.git
+```
+
+**说明**:其中`C++`预测代码在`PaddleDetection/deploy/cpp` 目录,该目录不依赖任何`PaddleDetection`下其他目录。
+
+
+### Step2: 下载PaddlePaddle C++ 预测库 paddle_inference
+
+PaddlePaddle C++ 预测库针对不同的`CPU`和`CUDA`版本提供了不同的预编译版本,请根据实际情况下载: [C++预测库下载列表](https://paddleinference.paddlepaddle.org.cn/user_guides/download_lib.html#windows)
+
+解压后`D:\projects\paddle_inference`目录包含内容为:
+```
+paddle_inference
+├── paddle # paddle核心库和头文件
+|
+├── third_party # 第三方依赖库和头文件
+|
+└── version.txt # 版本和编译信息
+```
+
+### Step3: 安装配置OpenCV
+
+1. 在OpenCV官网下载适用于Windows平台的3.4.6版本, [下载地址](https://sourceforge.net/projects/opencvlibrary/files/3.4.6/opencv-3.4.6-vc14_vc15.exe/download)
+2. 运行下载的可执行文件,将OpenCV解压至指定目录,如`D:\projects\opencv`
+3. 配置环境变量,如下流程所示(如果使用全局绝对路径,可以不用设置环境变量)
+ - 我的电脑->属性->高级系统设置->环境变量
+ - 在系统变量中找到Path(如没有,自行创建),并双击编辑
+ - 新建,将opencv路径填入并保存,如`D:\projects\opencv\build\x64\vc14\bin`
+
+### Step4: 编译
+
+1. 进入到`cpp`文件夹
+```
+cd D:\projects\PaddleDetection\deploy\cpp
+```
+
+2. 使用CMake生成项目文件
+
+编译参数的含义说明如下(带`*`表示仅在使用**GPU版本**预测库时指定, 其中CUDA库版本尽量对齐,**使用9.0、10.0版本,不使用9.2、10.1等版本CUDA库**):
+
+| 参数名 | 含义 |
+| ---- | ---- |
+| *CUDA_LIB | CUDA的库路径 |
+| *CUDNN_LIB | CUDNN的库路径 |
+| OPENCV_DIR | OpenCV的安装路径, |
+| PADDLE_DIR | Paddle预测库的路径 |
+| PADDLE_LIB_NAME | Paddle 预测库名称 |
+
+**注意:**
+
+1. 如果编译环境为CPU,需要下载`CPU`版预测库,请把`WITH_GPU`的勾去掉
+2. 如果使用的是`openblas`版本,请把`WITH_MKL`勾去掉
+3. 如无需使用关键点模型可以把`WITH_KEYPOINT`勾去掉
+4. Windows环境下,`PADDLE_LIB_NAME`需要设置为`paddle_inference`
+
+执行如下命令项目文件:
+```
+cmake . -G "Visual Studio 16 2019" -A x64 -T host=x64 -DWITH_GPU=ON -DWITH_MKL=ON -DCMAKE_BUILD_TYPE=Release -DCUDA_LIB=path_to_cuda_lib -DCUDNN_LIB=path_to_cudnn_lib -DPADDLE_DIR=path_to_paddle_lib -DPADDLE_LIB_NAME=paddle_inference -DOPENCV_DIR=path_to_opencv -DWITH_KEYPOINT=ON
+```
+
+例如:
+```
+cmake . -G "Visual Studio 16 2019" -A x64 -T host=x64 -DWITH_GPU=ON -DWITH_MKL=ON -DCMAKE_BUILD_TYPE=Release -DCUDA_LIB=D:\projects\packages\cuda10_0\lib\x64 -DCUDNN_LIB=D:\projects\packages\cuda10_0\lib\x64 -DPADDLE_DIR=D:\projects\packages\paddle_inference -DPADDLE_LIB_NAME=paddle_inference -DOPENCV_DIR=D:\projects\packages\opencv3_4_6 -DWITH_KEYPOINT=ON
+```
+
+
+
+3. 编译
+用`Visual Studio 16 2019`打开`cpp`文件夹下的`PaddleObjectDetector.sln`,将编译模式设置为`Release`,点击`生成`->`全部生成
+
+
+### Step5: 预测及可视化
+
+上述`Visual Studio 2019`编译产出的可执行文件在`out\build\x64-Release`目录下,打开`cmd`,并切换到该目录:
+
+```
+cd D:\projects\PaddleDetection\deploy\cpp\out\build\x64-Release
+```
+可执行文件`main`即为样例的预测程序,其主要的命令行参数如下:
+
+| 参数 | 说明 |
+| ---- | ---- |
+| --model_dir | 导出的检测预测模型所在路径 |
+| --model_dir_keypoint | Option | 导出的关键点预测模型所在路径 |
+| --image_file | 要预测的图片文件路径 |
+| --image_dir | 要预测的图片文件夹路径 |
+| --video_file | 要预测的视频文件路径 |
+| --camera_id | Option | 用来预测的摄像头ID,默认为-1(表示不使用摄像头预测)|
+| --device | 运行时的设备,可选择`CPU/GPU/XPU`,默认为`CPU`|
+| --gpu_id | 指定进行推理的GPU device id(默认值为0)|
+| --run_mode | 使用GPU时,默认为paddle, 可选(paddle/trt_fp32/trt_fp16/trt_int8)|
+| --batch_size | 检测模型预测时的batch size,在指定`image_dir`时有效 |
+| --batch_size_keypoint | 关键点模型预测时的batch size,默认为8 |
+| --run_benchmark | 是否重复预测来进行benchmark测速 |
+| --output_dir | 输出图片所在的文件夹, 默认为output |
+| --use_mkldnn | CPU预测中是否开启MKLDNN加速 |
+| --cpu_threads | 设置cpu线程数,默认为1 |
+| --use_dark | 关键点模型输出预测是否使用DarkPose后处理,默认为true |
+
+**注意**:
+(1)优先级顺序:`camera_id` > `video_file` > `image_dir` > `image_file`。
+(2)如果提示找不到`opencv_world346.dll`,把`D:\projects\packages\opencv3_4_6\build\x64\vc14\bin`文件夹下的`opencv_world346.dll`拷贝到`main.exe`文件夹下即可。
+(3)--run_benchmark如果设置为True,则需要安装依赖`pip install pynvml psutil GPUtil`。
+
+
+`样例一`:
+```shell
+#不使用`GPU`测试图片 `D:\\images\\test.jpeg`
+.\main --model_dir=D:\\models\\yolov3_darknet --image_file=D:\\images\\test.jpeg
+```
+
+图片文件`可视化预测结果`会保存在当前目录下`output.jpg`文件中。
+
+
+`样例二`:
+```shell
+#使用`GPU`测试视频 `D:\\videos\\test.mp4`
+.\main --model_dir=D:\\models\\yolov3_darknet --video_path=D:\\videos\\test.mp4 --device=GPU
+```
+
+视频文件目前支持`.mp4`格式的预测,`可视化预测结果`会保存在当前目录下`output.mp4`文件中。
+
+
+`样例三`:
+```shell
+#使用关键点模型与检测模型联合预测,使用 `GPU`预测
+#检测模型检测到的人送入关键点模型进行关键点预测
+.\main --model_dir=D:\\models\\yolov3_darknet --model_dir_keypoint=D:\\models\\hrnet_w32_256x192 --image_file=D:\\images\\test.jpeg --device=GPU
+```
+
+## 性能测试
+Benchmark请查看[BENCHMARK_INFER](../../BENCHMARK_INFER.md)
diff --git a/PaddleDetection-release-2.6/deploy/cpp/include/config_parser.h b/PaddleDetection-release-2.6/deploy/cpp/include/config_parser.h
new file mode 100644
index 0000000000000000000000000000000000000000..1f2e381c5284bb7ce16a6b06f858a32e83290f98
--- /dev/null
+++ b/PaddleDetection-release-2.6/deploy/cpp/include/config_parser.h
@@ -0,0 +1,142 @@
+// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#pragma once
+
+#include
+#include