Spaces:
Runtime error
Runtime error
| ## Reparameterize YOLO-World | |
| The reparameterization incorporates text embeddings as parameters into the model. For example, in the final classification layer, text embeddings are reparameterized into a simple 1x1 convolutional layer. | |
| <div align="center"> | |
| <img width="600" src="../assets/reparameterize.png"> | |
| </div> | |
| ### Key Advantages from Reparameterization | |
| > Reparameterized YOLO-World still has zero-shot ability! | |
| * **Efficiency:** reparameterized YOLO-World has a simple and efficient archtecture, e.g., `conv1x1` is faster than `transpose & matmul`. In addition, it enables further optmization for deployment. | |
| * **Accuracy:** reparameterized YOLO-World supports fine-tuning. Compared to the normal `fine-tuning` or `prompt tuning`, **reparameterized version can optimize the `neck` and `head` independently** since the `neck` and `head` have different parameters and do not depend on `text embeddings` anymore! | |
| For example, fine-tuning the **reparameterized YOLO-World** obtains *46.3 AP* on COCO *val2017* while fine-tuning the normal version obtains *46.1 AP*, with all hyper-parameters kept the same. | |
| ### Getting Started | |
| #### 1. Prepare cutstom text embeddings | |
| You need to generate the text embeddings by [`toos/generate_text_prompts.py`](../tools/generate_text_prompts.py) and save it as a `numpy.array` with shape `NxD`. | |
| #### 2. Reparameterizing | |
| Reparameterizing will generate a new checkpoint with text embeddings! | |
| Check those files first: | |
| * model checkpoint | |
| * text embeddings | |
| We mainly reparameterize two groups of modules: | |
| * head (`YOLOWorldHeadModule`) | |
| * neck (`MaxSigmoidCSPLayerWithTwoConv`) | |
| ```bash | |
| python tools/reparameterize_yoloworld.py \ | |
| --model path/to/checkpoint \ | |
| --out-dir path/to/save/re-parameterized/ \ | |
| --text-embed path/to/text/embeddings \ | |
| --conv-neck | |
| ``` | |
| #### 3. Prepare the model config | |
| Please see the sample config: [`finetune_coco/yolo_world_v2_s_rep_vlpan_bn_2e-4_80e_8gpus_mask-refine_finetune_coco.py`](../configs/finetune_coco/yolo_world_v2_s_rep_vlpan_bn_2e-4_80e_8gpus_mask-refine_finetune_coco.py) for reparameterized training. | |
| * `RepConvMaxSigmoidCSPLayerWithTwoConv`: | |
| ```python | |
| neck=dict(type='YOLOWorldPAFPN', | |
| guide_channels=num_classes, | |
| embed_channels=neck_embed_channels, | |
| num_heads=neck_num_heads, | |
| block_cfg=dict(type='RepConvMaxSigmoidCSPLayerWithTwoConv', | |
| guide_channels=num_classes)), | |
| ``` | |
| * `RepYOLOWorldHeadModule`: | |
| ```python | |
| bbox_head=dict(head_module=dict(type='RepYOLOWorldHeadModule', | |
| embed_dims=text_channels, | |
| num_guide=num_classes, | |
| num_classes=num_classes)), | |
| ``` | |
| #### 4. Reparameterized Training | |
| **Reparameterized YOLO-World** is easier to fine-tune and can be treated as an enhanced and pre-trained YOLOv8! | |
| You can check [`finetune_coco/yolo_world_v2_s_rep_vlpan_bn_2e-4_80e_8gpus_mask-refine_finetune_coco.py`](../configs/finetune_coco/yolo_world_v2_s_rep_vlpan_bn_2e-4_80e_8gpus_mask-refine_finetune_coco.py) for more details. |