File size: 4,575 Bytes
d3dbf03 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | # Preparing HVU
## Introduction
<!-- [DATASET] -->
```BibTeX
@article{Diba2019LargeSH,
title={Large Scale Holistic Video Understanding},
author={Ali Diba and M. Fayyaz and Vivek Sharma and Manohar Paluri and Jurgen Gall and R. Stiefelhagen and L. Gool},
journal={arXiv: Computer Vision and Pattern Recognition},
year={2019}
}
```
For basic dataset information, please refer to the official [project](https://github.com/holistic-video-understanding/HVU-Dataset/) and the [paper](https://arxiv.org/abs/1904.11451).
Before we start, please make sure that the directory is located at `$MMACTION2/tools/data/hvu/`.
## Step 1. Prepare Annotations
First of all, you can run the following script to prepare annotations.
```shell
bash download_annotations.sh
```
Besides, you need to run the following command to parse the tag list of HVU.
```shell
python parse_tag_list.py
```
## Step 2. Prepare Videos
Then, you can run the following script to prepare videos.
The codes are adapted from the [official crawler](https://github.com/activitynet/ActivityNet/tree/master/Crawler/Kinetics). Note that this might take a long time.
```shell
bash download_videos.sh
```
## Step 3. Extract RGB and Flow
This part is **optional** if you only want to use the video loader.
Before extracting, please refer to [install.md](/docs/en/get_started/installation.md) for installing [denseflow](https://github.com/open-mmlab/denseflow).
You can use the following script to extract both RGB and Flow frames.
```shell
bash extract_frames.sh
```
By default, we generate frames with short edge resized to 256.
More details can be found in [prepare_dataset](/docs/en/user_guides/prepare_dataset.md)
## Step 4. Generate File List
You can run the follow scripts to generate file list in the format of videos and rawframes, respectively.
```shell
bash generate_videos_filelist.sh
# execute the command below when rawframes are ready
bash generate_rawframes_filelist.sh
```
## Step 5. Generate File List for Each Individual Tag Categories
This part is **optional** if you don't want to train models on HVU for a specific tag category.
The file list generated in step 4 contains labels of different categories. These file lists can only be
handled with HVUDataset and used for multi-task learning of different tag categories. The component
`LoadHVULabel` is needed to load the multi-category tags, and the `HVULoss` should be used to train
the model.
If you only want to train video recognition models for a specific tag category, i.e. you want to train
a recognition model on HVU which only handles tags in the category `action`, we recommend you to use
the following command to generate file lists for the specific tag category. The new list, which only
contains tags of a specific category, can be handled with `VideoDataset` or `RawframeDataset`. The
recognition models can be trained with `BCELossWithLogits`.
The following command generates file list for the tag category ${category}, note that the tag category you
specified should be in the 6 tag categories available in HVU: \['action', 'attribute', 'concept', 'event',
'object', 'scene'\].
```shell
python generate_sub_file_list.py path/to/filelist.json ${category}
```
The filename of the generated file list for ${category} is generated by replacing `hvu` in the original
filename with `hvu_${category}`. For example, if the original filename is `hvu_train.json`, the filename
of the file list for action is `hvu_action_train.json`.
## Step 6. Folder Structure
After the whole data pipeline for HVU preparation.
you can get the rawframes (RGB + Flow), videos and annotation files for HVU.
In the context of the whole project (for HVU only), the full folder structure will look like:
```
mmaction2
βββ mmaction
βββ tools
βββ configs
βββ data
β βββ hvu
β β βββ hvu_train_video.json
β β βββ hvu_val_video.json
β β βββ hvu_train.json
β β βββ hvu_val.json
β β βββ annotations
β β βββ videos_train
β β β βββ OLpWTpTC4P8_000570_000670.mp4
β β β βββ xsPKW4tZZBc_002330_002430.mp4
β β β βββ ...
β β βββ videos_val
β β βββ rawframes_train
β β βββ rawframes_val
```
For training and evaluating on HVU, please refer to [Training and Test Tutorial](/docs/en/user_guides/train_test.md).
|