File size: 4,575 Bytes
d3dbf03
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# Preparing HVU

## Introduction

<!-- [DATASET] -->

```BibTeX

@article{Diba2019LargeSH,

  title={Large Scale Holistic Video Understanding},

  author={Ali Diba and M. Fayyaz and Vivek Sharma and Manohar Paluri and Jurgen Gall and R. Stiefelhagen and L. Gool},

  journal={arXiv: Computer Vision and Pattern Recognition},

  year={2019}

}

```

For basic dataset information, please refer to the official [project](https://github.com/holistic-video-understanding/HVU-Dataset/) and the [paper](https://arxiv.org/abs/1904.11451).
Before we start, please make sure that the directory is located at `$MMACTION2/tools/data/hvu/`.

## Step 1. Prepare Annotations

First of all, you can run the following script to prepare annotations.

```shell

bash download_annotations.sh

```

Besides, you need to run the following command to parse the tag list of HVU.

```shell

python parse_tag_list.py

```

## Step 2. Prepare Videos

Then, you can run the following script to prepare videos.
The codes are adapted from the [official crawler](https://github.com/activitynet/ActivityNet/tree/master/Crawler/Kinetics). Note that this might take a long time.

```shell

bash download_videos.sh

```

## Step 3. Extract RGB and Flow

This part is **optional** if you only want to use the video loader.

Before extracting, please refer to [install.md](/docs/en/get_started/installation.md) for installing [denseflow](https://github.com/open-mmlab/denseflow).

You can use the following script to extract both RGB and Flow frames.

```shell

bash extract_frames.sh

```

By default, we generate frames with short edge resized to 256.
More details can be found in [prepare_dataset](/docs/en/user_guides/prepare_dataset.md)

## Step 4. Generate File List

You can run the follow scripts to generate file list in the format of videos and rawframes, respectively.

```shell

bash generate_videos_filelist.sh

# execute the command below when rawframes are ready

bash generate_rawframes_filelist.sh

```

## Step 5. Generate File List for Each Individual Tag Categories

This part is **optional** if you don't want to train models on HVU for a specific tag category.

The file list generated in step 4 contains labels of different categories. These file lists can only be
handled with HVUDataset and used for multi-task learning of different tag categories. The component
`LoadHVULabel` is needed to load the multi-category tags, and the `HVULoss` should be used to train
the model.

If you only want to train video recognition models for a specific tag category, i.e. you want to train
a recognition model on HVU which only handles tags in the category `action`, we recommend you to use
the following command to generate file lists for the specific tag category. The new list, which only
contains tags of a specific category, can be handled with `VideoDataset` or `RawframeDataset`. The
recognition models can be trained with `BCELossWithLogits`.

The following command generates file list for the tag category ${category}, note that the tag category you
specified should be in the 6 tag categories available in HVU: \['action', 'attribute', 'concept', 'event',
'object', 'scene'\].

```shell

python generate_sub_file_list.py path/to/filelist.json ${category}

```

The filename of the generated file list for ${category} is generated by replacing `hvu` in the original
filename with `hvu_${category}`. For example, if the original filename is `hvu_train.json`, the filename
of the file list for action is `hvu_action_train.json`.

## Step 6. Folder Structure

After the whole data pipeline for HVU preparation.
you can get the rawframes (RGB + Flow), videos and annotation files for HVU.

In the context of the whole project (for HVU only), the full folder structure will look like:

```

mmaction2

β”œβ”€β”€ mmaction

β”œβ”€β”€ tools

β”œβ”€β”€ configs

β”œβ”€β”€ data

β”‚   β”œβ”€β”€ hvu

β”‚   β”‚   β”œβ”€β”€ hvu_train_video.json

β”‚   β”‚   β”œβ”€β”€ hvu_val_video.json

β”‚   β”‚   β”œβ”€β”€ hvu_train.json

β”‚   β”‚   β”œβ”€β”€ hvu_val.json

β”‚   β”‚   β”œβ”€β”€ annotations

β”‚   β”‚   β”œβ”€β”€ videos_train

β”‚   β”‚   β”‚   β”œβ”€β”€ OLpWTpTC4P8_000570_000670.mp4

β”‚   β”‚   β”‚   β”œβ”€β”€ xsPKW4tZZBc_002330_002430.mp4

β”‚   β”‚   β”‚   β”œβ”€β”€ ...

β”‚   β”‚   β”œβ”€β”€ videos_val

β”‚   β”‚   β”œβ”€β”€ rawframes_train

β”‚   β”‚   β”œβ”€β”€ rawframes_val



```

For training and evaluating on HVU, please refer to [Training and Test Tutorial](/docs/en/user_guides/train_test.md).