| # Preparing HVU | |
| ## Introduction | |
| <!-- [DATASET] --> | |
| ```BibTeX | |
| @article{Diba2019LargeSH, | |
| title={Large Scale Holistic Video Understanding}, | |
| author={Ali Diba and M. Fayyaz and Vivek Sharma and Manohar Paluri and Jurgen Gall and R. Stiefelhagen and L. Gool}, | |
| journal={arXiv: Computer Vision and Pattern Recognition}, | |
| year={2019} | |
| } | |
| ``` | |
| For basic dataset information, please refer to the official [project](https://github.com/holistic-video-understanding/HVU-Dataset/) and the [paper](https://arxiv.org/abs/1904.11451). | |
| Before we start, please make sure that the directory is located at `$MMACTION2/tools/data/hvu/`. | |
| ## Step 1. Prepare Annotations | |
| First of all, you can run the following script to prepare annotations. | |
| ```shell | |
| bash download_annotations.sh | |
| ``` | |
| Besides, you need to run the following command to parse the tag list of HVU. | |
| ```shell | |
| python parse_tag_list.py | |
| ``` | |
| ## Step 2. Prepare Videos | |
| Then, you can run the following script to prepare videos. | |
| The codes are adapted from the [official crawler](https://github.com/activitynet/ActivityNet/tree/master/Crawler/Kinetics). Note that this might take a long time. | |
| ```shell | |
| bash download_videos.sh | |
| ``` | |
| ## Step 3. Extract RGB and Flow | |
| This part is **optional** if you only want to use the video loader. | |
| Before extracting, please refer to [install.md](/docs/en/get_started/installation.md) for installing [denseflow](https://github.com/open-mmlab/denseflow). | |
| You can use the following script to extract both RGB and Flow frames. | |
| ```shell | |
| bash extract_frames.sh | |
| ``` | |
| By default, we generate frames with short edge resized to 256. | |
| More details can be found in [prepare_dataset](/docs/en/user_guides/prepare_dataset.md) | |
| ## Step 4. Generate File List | |
| You can run the follow scripts to generate file list in the format of videos and rawframes, respectively. | |
| ```shell | |
| bash generate_videos_filelist.sh | |
| # execute the command below when rawframes are ready | |
| bash generate_rawframes_filelist.sh | |
| ``` | |
| ## Step 5. Generate File List for Each Individual Tag Categories | |
| This part is **optional** if you don't want to train models on HVU for a specific tag category. | |
| The file list generated in step 4 contains labels of different categories. These file lists can only be | |
| handled with HVUDataset and used for multi-task learning of different tag categories. The component | |
| `LoadHVULabel` is needed to load the multi-category tags, and the `HVULoss` should be used to train | |
| the model. | |
| If you only want to train video recognition models for a specific tag category, i.e. you want to train | |
| a recognition model on HVU which only handles tags in the category `action`, we recommend you to use | |
| the following command to generate file lists for the specific tag category. The new list, which only | |
| contains tags of a specific category, can be handled with `VideoDataset` or `RawframeDataset`. The | |
| recognition models can be trained with `BCELossWithLogits`. | |
| The following command generates file list for the tag category ${category}, note that the tag category you | |
| specified should be in the 6 tag categories available in HVU: \['action', 'attribute', 'concept', 'event', | |
| 'object', 'scene'\]. | |
| ```shell | |
| python generate_sub_file_list.py path/to/filelist.json ${category} | |
| ``` | |
| The filename of the generated file list for ${category} is generated by replacing `hvu` in the original | |
| filename with `hvu_${category}`. For example, if the original filename is `hvu_train.json`, the filename | |
| of the file list for action is `hvu_action_train.json`. | |
| ## Step 6. Folder Structure | |
| After the whole data pipeline for HVU preparation. | |
| you can get the rawframes (RGB + Flow), videos and annotation files for HVU. | |
| In the context of the whole project (for HVU only), the full folder structure will look like: | |
| ``` | |
| mmaction2 | |
| βββ mmaction | |
| βββ tools | |
| βββ configs | |
| βββ data | |
| β βββ hvu | |
| β β βββ hvu_train_video.json | |
| β β βββ hvu_val_video.json | |
| β β βββ hvu_train.json | |
| β β βββ hvu_val.json | |
| β β βββ annotations | |
| β β βββ videos_train | |
| β β β βββ OLpWTpTC4P8_000570_000670.mp4 | |
| β β β βββ xsPKW4tZZBc_002330_002430.mp4 | |
| β β β βββ ... | |
| β β βββ videos_val | |
| β β βββ rawframes_train | |
| β β βββ rawframes_val | |
| ``` | |
| For training and evaluating on HVU, please refer to [Training and Test Tutorial](/docs/en/user_guides/train_test.md). | |