File size: 2,359 Bytes
d670799
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# Preparing Video Retrieval Datasets

## Introduction

<!-- [DATASET] -->

```BibTeX

@inproceedings{xu2016msr,

      title={Msr-vtt: A large video description dataset for bridging video and language},

      author={Xu, Jun and Mei, Tao and Yao, Ting and Rui, Yong},

      booktitle={CVPR},

      pages={5288--5296},

      year={2016}

}

```

```BibTeX

@inproceedings{chen2011collecting,

  title={Collecting highly parallel data for paraphrase evaluation},

  author={Chen, David and Dolan, William B},

  booktitle={ACL},

  pages={190--200},

  year={2011}

}

```

Before we start, please make sure that the directory is located at `$MMACTION2/tools/data/video_retrieval/`.

## Preparing MSRVTT dataset

For basic dataset information, you can refer to the MSRVTT dataset [website](https://www.microsoft.com/en-us/research/publication/msr-vtt-a-large-video-description-dataset-for-bridging-video-and-language/). Run the following command to prepare the MSRVTT dataset:

```shell

bash prepare_msrvtt.sh

```

After preparation, the folder structure will look like:

```

mmaction2

β”œβ”€β”€ mmaction

β”œβ”€β”€ tools

β”œβ”€β”€ configs

β”œβ”€β”€ data

β”‚   β”œβ”€β”€ video_retrieval

β”‚   β”‚   └── msrvtt

β”‚   β”‚       β”œβ”€β”€ train_9k.json

β”‚   β”‚       β”œβ”€β”€ train_7k.json

β”‚   β”‚       β”œβ”€β”€ test_JSFUSION.json

β”‚   β”‚       └─── videos

β”‚   β”‚           β”œβ”€β”€ video0.mp4

β”‚   β”‚           β”œβ”€β”€ video1.mp4

β”‚   β”‚           β”œβ”€β”€ ...

β”‚   β”‚           └── video9999.mp4

```

## Preparing MSVD dataset

For basic dataset information, you can refer to the MSVD dataset [website](https://www.cs.utexas.edu/users/ml/clamp/videoDescription/). Run the following command to prepare the MSVD dataset:

```shell

bash prepare_msvd.sh

```

After preparation, the folder structure will look like:

```

mmaction2

β”œβ”€β”€ mmaction

β”œβ”€β”€ tools

β”œβ”€β”€ configs

β”œβ”€β”€ data

β”‚   β”œβ”€β”€ video_retrieval

β”‚   β”‚   └── msrvd

β”‚   β”‚       β”œβ”€β”€ train.json

β”‚   β”‚       β”œβ”€β”€ test.json

β”‚   β”‚       β”œβ”€β”€ val.json

β”‚   β”‚       └─── videos

β”‚   β”‚           β”œβ”€β”€ xxx.avi

β”‚   β”‚           β”œβ”€β”€ xxx.avi

β”‚   β”‚           β”œβ”€β”€ ...

β”‚   β”‚           └── xxx.avi

```