File size: 6,767 Bytes
663494c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# Dataset splits vs. filtered training / test splits

The NAVSIM framework utilizes several dataset splits for standardized training and evaluating agents. 
All of them use the OpenScene dataset that is divided into the dataset splits `mini`,`trainval`,`test`,`private_test_e2e`, which can all be downloaded separately.

It is possible to run trainings and evaluations directly on these sets (see `Standard` in table below). 
Alternatively, you can run trainings and evaluations on training and validation splits that were filtered for challenging scenarios (see `NAVSIM` in table below), which is the recommended option for producing comparable and competitive results efficiently.
In contrast to the dataset splits which refer to a downloadable set of logs, the training / test splits are implemented as scene filters, which define how scenes are extracted from these logs.

The NAVSIM training / test splits subsample the OpenScene dataset splits.
Moreover, the NAVSIM splits include overlapping scenes, while the Standard splits are non-overlapping.
Specifically, `navtrain` is based on the `trainval` data and `navtest` on the `test` data.

As the `trainval` sensor data is very large, we provide a separate download link, which loads only the frames needed for `navtrain`. 
This eases access for users that only want to run the `navtrain` split and not the `trainval` split. If you already downloaded the full `trainval` sensor data, it is **not necessary** to download the `navtrain` frames as well.
The logs are always the complete dataset split.

## Overview
The Table belows offers an overview on the training and test splits supported by NAVSIM. 
In Navsim-v1.1, the training/test split can bet set with a single config parameter given in the table.

<table border="0">
    <tr>
        <th></th>
        <th>Name</th>
        <th>Description</th>
        <th>Logs</th>
        <th>Sensors</th>
        <th>Config parameters</th>
    </tr>
    <tr>
        <td rowspan="3">Standard</td>
        <td>trainval</td>
        <td>Large split for training and validating agents with regular driving recordings. Corresponds to nuPlan and downsampled to 2HZ.</td>
        <td>14GB</td>
        <td>&gt;2000GB</td>
        <td>
        train_test_split=trainval
        </td>
    </tr>
    <tr>
        <td>test</td>
        <td>Small split for testing agents with regular driving recordings. Corresponds to nuPlan and downsampled to 2HZ.</td>
        <td>1GB</td>
        <td>217GB</td>
        <td>
        train_test_split=test
        </td>
    </tr>
    <tr>
        <td>mini</td>
        <td>Demo split for with regular driving recordings. Corresponds to nuPlan and downsampled to 2HZ.</td>
        <td>1GB</td>
        <td>151GB</td>
        <td>
        train_test_split=mini
        </td>
    </tr>
    <tr>
        <td rowspan="2">NAVSIM</td>
        <td>navtrain</td>
        <td>Standard split for training agents in NAVSIM with non-trivial driving scenes. Sensors available separately in <a href="https://github.com/autonomousvision/navsim/blob/main/download/download_navtrain.sh">download_navtrain.sh</a>.</td>
        <td>-</td>
        <td>445GB*</td>
        <td>
        train_test_split=navtrain
        </td>
    </tr>
    <tr>
        <td>navtest</td>
        <td>Standard split for testing agents in NAVSIM with non-trivial driving scenes. Available as a filter for test split.</td>
        <td>-</td>
        <td>-</td>
        <td>
        train_test_split=navtest
        </td>
    </tr>
    <tr>
        <td rowspan="2">Competition</td>
        <td>warmup_test_e2e</td>
        <td>Warmup test split to validate submission on hugging face. Available as a filter for mini split.</td>
        <td>-</td>
        <td>-</td>
        <td>
        train_test_split=warmup_test_e2e
        </td>
    </tr>
    <tr>
        <td>private_test_e2e</td>
        <td>Private test split for the challenge leaderboard on hugging face.</td>
        <td>&lt;1GB</td>
        <td>25GB</td>
        <td>
        train_test_split=private_test_e2e
        </td>
    </tr>
</table>

(*300GB without history)

## Splits

The standard splits `trainval`, `test`, and `mini` are from the OpenScene dataset. Note that the data corresponds to the nuPlan dataset with a lower frequency of 2Hz. You can download all standard splits over Hugging Face with the bash scripts in [download](../download)

NAVSIM provides a subset and filter of the `trainval` split, called `navtrain`. The `navtrain` split facilitates a standardized training scheme and requires significantly less sensor data storage than `travel` (445GB vs. 2100GB). If your agents don't need historical sensor inputs, you can download `navtrain` without history, which requires 300GB of storage. Note that `navtrain` can be downloaded separately via [download_navtrain.sh](https://github.com/autonomousvision/navsim/blob/main/download/download_navtrain.sh) but still requires access to the `trainval` logs. Similarly, the `navtest` split enables a standardized set for testing agents with a provided scene filter. Both `navtrain` and `navtest` are filtered to increase interesting samples in the sets. 

For the challenge on Hugging Face, we provide the `warmup_test_e2e` and `private_test_e2e` for the warm-up and challenge track, respectively. Note that `private_test_e2e` requires you to download the data, while `warmup_test_e2e` is a scene filter for the `mini` split.


## Troubleshooting

As previous users reported missing files when downloading `navtrain`, we provide MD5 checksums for the `.tgz` files to identify corrupted downloads. We recommend to re-download `navtrain` without deleting the `.tgz` files (i.e. removing Line 12 and 22 in [download_navtrain.sh](https://github.com/autonomousvision/navsim/blob/main/download/download_navtrain.sh)) and running:

```bash 
echo "6f92f38d5f03ed852da7872a7122bdd2  navtrain_current_1.tgz" | md5sum -c -
echo "7a72f0a758b5df6cbe4c565920a4869f  navtrain_current_2.tgz" | md5sum -c -
echo "b083fce1428308abb5682a1a150cc1af  navtrain_current_3.tgz" | md5sum -c -
echo "68354ac2c993aa1ebbfac59547fdb840  navtrain_current_4.tgz" | md5sum -c -
echo "dc46ed34d92d5ab9cc1464d67b72fbf6  navtrain_history_1.tgz" | md5sum -c -
echo "fab177bdb79c0c9536da1566d13e5995  navtrain_history_2.tgz" | md5sum -c -
echo "71ed9a2387edc3849921861d7873c7f0  navtrain_history_3.tgz" | md5sum -c -
echo "2cc13aced2f458e50fe4cc2f26d18e07  navtrain_history_4.tgz" | md5sum -c -
```


<details>
<summary>Expected output:</summary>

```bash 
navtrain_current_1.tgz: OK
navtrain_current_2.tgz: OK
navtrain_current_3.tgz: OK
navtrain_current_4.tgz: OK
navtrain_history_1.tgz: OK
navtrain_history_2.tgz: OK
navtrain_history_3.tgz: OK
navtrain_history_4.tgz: OK
```

</details>
</br>