LiuxyIA commited on
Commit
b8bee3c
·
verified ·
1 Parent(s): fc90aa3

Upload LlavaQwenForCausalLM

Browse files
Files changed (4) hide show
  1. README.md +199 -0
  2. config.json +1643 -0
  3. generation_config.json +13 -0
  4. model.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags: []
4
+ ---
5
+
6
+ # Model Card for Model ID
7
+
8
+ <!-- Provide a quick summary of what the model is/does. -->
9
+
10
+
11
+
12
+ ## Model Details
13
+
14
+ ### Model Description
15
+
16
+ <!-- Provide a longer summary of what this model is. -->
17
+
18
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
+
20
+ - **Developed by:** [More Information Needed]
21
+ - **Funded by [optional]:** [More Information Needed]
22
+ - **Shared by [optional]:** [More Information Needed]
23
+ - **Model type:** [More Information Needed]
24
+ - **Language(s) (NLP):** [More Information Needed]
25
+ - **License:** [More Information Needed]
26
+ - **Finetuned from model [optional]:** [More Information Needed]
27
+
28
+ ### Model Sources [optional]
29
+
30
+ <!-- Provide the basic links for the model. -->
31
+
32
+ - **Repository:** [More Information Needed]
33
+ - **Paper [optional]:** [More Information Needed]
34
+ - **Demo [optional]:** [More Information Needed]
35
+
36
+ ## Uses
37
+
38
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
+
40
+ ### Direct Use
41
+
42
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
+
44
+ [More Information Needed]
45
+
46
+ ### Downstream Use [optional]
47
+
48
+ <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
+
50
+ [More Information Needed]
51
+
52
+ ### Out-of-Scope Use
53
+
54
+ <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
+
56
+ [More Information Needed]
57
+
58
+ ## Bias, Risks, and Limitations
59
+
60
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
+
62
+ [More Information Needed]
63
+
64
+ ### Recommendations
65
+
66
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
+
68
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ Use the code below to get started with the model.
73
+
74
+ [More Information Needed]
75
+
76
+ ## Training Details
77
+
78
+ ### Training Data
79
+
80
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
+
82
+ [More Information Needed]
83
+
84
+ ### Training Procedure
85
+
86
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
+
88
+ #### Preprocessing [optional]
89
+
90
+ [More Information Needed]
91
+
92
+
93
+ #### Training Hyperparameters
94
+
95
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
+
97
+ #### Speeds, Sizes, Times [optional]
98
+
99
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
+
101
+ [More Information Needed]
102
+
103
+ ## Evaluation
104
+
105
+ <!-- This section describes the evaluation protocols and provides the results. -->
106
+
107
+ ### Testing Data, Factors & Metrics
108
+
109
+ #### Testing Data
110
+
111
+ <!-- This should link to a Dataset Card if possible. -->
112
+
113
+ [More Information Needed]
114
+
115
+ #### Factors
116
+
117
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
+
119
+ [More Information Needed]
120
+
121
+ #### Metrics
122
+
123
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
+
125
+ [More Information Needed]
126
+
127
+ ### Results
128
+
129
+ [More Information Needed]
130
+
131
+ #### Summary
132
+
133
+
134
+
135
+ ## Model Examination [optional]
136
+
137
+ <!-- Relevant interpretability work for the model goes here -->
138
+
139
+ [More Information Needed]
140
+
141
+ ## Environmental Impact
142
+
143
+ <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
+
145
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
+
147
+ - **Hardware Type:** [More Information Needed]
148
+ - **Hours used:** [More Information Needed]
149
+ - **Cloud Provider:** [More Information Needed]
150
+ - **Compute Region:** [More Information Needed]
151
+ - **Carbon Emitted:** [More Information Needed]
152
+
153
+ ## Technical Specifications [optional]
154
+
155
+ ### Model Architecture and Objective
156
+
157
+ [More Information Needed]
158
+
159
+ ### Compute Infrastructure
160
+
161
+ [More Information Needed]
162
+
163
+ #### Hardware
164
+
165
+ [More Information Needed]
166
+
167
+ #### Software
168
+
169
+ [More Information Needed]
170
+
171
+ ## Citation [optional]
172
+
173
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
+
175
+ **BibTeX:**
176
+
177
+ [More Information Needed]
178
+
179
+ **APA:**
180
+
181
+ [More Information Needed]
182
+
183
+ ## Glossary [optional]
184
+
185
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
+
187
+ [More Information Needed]
188
+
189
+ ## More Information [optional]
190
+
191
+ [More Information Needed]
192
+
193
+ ## Model Card Authors [optional]
194
+
195
+ [More Information Needed]
196
+
197
+ ## Model Card Contact
198
+
199
+ [More Information Needed]
config.json ADDED
@@ -0,0 +1,1643 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "checkpoints/0.5B_pretrain/checkpoint-24722",
3
+ "action_loss_weight": 1.0,
4
+ "architectures": [
5
+ "LlavaQwenForCausalLM"
6
+ ],
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 151643,
9
+ "centerlines_data_path": "",
10
+ "eos_token_id": 151645,
11
+ "freeze_vision_tower": false,
12
+ "future_frames": 6,
13
+ "hidden_act": "silu",
14
+ "hidden_size": 1024,
15
+ "history_frames": 4,
16
+ "image_aspect_ratio": "anyres",
17
+ "image_crop_resolution": 224,
18
+ "image_folder": null,
19
+ "image_grid_pinpoints": [
20
+ [
21
+ 384,
22
+ 768
23
+ ],
24
+ [
25
+ 768,
26
+ 384
27
+ ],
28
+ [
29
+ 768,
30
+ 768
31
+ ],
32
+ [
33
+ 1152,
34
+ 384
35
+ ],
36
+ [
37
+ 384,
38
+ 1152
39
+ ]
40
+ ],
41
+ "image_split_resolution": 224,
42
+ "initializer_range": 0.02,
43
+ "intermediate_size": 2816,
44
+ "map_loss_weight": 0.01,
45
+ "max_position_embeddings": 32768,
46
+ "max_window_layers": 21,
47
+ "mm_hidden_size": 1152,
48
+ "mm_patch_merge_type": "flat",
49
+ "mm_projector_lr": null,
50
+ "mm_projector_type": "mlp2x_gelu",
51
+ "mm_resampler_type": null,
52
+ "mm_tunable_parts": "mm_mlp_adapter,mm_language_model",
53
+ "mm_use_im_patch_token": false,
54
+ "mm_use_im_start_end": false,
55
+ "mm_vision_select_feature": "patch",
56
+ "mm_vision_select_layer": -2,
57
+ "mm_vision_tower": "/data/liuxy/b2d/llava_carla/google/siglip-so400m-patch14-384",
58
+ "mm_vision_tower_lr": null,
59
+ "model_type": "llava_qwen",
60
+ "navi_data_path": null,
61
+ "num_attention_heads": 16,
62
+ "num_hidden_layers": 24,
63
+ "num_key_value_heads": 16,
64
+ "perception_config": {
65
+ "NameMapping": {
66
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Charger/SM_ChargerParked.SM_ChargerParked": "car",
67
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/FordCrown/SM_FordCrown_parked.SM_FordCrown_parked": "car",
68
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Lincoln/SM_LincolnParked.SM_LincolnParked": "car",
69
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/MercedesCCC/SM_MercedesCCC_Parked.SM_MercedesCCC_Parked": "car",
70
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Mini2021/SM_Mini2021_parked.SM_Mini2021_parked": "car",
71
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/NissanPatrol2021/SM_NissanPatrol2021_parked.SM_NissanPatrol2021_parked": "car",
72
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/TeslaM3/SM_TeslaM3_parked.SM_TeslaM3_parked": "car",
73
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/VolkswagenT2/SM_VolkswagenT2_2021_Parked.SM_VolkswagenT2_2021_Parked": "van",
74
+ "static.prop.constructioncone": "traffic_cone",
75
+ "static.prop.dirtdebris01": "others",
76
+ "static.prop.dirtdebris02": "others",
77
+ "static.prop.trafficwarning": "traffic_cone",
78
+ "static.prop.warningaccident": "traffic_cone",
79
+ "static.prop.warningconstruction": "traffic_cone",
80
+ "traffic.speed_limit.120": "traffic_sign",
81
+ "traffic.speed_limit.30": "traffic_sign",
82
+ "traffic.speed_limit.40": "traffic_sign",
83
+ "traffic.speed_limit.50": "traffic_sign",
84
+ "traffic.speed_limit.60": "traffic_sign",
85
+ "traffic.speed_limit.90": "traffic_sign",
86
+ "traffic.stop": "traffic_sign",
87
+ "traffic.traffic_light": "traffic_light",
88
+ "traffic.yield": "traffic_sign",
89
+ "vehicle.audi.etron": "car",
90
+ "vehicle.audi.tt": "car",
91
+ "vehicle.bh.crossbike": "bicycle",
92
+ "vehicle.carlamotors.firetruck": "truck",
93
+ "vehicle.chevrolet.impala": "car",
94
+ "vehicle.diamondback.century": "bicycle",
95
+ "vehicle.dodge.charger_2020": "car",
96
+ "vehicle.dodge.charger_police": "car",
97
+ "vehicle.dodge.charger_police_2020": "car",
98
+ "vehicle.ford.ambulance": "van",
99
+ "vehicle.ford.crown": "car",
100
+ "vehicle.ford.mustang": "car",
101
+ "vehicle.gazelle.omafiets": "bicycle",
102
+ "vehicle.lincoln.mkz_2017": "car",
103
+ "vehicle.lincoln.mkz_2020": "car",
104
+ "vehicle.mercedes.coupe_2020": "car",
105
+ "vehicle.mini.cooper_s_2021": "car",
106
+ "vehicle.nissan.patrol_2021": "car",
107
+ "vehicle.tesla.model3": "car",
108
+ "walker.pedestrian.0001": "pedestrian",
109
+ "walker.pedestrian.0004": "pedestrian",
110
+ "walker.pedestrian.0005": "pedestrian",
111
+ "walker.pedestrian.0007": "pedestrian",
112
+ "walker.pedestrian.0013": "pedestrian",
113
+ "walker.pedestrian.0014": "pedestrian",
114
+ "walker.pedestrian.0017": "pedestrian",
115
+ "walker.pedestrian.0018": "pedestrian",
116
+ "walker.pedestrian.0019": "pedestrian",
117
+ "walker.pedestrian.0020": "pedestrian",
118
+ "walker.pedestrian.0022": "pedestrian",
119
+ "walker.pedestrian.0025": "pedestrian",
120
+ "walker.pedestrian.0035": "pedestrian",
121
+ "walker.pedestrian.0041": "pedestrian",
122
+ "walker.pedestrian.0046": "pedestrian",
123
+ "walker.pedestrian.0047": "pedestrian"
124
+ },
125
+ "_dim_": 256,
126
+ "_ffn_dim_": 512,
127
+ "_num_levels_": 4,
128
+ "_pos_dim_": 128,
129
+ "ann_file_test": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_infos_qa_changelane_road_val.pkl",
130
+ "ann_file_train": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_infos_qa_changelane_road_train.pkl",
131
+ "ann_file_val": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_infos_qa_changelane_road_val.pkl",
132
+ "bev_h_": 200,
133
+ "bev_w_": 200,
134
+ "checkpoint_config": {
135
+ "interval": 1,
136
+ "max_keep_ckpts": 6
137
+ },
138
+ "class_names": [
139
+ "car",
140
+ "van",
141
+ "truck",
142
+ "bicycle",
143
+ "traffic_sign",
144
+ "traffic_cone",
145
+ "traffic_light",
146
+ "pedestrian",
147
+ "others"
148
+ ],
149
+ "custom_hooks": [
150
+ {
151
+ "type": "CustomSetEpochInfoHook"
152
+ }
153
+ ],
154
+ "data": {
155
+ "nonshuffler_sampler": {
156
+ "type": "DistributedSampler"
157
+ },
158
+ "samples_per_gpu": 1,
159
+ "shuffler_sampler": {
160
+ "type": "DistributedGroupSampler"
161
+ },
162
+ "test": {
163
+ "ann_file": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_infos_qa_changelane_road_val.pkl",
164
+ "bev_size": [
165
+ 200,
166
+ 200
167
+ ],
168
+ "box_type_3d": "LiDAR",
169
+ "classes": [
170
+ "car",
171
+ "van",
172
+ "truck",
173
+ "bicycle",
174
+ "traffic_sign",
175
+ "traffic_cone",
176
+ "traffic_light",
177
+ "pedestrian",
178
+ "others"
179
+ ],
180
+ "data_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/bench2drive",
181
+ "eval_cfg": {
182
+ "class_names": [
183
+ "car",
184
+ "van",
185
+ "truck",
186
+ "bicycle",
187
+ "traffic_sign",
188
+ "traffic_cone",
189
+ "traffic_light",
190
+ "pedestrian"
191
+ ],
192
+ "class_range": {
193
+ "bicycle": [
194
+ 40,
195
+ 40
196
+ ],
197
+ "car": [
198
+ 50,
199
+ 50
200
+ ],
201
+ "pedestrian": [
202
+ 40,
203
+ 40
204
+ ],
205
+ "traffic_cone": [
206
+ 30,
207
+ 30
208
+ ],
209
+ "traffic_light": [
210
+ 30,
211
+ 30
212
+ ],
213
+ "traffic_sign": [
214
+ 30,
215
+ 30
216
+ ],
217
+ "truck": [
218
+ 50,
219
+ 50
220
+ ],
221
+ "van": [
222
+ 50,
223
+ 50
224
+ ]
225
+ },
226
+ "dist_th_tp": 2.0,
227
+ "dist_ths": [
228
+ 0.5,
229
+ 1.0,
230
+ 2.0,
231
+ 4.0
232
+ ],
233
+ "err_name_maping": {
234
+ "attr_err": "mAAE",
235
+ "orient_err": "mAOE",
236
+ "scale_err": "mASE",
237
+ "trans_err": "mATE",
238
+ "vel_err": "mAVE"
239
+ },
240
+ "mean_ap_weight": 5,
241
+ "min_precision": 0.1,
242
+ "min_recall": 0.1,
243
+ "tp_metrics": [
244
+ "trans_err",
245
+ "scale_err",
246
+ "orient_err",
247
+ "vel_err"
248
+ ]
249
+ },
250
+ "future_frames": 16,
251
+ "map_file": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_map_infos.pkl",
252
+ "map_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/maps",
253
+ "modality": {
254
+ "use_camera": true,
255
+ "use_external": true,
256
+ "use_lidar": false,
257
+ "use_map": false,
258
+ "use_radar": false
259
+ },
260
+ "name_mapping": {
261
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Charger/SM_ChargerParked.SM_ChargerParked": "car",
262
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/FordCrown/SM_FordCrown_parked.SM_FordCrown_parked": "car",
263
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Lincoln/SM_LincolnParked.SM_LincolnParked": "car",
264
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/MercedesCCC/SM_MercedesCCC_Parked.SM_MercedesCCC_Parked": "car",
265
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Mini2021/SM_Mini2021_parked.SM_Mini2021_parked": "car",
266
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/NissanPatrol2021/SM_NissanPatrol2021_parked.SM_NissanPatrol2021_parked": "car",
267
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/TeslaM3/SM_TeslaM3_parked.SM_TeslaM3_parked": "car",
268
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/VolkswagenT2/SM_VolkswagenT2_2021_Parked.SM_VolkswagenT2_2021_Parked": "van",
269
+ "static.prop.constructioncone": "traffic_cone",
270
+ "static.prop.dirtdebris01": "others",
271
+ "static.prop.dirtdebris02": "others",
272
+ "static.prop.trafficwarning": "traffic_cone",
273
+ "static.prop.warningaccident": "traffic_cone",
274
+ "static.prop.warningconstruction": "traffic_cone",
275
+ "traffic.speed_limit.120": "traffic_sign",
276
+ "traffic.speed_limit.30": "traffic_sign",
277
+ "traffic.speed_limit.40": "traffic_sign",
278
+ "traffic.speed_limit.50": "traffic_sign",
279
+ "traffic.speed_limit.60": "traffic_sign",
280
+ "traffic.speed_limit.90": "traffic_sign",
281
+ "traffic.stop": "traffic_sign",
282
+ "traffic.traffic_light": "traffic_light",
283
+ "traffic.yield": "traffic_sign",
284
+ "vehicle.audi.etron": "car",
285
+ "vehicle.audi.tt": "car",
286
+ "vehicle.bh.crossbike": "bicycle",
287
+ "vehicle.carlamotors.firetruck": "truck",
288
+ "vehicle.chevrolet.impala": "car",
289
+ "vehicle.diamondback.century": "bicycle",
290
+ "vehicle.dodge.charger_2020": "car",
291
+ "vehicle.dodge.charger_police": "car",
292
+ "vehicle.dodge.charger_police_2020": "car",
293
+ "vehicle.ford.ambulance": "van",
294
+ "vehicle.ford.crown": "car",
295
+ "vehicle.ford.mustang": "car",
296
+ "vehicle.gazelle.omafiets": "bicycle",
297
+ "vehicle.lincoln.mkz_2017": "car",
298
+ "vehicle.lincoln.mkz_2020": "car",
299
+ "vehicle.mercedes.coupe_2020": "car",
300
+ "vehicle.mini.cooper_s_2021": "car",
301
+ "vehicle.nissan.patrol_2021": "car",
302
+ "vehicle.tesla.model3": "car",
303
+ "walker.pedestrian.0001": "pedestrian",
304
+ "walker.pedestrian.0004": "pedestrian",
305
+ "walker.pedestrian.0005": "pedestrian",
306
+ "walker.pedestrian.0007": "pedestrian",
307
+ "walker.pedestrian.0013": "pedestrian",
308
+ "walker.pedestrian.0014": "pedestrian",
309
+ "walker.pedestrian.0017": "pedestrian",
310
+ "walker.pedestrian.0018": "pedestrian",
311
+ "walker.pedestrian.0019": "pedestrian",
312
+ "walker.pedestrian.0020": "pedestrian",
313
+ "walker.pedestrian.0022": "pedestrian",
314
+ "walker.pedestrian.0025": "pedestrian",
315
+ "walker.pedestrian.0035": "pedestrian",
316
+ "walker.pedestrian.0041": "pedestrian",
317
+ "walker.pedestrian.0046": "pedestrian",
318
+ "walker.pedestrian.0047": "pedestrian"
319
+ },
320
+ "past_frames": 4,
321
+ "pipeline": [
322
+ {
323
+ "to_float32": true,
324
+ "type": "LoadMultiViewImageFromFiles"
325
+ },
326
+ {
327
+ "class_names": [
328
+ "car",
329
+ "van",
330
+ "truck",
331
+ "bicycle",
332
+ "traffic_sign",
333
+ "traffic_cone",
334
+ "traffic_light",
335
+ "pedestrian",
336
+ "others"
337
+ ],
338
+ "type": "VADFormatBundle3D",
339
+ "with_ego": true
340
+ },
341
+ {
342
+ "keys": [
343
+ "ego_his_trajs",
344
+ "ego_fut_trajs",
345
+ "ego_fut_masks",
346
+ "ego_fut_cmd",
347
+ "ego_lcf_feat",
348
+ "navi_points",
349
+ "navi_mask",
350
+ "cam_front_path",
351
+ "folder",
352
+ "frame_idx"
353
+ ],
354
+ "type": "CustomCollect3D"
355
+ }
356
+ ],
357
+ "point_cloud_range": [
358
+ -15.0,
359
+ -30.0,
360
+ -2.0,
361
+ 15.0,
362
+ 30.0,
363
+ 2.0
364
+ ],
365
+ "polyline_points_num": 20,
366
+ "queue_length": 4,
367
+ "test_mode": true,
368
+ "type": "B2D_TextDrive_Dataset"
369
+ },
370
+ "train": {
371
+ "ann_file": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_infos_qa_changelane_road_train.pkl",
372
+ "bev_size": [
373
+ 200,
374
+ 200
375
+ ],
376
+ "box_type_3d": "LiDAR",
377
+ "classes": [
378
+ "car",
379
+ "van",
380
+ "truck",
381
+ "bicycle",
382
+ "traffic_sign",
383
+ "traffic_cone",
384
+ "traffic_light",
385
+ "pedestrian",
386
+ "others"
387
+ ],
388
+ "data_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/bench2drive",
389
+ "future_frames": 16,
390
+ "map_file": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_map_infos.pkl",
391
+ "map_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/maps",
392
+ "modality": {
393
+ "use_camera": true,
394
+ "use_external": true,
395
+ "use_lidar": false,
396
+ "use_map": false,
397
+ "use_radar": false
398
+ },
399
+ "name_mapping": {
400
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Charger/SM_ChargerParked.SM_ChargerParked": "car",
401
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/FordCrown/SM_FordCrown_parked.SM_FordCrown_parked": "car",
402
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Lincoln/SM_LincolnParked.SM_LincolnParked": "car",
403
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/MercedesCCC/SM_MercedesCCC_Parked.SM_MercedesCCC_Parked": "car",
404
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Mini2021/SM_Mini2021_parked.SM_Mini2021_parked": "car",
405
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/NissanPatrol2021/SM_NissanPatrol2021_parked.SM_NissanPatrol2021_parked": "car",
406
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/TeslaM3/SM_TeslaM3_parked.SM_TeslaM3_parked": "car",
407
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/VolkswagenT2/SM_VolkswagenT2_2021_Parked.SM_VolkswagenT2_2021_Parked": "van",
408
+ "static.prop.constructioncone": "traffic_cone",
409
+ "static.prop.dirtdebris01": "others",
410
+ "static.prop.dirtdebris02": "others",
411
+ "static.prop.trafficwarning": "traffic_cone",
412
+ "static.prop.warningaccident": "traffic_cone",
413
+ "static.prop.warningconstruction": "traffic_cone",
414
+ "traffic.speed_limit.120": "traffic_sign",
415
+ "traffic.speed_limit.30": "traffic_sign",
416
+ "traffic.speed_limit.40": "traffic_sign",
417
+ "traffic.speed_limit.50": "traffic_sign",
418
+ "traffic.speed_limit.60": "traffic_sign",
419
+ "traffic.speed_limit.90": "traffic_sign",
420
+ "traffic.stop": "traffic_sign",
421
+ "traffic.traffic_light": "traffic_light",
422
+ "traffic.yield": "traffic_sign",
423
+ "vehicle.audi.etron": "car",
424
+ "vehicle.audi.tt": "car",
425
+ "vehicle.bh.crossbike": "bicycle",
426
+ "vehicle.carlamotors.firetruck": "truck",
427
+ "vehicle.chevrolet.impala": "car",
428
+ "vehicle.diamondback.century": "bicycle",
429
+ "vehicle.dodge.charger_2020": "car",
430
+ "vehicle.dodge.charger_police": "car",
431
+ "vehicle.dodge.charger_police_2020": "car",
432
+ "vehicle.ford.ambulance": "van",
433
+ "vehicle.ford.crown": "car",
434
+ "vehicle.ford.mustang": "car",
435
+ "vehicle.gazelle.omafiets": "bicycle",
436
+ "vehicle.lincoln.mkz_2017": "car",
437
+ "vehicle.lincoln.mkz_2020": "car",
438
+ "vehicle.mercedes.coupe_2020": "car",
439
+ "vehicle.mini.cooper_s_2021": "car",
440
+ "vehicle.nissan.patrol_2021": "car",
441
+ "vehicle.tesla.model3": "car",
442
+ "walker.pedestrian.0001": "pedestrian",
443
+ "walker.pedestrian.0004": "pedestrian",
444
+ "walker.pedestrian.0005": "pedestrian",
445
+ "walker.pedestrian.0007": "pedestrian",
446
+ "walker.pedestrian.0013": "pedestrian",
447
+ "walker.pedestrian.0014": "pedestrian",
448
+ "walker.pedestrian.0017": "pedestrian",
449
+ "walker.pedestrian.0018": "pedestrian",
450
+ "walker.pedestrian.0019": "pedestrian",
451
+ "walker.pedestrian.0020": "pedestrian",
452
+ "walker.pedestrian.0022": "pedestrian",
453
+ "walker.pedestrian.0025": "pedestrian",
454
+ "walker.pedestrian.0035": "pedestrian",
455
+ "walker.pedestrian.0041": "pedestrian",
456
+ "walker.pedestrian.0046": "pedestrian",
457
+ "walker.pedestrian.0047": "pedestrian"
458
+ },
459
+ "past_frames": 4,
460
+ "pipeline": [
461
+ {
462
+ "to_float32": true,
463
+ "type": "LoadMultiViewImageFromFiles"
464
+ },
465
+ {
466
+ "class_names": [
467
+ "car",
468
+ "van",
469
+ "truck",
470
+ "bicycle",
471
+ "traffic_sign",
472
+ "traffic_cone",
473
+ "traffic_light",
474
+ "pedestrian",
475
+ "others"
476
+ ],
477
+ "type": "VADFormatBundle3D",
478
+ "with_ego": true
479
+ },
480
+ {
481
+ "keys": [
482
+ "ego_his_trajs",
483
+ "ego_fut_trajs",
484
+ "ego_fut_masks",
485
+ "ego_fut_cmd",
486
+ "ego_lcf_feat",
487
+ "navi_points",
488
+ "navi_mask",
489
+ "cam_front_path",
490
+ "folder",
491
+ "frame_idx"
492
+ ],
493
+ "type": "CustomCollect3D"
494
+ }
495
+ ],
496
+ "point_cloud_range": [
497
+ -15.0,
498
+ -30.0,
499
+ -2.0,
500
+ 15.0,
501
+ 30.0,
502
+ 2.0
503
+ ],
504
+ "polyline_points_num": 20,
505
+ "queue_length": 4,
506
+ "test_mode": false,
507
+ "type": "B2D_TextDrive_Dataset"
508
+ },
509
+ "val": {
510
+ "ann_file": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_infos_qa_changelane_road_val.pkl",
511
+ "bev_size": [
512
+ 200,
513
+ 200
514
+ ],
515
+ "box_type_3d": "LiDAR",
516
+ "classes": [
517
+ "car",
518
+ "van",
519
+ "truck",
520
+ "bicycle",
521
+ "traffic_sign",
522
+ "traffic_cone",
523
+ "traffic_light",
524
+ "pedestrian",
525
+ "others"
526
+ ],
527
+ "data_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/bench2drive",
528
+ "eval_cfg": {
529
+ "class_names": [
530
+ "car",
531
+ "van",
532
+ "truck",
533
+ "bicycle",
534
+ "traffic_sign",
535
+ "traffic_cone",
536
+ "traffic_light",
537
+ "pedestrian"
538
+ ],
539
+ "class_range": {
540
+ "bicycle": [
541
+ 40,
542
+ 40
543
+ ],
544
+ "car": [
545
+ 50,
546
+ 50
547
+ ],
548
+ "pedestrian": [
549
+ 40,
550
+ 40
551
+ ],
552
+ "traffic_cone": [
553
+ 30,
554
+ 30
555
+ ],
556
+ "traffic_light": [
557
+ 30,
558
+ 30
559
+ ],
560
+ "traffic_sign": [
561
+ 30,
562
+ 30
563
+ ],
564
+ "truck": [
565
+ 50,
566
+ 50
567
+ ],
568
+ "van": [
569
+ 50,
570
+ 50
571
+ ]
572
+ },
573
+ "dist_th_tp": 2.0,
574
+ "dist_ths": [
575
+ 0.5,
576
+ 1.0,
577
+ 2.0,
578
+ 4.0
579
+ ],
580
+ "err_name_maping": {
581
+ "attr_err": "mAAE",
582
+ "orient_err": "mAOE",
583
+ "scale_err": "mASE",
584
+ "trans_err": "mATE",
585
+ "vel_err": "mAVE"
586
+ },
587
+ "mean_ap_weight": 5,
588
+ "min_precision": 0.1,
589
+ "min_recall": 0.1,
590
+ "tp_metrics": [
591
+ "trans_err",
592
+ "scale_err",
593
+ "orient_err",
594
+ "vel_err"
595
+ ]
596
+ },
597
+ "future_frames": 16,
598
+ "map_file": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_map_infos.pkl",
599
+ "map_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/maps",
600
+ "modality": {
601
+ "use_camera": true,
602
+ "use_external": true,
603
+ "use_lidar": false,
604
+ "use_map": false,
605
+ "use_radar": false
606
+ },
607
+ "name_mapping": {
608
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Charger/SM_ChargerParked.SM_ChargerParked": "car",
609
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/FordCrown/SM_FordCrown_parked.SM_FordCrown_parked": "car",
610
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Lincoln/SM_LincolnParked.SM_LincolnParked": "car",
611
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/MercedesCCC/SM_MercedesCCC_Parked.SM_MercedesCCC_Parked": "car",
612
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/Mini2021/SM_Mini2021_parked.SM_Mini2021_parked": "car",
613
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/NissanPatrol2021/SM_NissanPatrol2021_parked.SM_NissanPatrol2021_parked": "car",
614
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/TeslaM3/SM_TeslaM3_parked.SM_TeslaM3_parked": "car",
615
+ "/Game/Carla/Static/Car/4Wheeled/ParkedVehicles/VolkswagenT2/SM_VolkswagenT2_2021_Parked.SM_VolkswagenT2_2021_Parked": "van",
616
+ "static.prop.constructioncone": "traffic_cone",
617
+ "static.prop.dirtdebris01": "others",
618
+ "static.prop.dirtdebris02": "others",
619
+ "static.prop.trafficwarning": "traffic_cone",
620
+ "static.prop.warningaccident": "traffic_cone",
621
+ "static.prop.warningconstruction": "traffic_cone",
622
+ "traffic.speed_limit.120": "traffic_sign",
623
+ "traffic.speed_limit.30": "traffic_sign",
624
+ "traffic.speed_limit.40": "traffic_sign",
625
+ "traffic.speed_limit.50": "traffic_sign",
626
+ "traffic.speed_limit.60": "traffic_sign",
627
+ "traffic.speed_limit.90": "traffic_sign",
628
+ "traffic.stop": "traffic_sign",
629
+ "traffic.traffic_light": "traffic_light",
630
+ "traffic.yield": "traffic_sign",
631
+ "vehicle.audi.etron": "car",
632
+ "vehicle.audi.tt": "car",
633
+ "vehicle.bh.crossbike": "bicycle",
634
+ "vehicle.carlamotors.firetruck": "truck",
635
+ "vehicle.chevrolet.impala": "car",
636
+ "vehicle.diamondback.century": "bicycle",
637
+ "vehicle.dodge.charger_2020": "car",
638
+ "vehicle.dodge.charger_police": "car",
639
+ "vehicle.dodge.charger_police_2020": "car",
640
+ "vehicle.ford.ambulance": "van",
641
+ "vehicle.ford.crown": "car",
642
+ "vehicle.ford.mustang": "car",
643
+ "vehicle.gazelle.omafiets": "bicycle",
644
+ "vehicle.lincoln.mkz_2017": "car",
645
+ "vehicle.lincoln.mkz_2020": "car",
646
+ "vehicle.mercedes.coupe_2020": "car",
647
+ "vehicle.mini.cooper_s_2021": "car",
648
+ "vehicle.nissan.patrol_2021": "car",
649
+ "vehicle.tesla.model3": "car",
650
+ "walker.pedestrian.0001": "pedestrian",
651
+ "walker.pedestrian.0004": "pedestrian",
652
+ "walker.pedestrian.0005": "pedestrian",
653
+ "walker.pedestrian.0007": "pedestrian",
654
+ "walker.pedestrian.0013": "pedestrian",
655
+ "walker.pedestrian.0014": "pedestrian",
656
+ "walker.pedestrian.0017": "pedestrian",
657
+ "walker.pedestrian.0018": "pedestrian",
658
+ "walker.pedestrian.0019": "pedestrian",
659
+ "walker.pedestrian.0020": "pedestrian",
660
+ "walker.pedestrian.0022": "pedestrian",
661
+ "walker.pedestrian.0025": "pedestrian",
662
+ "walker.pedestrian.0035": "pedestrian",
663
+ "walker.pedestrian.0041": "pedestrian",
664
+ "walker.pedestrian.0046": "pedestrian",
665
+ "walker.pedestrian.0047": "pedestrian"
666
+ },
667
+ "past_frames": 4,
668
+ "pipeline": [
669
+ {
670
+ "to_float32": true,
671
+ "type": "LoadMultiViewImageFromFiles"
672
+ },
673
+ {
674
+ "class_names": [
675
+ "car",
676
+ "van",
677
+ "truck",
678
+ "bicycle",
679
+ "traffic_sign",
680
+ "traffic_cone",
681
+ "traffic_light",
682
+ "pedestrian",
683
+ "others"
684
+ ],
685
+ "type": "VADFormatBundle3D",
686
+ "with_ego": true
687
+ },
688
+ {
689
+ "keys": [
690
+ "ego_his_trajs",
691
+ "ego_fut_trajs",
692
+ "ego_fut_masks",
693
+ "ego_fut_cmd",
694
+ "ego_lcf_feat",
695
+ "navi_points",
696
+ "navi_mask",
697
+ "cam_front_path",
698
+ "folder",
699
+ "frame_idx"
700
+ ],
701
+ "type": "CustomCollect3D"
702
+ }
703
+ ],
704
+ "point_cloud_range": [
705
+ -15.0,
706
+ -30.0,
707
+ -2.0,
708
+ 15.0,
709
+ 30.0,
710
+ 2.0
711
+ ],
712
+ "polyline_points_num": 20,
713
+ "queue_length": 4,
714
+ "test_mode": true,
715
+ "type": "B2D_TextDrive_Dataset"
716
+ },
717
+ "workers_per_gpu": 6
718
+ },
719
+ "data_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/bench2drive",
720
+ "dataset_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data",
721
+ "dataset_type": "B2D_TextDrive_Dataset",
722
+ "dist_params": {
723
+ "backend": "nccl"
724
+ },
725
+ "eval_cfg": {
726
+ "class_names": [
727
+ "car",
728
+ "van",
729
+ "truck",
730
+ "bicycle",
731
+ "traffic_sign",
732
+ "traffic_cone",
733
+ "traffic_light",
734
+ "pedestrian"
735
+ ],
736
+ "class_range": {
737
+ "bicycle": [
738
+ 40,
739
+ 40
740
+ ],
741
+ "car": [
742
+ 50,
743
+ 50
744
+ ],
745
+ "pedestrian": [
746
+ 40,
747
+ 40
748
+ ],
749
+ "traffic_cone": [
750
+ 30,
751
+ 30
752
+ ],
753
+ "traffic_light": [
754
+ 30,
755
+ 30
756
+ ],
757
+ "traffic_sign": [
758
+ 30,
759
+ 30
760
+ ],
761
+ "truck": [
762
+ 50,
763
+ 50
764
+ ],
765
+ "van": [
766
+ 50,
767
+ 50
768
+ ]
769
+ },
770
+ "dist_th_tp": 2.0,
771
+ "dist_ths": [
772
+ 0.5,
773
+ 1.0,
774
+ 2.0,
775
+ 4.0
776
+ ],
777
+ "err_name_maping": {
778
+ "attr_err": "mAAE",
779
+ "orient_err": "mAOE",
780
+ "scale_err": "mASE",
781
+ "trans_err": "mATE",
782
+ "vel_err": "mAVE"
783
+ },
784
+ "mean_ap_weight": 5,
785
+ "min_precision": 0.1,
786
+ "min_recall": 0.1,
787
+ "tp_metrics": [
788
+ "trans_err",
789
+ "scale_err",
790
+ "orient_err",
791
+ "vel_err"
792
+ ]
793
+ },
794
+ "eval_pipeline": [
795
+ {
796
+ "coord_type": "LIDAR",
797
+ "file_client_args": {
798
+ "backend": "disk"
799
+ },
800
+ "load_dim": 5,
801
+ "type": "LoadPointsFromFile",
802
+ "use_dim": 5
803
+ },
804
+ {
805
+ "file_client_args": {
806
+ "backend": "disk"
807
+ },
808
+ "sweeps_num": 10,
809
+ "type": "LoadPointsFromMultiSweeps"
810
+ },
811
+ {
812
+ "class_names": [
813
+ "car",
814
+ "truck",
815
+ "trailer",
816
+ "bus",
817
+ "construction_vehicle",
818
+ "bicycle",
819
+ "motorcycle",
820
+ "pedestrian",
821
+ "traffic_cone",
822
+ "barrier"
823
+ ],
824
+ "type": "DefaultFormatBundle3D",
825
+ "with_label": false
826
+ },
827
+ {
828
+ "keys": [
829
+ "points"
830
+ ],
831
+ "type": "Collect3D"
832
+ }
833
+ ],
834
+ "evaluation": {
835
+ "interval": 6,
836
+ "map_metric": "chamfer",
837
+ "metric": "bbox",
838
+ "pipeline": [
839
+ {
840
+ "to_float32": true,
841
+ "type": "LoadMultiViewImageFromFiles"
842
+ },
843
+ {
844
+ "flip": false,
845
+ "img_scale": [
846
+ 1600,
847
+ 900
848
+ ],
849
+ "pts_scale_ratio": 1,
850
+ "transforms": [
851
+ {
852
+ "class_names": [
853
+ "car",
854
+ "van",
855
+ "truck",
856
+ "bicycle",
857
+ "traffic_sign",
858
+ "traffic_cone",
859
+ "traffic_light",
860
+ "pedestrian",
861
+ "others"
862
+ ],
863
+ "type": "VADFormatBundle3D",
864
+ "with_ego": true,
865
+ "with_label": false
866
+ },
867
+ {
868
+ "keys": [
869
+ "ego_his_trajs",
870
+ "ego_fut_trajs",
871
+ "ego_fut_masks",
872
+ "ego_fut_cmd",
873
+ "ego_lcf_feat",
874
+ "navi_points",
875
+ "navi_mask",
876
+ "cam_front_path",
877
+ "folder",
878
+ "frame_idx"
879
+ ],
880
+ "type": "CustomCollect3D"
881
+ }
882
+ ],
883
+ "type": "MultiScaleFlipAug3D"
884
+ }
885
+ ]
886
+ },
887
+ "file_client_args": {
888
+ "backend": "disk"
889
+ },
890
+ "future_frames": 16,
891
+ "img_norm_cfg": {
892
+ "mean": [
893
+ 123.675,
894
+ 116.28,
895
+ 103.53
896
+ ],
897
+ "std": [
898
+ 58.395,
899
+ 57.12,
900
+ 57.375
901
+ ],
902
+ "to_rgb": true
903
+ },
904
+ "inference_only_pipeline": [
905
+ {
906
+ "flip": false,
907
+ "img_scale": [
908
+ 1600,
909
+ 900
910
+ ],
911
+ "pts_scale_ratio": 1,
912
+ "transforms": [
913
+ {
914
+ "class_names": [
915
+ "car",
916
+ "van",
917
+ "truck",
918
+ "bicycle",
919
+ "traffic_sign",
920
+ "traffic_cone",
921
+ "traffic_light",
922
+ "pedestrian",
923
+ "others"
924
+ ],
925
+ "type": "VADFormatBundle3D",
926
+ "with_ego": true,
927
+ "with_label": false
928
+ },
929
+ {
930
+ "keys": [
931
+ "ego_his_trajs",
932
+ "ego_fut_trajs",
933
+ "ego_fut_masks",
934
+ "ego_fut_cmd",
935
+ "ego_lcf_feat",
936
+ "navi_points",
937
+ "navi_mask",
938
+ "cam_front_path",
939
+ "folder",
940
+ "frame_idx"
941
+ ],
942
+ "type": "CustomCollect3D"
943
+ }
944
+ ],
945
+ "type": "MultiScaleFlipAug3D"
946
+ }
947
+ ],
948
+ "info_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos",
949
+ "input_modality": {
950
+ "use_camera": true,
951
+ "use_external": true,
952
+ "use_lidar": false,
953
+ "use_map": false,
954
+ "use_radar": false
955
+ },
956
+ "load_from": null,
957
+ "log_config": {
958
+ "hooks": [
959
+ {
960
+ "type": "TextLoggerHook"
961
+ },
962
+ {
963
+ "type": "TensorboardLoggerHook"
964
+ }
965
+ ],
966
+ "interval": 50
967
+ },
968
+ "log_level": "INFO",
969
+ "lr_config": {
970
+ "by_epoch": false,
971
+ "min_lr_ratio": 0.001,
972
+ "policy": "CosineAnnealing",
973
+ "warmup": "linear",
974
+ "warmup_iters": 500,
975
+ "warmup_ratio": 0.3333333333333333
976
+ },
977
+ "map_classes": [
978
+ "Broken",
979
+ "Solid",
980
+ "SolidSolid",
981
+ "Center",
982
+ "TrafficLight",
983
+ "StopSign"
984
+ ],
985
+ "map_eval_use_same_gt_sample_num_flag": true,
986
+ "map_file": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/infos/b2d_map_infos.pkl",
987
+ "map_fixed_ptsnum_per_gt_line": 20,
988
+ "map_fixed_ptsnum_per_pred_line": 20,
989
+ "map_num_classes": 6,
990
+ "map_num_vec": 100,
991
+ "map_root": "/data/liuxy/b2d/llava_carla/Bench2DriveZoo/data/maps",
992
+ "model": {
993
+ "img_backbone": {
994
+ "depth": 50,
995
+ "frozen_stages": 1,
996
+ "norm_cfg": {
997
+ "requires_grad": false,
998
+ "type": "BN"
999
+ },
1000
+ "norm_eval": true,
1001
+ "num_stages": 4,
1002
+ "out_indices": [
1003
+ 1,
1004
+ 2,
1005
+ 3
1006
+ ],
1007
+ "style": "pytorch",
1008
+ "type": "ResNet"
1009
+ },
1010
+ "img_neck": {
1011
+ "add_extra_convs": "on_output",
1012
+ "in_channels": [
1013
+ 512,
1014
+ 1024,
1015
+ 2048
1016
+ ],
1017
+ "num_outs": 4,
1018
+ "out_channels": 256,
1019
+ "relu_before_extra_convs": true,
1020
+ "start_level": 0,
1021
+ "type": "FPN"
1022
+ },
1023
+ "pretrained": {
1024
+ "img": "ckpts/resnet50-19c8e357.pth"
1025
+ },
1026
+ "pts_bbox_head": {
1027
+ "as_two_stage": false,
1028
+ "bbox_coder": {
1029
+ "max_num": 100,
1030
+ "num_classes": 9,
1031
+ "pc_range": [
1032
+ -15.0,
1033
+ -30.0,
1034
+ -2.0,
1035
+ 15.0,
1036
+ 30.0,
1037
+ 2.0
1038
+ ],
1039
+ "post_center_range": [
1040
+ -20,
1041
+ -35,
1042
+ -10.0,
1043
+ 20,
1044
+ 35,
1045
+ 10.0
1046
+ ],
1047
+ "type": "CustomNMSFreeCoder",
1048
+ "voxel_size": [
1049
+ 0.15,
1050
+ 0.15,
1051
+ 4
1052
+ ]
1053
+ },
1054
+ "bev_h": 200,
1055
+ "bev_w": 200,
1056
+ "dis_thresh": 0.2,
1057
+ "ego_agent_decoder": {
1058
+ "num_layers": 1,
1059
+ "return_intermediate": false,
1060
+ "transformerlayers": {
1061
+ "attn_cfgs": [
1062
+ {
1063
+ "dropout": 0.0,
1064
+ "embed_dims": 256,
1065
+ "num_heads": 8,
1066
+ "type": "MultiheadAttention"
1067
+ }
1068
+ ],
1069
+ "feedforward_channels": 512,
1070
+ "ffn_dropout": 0.0,
1071
+ "operation_order": [
1072
+ "cross_attn",
1073
+ "norm",
1074
+ "ffn",
1075
+ "norm"
1076
+ ],
1077
+ "type": "BaseTransformerLayer"
1078
+ },
1079
+ "type": "CustomTransformerDecoder"
1080
+ },
1081
+ "ego_fut_mode": 6,
1082
+ "ego_his_encoder": null,
1083
+ "ego_lcf_feat_idx": null,
1084
+ "ego_map_decoder": {
1085
+ "num_layers": 1,
1086
+ "return_intermediate": false,
1087
+ "transformerlayers": {
1088
+ "attn_cfgs": [
1089
+ {
1090
+ "dropout": 0.0,
1091
+ "embed_dims": 256,
1092
+ "num_heads": 8,
1093
+ "type": "MultiheadAttention"
1094
+ }
1095
+ ],
1096
+ "feedforward_channels": 512,
1097
+ "ffn_dropout": 0.0,
1098
+ "operation_order": [
1099
+ "cross_attn",
1100
+ "norm",
1101
+ "ffn",
1102
+ "norm"
1103
+ ],
1104
+ "type": "BaseTransformerLayer"
1105
+ },
1106
+ "type": "CustomTransformerDecoder"
1107
+ },
1108
+ "in_channels": 256,
1109
+ "loss_bbox": {
1110
+ "loss_weight": 0.25,
1111
+ "type": "L1Loss"
1112
+ },
1113
+ "loss_cls": {
1114
+ "alpha": 0.25,
1115
+ "gamma": 2.0,
1116
+ "loss_weight": 2.0,
1117
+ "type": "FocalLoss",
1118
+ "use_sigmoid": true
1119
+ },
1120
+ "loss_iou": {
1121
+ "loss_weight": 0.0,
1122
+ "type": "GIoULoss"
1123
+ },
1124
+ "loss_map_bbox": {
1125
+ "loss_weight": 0.0,
1126
+ "type": "L1Loss"
1127
+ },
1128
+ "loss_map_cls": {
1129
+ "alpha": 0.25,
1130
+ "gamma": 2.0,
1131
+ "loss_weight": 2.0,
1132
+ "type": "FocalLoss",
1133
+ "use_sigmoid": true
1134
+ },
1135
+ "loss_map_dir": {
1136
+ "loss_weight": 0.005,
1137
+ "type": "PtsDirCosLoss"
1138
+ },
1139
+ "loss_map_iou": {
1140
+ "loss_weight": 0.0,
1141
+ "type": "GIoULoss"
1142
+ },
1143
+ "loss_map_pts": {
1144
+ "loss_weight": 1.0,
1145
+ "type": "PtsL1Loss"
1146
+ },
1147
+ "loss_plan_bound": {
1148
+ "dis_thresh": 1.0,
1149
+ "loss_weight": 1.0,
1150
+ "type": "PlanMapBoundLoss"
1151
+ },
1152
+ "loss_plan_col": {
1153
+ "loss_weight": 1.0,
1154
+ "type": "PlanCollisionLoss"
1155
+ },
1156
+ "loss_plan_dir": {
1157
+ "loss_weight": 0.5,
1158
+ "type": "PlanMapDirectionLoss"
1159
+ },
1160
+ "loss_plan_reg": {
1161
+ "loss_weight": 1.0,
1162
+ "type": "L1Loss"
1163
+ },
1164
+ "loss_traj": {
1165
+ "loss_weight": 0.2,
1166
+ "type": "L1Loss"
1167
+ },
1168
+ "loss_traj_cls": {
1169
+ "alpha": 0.25,
1170
+ "gamma": 2.0,
1171
+ "loss_weight": 0.2,
1172
+ "type": "FocalLoss",
1173
+ "use_sigmoid": true
1174
+ },
1175
+ "map_bbox_coder": {
1176
+ "max_num": 50,
1177
+ "num_classes": 6,
1178
+ "pc_range": [
1179
+ -15.0,
1180
+ -30.0,
1181
+ -2.0,
1182
+ 15.0,
1183
+ 30.0,
1184
+ 2.0
1185
+ ],
1186
+ "post_center_range": [
1187
+ -20,
1188
+ -35,
1189
+ -20,
1190
+ -35,
1191
+ 20,
1192
+ 35,
1193
+ 20,
1194
+ 35
1195
+ ],
1196
+ "type": "MapNMSFreeCoder",
1197
+ "voxel_size": [
1198
+ 0.15,
1199
+ 0.15,
1200
+ 4
1201
+ ]
1202
+ },
1203
+ "map_code_size": 2,
1204
+ "map_code_weights": [
1205
+ 1.0,
1206
+ 1.0,
1207
+ 1.0,
1208
+ 1.0
1209
+ ],
1210
+ "map_dir_interval": 1,
1211
+ "map_gt_shift_pts_pattern": "v2",
1212
+ "map_num_classes": 6,
1213
+ "map_num_pts_per_gt_vec": 20,
1214
+ "map_num_pts_per_vec": 20,
1215
+ "map_num_vec": 100,
1216
+ "map_query_embed_type": "instance_pts",
1217
+ "map_thresh": 0.5,
1218
+ "map_transform_method": "minmax",
1219
+ "motion_decoder": {
1220
+ "num_layers": 1,
1221
+ "return_intermediate": false,
1222
+ "transformerlayers": {
1223
+ "attn_cfgs": [
1224
+ {
1225
+ "dropout": 0.0,
1226
+ "embed_dims": 256,
1227
+ "num_heads": 8,
1228
+ "type": "MultiheadAttention"
1229
+ }
1230
+ ],
1231
+ "feedforward_channels": 512,
1232
+ "ffn_dropout": 0.0,
1233
+ "operation_order": [
1234
+ "cross_attn",
1235
+ "norm",
1236
+ "ffn",
1237
+ "norm"
1238
+ ],
1239
+ "type": "BaseTransformerLayer"
1240
+ },
1241
+ "type": "CustomTransformerDecoder"
1242
+ },
1243
+ "motion_map_decoder": {
1244
+ "num_layers": 1,
1245
+ "return_intermediate": false,
1246
+ "transformerlayers": {
1247
+ "attn_cfgs": [
1248
+ {
1249
+ "dropout": 0.0,
1250
+ "embed_dims": 256,
1251
+ "num_heads": 8,
1252
+ "type": "MultiheadAttention"
1253
+ }
1254
+ ],
1255
+ "feedforward_channels": 512,
1256
+ "ffn_dropout": 0.0,
1257
+ "operation_order": [
1258
+ "cross_attn",
1259
+ "norm",
1260
+ "ffn",
1261
+ "norm"
1262
+ ],
1263
+ "type": "BaseTransformerLayer"
1264
+ },
1265
+ "type": "CustomTransformerDecoder"
1266
+ },
1267
+ "num_classes": 9,
1268
+ "num_query": 300,
1269
+ "pe_normalization": true,
1270
+ "positional_encoding": {
1271
+ "col_num_embed": 200,
1272
+ "num_feats": 128,
1273
+ "row_num_embed": 200,
1274
+ "type": "LearnedPositionalEncoding"
1275
+ },
1276
+ "query_thresh": 0.0,
1277
+ "query_use_fix_pad": false,
1278
+ "sync_cls_avg_factor": true,
1279
+ "tot_epoch": 6,
1280
+ "transformer": {
1281
+ "decoder": {
1282
+ "num_layers": 6,
1283
+ "return_intermediate": true,
1284
+ "transformerlayers": {
1285
+ "attn_cfgs": [
1286
+ {
1287
+ "dropout": 0.0,
1288
+ "embed_dims": 256,
1289
+ "num_heads": 8,
1290
+ "type": "MultiheadAttention"
1291
+ },
1292
+ {
1293
+ "embed_dims": 256,
1294
+ "num_levels": 1,
1295
+ "type": "CustomMSDeformableAttention"
1296
+ }
1297
+ ],
1298
+ "feedforward_channels": 512,
1299
+ "ffn_dropout": 0.0,
1300
+ "operation_order": [
1301
+ "self_attn",
1302
+ "norm",
1303
+ "cross_attn",
1304
+ "norm",
1305
+ "ffn",
1306
+ "norm"
1307
+ ],
1308
+ "type": "DetrTransformerDecoderLayer"
1309
+ },
1310
+ "type": "DetectionTransformerDecoder"
1311
+ },
1312
+ "embed_dims": 256,
1313
+ "encoder": {
1314
+ "num_layers": 6,
1315
+ "num_points_in_pillar": 4,
1316
+ "pc_range": [
1317
+ -15.0,
1318
+ -30.0,
1319
+ -2.0,
1320
+ 15.0,
1321
+ 30.0,
1322
+ 2.0
1323
+ ],
1324
+ "return_intermediate": false,
1325
+ "transformerlayers": {
1326
+ "attn_cfgs": [
1327
+ {
1328
+ "embed_dims": 256,
1329
+ "num_levels": 1,
1330
+ "type": "TemporalSelfAttention"
1331
+ },
1332
+ {
1333
+ "deformable_attention": {
1334
+ "embed_dims": 256,
1335
+ "num_levels": 4,
1336
+ "num_points": 8,
1337
+ "type": "MSDeformableAttention3D"
1338
+ },
1339
+ "embed_dims": 256,
1340
+ "pc_range": [
1341
+ -15.0,
1342
+ -30.0,
1343
+ -2.0,
1344
+ 15.0,
1345
+ 30.0,
1346
+ 2.0
1347
+ ],
1348
+ "type": "SpatialCrossAttention"
1349
+ }
1350
+ ],
1351
+ "feedforward_channels": 512,
1352
+ "ffn_dropout": 0.0,
1353
+ "operation_order": [
1354
+ "self_attn",
1355
+ "norm",
1356
+ "cross_attn",
1357
+ "norm",
1358
+ "ffn",
1359
+ "norm"
1360
+ ],
1361
+ "type": "BEVFormerLayer"
1362
+ },
1363
+ "type": "BEVFormerEncoder"
1364
+ },
1365
+ "map_decoder": {
1366
+ "num_layers": 6,
1367
+ "return_intermediate": true,
1368
+ "transformerlayers": {
1369
+ "attn_cfgs": [
1370
+ {
1371
+ "dropout": 0.0,
1372
+ "embed_dims": 256,
1373
+ "num_heads": 8,
1374
+ "type": "MultiheadAttention"
1375
+ },
1376
+ {
1377
+ "embed_dims": 256,
1378
+ "num_levels": 1,
1379
+ "type": "CustomMSDeformableAttention"
1380
+ }
1381
+ ],
1382
+ "feedforward_channels": 512,
1383
+ "ffn_dropout": 0.0,
1384
+ "operation_order": [
1385
+ "self_attn",
1386
+ "norm",
1387
+ "cross_attn",
1388
+ "norm",
1389
+ "ffn",
1390
+ "norm"
1391
+ ],
1392
+ "type": "DetrTransformerDecoderLayer"
1393
+ },
1394
+ "type": "MapDetectionTransformerDecoder"
1395
+ },
1396
+ "map_num_pts_per_vec": 20,
1397
+ "map_num_vec": 100,
1398
+ "rotate_prev_bev": true,
1399
+ "type": "VADPerceptionTransformer",
1400
+ "use_can_bus": true,
1401
+ "use_shift": true
1402
+ },
1403
+ "type": "VADHead",
1404
+ "use_pe": true,
1405
+ "use_traj_lr_warmup": false,
1406
+ "valid_fut_ts": 6,
1407
+ "with_box_refine": true
1408
+ },
1409
+ "train_cfg": {
1410
+ "pts": {
1411
+ "assigner": {
1412
+ "cls_cost": {
1413
+ "type": "FocalLossCost",
1414
+ "weight": 2.0
1415
+ },
1416
+ "iou_cost": {
1417
+ "type": "IoUCost",
1418
+ "weight": 0.0
1419
+ },
1420
+ "pc_range": [
1421
+ -15.0,
1422
+ -30.0,
1423
+ -2.0,
1424
+ 15.0,
1425
+ 30.0,
1426
+ 2.0
1427
+ ],
1428
+ "reg_cost": {
1429
+ "type": "BBox3DL1Cost",
1430
+ "weight": 0.25
1431
+ },
1432
+ "type": "HungarianAssigner3D"
1433
+ },
1434
+ "grid_size": [
1435
+ 512,
1436
+ 512,
1437
+ 1
1438
+ ],
1439
+ "map_assigner": {
1440
+ "cls_cost": {
1441
+ "type": "FocalLossCost",
1442
+ "weight": 2.0
1443
+ },
1444
+ "iou_cost": {
1445
+ "iou_mode": "giou",
1446
+ "type": "IoUCost",
1447
+ "weight": 0.0
1448
+ },
1449
+ "pc_range": [
1450
+ -15.0,
1451
+ -30.0,
1452
+ -2.0,
1453
+ 15.0,
1454
+ 30.0,
1455
+ 2.0
1456
+ ],
1457
+ "pts_cost": {
1458
+ "type": "OrderedPtsL1Cost",
1459
+ "weight": 1.0
1460
+ },
1461
+ "reg_cost": {
1462
+ "box_format": "xywh",
1463
+ "type": "BBoxL1Cost",
1464
+ "weight": 0.0
1465
+ },
1466
+ "type": "MapHungarianAssigner3D"
1467
+ },
1468
+ "out_size_factor": 4,
1469
+ "point_cloud_range": [
1470
+ -15.0,
1471
+ -30.0,
1472
+ -2.0,
1473
+ 15.0,
1474
+ 30.0,
1475
+ 2.0
1476
+ ],
1477
+ "voxel_size": [
1478
+ 0.15,
1479
+ 0.15,
1480
+ 4
1481
+ ]
1482
+ }
1483
+ },
1484
+ "type": "ApolloVAD",
1485
+ "use_grid_mask": true,
1486
+ "video_test_mode": true
1487
+ },
1488
+ "num_classes": 9,
1489
+ "optimizer": {
1490
+ "lr": 0.0002,
1491
+ "paramwise_cfg": {
1492
+ "custom_keys": {
1493
+ "img_backbone": {
1494
+ "lr_mult": 0.1
1495
+ }
1496
+ }
1497
+ },
1498
+ "type": "AdamW",
1499
+ "weight_decay": 0.01
1500
+ },
1501
+ "optimizer_config": {
1502
+ "grad_clip": {
1503
+ "max_norm": 35,
1504
+ "norm_type": 2
1505
+ }
1506
+ },
1507
+ "past_frames": 4,
1508
+ "point_cloud_range": [
1509
+ -15.0,
1510
+ -30.0,
1511
+ -2.0,
1512
+ 15.0,
1513
+ 30.0,
1514
+ 2.0
1515
+ ],
1516
+ "queue_length": 4,
1517
+ "resume_from": null,
1518
+ "runner": {
1519
+ "max_epochs": 6,
1520
+ "type": "EpochBasedRunner"
1521
+ },
1522
+ "test_pipeline": [
1523
+ {
1524
+ "to_float32": true,
1525
+ "type": "LoadMultiViewImageFromFiles"
1526
+ },
1527
+ {
1528
+ "flip": false,
1529
+ "img_scale": [
1530
+ 1600,
1531
+ 900
1532
+ ],
1533
+ "pts_scale_ratio": 1,
1534
+ "transforms": [
1535
+ {
1536
+ "class_names": [
1537
+ "car",
1538
+ "van",
1539
+ "truck",
1540
+ "bicycle",
1541
+ "traffic_sign",
1542
+ "traffic_cone",
1543
+ "traffic_light",
1544
+ "pedestrian",
1545
+ "others"
1546
+ ],
1547
+ "type": "VADFormatBundle3D",
1548
+ "with_ego": true,
1549
+ "with_label": false
1550
+ },
1551
+ {
1552
+ "keys": [
1553
+ "ego_his_trajs",
1554
+ "ego_fut_trajs",
1555
+ "ego_fut_masks",
1556
+ "ego_fut_cmd",
1557
+ "ego_lcf_feat",
1558
+ "navi_points",
1559
+ "navi_mask",
1560
+ "cam_front_path",
1561
+ "folder",
1562
+ "frame_idx"
1563
+ ],
1564
+ "type": "CustomCollect3D"
1565
+ }
1566
+ ],
1567
+ "type": "MultiScaleFlipAug3D"
1568
+ }
1569
+ ],
1570
+ "total_epochs": 6,
1571
+ "train_pipeline": [
1572
+ {
1573
+ "to_float32": true,
1574
+ "type": "LoadMultiViewImageFromFiles"
1575
+ },
1576
+ {
1577
+ "class_names": [
1578
+ "car",
1579
+ "van",
1580
+ "truck",
1581
+ "bicycle",
1582
+ "traffic_sign",
1583
+ "traffic_cone",
1584
+ "traffic_light",
1585
+ "pedestrian",
1586
+ "others"
1587
+ ],
1588
+ "type": "VADFormatBundle3D",
1589
+ "with_ego": true
1590
+ },
1591
+ {
1592
+ "keys": [
1593
+ "ego_his_trajs",
1594
+ "ego_fut_trajs",
1595
+ "ego_fut_masks",
1596
+ "ego_fut_cmd",
1597
+ "ego_lcf_feat",
1598
+ "navi_points",
1599
+ "navi_mask",
1600
+ "cam_front_path",
1601
+ "folder",
1602
+ "frame_idx"
1603
+ ],
1604
+ "type": "CustomCollect3D"
1605
+ }
1606
+ ],
1607
+ "voxel_size": [
1608
+ 0.15,
1609
+ 0.15,
1610
+ 4
1611
+ ],
1612
+ "work_dir": null,
1613
+ "workflow": [
1614
+ [
1615
+ "train",
1616
+ 1
1617
+ ]
1618
+ ]
1619
+ },
1620
+ "perception_loss_weight": 1.0,
1621
+ "rms_norm_eps": 1e-06,
1622
+ "rope_scaling": null,
1623
+ "rope_theta": 1000000.0,
1624
+ "sliding_window": 32768,
1625
+ "text_data_path": "Bench2DriveZoo/data/final_QA/TRAIN_1",
1626
+ "tie_word_embeddings": true,
1627
+ "tokenizer_model_max_length": 32768,
1628
+ "tokenizer_padding_side": "right",
1629
+ "torch_dtype": "float32",
1630
+ "transformers_version": "4.39.3",
1631
+ "tune_mm_mlp_adapter": false,
1632
+ "use_cache": false,
1633
+ "use_clip_img_encoder": true,
1634
+ "use_command_encoder": false,
1635
+ "use_map_encoder": false,
1636
+ "use_mm_proj": true,
1637
+ "use_navi_encoder": false,
1638
+ "use_perception_encoder": false,
1639
+ "use_sliding_window": false,
1640
+ "use_text_prompts": true,
1641
+ "vision_tower_pretrained": "",
1642
+ "vocab_size": 151936
1643
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "attn_implementation": "flash_attention_2",
3
+ "bos_token_id": 151643,
4
+ "do_sample": true,
5
+ "eos_token_id": [
6
+ 151645,
7
+ 151643
8
+ ],
9
+ "pad_token_id": 151643,
10
+ "repetition_penalty": 1.1,
11
+ "top_p": 0.8,
12
+ "transformers_version": "4.39.3"
13
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6994ad2439ed49c1167e47872dc84584fccc192a475053a274d3b61d325f2da3
3
+ size 3455961848