stevenbucaille commited on
Commit
cb86ff6
·
1 Parent(s): 71d0c15

feat: update object detection labels in config.json and corresponding usage example in README.

Browse files
Files changed (2) hide show
  1. README.md +87 -195
  2. config.json +733 -733
README.md CHANGED
@@ -1,199 +1,91 @@
1
  ---
 
 
 
 
 
 
 
2
  library_name: transformers
3
- tags: []
4
  ---
5
 
6
- # Model Card for Model ID
7
-
8
- <!-- Provide a quick summary of what the model is/does. -->
9
-
10
-
11
-
12
- ## Model Details
13
-
14
- ### Model Description
15
-
16
- <!-- Provide a longer summary of what this model is. -->
17
-
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
-
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
-
28
- ### Model Sources [optional]
29
-
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
-
36
- ## Uses
37
-
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
- ### Direct Use
41
-
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
-
46
- ### Downstream Use [optional]
47
-
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
-
50
- [More Information Needed]
51
-
52
- ### Out-of-Scope Use
53
-
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
-
56
- [More Information Needed]
57
-
58
- ## Bias, Risks, and Limitations
59
-
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
-
62
- [More Information Needed]
63
-
64
- ### Recommendations
65
-
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
-
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
-
70
- ## How to Get Started with the Model
71
-
72
- Use the code below to get started with the model.
73
-
74
- [More Information Needed]
75
-
76
- ## Training Details
77
-
78
- ### Training Data
79
-
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Training Procedure
85
-
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
-
92
-
93
- #### Training Hyperparameters
94
-
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
97
- #### Speeds, Sizes, Times [optional]
98
-
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
-
101
- [More Information Needed]
102
-
103
- ## Evaluation
104
-
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
- ### Testing Data, Factors & Metrics
108
-
109
- #### Testing Data
110
-
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
-
115
- #### Factors
116
-
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
-
119
- [More Information Needed]
120
-
121
- #### Metrics
122
-
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
-
125
- [More Information Needed]
126
-
127
- ### Results
128
-
129
- [More Information Needed]
130
-
131
- #### Summary
132
-
133
-
134
-
135
- ## Model Examination [optional]
136
-
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
140
-
141
- ## Environmental Impact
142
-
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
-
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
-
153
- ## Technical Specifications [optional]
154
-
155
- ### Model Architecture and Objective
156
-
157
- [More Information Needed]
158
-
159
- ### Compute Infrastructure
160
-
161
- [More Information Needed]
162
-
163
- #### Hardware
164
-
165
- [More Information Needed]
166
-
167
- #### Software
168
-
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
-
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
-
195
- [More Information Needed]
196
-
197
- ## Model Card Contact
198
-
199
- [More Information Needed]
 
1
  ---
2
+ license: apache-2.0
3
+ tags:
4
+ - object-detection
5
+ - vision
6
+ datasets:
7
+ - coco
8
+ pipeline_tag: object-detection
9
  library_name: transformers
 
10
  ---
11
 
12
+ # LW-DETR (Light-Weight Detection Transformer)
13
+
14
+ LW-DETR, a Light-Weight DEtection TRansformer model, is designed to be a real-time object detection alternative that outperforms conventional convolutional (YOLO-style) and earlier transformer-based (DETR) methods in terms of speed and accuracy trade-off. It was introduced in the paper [LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection](https://huggingface.co/papers/2406.03459) by Chen et al. and first released in this repository.
15
+ Disclaimer: This model was originally contributed by [stevenbucaille](https://huggingface.co/stevenbucaille) in 🤗 transformers.
16
+
17
+ ## Model description
18
+
19
+ LW-DETR is an end-to-end object detection model that uses a Vision Transformer (ViT) backbone as its encoder, a simple convolutional projector, and a shallow DETR decoder. The core philosophy is to leverage the power of transformers while implementing several efficiency-focused techniques to achieve real-time performance.
20
+
21
+ Key Architectural Details:
22
+ - ViT Encoder: Uses a plain ViT architecture. To reduce the quadratic complexity of global self-attention, it adopts interleaved window and global attentions.
23
+ - Window-Major Organization: It employs a highly efficient window-major feature map organization scheme for attention computation, which drastically reduces the costly memory permutation operations required when transitioning between global and window attention modes, leading to lower inference latency.
24
+ - Feature Aggregation: It aggregates features from multiple levels (intermediate and final layers) of the ViT encoder to create richer input for the decoder.
25
+ - Projector: A C2f block (from YOLOv8) connects the encoder and decoder. For larger versions (large/xlarge), it outputs two-scale features ($1/8$ and $1/32$) to the decoder.
26
+ - Shallow DETR Decoder: It uses a computationally efficient 3-layer transformer decoder (instead of the standard 6 layers), incorporating deformable cross-attention for faster convergence and lower latency.
27
+ - Object Queries: It uses a mixed-query selection scheme to form the object queries from both learnable content queries and generated spatial queries (based on top-K features from the Projector).
28
+
29
+ Training Details:
30
+ - IoU-aware Classification Loss (IA-BCE loss): Enhances the classification branch by incorporating IoU information into the target score $t=s^{\alpha}u^{1-\alpha}$.
31
+ - Group DETR: Uses a Group DETR strategy (13 parallel weight-sharing decoders) for faster training convergence without affecting inference speed.
32
+ - Pretraining: Uses a two-stage pretraining strategy: first, ViT is pretrained on Objects365 using a Masked Image Modeling (MIM) method (CAEv2), followed by supervised retraining of the encoder and training of the projector and decoder on Objects365. This provides a significant performance boost (average of $\approx 5.5\text{ mAP}$).
33
+
34
+ ### How to use
35
+
36
+ You can use the raw model for object detection. See the [model hub](https://huggingface.co/models?search=stevenbucaille/lw-detr) to look for all available LW DETR models.
37
+
38
+ Here is how to use this model:
39
+
40
+ ```python
41
+ from transformers import AutoImageProcessor, LwDetrForObjectDetection
42
+ import torch
43
+ from PIL import Image
44
+ import requests
45
+
46
+ url = "http://images.cocodataset.org/val2017/000000039769.jpg"
47
+ image = Image.open(requests.get(url, stream=True).raw)
48
+
49
+ processor = AutoImageProcessor.from_pretrained("stevenbucaille/lwdetr_tiny_30e_objects365")
50
+ model = LwDetrForObjectDetection.from_pretrained("stevenbucaille/lwdetr_tiny_30e_objects365")
51
+
52
+ inputs = processor(images=image, return_tensors="pt")
53
+ outputs = model(**inputs)
54
+
55
+ # convert outputs (bounding boxes and class logits) to COCO API
56
+ # let's only keep detections with score > 0.7
57
+ target_sizes = torch.tensor([image.size[::-1]])
58
+ results = processor.post_process_object_detection(outputs, target_sizes=target_sizes, threshold=0.7)[0]
59
+
60
+ for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
61
+ box = [round(i, 2) for i in box.tolist()]
62
+ print(
63
+ f"Detected {model.config.id2label[label.item()]} with confidence "
64
+ f"{round(score.item(), 3)} at location {box}"
65
+ )
66
+ ```
67
+ This should output:
68
+ ```
69
+ Detected Jug with confidence 0.868 at location [4.89, 56.55, 319.81, 474.79]
70
+ Detected Refrigerator with confidence 0.753 at location [40.56, 73.11, 176.3, 116.86]
71
+ Detected Jug with confidence 0.718 at location [340.33, 25.1, 640.32, 368.68]
72
+ ```
73
+
74
+ Currently, both the feature extractor and model support PyTorch.
75
+
76
+ ## Training data
77
+
78
+ The LW-DETR models are trained/finetuned on the following datasets:
79
+ - Pretraining: Primarily conducted on [Objects365](https://www.objects365.org/overview.html), a large-scale, high-quality dataset for object detection.
80
+ - Finetuning: Final training is performed on the standard [COCO 2017 object detection dataset](https://cocodataset.org/#home).
81
+
82
+ ### BibTeX entry and citation info
83
+
84
+ ```bibtex
85
+ @article{chen2024lw,
86
+ title={LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection},
87
+ author={Chen, Qiang and Su, Xiangbo and Zhang, Xinyu and Wang, Jian and Chen, Jiahui and Shen, Yunpeng and Han, Chuchu and Chen, Ziliang and Xu, Weixiang and Li, Fanrong and others},
88
+ journal={arXiv preprint arXiv:2406.03459},
89
+ year={2024}
90
+ }
91
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json CHANGED
@@ -75,741 +75,741 @@
75
  "group_detr": 13,
76
  "hidden_expansion": 0.5,
77
  "id2label": {
78
- "0": "LABEL_0",
79
- "1": "LABEL_1",
80
- "2": "LABEL_2",
81
- "3": "LABEL_3",
82
- "4": "LABEL_4",
83
- "5": "LABEL_5",
84
- "6": "LABEL_6",
85
- "7": "LABEL_7",
86
- "8": "LABEL_8",
87
- "9": "LABEL_9",
88
- "10": "LABEL_10",
89
- "11": "LABEL_11",
90
- "12": "LABEL_12",
91
- "13": "LABEL_13",
92
- "14": "LABEL_14",
93
- "15": "LABEL_15",
94
- "16": "LABEL_16",
95
- "17": "LABEL_17",
96
- "18": "LABEL_18",
97
- "19": "LABEL_19",
98
- "20": "LABEL_20",
99
- "21": "LABEL_21",
100
- "22": "LABEL_22",
101
- "23": "LABEL_23",
102
- "24": "LABEL_24",
103
- "25": "LABEL_25",
104
- "26": "LABEL_26",
105
- "27": "LABEL_27",
106
- "28": "LABEL_28",
107
- "29": "LABEL_29",
108
- "30": "LABEL_30",
109
- "31": "LABEL_31",
110
- "32": "LABEL_32",
111
- "33": "LABEL_33",
112
- "34": "LABEL_34",
113
- "35": "LABEL_35",
114
- "36": "LABEL_36",
115
- "37": "LABEL_37",
116
- "38": "LABEL_38",
117
- "39": "LABEL_39",
118
- "40": "LABEL_40",
119
- "41": "LABEL_41",
120
- "42": "LABEL_42",
121
- "43": "LABEL_43",
122
- "44": "LABEL_44",
123
- "45": "LABEL_45",
124
- "46": "LABEL_46",
125
- "47": "LABEL_47",
126
- "48": "LABEL_48",
127
- "49": "LABEL_49",
128
- "50": "LABEL_50",
129
- "51": "LABEL_51",
130
- "52": "LABEL_52",
131
- "53": "LABEL_53",
132
- "54": "LABEL_54",
133
- "55": "LABEL_55",
134
- "56": "LABEL_56",
135
- "57": "LABEL_57",
136
- "58": "LABEL_58",
137
- "59": "LABEL_59",
138
- "60": "LABEL_60",
139
- "61": "LABEL_61",
140
- "62": "LABEL_62",
141
- "63": "LABEL_63",
142
- "64": "LABEL_64",
143
- "65": "LABEL_65",
144
- "66": "LABEL_66",
145
- "67": "LABEL_67",
146
- "68": "LABEL_68",
147
- "69": "LABEL_69",
148
- "70": "LABEL_70",
149
- "71": "LABEL_71",
150
- "72": "LABEL_72",
151
- "73": "LABEL_73",
152
- "74": "LABEL_74",
153
- "75": "LABEL_75",
154
- "76": "LABEL_76",
155
- "77": "LABEL_77",
156
- "78": "LABEL_78",
157
- "79": "LABEL_79",
158
- "80": "LABEL_80",
159
- "81": "LABEL_81",
160
- "82": "LABEL_82",
161
- "83": "LABEL_83",
162
- "84": "LABEL_84",
163
- "85": "LABEL_85",
164
- "86": "LABEL_86",
165
- "87": "LABEL_87",
166
- "88": "LABEL_88",
167
- "89": "LABEL_89",
168
- "90": "LABEL_90",
169
- "91": "LABEL_91",
170
- "92": "LABEL_92",
171
- "93": "LABEL_93",
172
- "94": "LABEL_94",
173
- "95": "LABEL_95",
174
- "96": "LABEL_96",
175
- "97": "LABEL_97",
176
- "98": "LABEL_98",
177
- "99": "LABEL_99",
178
- "100": "LABEL_100",
179
- "101": "LABEL_101",
180
- "102": "LABEL_102",
181
- "103": "LABEL_103",
182
- "104": "LABEL_104",
183
- "105": "LABEL_105",
184
- "106": "LABEL_106",
185
- "107": "LABEL_107",
186
- "108": "LABEL_108",
187
- "109": "LABEL_109",
188
- "110": "LABEL_110",
189
- "111": "LABEL_111",
190
- "112": "LABEL_112",
191
- "113": "LABEL_113",
192
- "114": "LABEL_114",
193
- "115": "LABEL_115",
194
- "116": "LABEL_116",
195
- "117": "LABEL_117",
196
- "118": "LABEL_118",
197
- "119": "LABEL_119",
198
- "120": "LABEL_120",
199
- "121": "LABEL_121",
200
- "122": "LABEL_122",
201
- "123": "LABEL_123",
202
- "124": "LABEL_124",
203
- "125": "LABEL_125",
204
- "126": "LABEL_126",
205
- "127": "LABEL_127",
206
- "128": "LABEL_128",
207
- "129": "LABEL_129",
208
- "130": "LABEL_130",
209
- "131": "LABEL_131",
210
- "132": "LABEL_132",
211
- "133": "LABEL_133",
212
- "134": "LABEL_134",
213
- "135": "LABEL_135",
214
- "136": "LABEL_136",
215
- "137": "LABEL_137",
216
- "138": "LABEL_138",
217
- "139": "LABEL_139",
218
- "140": "LABEL_140",
219
- "141": "LABEL_141",
220
- "142": "LABEL_142",
221
- "143": "LABEL_143",
222
- "144": "LABEL_144",
223
- "145": "LABEL_145",
224
- "146": "LABEL_146",
225
- "147": "LABEL_147",
226
- "148": "LABEL_148",
227
- "149": "LABEL_149",
228
- "150": "LABEL_150",
229
- "151": "LABEL_151",
230
- "152": "LABEL_152",
231
- "153": "LABEL_153",
232
- "154": "LABEL_154",
233
- "155": "LABEL_155",
234
- "156": "LABEL_156",
235
- "157": "LABEL_157",
236
- "158": "LABEL_158",
237
- "159": "LABEL_159",
238
- "160": "LABEL_160",
239
- "161": "LABEL_161",
240
- "162": "LABEL_162",
241
- "163": "LABEL_163",
242
- "164": "LABEL_164",
243
- "165": "LABEL_165",
244
- "166": "LABEL_166",
245
- "167": "LABEL_167",
246
- "168": "LABEL_168",
247
- "169": "LABEL_169",
248
- "170": "LABEL_170",
249
- "171": "LABEL_171",
250
- "172": "LABEL_172",
251
- "173": "LABEL_173",
252
- "174": "LABEL_174",
253
- "175": "LABEL_175",
254
- "176": "LABEL_176",
255
- "177": "LABEL_177",
256
- "178": "LABEL_178",
257
- "179": "LABEL_179",
258
- "180": "LABEL_180",
259
- "181": "LABEL_181",
260
- "182": "LABEL_182",
261
- "183": "LABEL_183",
262
- "184": "LABEL_184",
263
- "185": "LABEL_185",
264
- "186": "LABEL_186",
265
- "187": "LABEL_187",
266
- "188": "LABEL_188",
267
- "189": "LABEL_189",
268
- "190": "LABEL_190",
269
- "191": "LABEL_191",
270
- "192": "LABEL_192",
271
- "193": "LABEL_193",
272
- "194": "LABEL_194",
273
- "195": "LABEL_195",
274
- "196": "LABEL_196",
275
- "197": "LABEL_197",
276
- "198": "LABEL_198",
277
- "199": "LABEL_199",
278
- "200": "LABEL_200",
279
- "201": "LABEL_201",
280
- "202": "LABEL_202",
281
- "203": "LABEL_203",
282
- "204": "LABEL_204",
283
- "205": "LABEL_205",
284
- "206": "LABEL_206",
285
- "207": "LABEL_207",
286
- "208": "LABEL_208",
287
- "209": "LABEL_209",
288
- "210": "LABEL_210",
289
- "211": "LABEL_211",
290
- "212": "LABEL_212",
291
- "213": "LABEL_213",
292
- "214": "LABEL_214",
293
- "215": "LABEL_215",
294
- "216": "LABEL_216",
295
- "217": "LABEL_217",
296
- "218": "LABEL_218",
297
- "219": "LABEL_219",
298
- "220": "LABEL_220",
299
- "221": "LABEL_221",
300
- "222": "LABEL_222",
301
- "223": "LABEL_223",
302
- "224": "LABEL_224",
303
- "225": "LABEL_225",
304
- "226": "LABEL_226",
305
- "227": "LABEL_227",
306
- "228": "LABEL_228",
307
- "229": "LABEL_229",
308
- "230": "LABEL_230",
309
- "231": "LABEL_231",
310
- "232": "LABEL_232",
311
- "233": "LABEL_233",
312
- "234": "LABEL_234",
313
- "235": "LABEL_235",
314
- "236": "LABEL_236",
315
- "237": "LABEL_237",
316
- "238": "LABEL_238",
317
- "239": "LABEL_239",
318
- "240": "LABEL_240",
319
- "241": "LABEL_241",
320
- "242": "LABEL_242",
321
- "243": "LABEL_243",
322
- "244": "LABEL_244",
323
- "245": "LABEL_245",
324
- "246": "LABEL_246",
325
- "247": "LABEL_247",
326
- "248": "LABEL_248",
327
- "249": "LABEL_249",
328
- "250": "LABEL_250",
329
- "251": "LABEL_251",
330
- "252": "LABEL_252",
331
- "253": "LABEL_253",
332
- "254": "LABEL_254",
333
- "255": "LABEL_255",
334
- "256": "LABEL_256",
335
- "257": "LABEL_257",
336
- "258": "LABEL_258",
337
- "259": "LABEL_259",
338
- "260": "LABEL_260",
339
- "261": "LABEL_261",
340
- "262": "LABEL_262",
341
- "263": "LABEL_263",
342
- "264": "LABEL_264",
343
- "265": "LABEL_265",
344
- "266": "LABEL_266",
345
- "267": "LABEL_267",
346
- "268": "LABEL_268",
347
- "269": "LABEL_269",
348
- "270": "LABEL_270",
349
- "271": "LABEL_271",
350
- "272": "LABEL_272",
351
- "273": "LABEL_273",
352
- "274": "LABEL_274",
353
- "275": "LABEL_275",
354
- "276": "LABEL_276",
355
- "277": "LABEL_277",
356
- "278": "LABEL_278",
357
- "279": "LABEL_279",
358
- "280": "LABEL_280",
359
- "281": "LABEL_281",
360
- "282": "LABEL_282",
361
- "283": "LABEL_283",
362
- "284": "LABEL_284",
363
- "285": "LABEL_285",
364
- "286": "LABEL_286",
365
- "287": "LABEL_287",
366
- "288": "LABEL_288",
367
- "289": "LABEL_289",
368
- "290": "LABEL_290",
369
- "291": "LABEL_291",
370
- "292": "LABEL_292",
371
- "293": "LABEL_293",
372
- "294": "LABEL_294",
373
- "295": "LABEL_295",
374
- "296": "LABEL_296",
375
- "297": "LABEL_297",
376
- "298": "LABEL_298",
377
- "299": "LABEL_299",
378
- "300": "LABEL_300",
379
- "301": "LABEL_301",
380
- "302": "LABEL_302",
381
- "303": "LABEL_303",
382
- "304": "LABEL_304",
383
- "305": "LABEL_305",
384
- "306": "LABEL_306",
385
- "307": "LABEL_307",
386
- "308": "LABEL_308",
387
- "309": "LABEL_309",
388
- "310": "LABEL_310",
389
- "311": "LABEL_311",
390
- "312": "LABEL_312",
391
- "313": "LABEL_313",
392
- "314": "LABEL_314",
393
- "315": "LABEL_315",
394
- "316": "LABEL_316",
395
- "317": "LABEL_317",
396
- "318": "LABEL_318",
397
- "319": "LABEL_319",
398
- "320": "LABEL_320",
399
- "321": "LABEL_321",
400
- "322": "LABEL_322",
401
- "323": "LABEL_323",
402
- "324": "LABEL_324",
403
- "325": "LABEL_325",
404
- "326": "LABEL_326",
405
- "327": "LABEL_327",
406
- "328": "LABEL_328",
407
- "329": "LABEL_329",
408
- "330": "LABEL_330",
409
- "331": "LABEL_331",
410
- "332": "LABEL_332",
411
- "333": "LABEL_333",
412
- "334": "LABEL_334",
413
- "335": "LABEL_335",
414
- "336": "LABEL_336",
415
- "337": "LABEL_337",
416
- "338": "LABEL_338",
417
- "339": "LABEL_339",
418
- "340": "LABEL_340",
419
- "341": "LABEL_341",
420
- "342": "LABEL_342",
421
- "343": "LABEL_343",
422
- "344": "LABEL_344",
423
- "345": "LABEL_345",
424
- "346": "LABEL_346",
425
- "347": "LABEL_347",
426
- "348": "LABEL_348",
427
- "349": "LABEL_349",
428
- "350": "LABEL_350",
429
- "351": "LABEL_351",
430
- "352": "LABEL_352",
431
- "353": "LABEL_353",
432
- "354": "LABEL_354",
433
- "355": "LABEL_355",
434
- "356": "LABEL_356",
435
- "357": "LABEL_357",
436
- "358": "LABEL_358",
437
- "359": "LABEL_359",
438
- "360": "LABEL_360",
439
- "361": "LABEL_361",
440
- "362": "LABEL_362",
441
- "363": "LABEL_363",
442
- "364": "LABEL_364",
443
- "365": "LABEL_365"
444
  },
445
  "init_std": 0.02,
446
  "label2id": {
447
- "LABEL_0": 0,
448
- "LABEL_1": 1,
449
- "LABEL_10": 10,
450
- "LABEL_100": 100,
451
- "LABEL_101": 101,
452
- "LABEL_102": 102,
453
- "LABEL_103": 103,
454
- "LABEL_104": 104,
455
- "LABEL_105": 105,
456
- "LABEL_106": 106,
457
- "LABEL_107": 107,
458
- "LABEL_108": 108,
459
- "LABEL_109": 109,
460
- "LABEL_11": 11,
461
- "LABEL_110": 110,
462
- "LABEL_111": 111,
463
- "LABEL_112": 112,
464
- "LABEL_113": 113,
465
- "LABEL_114": 114,
466
- "LABEL_115": 115,
467
- "LABEL_116": 116,
468
- "LABEL_117": 117,
469
- "LABEL_118": 118,
470
- "LABEL_119": 119,
471
- "LABEL_12": 12,
472
- "LABEL_120": 120,
473
- "LABEL_121": 121,
474
- "LABEL_122": 122,
475
- "LABEL_123": 123,
476
- "LABEL_124": 124,
477
- "LABEL_125": 125,
478
- "LABEL_126": 126,
479
- "LABEL_127": 127,
480
- "LABEL_128": 128,
481
- "LABEL_129": 129,
482
- "LABEL_13": 13,
483
- "LABEL_130": 130,
484
- "LABEL_131": 131,
485
- "LABEL_132": 132,
486
- "LABEL_133": 133,
487
- "LABEL_134": 134,
488
- "LABEL_135": 135,
489
- "LABEL_136": 136,
490
- "LABEL_137": 137,
491
- "LABEL_138": 138,
492
- "LABEL_139": 139,
493
- "LABEL_14": 14,
494
- "LABEL_140": 140,
495
- "LABEL_141": 141,
496
- "LABEL_142": 142,
497
- "LABEL_143": 143,
498
- "LABEL_144": 144,
499
- "LABEL_145": 145,
500
- "LABEL_146": 146,
501
- "LABEL_147": 147,
502
- "LABEL_148": 148,
503
- "LABEL_149": 149,
504
- "LABEL_15": 15,
505
- "LABEL_150": 150,
506
- "LABEL_151": 151,
507
- "LABEL_152": 152,
508
- "LABEL_153": 153,
509
- "LABEL_154": 154,
510
- "LABEL_155": 155,
511
- "LABEL_156": 156,
512
- "LABEL_157": 157,
513
- "LABEL_158": 158,
514
- "LABEL_159": 159,
515
- "LABEL_16": 16,
516
- "LABEL_160": 160,
517
- "LABEL_161": 161,
518
- "LABEL_162": 162,
519
- "LABEL_163": 163,
520
- "LABEL_164": 164,
521
- "LABEL_165": 165,
522
- "LABEL_166": 166,
523
- "LABEL_167": 167,
524
- "LABEL_168": 168,
525
- "LABEL_169": 169,
526
- "LABEL_17": 17,
527
- "LABEL_170": 170,
528
- "LABEL_171": 171,
529
- "LABEL_172": 172,
530
- "LABEL_173": 173,
531
- "LABEL_174": 174,
532
- "LABEL_175": 175,
533
- "LABEL_176": 176,
534
- "LABEL_177": 177,
535
- "LABEL_178": 178,
536
- "LABEL_179": 179,
537
- "LABEL_18": 18,
538
- "LABEL_180": 180,
539
- "LABEL_181": 181,
540
- "LABEL_182": 182,
541
- "LABEL_183": 183,
542
- "LABEL_184": 184,
543
- "LABEL_185": 185,
544
- "LABEL_186": 186,
545
- "LABEL_187": 187,
546
- "LABEL_188": 188,
547
- "LABEL_189": 189,
548
- "LABEL_19": 19,
549
- "LABEL_190": 190,
550
- "LABEL_191": 191,
551
- "LABEL_192": 192,
552
- "LABEL_193": 193,
553
- "LABEL_194": 194,
554
- "LABEL_195": 195,
555
- "LABEL_196": 196,
556
- "LABEL_197": 197,
557
- "LABEL_198": 198,
558
- "LABEL_199": 199,
559
- "LABEL_2": 2,
560
- "LABEL_20": 20,
561
- "LABEL_200": 200,
562
- "LABEL_201": 201,
563
- "LABEL_202": 202,
564
- "LABEL_203": 203,
565
- "LABEL_204": 204,
566
- "LABEL_205": 205,
567
- "LABEL_206": 206,
568
- "LABEL_207": 207,
569
- "LABEL_208": 208,
570
- "LABEL_209": 209,
571
- "LABEL_21": 21,
572
- "LABEL_210": 210,
573
- "LABEL_211": 211,
574
- "LABEL_212": 212,
575
- "LABEL_213": 213,
576
- "LABEL_214": 214,
577
- "LABEL_215": 215,
578
- "LABEL_216": 216,
579
- "LABEL_217": 217,
580
- "LABEL_218": 218,
581
- "LABEL_219": 219,
582
- "LABEL_22": 22,
583
- "LABEL_220": 220,
584
- "LABEL_221": 221,
585
- "LABEL_222": 222,
586
- "LABEL_223": 223,
587
- "LABEL_224": 224,
588
- "LABEL_225": 225,
589
- "LABEL_226": 226,
590
- "LABEL_227": 227,
591
- "LABEL_228": 228,
592
- "LABEL_229": 229,
593
- "LABEL_23": 23,
594
- "LABEL_230": 230,
595
- "LABEL_231": 231,
596
- "LABEL_232": 232,
597
- "LABEL_233": 233,
598
- "LABEL_234": 234,
599
- "LABEL_235": 235,
600
- "LABEL_236": 236,
601
- "LABEL_237": 237,
602
- "LABEL_238": 238,
603
- "LABEL_239": 239,
604
- "LABEL_24": 24,
605
- "LABEL_240": 240,
606
- "LABEL_241": 241,
607
- "LABEL_242": 242,
608
- "LABEL_243": 243,
609
- "LABEL_244": 244,
610
- "LABEL_245": 245,
611
- "LABEL_246": 246,
612
- "LABEL_247": 247,
613
- "LABEL_248": 248,
614
- "LABEL_249": 249,
615
- "LABEL_25": 25,
616
- "LABEL_250": 250,
617
- "LABEL_251": 251,
618
- "LABEL_252": 252,
619
- "LABEL_253": 253,
620
- "LABEL_254": 254,
621
- "LABEL_255": 255,
622
- "LABEL_256": 256,
623
- "LABEL_257": 257,
624
- "LABEL_258": 258,
625
- "LABEL_259": 259,
626
- "LABEL_26": 26,
627
- "LABEL_260": 260,
628
- "LABEL_261": 261,
629
- "LABEL_262": 262,
630
- "LABEL_263": 263,
631
- "LABEL_264": 264,
632
- "LABEL_265": 265,
633
- "LABEL_266": 266,
634
- "LABEL_267": 267,
635
- "LABEL_268": 268,
636
- "LABEL_269": 269,
637
- "LABEL_27": 27,
638
- "LABEL_270": 270,
639
- "LABEL_271": 271,
640
- "LABEL_272": 272,
641
- "LABEL_273": 273,
642
- "LABEL_274": 274,
643
- "LABEL_275": 275,
644
- "LABEL_276": 276,
645
- "LABEL_277": 277,
646
- "LABEL_278": 278,
647
- "LABEL_279": 279,
648
- "LABEL_28": 28,
649
- "LABEL_280": 280,
650
- "LABEL_281": 281,
651
- "LABEL_282": 282,
652
- "LABEL_283": 283,
653
- "LABEL_284": 284,
654
- "LABEL_285": 285,
655
- "LABEL_286": 286,
656
- "LABEL_287": 287,
657
- "LABEL_288": 288,
658
- "LABEL_289": 289,
659
- "LABEL_29": 29,
660
- "LABEL_290": 290,
661
- "LABEL_291": 291,
662
- "LABEL_292": 292,
663
- "LABEL_293": 293,
664
- "LABEL_294": 294,
665
- "LABEL_295": 295,
666
- "LABEL_296": 296,
667
- "LABEL_297": 297,
668
- "LABEL_298": 298,
669
- "LABEL_299": 299,
670
- "LABEL_3": 3,
671
- "LABEL_30": 30,
672
- "LABEL_300": 300,
673
- "LABEL_301": 301,
674
- "LABEL_302": 302,
675
- "LABEL_303": 303,
676
- "LABEL_304": 304,
677
- "LABEL_305": 305,
678
- "LABEL_306": 306,
679
- "LABEL_307": 307,
680
- "LABEL_308": 308,
681
- "LABEL_309": 309,
682
- "LABEL_31": 31,
683
- "LABEL_310": 310,
684
- "LABEL_311": 311,
685
- "LABEL_312": 312,
686
- "LABEL_313": 313,
687
- "LABEL_314": 314,
688
- "LABEL_315": 315,
689
- "LABEL_316": 316,
690
- "LABEL_317": 317,
691
- "LABEL_318": 318,
692
- "LABEL_319": 319,
693
- "LABEL_32": 32,
694
- "LABEL_320": 320,
695
- "LABEL_321": 321,
696
- "LABEL_322": 322,
697
- "LABEL_323": 323,
698
- "LABEL_324": 324,
699
- "LABEL_325": 325,
700
- "LABEL_326": 326,
701
- "LABEL_327": 327,
702
- "LABEL_328": 328,
703
- "LABEL_329": 329,
704
- "LABEL_33": 33,
705
- "LABEL_330": 330,
706
- "LABEL_331": 331,
707
- "LABEL_332": 332,
708
- "LABEL_333": 333,
709
- "LABEL_334": 334,
710
- "LABEL_335": 335,
711
- "LABEL_336": 336,
712
- "LABEL_337": 337,
713
- "LABEL_338": 338,
714
- "LABEL_339": 339,
715
- "LABEL_34": 34,
716
- "LABEL_340": 340,
717
- "LABEL_341": 341,
718
- "LABEL_342": 342,
719
- "LABEL_343": 343,
720
- "LABEL_344": 344,
721
- "LABEL_345": 345,
722
- "LABEL_346": 346,
723
- "LABEL_347": 347,
724
- "LABEL_348": 348,
725
- "LABEL_349": 349,
726
- "LABEL_35": 35,
727
- "LABEL_350": 350,
728
- "LABEL_351": 351,
729
- "LABEL_352": 352,
730
- "LABEL_353": 353,
731
- "LABEL_354": 354,
732
- "LABEL_355": 355,
733
- "LABEL_356": 356,
734
- "LABEL_357": 357,
735
- "LABEL_358": 358,
736
- "LABEL_359": 359,
737
- "LABEL_36": 36,
738
- "LABEL_360": 360,
739
- "LABEL_361": 361,
740
- "LABEL_362": 362,
741
- "LABEL_363": 363,
742
- "LABEL_364": 364,
743
- "LABEL_365": 365,
744
- "LABEL_37": 37,
745
- "LABEL_38": 38,
746
- "LABEL_39": 39,
747
- "LABEL_4": 4,
748
- "LABEL_40": 40,
749
- "LABEL_41": 41,
750
- "LABEL_42": 42,
751
- "LABEL_43": 43,
752
- "LABEL_44": 44,
753
- "LABEL_45": 45,
754
- "LABEL_46": 46,
755
- "LABEL_47": 47,
756
- "LABEL_48": 48,
757
- "LABEL_49": 49,
758
- "LABEL_5": 5,
759
- "LABEL_50": 50,
760
- "LABEL_51": 51,
761
- "LABEL_52": 52,
762
- "LABEL_53": 53,
763
- "LABEL_54": 54,
764
- "LABEL_55": 55,
765
- "LABEL_56": 56,
766
- "LABEL_57": 57,
767
- "LABEL_58": 58,
768
- "LABEL_59": 59,
769
- "LABEL_6": 6,
770
- "LABEL_60": 60,
771
- "LABEL_61": 61,
772
- "LABEL_62": 62,
773
- "LABEL_63": 63,
774
- "LABEL_64": 64,
775
- "LABEL_65": 65,
776
- "LABEL_66": 66,
777
- "LABEL_67": 67,
778
- "LABEL_68": 68,
779
- "LABEL_69": 69,
780
- "LABEL_7": 7,
781
- "LABEL_70": 70,
782
- "LABEL_71": 71,
783
- "LABEL_72": 72,
784
- "LABEL_73": 73,
785
- "LABEL_74": 74,
786
- "LABEL_75": 75,
787
- "LABEL_76": 76,
788
- "LABEL_77": 77,
789
- "LABEL_78": 78,
790
- "LABEL_79": 79,
791
- "LABEL_8": 8,
792
- "LABEL_80": 80,
793
- "LABEL_81": 81,
794
- "LABEL_82": 82,
795
- "LABEL_83": 83,
796
- "LABEL_84": 84,
797
- "LABEL_85": 85,
798
- "LABEL_86": 86,
799
- "LABEL_87": 87,
800
- "LABEL_88": 88,
801
- "LABEL_89": 89,
802
- "LABEL_9": 9,
803
- "LABEL_90": 90,
804
- "LABEL_91": 91,
805
- "LABEL_92": 92,
806
- "LABEL_93": 93,
807
- "LABEL_94": 94,
808
- "LABEL_95": 95,
809
- "LABEL_96": 96,
810
- "LABEL_97": 97,
811
- "LABEL_98": 98,
812
- "LABEL_99": 99
813
  },
814
  "model_type": "lw_detr",
815
  "num_feature_levels": 1,
@@ -824,4 +824,4 @@
824
  "transformers_version": "5.0.0.dev0",
825
  "use_pretrained_backbone": false,
826
  "use_timm_backbone": false
827
- }
 
75
  "group_detr": 13,
76
  "hidden_expansion": 0.5,
77
  "id2label": {
78
+ "0": "Person",
79
+ "1": "Sneakers",
80
+ "10": "Cup",
81
+ "100": "Hanger",
82
+ "101": "Blackboard/Whiteboard",
83
+ "102": "Napkin",
84
+ "103": "Other Fish",
85
+ "104": "Orange/Tangerine",
86
+ "105": "Toiletry",
87
+ "106": "Keyboard",
88
+ "107": "Tomato",
89
+ "108": "Lantern",
90
+ "109": "Machinery Vehicle",
91
+ "11": "Street Lights",
92
+ "110": "Fan",
93
+ "111": "Green Vegetables",
94
+ "112": "Banana",
95
+ "113": "Baseball Glove",
96
+ "114": "Airplane",
97
+ "115": "Mouse",
98
+ "116": "Train",
99
+ "117": "Pumpkin",
100
+ "118": "Soccer",
101
+ "119": "Skiboard",
102
+ "12": "Cabinet/shelf",
103
+ "120": "Luggage",
104
+ "121": "Nightstand",
105
+ "122": "Tea pot",
106
+ "123": "Telephone",
107
+ "124": "Trolley",
108
+ "125": "Head Phone",
109
+ "126": "Sports Car",
110
+ "127": "Stop Sign",
111
+ "128": "Dessert",
112
+ "129": "Scooter",
113
+ "13": "Handbag/Satchel",
114
+ "130": "Stroller",
115
+ "131": "Crane",
116
+ "132": "Remote",
117
+ "133": "Refrigerator",
118
+ "134": "Oven",
119
+ "135": "Lemon",
120
+ "136": "Duck",
121
+ "137": "Baseball Bat",
122
+ "138": "Surveillance Camera",
123
+ "139": "Cat",
124
+ "14": "Bracelet",
125
+ "140": "Jug",
126
+ "141": "Broccoli",
127
+ "142": "Piano",
128
+ "143": "Pizza",
129
+ "144": "Elephant",
130
+ "145": "Skateboard",
131
+ "146": "Surfboard",
132
+ "147": "Gun",
133
+ "148": "Skating and Skiing shoes",
134
+ "149": "Gas stove",
135
+ "15": "Plate",
136
+ "150": "Donut",
137
+ "151": "Bow Tie",
138
+ "152": "Carrot",
139
+ "153": "Toilet",
140
+ "154": "Kite",
141
+ "155": "Strawberry",
142
+ "156": "Other Balls",
143
+ "157": "Shovel",
144
+ "158": "Pepper",
145
+ "159": "Computer Box",
146
+ "16": "Picture/Frame",
147
+ "160": "Toilet Paper",
148
+ "161": "Cleaning Products",
149
+ "162": "Chopsticks",
150
+ "163": "Microwave",
151
+ "164": "Pigeon",
152
+ "165": "Baseball",
153
+ "166": "Cutting/chopping Board",
154
+ "167": "Coffee Table",
155
+ "168": "Side Table",
156
+ "169": "Scissors",
157
+ "17": "Helmet",
158
+ "170": "Marker",
159
+ "171": "Pie",
160
+ "172": "Ladder",
161
+ "173": "Snowboard",
162
+ "174": "Cookies",
163
+ "175": "Radiator",
164
+ "176": "Fire Hydrant",
165
+ "177": "Basketball",
166
+ "178": "Zebra",
167
+ "179": "Grape",
168
+ "18": "Book",
169
+ "180": "Giraffe",
170
+ "181": "Potato",
171
+ "182": "Sausage",
172
+ "183": "Tricycle",
173
+ "184": "Violin",
174
+ "185": "Egg",
175
+ "186": "Fire Extinguisher",
176
+ "187": "Candy",
177
+ "188": "Fire Truck",
178
+ "189": "Billiards",
179
+ "19": "Gloves",
180
+ "190": "Converter",
181
+ "191": "Bathtub",
182
+ "192": "Wheelchair",
183
+ "193": "Golf Club",
184
+ "194": "Briefcase",
185
+ "195": "Cucumber",
186
+ "196": "Cigar/Cigarette",
187
+ "197": "Paint Brush",
188
+ "198": "Pear",
189
+ "199": "Heavy Truck",
190
+ "2": "Chair",
191
+ "20": "Storage box",
192
+ "200": "Hamburger",
193
+ "201": "Extractor",
194
+ "202": "Extension Cord",
195
+ "203": "Tong",
196
+ "204": "Tennis Racket",
197
+ "205": "Folder",
198
+ "206": "American Football",
199
+ "207": "earphone",
200
+ "208": "Mask",
201
+ "209": "Kettle",
202
+ "21": "Boat",
203
+ "210": "Tennis",
204
+ "211": "Ship",
205
+ "212": "Swing",
206
+ "213": "Coffee Machine",
207
+ "214": "Slide",
208
+ "215": "Carriage",
209
+ "216": "Onion",
210
+ "217": "Green beans",
211
+ "218": "Projector",
212
+ "219": "Frisbee",
213
+ "22": "Leather Shoes",
214
+ "220": "Washing Machine/Drying Machine",
215
+ "221": "Chicken",
216
+ "222": "Printer",
217
+ "223": "Watermelon",
218
+ "224": "Saxophone",
219
+ "225": "Tissue",
220
+ "226": "Toothbrush",
221
+ "227": "Ice cream",
222
+ "228": "Hot-air balloon",
223
+ "229": "Cello",
224
+ "23": "Flower",
225
+ "230": "French Fries",
226
+ "231": "Scale",
227
+ "232": "Trophy",
228
+ "233": "Cabbage",
229
+ "234": "Hot dog",
230
+ "235": "Blender",
231
+ "236": "Peach",
232
+ "237": "Rice",
233
+ "238": "Wallet/Purse",
234
+ "239": "Volleyball",
235
+ "24": "Bench",
236
+ "240": "Deer",
237
+ "241": "Goose",
238
+ "242": "Tape",
239
+ "243": "Tablet",
240
+ "244": "Cosmetics",
241
+ "245": "Trumpet",
242
+ "246": "Pineapple",
243
+ "247": "Golf Ball",
244
+ "248": "Ambulance",
245
+ "249": "Parking meter",
246
+ "25": "Potted Plant",
247
+ "250": "Mango",
248
+ "251": "Key",
249
+ "252": "Hurdle",
250
+ "253": "Fishing Rod",
251
+ "254": "Medal",
252
+ "255": "Flute",
253
+ "256": "Brush",
254
+ "257": "Penguin",
255
+ "258": "Megaphone",
256
+ "259": "Corn",
257
+ "26": "Bowl/Basin",
258
+ "260": "Lettuce",
259
+ "261": "Garlic",
260
+ "262": "Swan",
261
+ "263": "Helicopter",
262
+ "264": "Green Onion",
263
+ "265": "Sandwich",
264
+ "266": "Nuts",
265
+ "267": "Speed Limit Sign",
266
+ "268": "Induction Cooker",
267
+ "269": "Broom",
268
+ "27": "Flag",
269
+ "270": "Trombone",
270
+ "271": "Plum",
271
+ "272": "Rickshaw",
272
+ "273": "Goldfish",
273
+ "274": "Kiwi fruit",
274
+ "275": "Router/modem",
275
+ "276": "Poker Card",
276
+ "277": "Toaster",
277
+ "278": "Shrimp",
278
+ "279": "Sushi",
279
+ "28": "Pillow",
280
+ "280": "Cheese",
281
+ "281": "Notepaper",
282
+ "282": "Cherry",
283
+ "283": "Pliers",
284
+ "284": "CD",
285
+ "285": "Pasta",
286
+ "286": "Hammer",
287
+ "287": "Cue",
288
+ "288": "Avocado",
289
+ "289": "Hami melon",
290
+ "29": "Boots",
291
+ "290": "Flask",
292
+ "291": "Mushroom",
293
+ "292": "Screwdriver",
294
+ "293": "Soap",
295
+ "294": "Recorder",
296
+ "295": "Bear",
297
+ "296": "Eggplant",
298
+ "297": "Board Eraser",
299
+ "298": "Coconut",
300
+ "299": "Tape Measure/Ruler",
301
+ "3": "Other Shoes",
302
+ "30": "Vase",
303
+ "300": "Pig",
304
+ "301": "Showerhead",
305
+ "302": "Globe",
306
+ "303": "Chips",
307
+ "304": "Steak",
308
+ "305": "Crosswalk Sign",
309
+ "306": "Stapler",
310
+ "307": "Camel",
311
+ "308": "Formula 1",
312
+ "309": "Pomegranate",
313
+ "31": "Microphone",
314
+ "310": "Dishwasher",
315
+ "311": "Crab",
316
+ "312": "Hoverboard",
317
+ "313": "Meatball",
318
+ "314": "Rice Cooker",
319
+ "315": "Tuba",
320
+ "316": "Calculator",
321
+ "317": "Papaya",
322
+ "318": "Antelope",
323
+ "319": "Parrot",
324
+ "32": "Necklace",
325
+ "320": "Seal",
326
+ "321": "Butterfly",
327
+ "322": "Dumbbell",
328
+ "323": "Donkey",
329
+ "324": "Lion",
330
+ "325": "Urinal",
331
+ "326": "Dolphin",
332
+ "327": "Electric Drill",
333
+ "328": "Hair Dryer",
334
+ "329": "Egg tart",
335
+ "33": "Ring",
336
+ "330": "Jellyfish",
337
+ "331": "Treadmill",
338
+ "332": "Lighter",
339
+ "333": "Grapefruit",
340
+ "334": "Game board",
341
+ "335": "Mop",
342
+ "336": "Radish",
343
+ "337": "Baozi",
344
+ "338": "Target",
345
+ "339": "French",
346
+ "34": "SUV",
347
+ "340": "Spring Rolls",
348
+ "341": "Monkey",
349
+ "342": "Rabbit",
350
+ "343": "Pencil Case",
351
+ "344": "Yak",
352
+ "345": "Red Cabbage",
353
+ "346": "Binoculars",
354
+ "347": "Asparagus",
355
+ "348": "Barbell",
356
+ "349": "Scallop",
357
+ "35": "Wine Glass",
358
+ "350": "Noddles",
359
+ "351": "Comb",
360
+ "352": "Dumpling",
361
+ "353": "Oyster",
362
+ "354": "Table Tennis paddle",
363
+ "355": "Cosmetics Brush/Eyeliner Pencil",
364
+ "356": "Chainsaw",
365
+ "357": "Eraser",
366
+ "358": "Lobster",
367
+ "359": "Durian",
368
+ "36": "Belt",
369
+ "360": "Okra",
370
+ "361": "Lipstick",
371
+ "362": "Cosmetics Mirror",
372
+ "363": "Curling",
373
+ "364": "Table Tennis",
374
+ "365": "N/A",
375
+ "37": "Monitor/TV",
376
+ "38": "Backpack",
377
+ "39": "Umbrella",
378
+ "4": "Hat",
379
+ "40": "Traffic Light",
380
+ "41": "Speaker",
381
+ "42": "Watch",
382
+ "43": "Tie",
383
+ "44": "Trash bin Can",
384
+ "45": "Slippers",
385
+ "46": "Bicycle",
386
+ "47": "Stool",
387
+ "48": "Barrel/bucket",
388
+ "49": "Van",
389
+ "5": "Car",
390
+ "50": "Couch",
391
+ "51": "Sandals",
392
+ "52": "Basket",
393
+ "53": "Drum",
394
+ "54": "Pen/Pencil",
395
+ "55": "Bus",
396
+ "56": "Wild Bird",
397
+ "57": "High Heels",
398
+ "58": "Motorcycle",
399
+ "59": "Guitar",
400
+ "6": "Lamp",
401
+ "60": "Carpet",
402
+ "61": "Cell Phone",
403
+ "62": "Bread",
404
+ "63": "Camera",
405
+ "64": "Canned",
406
+ "65": "Truck",
407
+ "66": "Traffic cone",
408
+ "67": "Cymbal",
409
+ "68": "Lifesaver",
410
+ "69": "Towel",
411
+ "7": "Glasses",
412
+ "70": "Stuffed Toy",
413
+ "71": "Candle",
414
+ "72": "Sailboat",
415
+ "73": "Laptop",
416
+ "74": "Awning",
417
+ "75": "Bed",
418
+ "76": "Faucet",
419
+ "77": "Tent",
420
+ "78": "Horse",
421
+ "79": "Mirror",
422
+ "8": "Bottle",
423
+ "80": "Power outlet",
424
+ "81": "Sink",
425
+ "82": "Apple",
426
+ "83": "Air Conditioner",
427
+ "84": "Knife",
428
+ "85": "Hockey Stick",
429
+ "86": "Paddle",
430
+ "87": "Pickup Truck",
431
+ "88": "Fork",
432
+ "89": "Traffic Sign",
433
+ "9": "Desk",
434
+ "90": "Balloon",
435
+ "91": "Tripod",
436
+ "92": "Dog",
437
+ "93": "Spoon",
438
+ "94": "Clock",
439
+ "95": "Pot",
440
+ "96": "Cow",
441
+ "97": "Cake",
442
+ "98": "Dining Table",
443
+ "99": "Sheep"
444
  },
445
  "init_std": 0.02,
446
  "label2id": {
447
+ "Air Conditioner": 83,
448
+ "Airplane": 114,
449
+ "Ambulance": 248,
450
+ "American Football": 206,
451
+ "Antelope": 318,
452
+ "Apple": 82,
453
+ "Asparagus": 347,
454
+ "Avocado": 288,
455
+ "Awning": 74,
456
+ "Backpack": 38,
457
+ "Balloon": 90,
458
+ "Banana": 112,
459
+ "Baozi": 337,
460
+ "Barbell": 348,
461
+ "Barrel/bucket": 48,
462
+ "Baseball": 165,
463
+ "Baseball Bat": 137,
464
+ "Baseball Glove": 113,
465
+ "Basket": 52,
466
+ "Basketball": 177,
467
+ "Bathtub": 191,
468
+ "Bear": 295,
469
+ "Bed": 75,
470
+ "Belt": 36,
471
+ "Bench": 24,
472
+ "Bicycle": 46,
473
+ "Billiards": 189,
474
+ "Binoculars": 346,
475
+ "Blackboard/Whiteboard": 101,
476
+ "Blender": 235,
477
+ "Board Eraser": 297,
478
+ "Boat": 21,
479
+ "Book": 18,
480
+ "Boots": 29,
481
+ "Bottle": 8,
482
+ "Bow Tie": 151,
483
+ "Bowl/Basin": 26,
484
+ "Bracelet": 14,
485
+ "Bread": 62,
486
+ "Briefcase": 194,
487
+ "Broccoli": 141,
488
+ "Broom": 269,
489
+ "Brush": 256,
490
+ "Bus": 55,
491
+ "Butterfly": 321,
492
+ "CD": 284,
493
+ "Cabbage": 233,
494
+ "Cabinet/shelf": 12,
495
+ "Cake": 97,
496
+ "Calculator": 316,
497
+ "Camel": 307,
498
+ "Camera": 63,
499
+ "Candle": 71,
500
+ "Candy": 187,
501
+ "Canned": 64,
502
+ "Car": 5,
503
+ "Carpet": 60,
504
+ "Carriage": 215,
505
+ "Carrot": 152,
506
+ "Cat": 139,
507
+ "Cell Phone": 61,
508
+ "Cello": 229,
509
+ "Chainsaw": 356,
510
+ "Chair": 2,
511
+ "Cheese": 280,
512
+ "Cherry": 282,
513
+ "Chicken": 221,
514
+ "Chips": 303,
515
+ "Chopsticks": 162,
516
+ "Cigar/Cigarette": 196,
517
+ "Cleaning Products": 161,
518
+ "Clock": 94,
519
+ "Coconut": 298,
520
+ "Coffee Machine": 213,
521
+ "Coffee Table": 167,
522
+ "Comb": 351,
523
+ "Computer Box": 159,
524
+ "Converter": 190,
525
+ "Cookies": 174,
526
+ "Corn": 259,
527
+ "Cosmetics": 244,
528
+ "Cosmetics Brush/Eyeliner Pencil": 355,
529
+ "Cosmetics Mirror": 362,
530
+ "Couch": 50,
531
+ "Cow": 96,
532
+ "Crab": 311,
533
+ "Crane": 131,
534
+ "Crosswalk Sign": 305,
535
+ "Cucumber": 195,
536
+ "Cue": 287,
537
+ "Cup": 10,
538
+ "Curling": 363,
539
+ "Cutting/chopping Board": 166,
540
+ "Cymbal": 67,
541
+ "Deer": 240,
542
+ "Desk": 9,
543
+ "Dessert": 128,
544
+ "Dining Table": 98,
545
+ "Dishwasher": 310,
546
+ "Dog": 92,
547
+ "Dolphin": 326,
548
+ "Donkey": 323,
549
+ "Donut": 150,
550
+ "Drum": 53,
551
+ "Duck": 136,
552
+ "Dumbbell": 322,
553
+ "Dumpling": 352,
554
+ "Durian": 359,
555
+ "Egg": 185,
556
+ "Egg tart": 329,
557
+ "Eggplant": 296,
558
+ "Electric Drill": 327,
559
+ "Elephant": 144,
560
+ "Eraser": 357,
561
+ "Extension Cord": 202,
562
+ "Extractor": 201,
563
+ "Fan": 110,
564
+ "Faucet": 76,
565
+ "Fire Extinguisher": 186,
566
+ "Fire Hydrant": 176,
567
+ "Fire Truck": 188,
568
+ "Fishing Rod": 253,
569
+ "Flag": 27,
570
+ "Flask": 290,
571
+ "Flower": 23,
572
+ "Flute": 255,
573
+ "Folder": 205,
574
+ "Fork": 88,
575
+ "Formula 1": 308,
576
+ "French": 339,
577
+ "French Fries": 230,
578
+ "Frisbee": 219,
579
+ "Game board": 334,
580
+ "Garlic": 261,
581
+ "Gas stove": 149,
582
+ "Giraffe": 180,
583
+ "Glasses": 7,
584
+ "Globe": 302,
585
+ "Gloves": 19,
586
+ "Goldfish": 273,
587
+ "Golf Ball": 247,
588
+ "Golf Club": 193,
589
+ "Goose": 241,
590
+ "Grape": 179,
591
+ "Grapefruit": 333,
592
+ "Green Onion": 264,
593
+ "Green Vegetables": 111,
594
+ "Green beans": 217,
595
+ "Guitar": 59,
596
+ "Gun": 147,
597
+ "Hair Dryer": 328,
598
+ "Hamburger": 200,
599
+ "Hami melon": 289,
600
+ "Hammer": 286,
601
+ "Handbag/Satchel": 13,
602
+ "Hanger": 100,
603
+ "Hat": 4,
604
+ "Head Phone": 125,
605
+ "Heavy Truck": 199,
606
+ "Helicopter": 263,
607
+ "Helmet": 17,
608
+ "High Heels": 57,
609
+ "Hockey Stick": 85,
610
+ "Horse": 78,
611
+ "Hot dog": 234,
612
+ "Hot-air balloon": 228,
613
+ "Hoverboard": 312,
614
+ "Hurdle": 252,
615
+ "Ice cream": 227,
616
+ "Induction Cooker": 268,
617
+ "Jellyfish": 330,
618
+ "Jug": 140,
619
+ "Kettle": 209,
620
+ "Key": 251,
621
+ "Keyboard": 106,
622
+ "Kite": 154,
623
+ "Kiwi fruit": 274,
624
+ "Knife": 84,
625
+ "Ladder": 172,
626
+ "Lamp": 6,
627
+ "Lantern": 108,
628
+ "Laptop": 73,
629
+ "Leather Shoes": 22,
630
+ "Lemon": 135,
631
+ "Lettuce": 260,
632
+ "Lifesaver": 68,
633
+ "Lighter": 332,
634
+ "Lion": 324,
635
+ "Lipstick": 361,
636
+ "Lobster": 358,
637
+ "Luggage": 120,
638
+ "Machinery Vehicle": 109,
639
+ "Mango": 250,
640
+ "Marker": 170,
641
+ "Mask": 208,
642
+ "Meatball": 313,
643
+ "Medal": 254,
644
+ "Megaphone": 258,
645
+ "Microphone": 31,
646
+ "Microwave": 163,
647
+ "Mirror": 79,
648
+ "Monitor/TV": 37,
649
+ "Monkey": 341,
650
+ "Mop": 335,
651
+ "Motorcycle": 58,
652
+ "Mouse": 115,
653
+ "Mushroom": 291,
654
+ "N/A": 365,
655
+ "Napkin": 102,
656
+ "Necklace": 32,
657
+ "Nightstand": 121,
658
+ "Noddles": 350,
659
+ "Notepaper": 281,
660
+ "Nuts": 266,
661
+ "Okra": 360,
662
+ "Onion": 216,
663
+ "Orange/Tangerine": 104,
664
+ "Other Balls": 156,
665
+ "Other Fish": 103,
666
+ "Other Shoes": 3,
667
+ "Oven": 134,
668
+ "Oyster": 353,
669
+ "Paddle": 86,
670
+ "Paint Brush": 197,
671
+ "Papaya": 317,
672
+ "Parking meter": 249,
673
+ "Parrot": 319,
674
+ "Pasta": 285,
675
+ "Peach": 236,
676
+ "Pear": 198,
677
+ "Pen/Pencil": 54,
678
+ "Pencil Case": 343,
679
+ "Penguin": 257,
680
+ "Pepper": 158,
681
+ "Person": 0,
682
+ "Piano": 142,
683
+ "Pickup Truck": 87,
684
+ "Picture/Frame": 16,
685
+ "Pie": 171,
686
+ "Pig": 300,
687
+ "Pigeon": 164,
688
+ "Pillow": 28,
689
+ "Pineapple": 246,
690
+ "Pizza": 143,
691
+ "Plate": 15,
692
+ "Pliers": 283,
693
+ "Plum": 271,
694
+ "Poker Card": 276,
695
+ "Pomegranate": 309,
696
+ "Pot": 95,
697
+ "Potato": 181,
698
+ "Potted Plant": 25,
699
+ "Power outlet": 80,
700
+ "Printer": 222,
701
+ "Projector": 218,
702
+ "Pumpkin": 117,
703
+ "Rabbit": 342,
704
+ "Radiator": 175,
705
+ "Radish": 336,
706
+ "Recorder": 294,
707
+ "Red Cabbage": 345,
708
+ "Refrigerator": 133,
709
+ "Remote": 132,
710
+ "Rice": 237,
711
+ "Rice Cooker": 314,
712
+ "Rickshaw": 272,
713
+ "Ring": 33,
714
+ "Router/modem": 275,
715
+ "SUV": 34,
716
+ "Sailboat": 72,
717
+ "Sandals": 51,
718
+ "Sandwich": 265,
719
+ "Sausage": 182,
720
+ "Saxophone": 224,
721
+ "Scale": 231,
722
+ "Scallop": 349,
723
+ "Scissors": 169,
724
+ "Scooter": 129,
725
+ "Screwdriver": 292,
726
+ "Seal": 320,
727
+ "Sheep": 99,
728
+ "Ship": 211,
729
+ "Shovel": 157,
730
+ "Showerhead": 301,
731
+ "Shrimp": 278,
732
+ "Side Table": 168,
733
+ "Sink": 81,
734
+ "Skateboard": 145,
735
+ "Skating and Skiing shoes": 148,
736
+ "Skiboard": 119,
737
+ "Slide": 214,
738
+ "Slippers": 45,
739
+ "Sneakers": 1,
740
+ "Snowboard": 173,
741
+ "Soap": 293,
742
+ "Soccer": 118,
743
+ "Speaker": 41,
744
+ "Speed Limit Sign": 267,
745
+ "Spoon": 93,
746
+ "Sports Car": 126,
747
+ "Spring Rolls": 340,
748
+ "Stapler": 306,
749
+ "Steak": 304,
750
+ "Stool": 47,
751
+ "Stop Sign": 127,
752
+ "Storage box": 20,
753
+ "Strawberry": 155,
754
+ "Street Lights": 11,
755
+ "Stroller": 130,
756
+ "Stuffed Toy": 70,
757
+ "Surfboard": 146,
758
+ "Surveillance Camera": 138,
759
+ "Sushi": 279,
760
+ "Swan": 262,
761
+ "Swing": 212,
762
+ "Table Tennis": 364,
763
+ "Table Tennis paddle": 354,
764
+ "Tablet": 243,
765
+ "Tape": 242,
766
+ "Tape Measure/Ruler": 299,
767
+ "Target": 338,
768
+ "Tea pot": 122,
769
+ "Telephone": 123,
770
+ "Tennis": 210,
771
+ "Tennis Racket": 204,
772
+ "Tent": 77,
773
+ "Tie": 43,
774
+ "Tissue": 225,
775
+ "Toaster": 277,
776
+ "Toilet": 153,
777
+ "Toilet Paper": 160,
778
+ "Toiletry": 105,
779
+ "Tomato": 107,
780
+ "Tong": 203,
781
+ "Toothbrush": 226,
782
+ "Towel": 69,
783
+ "Traffic Light": 40,
784
+ "Traffic Sign": 89,
785
+ "Traffic cone": 66,
786
+ "Train": 116,
787
+ "Trash bin Can": 44,
788
+ "Treadmill": 331,
789
+ "Tricycle": 183,
790
+ "Tripod": 91,
791
+ "Trolley": 124,
792
+ "Trombone": 270,
793
+ "Trophy": 232,
794
+ "Truck": 65,
795
+ "Trumpet": 245,
796
+ "Tuba": 315,
797
+ "Umbrella": 39,
798
+ "Urinal": 325,
799
+ "Van": 49,
800
+ "Vase": 30,
801
+ "Violin": 184,
802
+ "Volleyball": 239,
803
+ "Wallet/Purse": 238,
804
+ "Washing Machine/Drying Machine": 220,
805
+ "Watch": 42,
806
+ "Watermelon": 223,
807
+ "Wheelchair": 192,
808
+ "Wild Bird": 56,
809
+ "Wine Glass": 35,
810
+ "Yak": 344,
811
+ "Zebra": 178,
812
+ "earphone": 207
813
  },
814
  "model_type": "lw_detr",
815
  "num_feature_levels": 1,
 
824
  "transformers_version": "5.0.0.dev0",
825
  "use_pretrained_backbone": false,
826
  "use_timm_backbone": false
827
+ }