Yuan-lab commited on
Commit
a3f659e
·
verified ·
1 Parent(s): c31aecf

Upload 6 files

Browse files
Files changed (7) hide show
  1. .gitattributes +1 -0
  2. recipe.yaml +12 -0
  3. special_tokens_map.json +1092 -0
  4. tokenizer.json +3 -0
  5. tokenizer.model +3 -0
  6. tokenizer_config.json +0 -0
  7. utils.py +144 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
recipe.yaml ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ default_stage:
2
+ default_modifiers:
3
+ GPTQModifier:
4
+ targets: [Linear]
5
+ ignore: ['re:.*lm_head$', 're:.*qkv$', 're:.*fc1$', 're:.*fc2$', 're:.*attn.proj$',
6
+ 're:.*up_proj$', 're:.*gate_proj$', 're:.*down_proj$', 're:.*router$', 're:.*self_attn.get_query_key*',
7
+ 're:.*self_attn.o_proj*', 're:.*self_attn.v_proj*']
8
+ scheme: W4A16
9
+ sequential_update: true
10
+ block_size: 128
11
+ dampening_frac: 0.01
12
+ offload_hessians: false
special_tokens_map.json ADDED
@@ -0,0 +1,1092 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<s>",
4
+ "<eod>",
5
+ "<unk>",
6
+ "<sep>",
7
+ "<pad>",
8
+ "<mask>",
9
+ "<predict>",
10
+ "<FIM_SUFFIX>",
11
+ "<FIM_PREFIX>",
12
+ "<FIM_MIDDLE>",
13
+ "<commit_before>",
14
+ "<commit_msg>",
15
+ "<commit_after>",
16
+ "<jupyter_start>",
17
+ "<jupyter_text>",
18
+ "<jupyter_code>",
19
+ "<jupyter_output>",
20
+ "<empty_output>",
21
+ "<repo_name>",
22
+ "<file_sep>",
23
+ "<BOS>",
24
+ "<IMAGE>",
25
+ "</IMAGE>",
26
+ "<grounding>",
27
+ "<obj>",
28
+ "</obj>",
29
+ "<box>",
30
+ "</box>",
31
+ "<point>",
32
+ "</point>",
33
+ "<3dbox>",
34
+ "</3dbox>",
35
+ "<depth>",
36
+ "</depth>",
37
+ "s000",
38
+ "s001",
39
+ "s002",
40
+ "s003",
41
+ "s004",
42
+ "s005",
43
+ "s006",
44
+ "s007",
45
+ "s008",
46
+ "s009",
47
+ "s010",
48
+ "s011",
49
+ "s012",
50
+ "s013",
51
+ "s014",
52
+ "s015",
53
+ "s016",
54
+ "s017",
55
+ "s018",
56
+ "s019",
57
+ "s020",
58
+ "s021",
59
+ "s022",
60
+ "s023",
61
+ "s024",
62
+ "s025",
63
+ "s026",
64
+ "s027",
65
+ "s028",
66
+ "s029",
67
+ "s030",
68
+ "s031",
69
+ "s032",
70
+ "s033",
71
+ "s034",
72
+ "s035",
73
+ "s036",
74
+ "s037",
75
+ "s038",
76
+ "s039",
77
+ "s040",
78
+ "s041",
79
+ "s042",
80
+ "s043",
81
+ "s044",
82
+ "s045",
83
+ "s046",
84
+ "s047",
85
+ "s048",
86
+ "s049",
87
+ "s050",
88
+ "s051",
89
+ "s052",
90
+ "s053",
91
+ "s054",
92
+ "s055",
93
+ "s056",
94
+ "s057",
95
+ "s058",
96
+ "s059",
97
+ "s060",
98
+ "s061",
99
+ "s062",
100
+ "s063",
101
+ "s064",
102
+ "s065",
103
+ "s066",
104
+ "s067",
105
+ "s068",
106
+ "s069",
107
+ "s070",
108
+ "s071",
109
+ "s072",
110
+ "s073",
111
+ "s074",
112
+ "s075",
113
+ "s076",
114
+ "s077",
115
+ "s078",
116
+ "s079",
117
+ "s080",
118
+ "s081",
119
+ "s082",
120
+ "s083",
121
+ "s084",
122
+ "s085",
123
+ "s086",
124
+ "s087",
125
+ "s088",
126
+ "s089",
127
+ "s090",
128
+ "s091",
129
+ "s092",
130
+ "s093",
131
+ "s094",
132
+ "s095",
133
+ "s096",
134
+ "s097",
135
+ "s098",
136
+ "s099",
137
+ "s100",
138
+ "s101",
139
+ "s102",
140
+ "s103",
141
+ "s104",
142
+ "s105",
143
+ "s106",
144
+ "s107",
145
+ "s108",
146
+ "s109",
147
+ "s110",
148
+ "s111",
149
+ "s112",
150
+ "s113",
151
+ "s114",
152
+ "s115",
153
+ "s116",
154
+ "s117",
155
+ "s118",
156
+ "s119",
157
+ "s120",
158
+ "s121",
159
+ "s122",
160
+ "s123",
161
+ "s124",
162
+ "s125",
163
+ "s126",
164
+ "s127",
165
+ "s128",
166
+ "s129",
167
+ "s130",
168
+ "s131",
169
+ "s132",
170
+ "s133",
171
+ "s134",
172
+ "s135",
173
+ "s136",
174
+ "s137",
175
+ "s138",
176
+ "s139",
177
+ "s140",
178
+ "s141",
179
+ "s142",
180
+ "s143",
181
+ "s144",
182
+ "s145",
183
+ "s146",
184
+ "s147",
185
+ "s148",
186
+ "s149",
187
+ "s150",
188
+ "s151",
189
+ "s152",
190
+ "s153",
191
+ "s154",
192
+ "s155",
193
+ "s156",
194
+ "s157",
195
+ "s158",
196
+ "s159",
197
+ "s160",
198
+ "s161",
199
+ "s162",
200
+ "s163",
201
+ "s164",
202
+ "s165",
203
+ "s166",
204
+ "s167",
205
+ "s168",
206
+ "s169",
207
+ "s170",
208
+ "s171",
209
+ "s172",
210
+ "s173",
211
+ "s174",
212
+ "s175",
213
+ "s176",
214
+ "s177",
215
+ "s178",
216
+ "s179",
217
+ "s180",
218
+ "s181",
219
+ "s182",
220
+ "s183",
221
+ "s184",
222
+ "s185",
223
+ "s186",
224
+ "s187",
225
+ "s188",
226
+ "s189",
227
+ "s190",
228
+ "s191",
229
+ "s192",
230
+ "s193",
231
+ "s194",
232
+ "s195",
233
+ "s196",
234
+ "s197",
235
+ "s198",
236
+ "s199",
237
+ "s200",
238
+ "s201",
239
+ "s202",
240
+ "s203",
241
+ "s204",
242
+ "s205",
243
+ "s206",
244
+ "s207",
245
+ "s208",
246
+ "s209",
247
+ "s210",
248
+ "s211",
249
+ "s212",
250
+ "s213",
251
+ "s214",
252
+ "s215",
253
+ "s216",
254
+ "s217",
255
+ "s218",
256
+ "s219",
257
+ "s220",
258
+ "s221",
259
+ "s222",
260
+ "s223",
261
+ "s224",
262
+ "s225",
263
+ "s226",
264
+ "s227",
265
+ "s228",
266
+ "s229",
267
+ "s230",
268
+ "s231",
269
+ "s232",
270
+ "s233",
271
+ "s234",
272
+ "s235",
273
+ "s236",
274
+ "s237",
275
+ "s238",
276
+ "s239",
277
+ "s240",
278
+ "s241",
279
+ "s242",
280
+ "s243",
281
+ "s244",
282
+ "s245",
283
+ "s246",
284
+ "s247",
285
+ "s248",
286
+ "s249",
287
+ "s250",
288
+ "s251",
289
+ "s252",
290
+ "s253",
291
+ "s254",
292
+ "s255",
293
+ "s256",
294
+ "s257",
295
+ "s258",
296
+ "s259",
297
+ "s260",
298
+ "s261",
299
+ "s262",
300
+ "s263",
301
+ "s264",
302
+ "s265",
303
+ "s266",
304
+ "s267",
305
+ "s268",
306
+ "s269",
307
+ "s270",
308
+ "s271",
309
+ "s272",
310
+ "s273",
311
+ "s274",
312
+ "s275",
313
+ "s276",
314
+ "s277",
315
+ "s278",
316
+ "s279",
317
+ "s280",
318
+ "s281",
319
+ "s282",
320
+ "s283",
321
+ "s284",
322
+ "s285",
323
+ "s286",
324
+ "s287",
325
+ "s288",
326
+ "s289",
327
+ "s290",
328
+ "s291",
329
+ "s292",
330
+ "s293",
331
+ "s294",
332
+ "s295",
333
+ "s296",
334
+ "s297",
335
+ "s298",
336
+ "s299",
337
+ "s300",
338
+ "s301",
339
+ "s302",
340
+ "s303",
341
+ "s304",
342
+ "s305",
343
+ "s306",
344
+ "s307",
345
+ "s308",
346
+ "s309",
347
+ "s310",
348
+ "s311",
349
+ "s312",
350
+ "s313",
351
+ "s314",
352
+ "s315",
353
+ "s316",
354
+ "s317",
355
+ "s318",
356
+ "s319",
357
+ "s320",
358
+ "s321",
359
+ "s322",
360
+ "s323",
361
+ "s324",
362
+ "s325",
363
+ "s326",
364
+ "s327",
365
+ "s328",
366
+ "s329",
367
+ "s330",
368
+ "s331",
369
+ "s332",
370
+ "s333",
371
+ "s334",
372
+ "s335",
373
+ "s336",
374
+ "s337",
375
+ "s338",
376
+ "s339",
377
+ "s340",
378
+ "s341",
379
+ "s342",
380
+ "s343",
381
+ "s344",
382
+ "s345",
383
+ "s346",
384
+ "s347",
385
+ "s348",
386
+ "s349",
387
+ "s350",
388
+ "s351",
389
+ "s352",
390
+ "s353",
391
+ "s354",
392
+ "s355",
393
+ "s356",
394
+ "s357",
395
+ "s358",
396
+ "s359",
397
+ "s360",
398
+ "s361",
399
+ "s362",
400
+ "s363",
401
+ "s364",
402
+ "s365",
403
+ "s366",
404
+ "s367",
405
+ "s368",
406
+ "s369",
407
+ "s370",
408
+ "s371",
409
+ "s372",
410
+ "s373",
411
+ "s374",
412
+ "s375",
413
+ "s376",
414
+ "s377",
415
+ "s378",
416
+ "s379",
417
+ "s380",
418
+ "s381",
419
+ "s382",
420
+ "s383",
421
+ "s384",
422
+ "s385",
423
+ "s386",
424
+ "s387",
425
+ "s388",
426
+ "s389",
427
+ "s390",
428
+ "s391",
429
+ "s392",
430
+ "s393",
431
+ "s394",
432
+ "s395",
433
+ "s396",
434
+ "s397",
435
+ "s398",
436
+ "s399",
437
+ "s400",
438
+ "s401",
439
+ "s402",
440
+ "s403",
441
+ "s404",
442
+ "s405",
443
+ "s406",
444
+ "s407",
445
+ "s408",
446
+ "s409",
447
+ "s410",
448
+ "s411",
449
+ "s412",
450
+ "s413",
451
+ "s414",
452
+ "s415",
453
+ "s416",
454
+ "s417",
455
+ "s418",
456
+ "s419",
457
+ "s420",
458
+ "s421",
459
+ "s422",
460
+ "s423",
461
+ "s424",
462
+ "s425",
463
+ "s426",
464
+ "s427",
465
+ "s428",
466
+ "s429",
467
+ "s430",
468
+ "s431",
469
+ "s432",
470
+ "s433",
471
+ "s434",
472
+ "s435",
473
+ "s436",
474
+ "s437",
475
+ "s438",
476
+ "s439",
477
+ "s440",
478
+ "s441",
479
+ "s442",
480
+ "s443",
481
+ "s444",
482
+ "s445",
483
+ "s446",
484
+ "s447",
485
+ "s448",
486
+ "s449",
487
+ "s450",
488
+ "s451",
489
+ "s452",
490
+ "s453",
491
+ "s454",
492
+ "s455",
493
+ "s456",
494
+ "s457",
495
+ "s458",
496
+ "s459",
497
+ "s460",
498
+ "s461",
499
+ "s462",
500
+ "s463",
501
+ "s464",
502
+ "s465",
503
+ "s466",
504
+ "s467",
505
+ "s468",
506
+ "s469",
507
+ "s470",
508
+ "s471",
509
+ "s472",
510
+ "s473",
511
+ "s474",
512
+ "s475",
513
+ "s476",
514
+ "s477",
515
+ "s478",
516
+ "s479",
517
+ "s480",
518
+ "s481",
519
+ "s482",
520
+ "s483",
521
+ "s484",
522
+ "s485",
523
+ "s486",
524
+ "s487",
525
+ "s488",
526
+ "s489",
527
+ "s490",
528
+ "s491",
529
+ "s492",
530
+ "s493",
531
+ "s494",
532
+ "s495",
533
+ "s496",
534
+ "s497",
535
+ "s498",
536
+ "s499",
537
+ "s500",
538
+ "s501",
539
+ "s502",
540
+ "s503",
541
+ "s504",
542
+ "s505",
543
+ "s506",
544
+ "s507",
545
+ "s508",
546
+ "s509",
547
+ "s510",
548
+ "s511",
549
+ "s512",
550
+ "s513",
551
+ "s514",
552
+ "s515",
553
+ "s516",
554
+ "s517",
555
+ "s518",
556
+ "s519",
557
+ "s520",
558
+ "s521",
559
+ "s522",
560
+ "s523",
561
+ "s524",
562
+ "s525",
563
+ "s526",
564
+ "s527",
565
+ "s528",
566
+ "s529",
567
+ "s530",
568
+ "s531",
569
+ "s532",
570
+ "s533",
571
+ "s534",
572
+ "s535",
573
+ "s536",
574
+ "s537",
575
+ "s538",
576
+ "s539",
577
+ "s540",
578
+ "s541",
579
+ "s542",
580
+ "s543",
581
+ "s544",
582
+ "s545",
583
+ "s546",
584
+ "s547",
585
+ "s548",
586
+ "s549",
587
+ "s550",
588
+ "s551",
589
+ "s552",
590
+ "s553",
591
+ "s554",
592
+ "s555",
593
+ "s556",
594
+ "s557",
595
+ "s558",
596
+ "s559",
597
+ "s560",
598
+ "s561",
599
+ "s562",
600
+ "s563",
601
+ "s564",
602
+ "s565",
603
+ "s566",
604
+ "s567",
605
+ "s568",
606
+ "s569",
607
+ "s570",
608
+ "s571",
609
+ "s572",
610
+ "s573",
611
+ "s574",
612
+ "s575",
613
+ "s576",
614
+ "s577",
615
+ "s578",
616
+ "s579",
617
+ "s580",
618
+ "s581",
619
+ "s582",
620
+ "s583",
621
+ "s584",
622
+ "s585",
623
+ "s586",
624
+ "s587",
625
+ "s588",
626
+ "s589",
627
+ "s590",
628
+ "s591",
629
+ "s592",
630
+ "s593",
631
+ "s594",
632
+ "s595",
633
+ "s596",
634
+ "s597",
635
+ "s598",
636
+ "s599",
637
+ "s600",
638
+ "s601",
639
+ "s602",
640
+ "s603",
641
+ "s604",
642
+ "s605",
643
+ "s606",
644
+ "s607",
645
+ "s608",
646
+ "s609",
647
+ "s610",
648
+ "s611",
649
+ "s612",
650
+ "s613",
651
+ "s614",
652
+ "s615",
653
+ "s616",
654
+ "s617",
655
+ "s618",
656
+ "s619",
657
+ "s620",
658
+ "s621",
659
+ "s622",
660
+ "s623",
661
+ "s624",
662
+ "s625",
663
+ "s626",
664
+ "s627",
665
+ "s628",
666
+ "s629",
667
+ "s630",
668
+ "s631",
669
+ "s632",
670
+ "s633",
671
+ "s634",
672
+ "s635",
673
+ "s636",
674
+ "s637",
675
+ "s638",
676
+ "s639",
677
+ "s640",
678
+ "s641",
679
+ "s642",
680
+ "s643",
681
+ "s644",
682
+ "s645",
683
+ "s646",
684
+ "s647",
685
+ "s648",
686
+ "s649",
687
+ "s650",
688
+ "s651",
689
+ "s652",
690
+ "s653",
691
+ "s654",
692
+ "s655",
693
+ "s656",
694
+ "s657",
695
+ "s658",
696
+ "s659",
697
+ "s660",
698
+ "s661",
699
+ "s662",
700
+ "s663",
701
+ "s664",
702
+ "s665",
703
+ "s666",
704
+ "s667",
705
+ "s668",
706
+ "s669",
707
+ "s670",
708
+ "s671",
709
+ "s672",
710
+ "s673",
711
+ "s674",
712
+ "s675",
713
+ "s676",
714
+ "s677",
715
+ "s678",
716
+ "s679",
717
+ "s680",
718
+ "s681",
719
+ "s682",
720
+ "s683",
721
+ "s684",
722
+ "s685",
723
+ "s686",
724
+ "s687",
725
+ "s688",
726
+ "s689",
727
+ "s690",
728
+ "s691",
729
+ "s692",
730
+ "s693",
731
+ "s694",
732
+ "s695",
733
+ "s696",
734
+ "s697",
735
+ "s698",
736
+ "s699",
737
+ "s700",
738
+ "s701",
739
+ "s702",
740
+ "s703",
741
+ "s704",
742
+ "s705",
743
+ "s706",
744
+ "s707",
745
+ "s708",
746
+ "s709",
747
+ "s710",
748
+ "s711",
749
+ "s712",
750
+ "s713",
751
+ "s714",
752
+ "s715",
753
+ "s716",
754
+ "s717",
755
+ "s718",
756
+ "s719",
757
+ "s720",
758
+ "s721",
759
+ "s722",
760
+ "s723",
761
+ "s724",
762
+ "s725",
763
+ "s726",
764
+ "s727",
765
+ "s728",
766
+ "s729",
767
+ "s730",
768
+ "s731",
769
+ "s732",
770
+ "s733",
771
+ "s734",
772
+ "s735",
773
+ "s736",
774
+ "s737",
775
+ "s738",
776
+ "s739",
777
+ "s740",
778
+ "s741",
779
+ "s742",
780
+ "s743",
781
+ "s744",
782
+ "s745",
783
+ "s746",
784
+ "s747",
785
+ "s748",
786
+ "s749",
787
+ "s750",
788
+ "s751",
789
+ "s752",
790
+ "s753",
791
+ "s754",
792
+ "s755",
793
+ "s756",
794
+ "s757",
795
+ "s758",
796
+ "s759",
797
+ "s760",
798
+ "s761",
799
+ "s762",
800
+ "s763",
801
+ "s764",
802
+ "s765",
803
+ "s766",
804
+ "s767",
805
+ "s768",
806
+ "s769",
807
+ "s770",
808
+ "s771",
809
+ "s772",
810
+ "s773",
811
+ "s774",
812
+ "s775",
813
+ "s776",
814
+ "s777",
815
+ "s778",
816
+ "s779",
817
+ "s780",
818
+ "s781",
819
+ "s782",
820
+ "s783",
821
+ "s784",
822
+ "s785",
823
+ "s786",
824
+ "s787",
825
+ "s788",
826
+ "s789",
827
+ "s790",
828
+ "s791",
829
+ "s792",
830
+ "s793",
831
+ "s794",
832
+ "s795",
833
+ "s796",
834
+ "s797",
835
+ "s798",
836
+ "s799",
837
+ "s800",
838
+ "s801",
839
+ "s802",
840
+ "s803",
841
+ "s804",
842
+ "s805",
843
+ "s806",
844
+ "s807",
845
+ "s808",
846
+ "s809",
847
+ "s810",
848
+ "s811",
849
+ "s812",
850
+ "s813",
851
+ "s814",
852
+ "s815",
853
+ "s816",
854
+ "s817",
855
+ "s818",
856
+ "s819",
857
+ "s820",
858
+ "s821",
859
+ "s822",
860
+ "s823",
861
+ "s824",
862
+ "s825",
863
+ "s826",
864
+ "s827",
865
+ "s828",
866
+ "s829",
867
+ "s830",
868
+ "s831",
869
+ "s832",
870
+ "s833",
871
+ "s834",
872
+ "s835",
873
+ "s836",
874
+ "s837",
875
+ "s838",
876
+ "s839",
877
+ "s840",
878
+ "s841",
879
+ "s842",
880
+ "s843",
881
+ "s844",
882
+ "s845",
883
+ "s846",
884
+ "s847",
885
+ "s848",
886
+ "s849",
887
+ "s850",
888
+ "s851",
889
+ "s852",
890
+ "s853",
891
+ "s854",
892
+ "s855",
893
+ "s856",
894
+ "s857",
895
+ "s858",
896
+ "s859",
897
+ "s860",
898
+ "s861",
899
+ "s862",
900
+ "s863",
901
+ "s864",
902
+ "s865",
903
+ "s866",
904
+ "s867",
905
+ "s868",
906
+ "s869",
907
+ "s870",
908
+ "s871",
909
+ "s872",
910
+ "s873",
911
+ "s874",
912
+ "s875",
913
+ "s876",
914
+ "s877",
915
+ "s878",
916
+ "s879",
917
+ "s880",
918
+ "s881",
919
+ "s882",
920
+ "s883",
921
+ "s884",
922
+ "s885",
923
+ "s886",
924
+ "s887",
925
+ "s888",
926
+ "s889",
927
+ "s890",
928
+ "s891",
929
+ "s892",
930
+ "s893",
931
+ "s894",
932
+ "s895",
933
+ "s896",
934
+ "s897",
935
+ "s898",
936
+ "s899",
937
+ "s900",
938
+ "s901",
939
+ "s902",
940
+ "s903",
941
+ "s904",
942
+ "s905",
943
+ "s906",
944
+ "s907",
945
+ "s908",
946
+ "s909",
947
+ "s910",
948
+ "s911",
949
+ "s912",
950
+ "s913",
951
+ "s914",
952
+ "s915",
953
+ "s916",
954
+ "s917",
955
+ "s918",
956
+ "s919",
957
+ "s920",
958
+ "s921",
959
+ "s922",
960
+ "s923",
961
+ "s924",
962
+ "s925",
963
+ "s926",
964
+ "s927",
965
+ "s928",
966
+ "s929",
967
+ "s930",
968
+ "s931",
969
+ "s932",
970
+ "s933",
971
+ "s934",
972
+ "s935",
973
+ "s936",
974
+ "s937",
975
+ "s938",
976
+ "s939",
977
+ "s940",
978
+ "s941",
979
+ "s942",
980
+ "s943",
981
+ "s944",
982
+ "s945",
983
+ "s946",
984
+ "s947",
985
+ "s948",
986
+ "s949",
987
+ "s950",
988
+ "s951",
989
+ "s952",
990
+ "s953",
991
+ "s954",
992
+ "s955",
993
+ "s956",
994
+ "s957",
995
+ "s958",
996
+ "s959",
997
+ "s960",
998
+ "s961",
999
+ "s962",
1000
+ "s963",
1001
+ "s964",
1002
+ "s965",
1003
+ "s966",
1004
+ "s967",
1005
+ "s968",
1006
+ "s969",
1007
+ "s970",
1008
+ "s971",
1009
+ "s972",
1010
+ "s973",
1011
+ "s974",
1012
+ "s975",
1013
+ "s976",
1014
+ "s977",
1015
+ "s978",
1016
+ "s979",
1017
+ "s980",
1018
+ "s981",
1019
+ "s982",
1020
+ "s983",
1021
+ "s984",
1022
+ "s985",
1023
+ "s986",
1024
+ "s987",
1025
+ "s988",
1026
+ "s989",
1027
+ "s990",
1028
+ "s991",
1029
+ "s992",
1030
+ "s993",
1031
+ "s994",
1032
+ "s995",
1033
+ "s996",
1034
+ "s997",
1035
+ "s998",
1036
+ "s999",
1037
+ "<eop>",
1038
+ "<eog>",
1039
+ "<|begin_of_sentence|>",
1040
+ "<|end_of_sentence|>",
1041
+ "<|User|>",
1042
+ "<|Assistant|>",
1043
+ "<think>",
1044
+ "</think>",
1045
+ "<search_result>",
1046
+ "</search_result>",
1047
+ "<search_query>",
1048
+ "</search_query>",
1049
+ "<code_query>",
1050
+ "</code_query>",
1051
+ "<code_result>",
1052
+ "</code_result>",
1053
+ "<infer>",
1054
+ "</infer>",
1055
+ "<inferresult>",
1056
+ "</inferresult>",
1057
+ "<tool_calls>",
1058
+ "</tool_calls>",
1059
+ "<tool_response>",
1060
+ "</tool_response>",
1061
+ "<final_answer>",
1062
+ "</final_answer>"
1063
+ ],
1064
+ "bos_token": {
1065
+ "content": "<s>",
1066
+ "lstrip": false,
1067
+ "normalized": false,
1068
+ "rstrip": false,
1069
+ "single_word": false
1070
+ },
1071
+ "eos_token": {
1072
+ "content": "<eod>",
1073
+ "lstrip": false,
1074
+ "normalized": false,
1075
+ "rstrip": false,
1076
+ "single_word": false
1077
+ },
1078
+ "pad_token": {
1079
+ "content": "<eod>",
1080
+ "lstrip": false,
1081
+ "normalized": false,
1082
+ "rstrip": false,
1083
+ "single_word": false
1084
+ },
1085
+ "unk_token": {
1086
+ "content": "<unk>",
1087
+ "lstrip": false,
1088
+ "normalized": false,
1089
+ "rstrip": false,
1090
+ "single_word": false
1091
+ }
1092
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c84861a800c30e71099d63dca0963edbacf554586527ac037155a0560e2fb04
3
+ size 14976038
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36f79e0c70f73cdd2a8dd0fbe7bfe290da158eea746778d289e4ad76c8b383d9
3
+ size 2155861
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff
 
utils.py ADDED
@@ -0,0 +1,144 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import (Callable, Dict, Iterable, List, Literal, Mapping, Optional, Protocol, Set, Tuple, Union, overload)
2
+ import torch
3
+ from torch.func import functional_call
4
+
5
+ @overload
6
+ def flatten_bn(x: torch.Tensor) -> torch.Tensor:
7
+ ...
8
+
9
+
10
+ @overload
11
+ def flatten_bn(x: List[torch.Tensor]) -> List[torch.Tensor]:
12
+ ...
13
+
14
+
15
+ @overload
16
+ def flatten_bn(
17
+ x: Union[List[torch.Tensor], torch.Tensor],
18
+ *,
19
+ concat: Literal[True],
20
+ ) -> torch.Tensor:
21
+ ...
22
+
23
+
24
+ @overload
25
+ def flatten_bn(
26
+ x: Union[List[torch.Tensor], torch.Tensor],
27
+ *,
28
+ concat: bool = False,
29
+ ) -> Union[List[torch.Tensor], torch.Tensor]:
30
+ ...
31
+
32
+
33
+ def flatten_bn(
34
+ x: Union[List[torch.Tensor], torch.Tensor],
35
+ *,
36
+ concat: bool = False,
37
+ ) -> Union[List[torch.Tensor], torch.Tensor]:
38
+ """
39
+ Flatten the ``B`` and ``N`` dimensions of batched multimodal inputs.
40
+
41
+ The input tensor should have shape ``(B, N, ...)```.
42
+ """
43
+ if isinstance(x, torch.Tensor):
44
+ return x.flatten(0, 1)
45
+
46
+ if concat:
47
+ return torch.cat(x)
48
+
49
+ return [x_n for x_b in x for x_n in x_b]
50
+
51
+ def _flatten_embeddings(embeddings: torch.Tensor) -> torch.Tensor:
52
+ """
53
+ Recursively flattens and concatenates NestedTensors on all but the last
54
+ dimension.
55
+ """
56
+
57
+ if isinstance(embeddings, torch.Tensor):
58
+ # Flatten all but the last dimension.
59
+ return embeddings.flatten(0, -2)
60
+
61
+ return torch.cat(tuple(_flatten_embeddings(t) for t in embeddings))
62
+
63
+ def _embedding_count_expression(embeddings: torch.Tensor) -> str:
64
+ """
65
+ Constructs a debugging representation of the number of embeddings in the
66
+ Tensors.
67
+ """
68
+
69
+ if isinstance(embeddings, torch.Tensor):
70
+ return " x ".join([str(dim) for dim in embeddings.shape[:-1]])
71
+
72
+ return " + ".join(
73
+ _embedding_count_expression(inner) for inner in embeddings)
74
+
75
+ def _merge_multimodal_embeddings(
76
+ inputs_embeds: torch.Tensor,
77
+ is_multimodal: torch.Tensor,
78
+ multimodal_embeddings: torch.Tensor,
79
+ ) -> torch.Tensor:
80
+ """
81
+ Merge ``multimodal_embeddings`` into ``inputs_embeds`` by overwriting the
82
+ positions in ``inputs_embeds`` corresponding to placeholder tokens in
83
+ ``input_ids``.
84
+
85
+ Note:
86
+ This updates ``inputs_embeds`` in place.
87
+ """
88
+ num_expected_tokens = is_multimodal.sum().item()
89
+ assert isinstance(num_expected_tokens, int)
90
+ # [total_patches, text_config.hidden_size]
91
+ flattened = _flatten_embeddings(multimodal_embeddings)
92
+ if flattened.shape[0] != num_expected_tokens:
93
+ expr = _embedding_count_expression(multimodal_embeddings)
94
+ raise ValueError(
95
+ f"Attempted to assign {expr} = {flattened.shape[0]} "
96
+ f"multimodal tokens to {num_expected_tokens} placeholders")
97
+
98
+ inputs_embeds[is_multimodal] = flattened
99
+ return inputs_embeds
100
+
101
+ def merge_multimodal_embeddings(
102
+ input_ids: torch.Tensor,
103
+ inputs_embeds: torch.Tensor,
104
+ multimodal_embeddings: torch.Tensor,
105
+ placeholder_token_id: Union[int, List[int]],
106
+ ) -> torch.Tensor:
107
+ """
108
+ Merge ``multimodal_embeddings`` into ``inputs_embeds`` by overwriting the
109
+ positions in ``inputs_embeds`` corresponding to placeholder tokens in
110
+ ``input_ids``.
111
+
112
+ ``placeholder_token_id`` can be a list of token ids (e.g, token ids
113
+ of img_start, img_break, and img_end tokens) when needed: This means
114
+ the order of these tokens in the ``input_ids`` MUST MATCH the order of
115
+ their embeddings in ``multimodal_embeddings`` since we need to
116
+ slice-merge instead of individually scattering.
117
+
118
+ For example, if input_ids is "TTTTTSIIIBIIIBIIIETTT", where
119
+ - T is text token
120
+ - S is image start token
121
+ - I is image embedding token
122
+ - B is image break token
123
+ - E is image end token.
124
+
125
+ Then the image embeddings (that correspond to I's) from vision encoder
126
+ must be padded with embeddings of S, B, and E in the same order of
127
+ input_ids for a correct embedding merge.
128
+
129
+ Note:
130
+ This updates ``inputs_embeds`` in place.
131
+ """
132
+ if isinstance(placeholder_token_id, list):
133
+ placeholder_token_id = torch.tensor(placeholder_token_id,
134
+ device=input_ids.device)
135
+ return _merge_multimodal_embeddings(
136
+ inputs_embeds,
137
+ torch.isin(input_ids, placeholder_token_id),
138
+ multimodal_embeddings,
139
+ )
140
+ return _merge_multimodal_embeddings(
141
+ inputs_embeds,
142
+ (input_ids == placeholder_token_id),
143
+ multimodal_embeddings,
144
+ )