File size: 63,811 Bytes
17c6d62 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 |
<!--
Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
â ïž ãã®ãã¡ã€ã«ã¯Markdown圢åŒã§ãããç¹å®ã®ææ³ãå«ãŸããŠãããéåžžã®Markdownãã¥ãŒã¢ãŒã§ã¯æ£ãã衚瀺ãããªãå ŽåããããŸãã
-->
# How to add a model to ð€ Transformers?
ð€ Transformersã©ã€ãã©ãªã¯ãã³ãã¥ããã£ã®è²¢ç®è
ã®ãããã§æ°ããã¢ãã«ãæäŸã§ããããšããããããŸãã
ããããããã¯é£ãããããžã§ã¯ãã§ãããð€ Transformersã©ã€ãã©ãªãšå®è£
ããã¢ãã«ã«ã€ããŠã®æ·±ãç¥èãå¿
èŠã§ãã
Hugging Faceã§ã¯ãã³ãã¥ããã£ã®å€ãã®äººã
ã«ç©æ¥µçã«ã¢ãã«ã远å ããåãäžããããšåªåããŠããã
ãã®ã¬ã€ãããŸãšããŠãPyTorchã¢ãã«ã远å ããããã»ã¹ã説æããŸãïŒ[PyTorchãã€ã³ã¹ããŒã«ãããŠããããšã確èªããŠãã ãã](https://pytorch.org/get-started/locally/)ïŒã
ãã®éçšã§ã以äžã®ããšãåŠã³ãŸãïŒ
- ãªãŒãã³ãœãŒã¹ã®ãã¹ããã©ã¯ãã£ã¹ã«é¢ããæŽå¯
- æã人æ°ã®ããæ·±å±€åŠç¿ã©ã€ãã©ãªã®èšèšååãçè§£ãã
- å€§èŠæš¡ãªã¢ãã«ãå¹ççã«ãã¹ãããæ¹æ³ãåŠã¶
- `black`ã`ruff`ãããã³`make fix-copies`ãªã©ã®PythonãŠãŒãã£ãªãã£ãçµ±åããŠãã¯ãªãŒã³ã§èªã¿ãããã³ãŒãã確ä¿ããæ¹æ³ãåŠã¶
Hugging FaceããŒã ã®ã¡ã³ããŒããµããŒããæäŸããã®ã§ãäžäººãŒã£ã¡ã«ãªãããšã¯ãããŸããã ð€ â€ïž
ãããå§ããŸãããïŒð€ Transformersã§èŠããã¢ãã«ã«ã€ããŠã®[New model addition](https://github.com/huggingface/transformers/issues/new?assignees=&labels=New+model&template=new-model-addition.yml)ã®ã€ã·ã¥ãŒãéããŠãã ããã
ç¹å®ã®ã¢ãã«ãæäŸããããšã«ç¹ã«ãã ããããªãå Žåã[New model label](https://github.com/huggingface/transformers/labels/New%20model)ã§æªå²ãåœãŠã®ã¢ãã«ãªã¯ãšã¹ãããããã©ããã確èªããŠãããã«åãçµãããšãã§ããŸãã
æ°ããã¢ãã«ãªã¯ãšã¹ããéããããæåã®ã¹ãããã¯ð€ Transformersãããçè§£ããããšã§ãïŒ
## General overview of ð€ Transformers
ãŸããð€ Transformersã®äžè¬çãªæŠèŠãææ¡ããå¿
èŠããããŸããð€ Transformersã¯éåžžã«æèŠãåãããã©ã€ãã©ãªã§ãã®ã§ã
ã©ã€ãã©ãªã®å²åŠãèšèšéžæã«ã€ããŠåæã§ããªãå¯èœæ§ããããŸãããã ããç§ãã¡ã®çµéšãããã©ã€ãã©ãªã®åºæ¬çãªèšèšéžæãšå²åŠã¯ã
ð€ Transformersãå¹ççã«ã¹ã±ãŒãªã³ã°ããé©åãªã¬ãã«ã§ä¿å®ã³ã¹ããæããããã«äžå¯æ¬ ã§ãã
ã©ã€ãã©ãªã®çè§£ãæ·±ããããã®è¯ãåºçºç¹ã¯ã[å²åŠã®ããã¥ã¡ã³ã](philosophy)ãèªãããšã§ãã
ç§ãã¡ã®äœæ¥æ¹æ³ã®çµæããã¹ãŠã®ã¢ãã«ã«é©çšããããšããããã€ãã®éžæè¢ããããŸãïŒ
- äžè¬çã«ãæœè±¡åãããæ§æãåªå
ãããŸãã
- ã³ãŒãã®éè€ã¯ãèªã¿ããããã¢ã¯ã»ã¹å¯èœæ§ã倧å¹
ã«åäžãããå Žåãå¿
ãããæªãããã§ã¯ãããŸããã
- ã¢ãã«ãã¡ã€ã«ã¯ã§ããã ãèªå·±å®çµçã§ããã¹ãã§ãç¹å®ã®ã¢ãã«ã®ã³ãŒããèªãéã«ã¯ãçæ³çã«ã¯è©²åœãã`modeling_....py`ãã¡ã€ã«ã®ã¿ãèŠãå¿
èŠããããŸãã
ç§ãã¡ã®æèŠã§ã¯ããã®ã©ã€ãã©ãªã®ã³ãŒãã¯åãªã補åãæäŸããææ®µã ãã§ãªãã*äŸãã°ãæšè«ã®ããã«BERTã䜿çšããèœå*ãªã©ã®è£œåãã®ãã®.
### Overview of models
ã¢ãã«ãæ£åžžã«è¿œå ããããã«ã¯ãã¢ãã«ãšãã®èšå®ã[`PreTrainedModel`]ãããã³[`PretrainedConfig`]ã®çžäºäœçšãçè§£ããããšãéèŠã§ãã
äŸç€ºçãªç®çã§ãð€ Transformersã«è¿œå ããã¢ãã«ããBrandNewBertããšåŒã³ãŸãã
以äžãã芧ãã ããïŒ
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers_overview.png"/>
ã芧ã®ããã«ãð€ Transformersã§ã¯ç¶æ¿ã䜿çšããŠããŸãããæœè±¡åã®ã¬ãã«ãæå°éã«ä¿ã£ãŠããŸãã
ã©ã€ãã©ãªå
ã®ã©ã®ã¢ãã«ã«ããæœè±¡åã®ã¬ãã«ã2ã€ãè¶
ããããšã¯ãããŸããã
`BrandNewBertModel` 㯠`BrandNewBertPreTrainedModel` ãç¶æ¿ããããã«[`PreTrainedModel`]ãç¶æ¿ããŠããŸãã
ããã ãã§ãã
äžè¬çãªã«ãŒã«ãšããŠãæ°ããã¢ãã«ã¯[`PreTrainedModel`]ã«ã®ã¿äŸåããããã«ããããšèããŠããŸãã
ãã¹ãŠã®æ°ããã¢ãã«ã«èªåçã«æäŸãããéèŠãªæ©èœã¯ã[`~PreTrainedModel.from_pretrained`]ããã³
[`~PreTrainedModel.save_pretrained`]ã§ãã
ãããã¯ã·ãªã¢ã©ã€ãŒãŒã·ã§ã³ãšãã·ãªã¢ã©ã€ãŒãŒã·ã§ã³ã«äœ¿çšãããŸãã
`BrandNewBertModel.forward`ãªã©ã®ä»ã®éèŠãªæ©èœã¯ãæ°ãããmodeling_brand_new_bert.pyãã¹ã¯ãªããã§å®å
šã«å®çŸ©ãããã¹ãã§ãã
次ã«ãç¹å®ã®ãããã¬ã€ã€ãŒãæã€ã¢ãã«ïŒããšãã° `BrandNewBertForMaskedLM` ïŒã `BrandNewBertModel` ãç¶æ¿ããã®ã§ã¯ãªãã
æœè±¡åã®ã¬ãã«ãäœãä¿ã€ããã«ããã®ãã©ã¯ãŒããã¹ã§ `BrandNewBertModel` ãåŒã³åºãã³ã³ããŒãã³ããšããŠäœ¿çšãããããã«ããããšèããŠããŸãã
æ°ããã¢ãã«ã«ã¯åžžã« `BrandNewBertConfig` ãšããèšå®ã¯ã©ã¹ãå¿
èŠã§ãããã®èšå®ã¯åžžã«[`PreTrainedModel`]ã®å±æ§ãšããŠä¿åããã
ãããã£ãŠã`BrandNewBertPreTrainedModel`ããç¶æ¿ãããã¹ãŠã®ã¯ã©ã¹ã§`config`屿§ãä»ããŠã¢ã¯ã»ã¹ã§ããŸãã
```python
model = BrandNewBertModel.from_pretrained("brandy/brand_new_bert")
model.config # model has access to its config
```
ã¢ãã«ãšåæ§ã«ãèšå®ã¯[`PretrainedConfig`]ããåºæ¬çãªã·ãªã¢ã«åããã³éã·ãªã¢ã«åã®æ©èœãç¶æ¿ããŠããŸããæ³šæãã¹ãã¯ãèšå®ãšã¢ãã«ã¯åžžã«2ã€ã®ç°ãªã圢åŒã«ã·ãªã¢ã«åãããããšã§ã - ã¢ãã«ã¯*pytorch_model.bin*ãã¡ã€ã«ã«ãèšå®ã¯*config.json*ãã¡ã€ã«ã«ã·ãªã¢ã«åãããŸãã[`~PreTrainedModel.save_pretrained`]ãåŒã³åºããšãèªåçã«[`~PretrainedConfig.save_pretrained`]ãåŒã³åºãããã¢ãã«ãšèšå®ã®äž¡æ¹ãä¿åãããŸãã
### Code style
æ°ããã¢ãã«ãã³ãŒãã£ã³ã°ããéã«ã¯ãTransformersã¯æèŠãããã©ã€ãã©ãªã§ãããã³ãŒãã®æžãæ¹ã«é¢ããŠããã€ãã®ç¬èªã®èãæ¹ããããŸã :-)
1. ã¢ãã«ã®ãã©ã¯ãŒããã¹ã¯ã¢ããªã³ã°ãã¡ã€ã«ã«å®å
šã«èšè¿°ãããã©ã€ãã©ãªå
ã®ä»ã®ã¢ãã«ãšã¯å®å
šã«ç¬ç«ããŠããå¿
èŠããããŸããä»ã®ã¢ãã«ãããããã¯ãåå©çšãããå Žåãã³ãŒããã³ããŒããŠãããã«`# Copied from`ã³ã¡ã³ããä»ããŠè²Œãä»ããŸãïŒè¯ãäŸã¯[ãã¡ã](https://github.com/huggingface/transformers/blob/v4.17.0/src/transformers/models/roberta/modeling_roberta.py#L160)ãã³ããŒã«é¢ãã詳现ãªããã¥ã¡ã³ããŒã·ã§ã³ã¯[ãã](pr_checks#check-copies)ãåç
§ããŠãã ããïŒã
2. ã³ãŒãã¯å®å
šã«çè§£å¯èœã§ãªããã°ãªããŸãããããã¯èšè¿°çãªå€æ°åãéžæããçç¥åœ¢ãé¿ããã¹ãã§ããããšãæå³ããŸããäŸãã°ã`act`ã§ã¯ãªã`activation`ã奜ãŸããŸãã1æåã®å€æ°åã¯ãforã«ãŒãå
ã®ã€ã³ããã¯ã¹ã§ãªãéãã匷ãéæšå¥šã§ãã
3. ããäžè¬çã«ãéæ³ã®ãããªçãã³ãŒããããé·ããŠæç€ºçãªã³ãŒãã奜ã¿ãŸãã
4. PyTorchã§ã¯`nn.Sequential`ããµãã¯ã©ã¹åããã«ã`nn.Module`ããµãã¯ã©ã¹åãããã©ã¯ãŒããã¹ãèšè¿°ããã³ãŒãã䜿çšããä»ã®äººãç°¡åã«ãããã°ã§ããããã«ããŸããããªã³ãã¹ããŒãã¡ã³ãããã¬ãŒã¯ãã€ã³ãã远å ããŠãããã°ã§ããããã«ããŸãã
5. 颿°ã®ã·ã°ããã£ã¯åã¢ãããŒã·ã§ã³ãä»ããã¹ãã§ãããã®ä»ã®éšåã«é¢ããŠã¯ãåã¢ãããŒã·ã§ã³ãããè¯ã倿°åãèªã¿ãããçè§£ããããããšããããŸãã
### Overview of tokenizers
ãŸã å®äºããŠããŸãã :-( ãã®ã»ã¯ã·ã§ã³ã¯è¿æ¥äžã«è¿œå ãããŸãïŒ
## Step-by-step recipe to add a model to ð€ Transformers
ã¢ãã«ã远å ããæ¹æ³ã¯äººããããç°ãªããããä»ã®ã³ã³ããªãã¥ãŒã¿ãŒãð€ Transformersã«ã¢ãã«ã远å ããéã®èŠçŽã確èªããããšãéåžžã«åœ¹ç«ã€å ŽåããããŸãã以äžã¯ãä»ã®ã³ã³ããªãã¥ãŒã¿ãŒãð€ Transformersã«ã¢ãã«ãããŒãããéã®ã³ãã¥ããã£ããã°æçš¿ã®ãªã¹ãã§ãã
1. [GPT2ã¢ãã«ã®ããŒãã£ã³ã°](https://medium.com/huggingface/from-tensorflow-to-pytorch-265f40ef2a28) by [Thomas](https://huggingface.co/thomwolf)
2. [WMT19 MTã¢ãã«ã®ããŒãã£ã³ã°](https://huggingface.co/blog/porting-fsmt) by [Stas](https://huggingface.co/stas)
çµéšããèšããããšã¯ãã¢ãã«ã远å ããéã«æãéèŠãªããšã¯æ¬¡ã®ããã«ãªããŸãïŒ
- è»èŒªã®åçºæãããªãã§ãã ããïŒæ°ããð€ Transformersã¢ãã«ã®ããã«è¿œå ããã³ãŒãã®ã»ãšãã©ã¯ãã§ã«ð€ Transformerså
ã®ã©ããã«ååšããŠããŸããé¡äŒŒããæ¢åã®ã¢ãã«ãããŒã¯ãã€ã¶ãèŠã€ããããã«ãããã€ãã®æéããããŠæ¢ãããšãéèŠã§ãã[grep](https://www.gnu.org/software/grep/)ãš[rg](https://github.com/BurntSushi/ripgrep)ã¯ããªãã®åéã§ããã¢ãã«ã®ããŒã¯ãã€ã¶ã¯1ã€ã®ã¢ãã«å®è£
ã«åºã¥ããŠãããããããŸããããã¢ãã«ã®ã¢ããªã³ã°ã³ãŒãã¯å¥ã®å®è£
ã«åºã¥ããŠããããšãããããšã«æ³šæããŠãã ãããäŸãã°ãFSMTã®ã¢ããªã³ã°ã³ãŒãã¯BARTã«åºã¥ããŠãããFSMTã®ããŒã¯ãã€ã¶ã³ãŒãã¯XLMã«åºã¥ããŠããŸãã
- ããã¯ç§åŠçãªèª²é¡ããããšã³ãžãã¢ãªã³ã°ã®èª²é¡ã§ããã¢ãã«ã®è«æã®çè«çãªåŽé¢ããã¹ãŠçè§£ããããšããããããå¹ççãªãããã°ç°å¢ãäœæããããã«æéãè²»ããã¹ãã§ãã
- è¡ãè©°ãŸã£ãå Žåã¯å©ããæ±ããŠãã ããïŒã¢ãã«ã¯ð€ Transformersã®ã³ã¢ã³ã³ããŒãã³ãã§ãããHugging Faceã§ã¯ã¢ãã«ã远å ããããã®åã¹ãããã§ãæäŒãããã®ãåãã§ããŸããé²è¡ããªãããšã«æ°ä»ããå Žåã¯ãé²å±ããŠããªãããšãæ°ã«ããªãã§ãã ããã
以äžã§ã¯ãð€ Transformersã«ã¢ãã«ãããŒãããéã«æã圹ç«ã€ãšèããããäžè¬çãªã¬ã·ããæäŸããããšããŠããŸãã
次ã®ãªã¹ãã¯ãã¢ãã«ã远å ããããã«è¡ãå¿
èŠããããã¹ãŠã®ããšã®èŠçŽã§ãããTo-Doãªã¹ããšããŠäœ¿çšã§ããŸãïŒ
- â ïŒãªãã·ã§ã³ïŒã¢ãã«ã®çè«çãªåŽé¢ãçè§£ããŸãã
- â ð€ Transformersã®éçºç°å¢ãæºåããŸãã
- â ãªãªãžãã«ã®ãªããžããªã®ãããã°ç°å¢ãã»ããã¢ããããŸãã
- â `forward()` ãã¹ããªãªãžãã«ã®ãªããžããªãšãã§ãã¯ãã€ã³ãã§æ£åžžã«å®è¡ããã¹ã¯ãªãããäœæããŸãã
- â ã¢ãã«ã®éªšæ Œãð€ Transformersã«æ£åžžã«è¿œå ããŸãã
- â ãªãªãžãã«ã®ãã§ãã¯ãã€ã³ããð€ Transformersã®ãã§ãã¯ãã€ã³ãã«æ£åžžã«å€æããŸãã
- â ð€ Transformersã§å®è¡ããã `forward()` ãã¹ãæ£åžžã«å®è¡ãããªãªãžãã«ã®ãã§ãã¯ãã€ã³ããšåäžã®åºåãåŸãŸãã
- â ð€ Transformersã§ã®ã¢ãã«ãã¹ããå®äºããŸãã
- â ð€ Transformersã«ããŒã¯ãã€ã¶ãæ£åžžã«è¿œå ããŸãã
- â ãšã³ãããŒãšã³ãã®çµ±åãã¹ããå®è¡ããŸãã
- â ããã¥ã¡ã³ãã宿ãããŸãã
- â ã¢ãã«ã®ãŠã§ã€ããHubã«ã¢ããããŒãããŸãã
- â ãã«ãªã¯ãšã¹ããæåºããŸãã
- â ïŒãªãã·ã§ã³ïŒãã¢ããŒãããã¯ã远å ããŸãã
ãŸããéåžžã`BrandNewBert`ã®çè«çãªçè§£ãæ·±ããããšããå§ãããŸãã
ãã ããããã¢ãã«ã®çè«çãªåŽé¢ããå®åäžã«çè§£ãããæ¹ã奜ãŸããå Žåã`BrandNewBert`ã®ã³ãŒãããŒã¹ã«çŽæ¥ã¢ã¯ã»ã¹ããã®ãåé¡ãããŸããã
ãã®ãªãã·ã§ã³ã¯ããšã³ãžãã¢ãªã³ã°ã®ã¹ãã«ãçè«çãªã¹ãã«ãããåªããŠããå Žåã
`BrandNewBert`ã®è«æãçè§£ããã®ã«èŠåŽããŠããå ŽåããŸãã¯ç§åŠçãªè«æãèªããããããã°ã©ãã³ã°ã楜ããã§ããå Žåã«é©ããŠããŸãã
### 1. (Optional) Theoretical aspects of BrandNewBert
BrandNewBertã®è«æãããå Žåããã®èª¬æãèªãããã®æéãåãã¹ãã§ããè«æã®äžã«ã¯çè§£ãé£ããéšåããããããããŸããã
ãã®å Žåã§ãå¿é
ããªãã§ãã ãããç®æšã¯è«æã®æ·±ãçè«ççè§£ãåŸãããšã§ã¯ãªãã
ð€ Transformersã§ã¢ãã«ã广çã«åå®è£
ããããã«å¿
èŠãªæ
å ±ãæœåºããããšã§ãã
ãã ããçè«çãªåŽé¢ã«ããŸãå€ãã®æéããããå¿
èŠã¯ãããŸããã代ããã«ãå®è·µçãªåŽé¢ã«çŠç¹ãåœãŠãŸããããå
·äœçã«ã¯æ¬¡ã®ç¹ã§ãïŒ
- *brand_new_bert*ã¯ã©ã®çš®é¡ã®ã¢ãã«ã§ããïŒ BERTã®ãããªãšã³ã³ãŒããŒã®ã¿ã®ã¢ãã«ã§ããïŒ GPT2ã®ãããªãã³ãŒããŒã®ã¿ã®ã¢ãã«ã§ããïŒ BARTã®ãããªãšã³ã³ãŒããŒ-ãã³ãŒããŒã¢ãã«ã§ããïŒ
[model_summary](model_summary)ãåç
§ããŠããããã®éãã«ã€ããŠè©³ããç¥ãããå ŽåããããŸãã
- *brand_new_bert*ã®å¿çšåéã¯äœã§ããïŒ ããã¹ãåé¡ã§ããïŒ ããã¹ãçæã§ããïŒ Seq2Seqã¿ã¹ã¯ãäŸãã°èŠçŽã§ããïŒ
- ã¢ãã«ãBERT/GPT-2/BARTãšã¯ç°ãªããã®ã«ããæ°ããæ©èœã¯äœã§ããïŒ
- æ¢åã®[ð€ Transformersã¢ãã«](https://huggingface.co/transformers/#contents)ã®äžã§*brand_new_bert*ã«æã䌌ãŠããã¢ãã«ã¯ã©ãã§ããïŒ
- 䜿çšãããŠããããŒã¯ãã€ã¶ã®çš®é¡ã¯äœã§ããïŒ SentencePieceããŒã¯ãã€ã¶ã§ããïŒ WordPieceããŒã¯ãã€ã¶ã§ããïŒ BERTãBARTã§äœ¿çšãããŠããããŒã¯ãã€ã¶ãšåãã§ããïŒ
ã¢ãã«ã®ã¢ãŒããã¯ãã£ã®è¯ãæŠèŠãåŸããšæããããHugging FaceããŒã ã«è³ªåãéãããšãã§ããŸãã
ããã«ã¯ã¢ãã«ã®ã¢ãŒããã¯ãã£ã泚æå±€ãªã©ã«é¢ãã質åãå«ãŸãããããããŸããã
ç§ãã¡ã¯åãã§ãæäŒãããŸãã
### 2. Next prepare your environment
1. ãªããžããªã®ããŒãžã§ãForkããã¿ã³ãã¯ãªãã¯ããŠã[ãªããžããª](https://github.com/huggingface/transformers)ããã©ãŒã¯ããŸãã
ããã«ãããã³ãŒãã®ã³ããŒãGitHubãŠãŒã¶ãŒã¢ã«ãŠã³ãã®äžã«äœæãããŸãã
2. ããŒã«ã«ãã£ã¹ã¯ã«ãã`transformers`ãã©ãŒã¯ãã¯ããŒã³ããããŒã¹ãªããžããªããªã¢ãŒããšããŠè¿œå ããŸãïŒ
```bash
git clone https://github.com/[your Github handle]/transformers.git
cd transformers
git remote add upstream https://github.com/huggingface/transformers.git
```
```bash
python -m venv .env
source .env/bin/activate
pip install -e ".[dev]"
```
3. éçºç°å¢ãã»ããã¢ããããããã«ã次ã®ã³ãã³ããå®è¡ããŠãã ããïŒ
```bash
python -m venv .env
source .env/bin/activate
pip install -e ".[dev]"
```
ã䜿ãã®OSã«å¿ããŠãããã³Transformersã®ãªãã·ã§ã³ã®äŸåé¢ä¿ã®æ°ãå¢ããŠããããããã®ã³ãã³ãã§ãšã©ãŒãçºçããå¯èœæ§ããããŸãã
ãã®å Žåã¯ãäœæ¥ããŠããDeep Learningãã¬ãŒã ã¯ãŒã¯ïŒPyTorchãTensorFlowãããã³/ãŸãã¯FlaxïŒãã€ã³ã¹ããŒã«ããæ¬¡ã®æé ãå®è¡ããŠãã ããïŒ
```bash
pip install -e ".[quality]"
```
ããã¯ã»ãšãã©ã®ãŠãŒã¹ã±ãŒã¹ã«ã¯ååã§ããã¯ãã§ãããã®åŸã芪ãã£ã¬ã¯ããªã«æ»ãããšãã§ããŸãã
```bash
cd ..
```
4. Transformersã«*brand_new_bert*ã®PyTorchããŒãžã§ã³ã远å ããããšããå§ãããŸããPyTorchãã€ã³ã¹ããŒã«ããã«ã¯ã
https://pytorch.org/get-started/locally/ ã®æç€ºã«åŸã£ãŠãã ããã
**泚æ:** CUDAãã€ã³ã¹ããŒã«ããå¿
èŠã¯ãããŸãããæ°ããã¢ãã«ãCPUã§åäœãããããšã§ååã§ãã
5. *brand_new_bert*ãç§»æ€ããã«ã¯ãå
ã®ãªããžããªãžã®ã¢ã¯ã»ã¹ãå¿
èŠã§ãã
```bash
git clone https://github.com/org_that_created_brand_new_bert_org/brand_new_bert.git
cd brand_new_bert
pip install -e .
```
*brand_new_bert*ãð€ Transformersã«ããŒãããããã®éçºç°å¢ãèšå®ããŸããã
### 3.-4. Run a pretrained checkpoint using the original repository
æåã«ããªãªãžãã«ã®*brand_new_bert*ãªããžããªã§äœæ¥ããŸããéåžžããªãªãžãã«ã®å®è£
ã¯éåžžã«ãç ç©¶çãã§ãããããã¥ã¡ã³ããŒã·ã§ã³ãäžè¶³ããŠããããã³ãŒããçè§£ãã«ããããšããããŸããããããããã*brand_new_bert*ãåå®è£
ããåæ©ãšãªãã¹ãã§ããHugging Faceã§ã¯ãäž»èŠãªç®æšã®1ã€ããåäœããã¢ãã«ãåãããããã§ããã ã**ã¢ã¯ã»ã¹å¯èœã§ãŠãŒã¶ãŒãã¬ã³ããªãŒã§çŸãã**ãã®ã«æžãçŽãããšã§ããããã¯ãð€ Transformersã«ã¢ãã«ãåå®è£
ããæãéèŠãªåæ©ã§ã - è€éãªæ°ããNLPæè¡ã**誰ã«ã§ã**ã¢ã¯ã»ã¹å¯èœã«ããããšãã詊ã¿ã§ãã
ãŸãããªãªãžãã«ã®ãªããžããªã«å
¥ã蟌ãããšããå§ããã¹ãã§ãã
å
¬åŒã®äºååŠç¿æžã¿ã¢ãã«ããªãªãžãã«ã®ãªããžããªã§æ£åžžã«å®è¡ããããšã¯ãéåžžã**æãå°é£ãª**ã¹ãããã§ãã
ç§ãã¡ã®çµéšããããªãªãžãã«ã®ã³ãŒãããŒã¹ã«æ
£ããã®ã«æéããããããšãéåžžã«éèŠã§ãã以äžã®ããšãçè§£ããå¿
èŠããããŸãïŒ
- äºååŠç¿æžã¿ã®éã¿ãã©ãã§èŠã€ãããïŒ
- 察å¿ããã¢ãã«ã«äºååŠç¿æžã¿ã®éã¿ãããŒãããæ¹æ³ã¯ïŒ
- ã¢ãã«ããç¬ç«ããŠããŒã¯ãã€ã¶ãå®è¡ããæ¹æ³ã¯ïŒ
- 1ã€ã®ãã©ã¯ãŒããã¹ã远跡ããŠãåçŽãªãã©ã¯ãŒããã¹ã«å¿
èŠãªã¯ã©ã¹ãšé¢æ°ããããããã«ããŸããéåžžããããã®é¢æ°ã ããåå®è£
ããå¿
èŠããããŸãã
- ã¢ãã«ã®éèŠãªã³ã³ããŒãã³ããç¹å®ã§ããããšïŒã¢ãã«ã®ã¯ã©ã¹ã¯ã©ãã«ãããŸããïŒã¢ãã«ã®ãµãã¯ã©ã¹ã*äŸ* EncoderModelãDecoderModelããããŸããïŒèªå·±æ³šæã¬ã€ã€ãŒã¯ã©ãã«ãããŸããïŒè€æ°ã®ç°ãªã泚æã¬ã€ã€ãŒã*äŸ* *èªå·±æ³šæ*ã*ã¯ãã¹ã¢ãã³ã·ã§ã³*ãªã©ãååšããŸããïŒ
- ãªãªãžãã«ã®ãªããžããªã®ç°å¢ã§ã¢ãã«ããããã°ããæ¹æ³ã¯ïŒ*print*ã¹ããŒãã¡ã³ãã远å ããå¿
èŠããããã*ipdb*ã®ãããªå¯Ÿè©±åãããã¬ã䜿çšã§ããããPyCharmã®ãããªå¹ççãªIDEã䜿çšããŠã¢ãã«ããããã°ããå¿
èŠããããŸããïŒ
éèŠãªã®ã¯ãããŒãã£ã³ã°ããã»ã¹ãéå§ããåã«ããªãªãžãã«ã®ãªããžããªã§ã³ãŒãã**å¹ççã«**ãããã°ã§ããããšã§ãïŒãŸããããã¯ãªãŒãã³ãœãŒã¹ã©ã€ãã©ãªã§äœæ¥ããŠããããšãèŠããŠãããŠãã ããããªãªãžãã«ã®ãªããžããªã§ã³ãŒãã調ã¹ã誰ããæè¿ããããã«ãåé¡ããªãŒãã³ã«ãããããã«ãªã¯ãšã¹ããéä¿¡ãããããããšããããããªãã§ãã ããããã®ãªããžããªã®ã¡ã³ãããŒã¯ã圌ãã®ã³ãŒãã調ã¹ãŠããã人ã«å¯ŸããŠéåžžã«åãã§ããå¯èœæ§ãé«ãã§ãïŒ
ãã®æ®µéã§ã¯ããªãªãžãã«ã®ã¢ãã«ã®ãããã°ã«ã©ã®ãããªç°å¢ãšæŠç¥ã䜿çšãããã¯ãããªã次第ã§ããæåã«ãªãªãžãã«ã®ãªããžããªã«é¢ããã³ãŒãããããã°ã§ããããšãéåžžã«éèŠã§ãããŸããGPUç°å¢ãã»ããã¢ããããããšã¯ãå§ãããŸããããŸããCPUäžã§äœæ¥ããã¢ãã«ããã§ã«ð€ Transformersã«æ£åžžã«ããŒããããŠããããšã確èªããŸããæåŸã«ãã¢ãã«ãGPUäžã§ãæåŸ
éãã«åäœãããã©ãããæ€èšŒããå¿
èŠããããŸãã
äžè¬çã«ããªãªãžãã«ã®ã¢ãã«ãå®è¡ããããã®2ã€ã®ãããã°ç°å¢ããããŸãïŒ
- [Jupyter notebooks](https://jupyter.org/) / [google colab](https://colab.research.google.com/notebooks/intro.ipynb)
- ããŒã«ã«ãªPythonã¹ã¯ãªããã
JupyterããŒãããã¯ã¯ãã»ã«ããšã«å®è¡ã§ãããããè«ççãªã³ã³ããŒãã³ããããåå²ããäžéçµæãä¿åã§ããããããããã°ãµã€ã¯ã«ãéããªããšããå©ç¹ããããŸãããŸããããŒãããã¯ã¯ä»ã®å
±åäœæ¥è
ãšç°¡åã«å
±æã§ããããšãå€ããHugging FaceããŒã ã«å©ããæ±ããå Žåã«éåžžã«åœ¹ç«ã€å ŽåããããŸããJupyterããŒãããã¯ã«ç²ŸéããŠããå Žåããã
```python
model = BrandNewBertModel.load_pretrained_checkpoint("/path/to/checkpoint/")
input_ids = [0, 4, 5, 2, 3, 7, 9] # vector of input ids
original_output = model.predict(input_ids)
```
ãããã°æŠç¥ã«ã€ããŠã¯ãéåžžãããã€ãã®éžæè¢ããããŸãïŒ
- å
ã®ã¢ãã«ãå€ãã®å°ããªãã¹ãå¯èœãªã³ã³ããŒãã³ãã«åè§£ããããããã«å¯ŸããŠåæ¹ãã¹ãå®è¡ããŠæ€èšŒããŸã
- å
ã®ã¢ãã«ãå
ã®ããŒã¯ãã€ã¶ãšå
ã®ã¢ãã«ã«ã®ã¿åè§£ãããããã«å¯ŸããŠåæ¹ãã¹ãå®è¡ããæ€èšŒã®ããã«äžéã®ããªã³ãã¹ããŒãã¡ã³ããŸãã¯ãã¬ãŒã¯ãã€ã³ãã䜿çšããŸã
å床ãã©ã®æŠç¥ãéžæãããã¯ããªã次第ã§ããå
ã®ã³ãŒãããŒã¹ã«äŸåããããšãå€ããå
ã®ã³ãŒãããŒã¹ã«å¿ããŠäžæ¹ãŸãã¯ä»æ¹ãæå©ãªããšããããŸãã
å
ã®ã³ãŒãããŒã¹ãã¢ãã«ãå°ããªãµãã³ã³ããŒãã³ãã«åè§£ã§ããå Žåã*äŸãã°*å
ã®ã³ãŒãããŒã¹ãç°¡åã«ã€ãŒã¬ãŒã¢ãŒãã§å®è¡ã§ããå Žåããããè¡ã䟡å€ãéåžžãããŸããæåããããé£ããæ¹æ³ãéžæããããšã«ã¯ããã€ãã®éèŠãªå©ç¹ããããŸãïŒ
- åŸã§å
ã®ã¢ãã«ãð€ Transformersã®å®è£
ãšæ¯èŒããéã«ãåã³ã³ããŒãã³ãã察å¿ããð€ Transformerså®è£
ã®ã³ã³ããŒãã³ããšäžèŽããããšãèªåçã«æ€èšŒã§ãããããèŠèŠçãªæ¯èŒã«äŸåããã«æžã¿ãŸã
- 倧ããªåé¡ãå°ããªåé¡ã«åè§£ãããã€ãŸãåã
ã®ã³ã³ããŒãã³ãã®ã¿ãããŒãã£ã³ã°ããåé¡ã«åå²ããã®ã«åœ¹ç«ã¡ãäœæ¥ãæ§é åããã®ã«åœ¹ç«ã¡ãŸã
- ã¢ãã«ãè«ççãªæå³ã®ããã³ã³ããŒãã³ãã«åå²ããããšã§ãã¢ãã«ã®èšèšãããããçè§£ããããããã¢ãã«ãããããçè§£ããã®ã«åœ¹ç«ã¡ãŸã
- åŸã§ãã³ã³ããŒãã³ãããšã®ãã¹ããè¡ãããšã§ãã³ãŒãã倿Žãç¶ããéã«ãªã°ã¬ãã·ã§ã³ãçºçããªãããšã確èªããã®ã«åœ¹ç«ã¡ãŸã
[Lysandreã®](https://gist.github.com/LysandreJik/db4c948f6b4483960de5cbac598ad4ed) ELECTRAã®çµ±åãã§ãã¯ã¯ããããã©ã®ããã«è¡ããããã®è¯ãäŸã§ãã
ãã ããå
ã®ã³ãŒãããŒã¹ãéåžžã«è€éã§ãäžéã³ã³ããŒãã³ããã³ã³ãã€ã«ã¢ãŒãã§å®è¡ããããšããèš±å¯ããªãå Žåãã¢ãã«ãå°ããªãã¹ãå¯èœãªãµãã³ã³ããŒãã³ãã«åè§£ããããšãæéãããããããããäžå¯èœã§ããããšããããŸãã
è¯ãäŸã¯[T5ã®MeshTensorFlow](https://github.com/tensorflow/mesh/tree/master/mesh_tensorflow)ã©ã€ãã©ãªã§ãããéåžžã«è€éã§ã¢ãã«ããµãã³ã³ããŒãã³ãã«åè§£ããç°¡åãªæ¹æ³ãæäŸããªãããšããããŸãããã®ãããªã©ã€ãã©ãªã§ã¯ãéåžžãããªã³ãã¹ããŒãã¡ã³ããæ€èšŒããããšã«äŸåããŸãã
ã©ã®æŠç¥ãéžæããŠããæšå¥šãããæé ã¯éåžžåãã§ãæåã®ã¬ã€ã€ãŒãããããã°ãéå§ããæåŸã®ã¬ã€ã€ãŒãããããã°ãè¡ãã¹ãã§ãã
éåžžã以äžã®é åºã§æ¬¡ã®ã¬ã€ã€ãŒããã®åºåãååŸããããšããå§ãããŸãïŒ
1. ã¢ãã«ã«æž¡ãããå
¥åIDãååŸãã
2. åèªã®åã蟌ã¿ãååŸãã
3. æåã®Transformerã¬ã€ã€ãŒã®å
¥åãååŸãã
4. æåã®Transformerã¬ã€ã€ãŒã®åºåãååŸãã
5. 次ã®n - 1ã€ã®Transformerã¬ã€ã€ãŒã®åºåãååŸãã
6. BrandNewBertã¢ãã«å
šäœã®åºåãååŸãã
å
¥åIDã¯æŽæ°ã®é
åã§ããå¿
èŠãããã*äŸïŒ* `input_ids = [0, 4, 4, 3, 2, 4, 1, 7, 19]` ã®ããã«ãªããŸãã
以äžã®ã¬ã€ã€ãŒã®åºåã¯å€æ¬¡å
ã®æµ®åå°æ°ç¹é
åã§ããããšãå€ããæ¬¡ã®ããã«ãªãããšããããŸãïŒ
```
[[
[-0.1465, -0.6501, 0.1993, ..., 0.1451, 0.3430, 0.6024],
[-0.4417, -0.5920, 0.3450, ..., -0.3062, 0.6182, 0.7132],
[-0.5009, -0.7122, 0.4548, ..., -0.3662, 0.6091, 0.7648],
...,
[-0.5613, -0.6332, 0.4324, ..., -0.3792, 0.7372, 0.9288],
[-0.5416, -0.6345, 0.4180, ..., -0.3564, 0.6992, 0.9191],
[-0.5334, -0.6403, 0.4271, ..., -0.3339, 0.6533, 0.8694]]],
```
ð€ Transformersã«è¿œå ããããã¹ãŠã®ã¢ãã«ã¯ãçµ±åãã¹ããæ°ååæ ŒããããšãæåŸ
ãããŠãããå
ã®ã¢ãã«ãšð€ Transformersã§åå®è£
ãããããŒãžã§ã³ãã0.001ã®ç²ŸåºŠãŸã§ãŸã£ããåãåºåãæäŸããå¿
èŠããããŸãã
ç°ãªãã©ã€ãã©ãªãã¬ãŒã ã¯ãŒã¯ã§åãã¢ãã«ãæžããå Žåããããã«ç°ãªãåºåãè¿ãããšãæ£åžžã§ããããã誀差蚱容å€ãšããŠ1e-3ïŒ0.001ïŒãåãå
¥ããŠããŸããã¢ãã«ãã»ãŒåãåºåãè¿ãã ãã§ã¯äžååã§ãã»ãŒåäžã§ããå¿
èŠããããŸãããã®ãããð€ TransformersããŒãžã§ã³ã®äžéåºåãå
ã®*brand_new_bert*ã®å®è£
ã®äžéåºåãšè€æ°åã«ããã£ãŠæ¯èŒããããšã«ãªãã§ãããããã®éãå
ã®ãªããžããªã®**å¹ççãª**ãããã°ç°å¢ãéåžžã«éèŠã§ãã以äžã¯ããããã°ç°å¢ãã§ããã ãå¹ççã«ããããã®ã¢ããã€ã¹ã§ãã
- äžéçµæããããã°ããæé©ãªæ¹æ³ãèŠã€ãããå
ã®ãªããžããªã¯PyTorchã§æžãããŠããŸããïŒãã®å Žåãå
ã®ã¢ãã«ãããå°ããªãµãã³ã³ããŒãã³ãã«åè§£ããŠäžéå€ãååŸããé·ãã¹ã¯ãªãããæžãããšãããããé©åã§ããå
ã®ãªããžããªãTensorflow 1ã§æžãããŠããå Žåã[tf.print](https://www.tensorflow.org/api_docs/python/tf/print)ãªã©ã®TensorFlowã®ããªã³ãæäœã䜿çšããŠäžéå€ãåºåããå¿
èŠããããããããŸãããå
ã®ãªããžããªãJaxã§æžãããŠããå Žåããã©ã¯ãŒããã¹ã®å®è¡æã«ã¢ãã«ã**jittedãããŠããªã**ããšã確èªããŠãã ãããäŸïŒ[ãã®ãªã³ã¯](https://github.com/google/jax/issues/196)ããã§ãã¯ã
- 䜿çšå¯èœãªæå°ã®äºååŠç¿æžã¿ãã§ãã¯ãã€ã³ãã䜿çšããŸãããã§ãã¯ãã€ã³ããå°ããã»ã©ããããã°ãµã€ã¯ã«ãéããªããŸããäºååŠç¿æžã¿ã¢ãã«ããã©ã¯ãŒããã¹ã«10ç§ä»¥äžãããå Žåãå¹ççã§ã¯ãããŸãããéåžžã«å€§ããªãã§ãã¯ãã€ã³ãããå©çšã§ããªãå Žåãæ°ããç°å¢ã§ã©ã³ãã ã«åæåããããŠã§ã€ããæã€ãããŒã¢ãã«ãäœæãããããã®ãŠã§ã€ããð€ TransformersããŒãžã§ã³ã®ã¢ãã«ãšæ¯èŒããæ¹ãè¯ããããããŸããã
- å
ã®ãªããžããªã§ãã©ã¯ãŒããã¹ãåŒã³åºãæãç°¡åãªæ¹æ³ã䜿çšããŠããããšã確èªããŠãã ãããçæ³çã«ã¯ãå
ã®ãªããžããªã§**åäžã®ãã©ã¯ãŒããã¹**ãåŒã³åºã颿°ãèŠã€ãããã§ããããã¯éåžžãpredictãããevaluateãããforwardããã__call__ããšåŒã°ããŸããè€æ°åãforwardããåŒã³åºã颿°ããããã°ããããããŸãããäŸïŒããã¹ããçæããããã«ãautoregressive_sampleãããgenerateããšåŒã°ãã颿°ã
- ããŒã¯ãã€ãŒãŒã·ã§ã³ãšã¢ãã«ã®ããã©ã¯ãŒãããã¹ãåé¢ããããšããŠãã ãããå
ã®ãªããžããªãå
¥åæååãå
¥åããå¿
èŠãããäŸã瀺ãå Žåããã©ã¯ãŒãã³ãŒã«å
ã§æååå
¥åãå
¥åIDã«å€æŽãããå Žæãç¹å®ãããã®ãã€ã³ãããéå§ããŸããããã¯ãã¹ã¯ãªãããèªåã§æžãããå
¥åæååã§ã¯ãªãå
¥åIDãçŽæ¥å
¥åã§ããããã«å
ã®ã³ãŒãã倿Žããå¿
èŠããããããããŸããã
- ãããã°ã»ããã¢ããå
ã®ã¢ãã«ããã¬ãŒãã³ã°ã¢ãŒãã§ã¯ãªãããšã確èªããŠãã ããããã¬ãŒãã³ã°ã¢ãŒãã§ã¯ãã¢ãã«å
ã®è€æ°ã®ããããã¢ãŠãã¬ã€ã€ãŒã®ããã«ã©ã³ãã ãªåºåãçæãããããšããããŸãããããã°ç°å¢ã®ãã©ã¯ãŒããã¹ã**決å®è«ç**ã§ããããšã確èªããããããã¢ãŠãã¬ã€ã€ãŒã䜿çšãããªãããã«ããŸãããŸãã¯ãæ°ããå®è£
ãåããã¬ãŒã ã¯ãŒã¯å
ã«ããå Žåã*transformers.utils.set_seed*ã䜿çšããŠãã ããã
以äžã®ã»ã¯ã·ã§ã³ã§ã¯ã*brand_new_bert*ã«ã€ããŠãããå
·äœçã«ã©ã®ããã«è¡ããã«ã€ããŠã®è©³çް/ãã³ããæäŸããŸãã
### 5.-14. Port BrandNewBert to ð€ Transformers
次ã«ãã€ãã«æ°ããã³ãŒããð€ Transformersã«è¿œå ã§ããŸããð€ Transformersã®ãã©ãŒã¯ã®ã¯ããŒã³ã«ç§»åããŠãã ããïŒ
```bash
cd transformers
```
ç¹å¥ãªã±ãŒã¹ãšããŠãæ¢åã®ã¢ãã«ãšå®å
šã«äžèŽããã¢ãŒããã¯ãã£ã®ã¢ãã«ã远å ããå Žåã
[ãã®ã»ã¯ã·ã§ã³](#write-a-conversion-script)ã§èª¬æãããŠããããã«ã倿ã¹ã¯ãªããã远å ããã ãã§æžã¿ãŸãã
ãã®å Žåãæ¢åã®ã¢ãã«ã®å®å
šãªã¢ãã«ã¢ãŒããã¯ãã£ãåå©çšã§ããŸãã
ãã以å€ã®å Žåã¯ãæ°ããã¢ãã«ã®çæãéå§ããŸãããã æ¬¡ã®ã¹ã¯ãªããã䜿çšããŠã以äžããå§ãŸãã¢ãã«ã远å ããããšããå§ãããŸãã
æ¢åã®ã¢ãã«:
```bash
transformers-cli add-new-model-like
```
ã¢ãã«ã®åºæ¬æ
å ±ãå
¥åããããã®ã¢ã³ã±ãŒãã衚瀺ãããŸãã
**äž»èŠãª huggingface/transformers ãªããžããªã§ãã«ãªã¯ãšã¹ããéã**
èªåçæãããã³ãŒããé©å¿ãå§ããåã«ãð€ Transformers ã«ãäœæ¥äžïŒWIPïŒããã«ãªã¯ãšã¹ããéãã¿ã€ãã³ã°ã§ãã
äŸïŒã[WIP] *brand_new_bert* ã远å ããªã©ã§ãã
ããã«ããããŠãŒã¶ãŒãš Hugging Face ããŒã ãð€ Transformers ã«ã¢ãã«ãçµ±åããäœæ¥ã䞊è¡ããŠè¡ãããšãã§ããŸãã
以äžã®æé ãå®è¡ããŠãã ããïŒ
1. ã¡ã€ã³ãã©ã³ãããåãããããååã®ãã©ã³ããäœæããŸãã
```bash
git checkout -b add_brand_new_bert
```
2. èªåçæãããã³ãŒããã³ãããããŠãã ãã:
```bash
git add .
git commit
```
3. çŸåšã® main ãã©ã³ãã«ãã§ããããŠãªããŒã¹
```bash
git fetch upstream
git rebase upstream/main
```
4. 倿Žãããªãã®ã¢ã«ãŠã³ãã«ããã·ã¥ããã«ã¯ã次ã®ã³ãã³ãã䜿çšããŸãïŒ
```bash
git push -u origin a-descriptive-name-for-my-changes
```
5. æºè¶³ããããGitHubäžã®ãã©ãŒã¯ã®ãŠã§ãããŒãžã«ç§»åããŸãã[ãã«ãªã¯ãšã¹ã]ãã¯ãªãã¯ããŸããå°æ¥ã®å€æŽã«åããŠãHugging Face ããŒã ã®ã¡ã³ããŒã®GitHubãã³ãã«ãã¬ãã¥ã¢ãŒãšããŠè¿œå ããŠãã ããã
6. GitHubã®ãã«ãªã¯ãšã¹ããŠã§ãããŒãžã®å³åŽã«ããããã©ããã«å€æããã¯ãªãã¯ããŠãPRããã©ããã«å€æŽããŸãã
以äžã§ã¯ã鲿ããã£ãå Žåã¯åžžã«äœæ¥ãã³ãããããããã·ã¥ããŠãã«ãªã¯ãšã¹ãã«è¡šç€ºãããããã«ããŠãã ãããããã«ã宿çã«ã¡ã€ã³ããã®ææ°ã®å€æŽãåã蟌ãããã«ã次ã®ããã«è¡ãããšãå¿ããªãã§ãã ããïŒ
```bash
git fetch upstream
git merge upstream/main
```
äžè¬çã«ãã¢ãã«ãå®è£
ã«é¢ãã質åã¯Pull Request (PR) ã§è¡ããPRå
ã§è°è«ãã解決ããŸãã
ããã«ãããHugging Face ããŒã ã¯æ°ããã³ãŒããã³ãããããéã質åãããå Žåã«åžžã«éç¥ãåããããšãã§ããŸãã
質åãåé¡ã解決ãããéã«ãåé¡ã質åãçè§£ãããããããã«ãHugging Face ããŒã ã«ã³ãŒããææããããšãéåžžã«åœ¹ç«ã¡ãŸãã
ãã®ããã«ã¯ããFiles changedãã¿ãã«ç§»åããŠãã¹ãŠã®å€æŽã衚瀺ãã質åãããè¡ã«ç§»åããŠã+ãã·ã³ãã«ãã¯ãªãã¯ããŠã³ã¡ã³ãã远å ããŸãã
質åãåé¡ã解決ãããå Žåã¯ãäœæãããã³ã¡ã³ãã®ãResolveããã¿ã³ãã¯ãªãã¯ã§ããŸãã
åæ§ã«ãHugging Face ããŒã ã¯ã³ãŒããã¬ãã¥ãŒããéã«ã³ã¡ã³ããéããŸãã
PRäžã§ã®ã»ãšãã©ã®è³ªåã¯GitHubäžã§è¡ãããšããå§ãããŸãã
äžè¬çãªè³ªåã«é¢ããŠã¯ãå
¬ã«ã¯ããŸã圹ç«ããªã質åã«ã€ããŠã¯ãSlackãã¡ãŒã«ã§Hugging Face ããŒã ã«é£çµ¡ããããšãã§ããŸãã
**5. çæãããã¢ãã«ã³ãŒãã"brand_new_bert"ã«é©å¿ããã**
æåã«ãã¢ãã«èªäœã«çŠç¹ãåœãŠãããŒã¯ãã€ã¶ã«ã¯æ°ã«ããªãã§ãã ããã
é¢é£ããã³ãŒãã¯ãçæããããã¡ã€ã«`src/transformers/models/brand_new_bert/modeling_brand_new_bert.py`ããã³`src/transformers/models/brand_new_bert/configuration_brand_new_bert.py`ã§èŠã€ããã¯ãã§ãã
ããŠãã€ãã«ã³ãŒãã£ã³ã°ãå§ããããšãã§ããŸã :smile:ã
`src/transformers/models/brand_new_bert/modeling_brand_new_bert.py`ã«ããçæãããã³ãŒãã¯ããšã³ã³ãŒããŒã®ã¿ã®ã¢ãã«ã§ããã°BERTãšåãã¢ãŒããã¯ãã£ãæã£ãŠãããããšã³ã³ãŒããŒ-ãã³ãŒããŒã¢ãã«ã§ããã°BARTãšåãã¢ãŒããã¯ãã£ãæã£ãŠããã¯ãã§ãã
ãã®æ®µéã§ã¯ãã¢ãã«ã®çè«çãªåŽé¢ã«ã€ããŠåŠãã ããšãæãåºãã¹ãã§ããã€ãŸããããã®ã¢ãã«ã¯BERTãŸãã¯BARTãšã©ã®ããã«ç°ãªãã®ãïŒããšããããšã§ãã
ãããã®å€æŽãå®è£
ããŸãããããã¯éåžžãã»ã«ãã¢ãã³ã·ã§ã³ã¬ã€ã€ãŒãæ£èŠåã¬ã€ã€ãŒã®é åºãªã©ã倿Žããããšãæå³ããŸãã
åã³ãããªãã®ã¢ãã«ãã©ã®ããã«å®è£
ãããã¹ãããããè¯ãçè§£ããããã«ãTransformerså
ã«æ¢åã®ã¢ãã«ã®é¡äŒŒã¢ãŒããã¯ãã£ãèŠãããšã圹ç«ã€ããšããããŸãã
ãã®æç¹ã§ã¯ãã³ãŒããå®å
šã«æ£ç¢ºãŸãã¯ã¯ãªãŒã³ã§ããå¿
èŠã¯ãããŸããã
ãããããŸãã¯å¿
èŠãªã³ãŒãã®æåã®*ã¯ãªãŒã³ã§ãªã*ã³ããŒïŒããŒã¹ãããŒãžã§ã³ã
`src/transformers/models/brand_new_bert/modeling_brand_new_bert.py`ã«è¿œå ããå¿
èŠãªã³ãŒãããã¹ãŠè¿œå ãããŠãããšæãããŸã§æ¹å/ä¿®æ£ãå埩çã«è¡ãããšããå§ãã§ãã
ç§ãã¡ã®çµéšãããå¿
èŠãªã³ãŒãã®æåã®ããŒãžã§ã³ãè¿
éã«è¿œå ããæ¬¡ã®ã»ã¯ã·ã§ã³ã§èª¬æãã倿ã¹ã¯ãªããã䜿çšããŠã³ãŒããç¹°ãè¿ãæ¹å/ä¿®æ£ããæ¹ãå¹ççã§ããããšãå€ãã§ãã
ãã®æç¹ã§åäœããå¿
èŠãããã®ã¯ãð€ Transformersã®"brand_new_bert"ã®å®è£
ãã€ã³ã¹ã¿ã³ã¹åã§ããããšã ãã§ããã€ãŸãã以äžã®ã³ãã³ããæ©èœããå¿
èŠããããŸãïŒ
```python
from transformers import BrandNewBertModel, BrandNewBertConfig
model = BrandNewBertModel(BrandNewBertConfig())
```
äžèšã®ã³ãã³ãã¯ã`BrandNewBertConfig()` ã§å®çŸ©ãããããã©ã«ããã©ã¡ãŒã¿ã«åŸã£ãŠã¢ãã«ãäœæãã
ãã¹ãŠã®ã³ã³ããŒãã³ãã® `init()` ã¡ãœãããæ£åžžã«åäœããããšã確èªããŸãã
ãã¹ãŠã®ã©ã³ãã ãªåæåã¯ã`BrandnewBertPreTrainedModel` ã¯ã©ã¹ã® `_init_weights` ã¡ãœããã§è¡ãå¿
èŠããããŸãã
ãã®ã¡ãœããã¯ãèšå®å€æ°ã«äŸåãããã¹ãŠã®ãªãŒãã¢ãžã¥ãŒã«ãåæåããå¿
èŠããããŸãã以äžã¯ãBERT ã® `_init_weights` ã¡ãœããã®äŸã§ãïŒ
```py
def _init_weights(self, module):
"""Initialize the weights"""
if isinstance(module, nn.Linear):
module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)
if module.bias is not None:
module.bias.data.zero_()
elif isinstance(module, nn.Embedding):
module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)
if module.padding_idx is not None:
module.weight.data[module.padding_idx].zero_()
elif isinstance(module, nn.LayerNorm):
module.bias.data.zero_()
module.weight.data.fill_(1.0)
```
ç¹å®ã®ã¢ãžã¥ãŒã«ã«ç¹å¥ãªåæåãå¿
èŠãªå Žåãã«ã¹ã¿ã ã¹ããŒã ãããã«æã€ããšãã§ããŸããããšãã°ã
`Wav2Vec2ForPreTraining`ã§ã¯ãæåŸã®2ã€ã®ç·åœ¢å±€ã«ã¯éåžžã®PyTorchã®`nn.Linear`ã®åæåãå¿
èŠã§ããã
ä»ã®ãã¹ãŠã®å±€ã¯äžèšã®ãããªåæåã䜿çšããå¿
èŠããããŸããããã¯ä»¥äžã®ããã«ã³ãŒãã£ã³ã°ãããŠããŸãïŒ
```py
def _init_weights(self, module):
"""Initialize the weights"""
if isinstance(module, Wav2Vec2ForPreTraining):
module.project_hid.reset_parameters()
module.project_q.reset_parameters()
module.project_hid._is_hf_initialized = True
module.project_q._is_hf_initialized = True
elif isinstance(module, nn.Linear):
module.weight.data.normal_(mean=0.0, std=self.config.initializer_range)
if module.bias is not None:
module.bias.data.zero_()
```
`_is_hf_initialized`ãã©ã°ã¯ããµãã¢ãžã¥ãŒã«ãäžåºŠã ãåæåããããšã確å®ã«ããããã«å
éšã§äœ¿çšãããŸãã
`module.project_q`ãš`module.project_hid`ã®ããã«ããã`True`ã«èšå®ããããšã§ã
ã«ã¹ã¿ã åæåãåŸã§äžæžããããªãããã«ãã`_init_weights`颿°ããããã«é©çšãããªãããã«ããŸãã
**6. 倿ã¹ã¯ãªãããæžã**
次ã«ã*brand_new_bert* ã®å
ã®ãªããžããªã§ãããã°ã«äœ¿çšãããã§ãã¯ãã€ã³ãããæ°ããäœæãã ð€ Transformers å®è£
ã® *brand_new_bert* ãšäºææ§ã®ãããã§ãã¯ãã€ã³ãã«å€æã§ãã倿ã¹ã¯ãªãããæžãå¿
èŠããããŸãã
倿ã¹ã¯ãªããããŒãããæžãããšã¯ãå§ããããŸãããã代ããã« ð€ Transformers ã§æ¢ã«ååšããé¡äŒŒã®ã¢ãã«ãåããã¬ãŒã ã¯ãŒã¯ã§å€æããã¹ã¯ãªããã調ã¹ãããšãè¯ãã§ãããã
éåžžãæ¢åã®å€æã¹ã¯ãªãããã³ããŒããŠãèªåã®ãŠãŒã¹ã±ãŒã¹ã«ãããã«é©å¿ãããããšã§ååã§ãã
Hugging Face ããŒã ã«æ¢åã®ã¢ãã«ã«é¡äŒŒãã倿ã¹ã¯ãªãããæããŠãããããšãèºèºããªãã§ãã ããã
- TensorFlowããPyTorchã«ã¢ãã«ãç§»æ€ããŠããå Žåãè¯ãåºçºç¹ã¯BERTã®å€æã¹ã¯ãªãããããããŸãã [here](https://github.com/huggingface/transformers/blob/7acfa95afb8194f8f9c1f4d2c6028224dbed35a2/src/transformers/models/bert/modeling_bert.py#L91)
- PyTorchããPyTorchã«ã¢ãã«ãç§»æ€ããŠããå Žåãè¯ãåºçºç¹ã¯BARTã®å€æã¹ã¯ãªãããããããŸãã [here](https://github.com/huggingface/transformers/blob/main/src/transformers/models/bart/convert_bart_original_pytorch_checkpoint_to_pytorch.py)
以äžã§ã¯ãPyTorchã¢ãã«ãå±€ã®éã¿ãã©ã®ããã«ä¿åããå±€ã®ååãå®çŸ©ãããã«ã€ããŠç°¡åã«èª¬æããŸãã
PyTorchã§ã¯ãå±€ã®ååã¯å±€ã«äžããã¯ã©ã¹å±æ§ã®ååã«ãã£ãŠå®çŸ©ãããŸãã
PyTorchã§ `SimpleModel` ãšãããããŒã¢ãã«ãå®çŸ©ããŸãããïŒ
```python
from torch import nn
class SimpleModel(nn.Module):
def __init__(self):
super().__init__()
self.dense = nn.Linear(10, 10)
self.intermediate = nn.Linear(10, 10)
self.layer_norm = nn.LayerNorm(10)
```
ããã§ããã®ã¢ãã«å®çŸ©ã®ã€ã³ã¹ã¿ã³ã¹ãäœæãã`dense`ã`intermediate`ã`layer_norm`ã®ãã¹ãŠã®éã¿ãã©ã³ãã ãªéã¿ã§åããã¢ãã«ãäœæã§ããŸããã¢ãã«ã®ã¢ãŒããã¯ãã£ã確èªããããã«ãã¢ãã«ãå°å·ããŠã¿ãŸãããã
```python
model = SimpleModel()
print(model)
```
ããã¯ä»¥äžãåºåããŸãïŒ
```
SimpleModel(
(dense): Linear(in_features=10, out_features=10, bias=True)
(intermediate): Linear(in_features=10, out_features=10, bias=True)
(layer_norm): LayerNorm((10,), eps=1e-05, elementwise_affine=True)
)
```
å±€ã®ååã¯PyTorchã®ã¯ã©ã¹å±æ§ã®ååã«ãã£ãŠå®çŸ©ãããŠããŸããç¹å®ã®å±€ã®éã¿å€ãåºåããããšãã§ããŸãïŒ
```python
print(model.dense.weight.data)
```
ã©ã³ãã ã«åæåãããéã¿ã確èªããããã«
```
tensor([[-0.0818, 0.2207, -0.0749, -0.0030, 0.0045, -0.1569, -0.1598, 0.0212,
-0.2077, 0.2157],
[ 0.1044, 0.0201, 0.0990, 0.2482, 0.3116, 0.2509, 0.2866, -0.2190,
0.2166, -0.0212],
[-0.2000, 0.1107, -0.1999, -0.3119, 0.1559, 0.0993, 0.1776, -0.1950,
-0.1023, -0.0447],
[-0.0888, -0.1092, 0.2281, 0.0336, 0.1817, -0.0115, 0.2096, 0.1415,
-0.1876, -0.2467],
[ 0.2208, -0.2352, -0.1426, -0.2636, -0.2889, -0.2061, -0.2849, -0.0465,
0.2577, 0.0402],
[ 0.1502, 0.2465, 0.2566, 0.0693, 0.2352, -0.0530, 0.1859, -0.0604,
0.2132, 0.1680],
[ 0.1733, -0.2407, -0.1721, 0.1484, 0.0358, -0.0633, -0.0721, -0.0090,
0.2707, -0.2509],
[-0.1173, 0.1561, 0.2945, 0.0595, -0.1996, 0.2988, -0.0802, 0.0407,
0.1829, -0.1568],
[-0.1164, -0.2228, -0.0403, 0.0428, 0.1339, 0.0047, 0.1967, 0.2923,
0.0333, -0.0536],
[-0.1492, -0.1616, 0.1057, 0.1950, -0.2807, -0.2710, -0.1586, 0.0739,
0.2220, 0.2358]]).
```
ã¹ã¯ãªããå
ã®å€æã¹ã¯ãªããã§ã¯ãã©ã³ãã ã«åæåãããéã¿ãã察å¿ãããã§ãã¯ãã€ã³ãå
ã®æ£ç¢ºãªéã¿ã§åããå¿
èŠããããŸããäŸãã°ã以äžã®ããã«ç¿»èš³ããŸãïŒ
```python
# retrieve matching layer weights, e.g. by
# recursive algorithm
layer_name = "dense"
pretrained_weight = array_of_dense_layer
model_pointer = getattr(model, "dense")
model_pointer.weight.data = torch.from_numpy(pretrained_weight)
```
PyTorchã¢ãã«ã®åã©ã³ãã åæåãããéã¿ãšå¯Ÿå¿ããäºååŠç¿æžã¿ãã§ãã¯ãã€ã³ãã®éã¿ã
**圢ç¶ãšååã®äž¡æ¹**ã§æ£ç¢ºã«äžèŽããããšã確èªããå¿
èŠããããŸãã
ãããè¡ãããã«ã圢ç¶ã«å¯Ÿããassertã¹ããŒãã¡ã³ãã远å ãããã§ãã¯ãã€ã³ãã®éã¿ã®ååãåºåããããšã
**å¿
èŠäžå¯æ¬ **ã§ããäŸãã°ã次ã®ãããªã¹ããŒãã¡ã³ãã远å ããå¿
èŠããããŸãïŒ
```python
assert (
model_pointer.weight.shape == pretrained_weight.shape
), f"Pointer shape of random weight {model_pointer.shape} and array shape of checkpoint weight {pretrained_weight.shape} mismatched"
```
ãŸããäž¡æ¹ã®éã¿ã®ååãå°å·ããŠãäžèŽããŠããããšã確èªããå¿
èŠããããŸããäŸãã°ã次ã®ããã«ããŸãïŒ
```python
logger.info(f"Initialize PyTorch weight {layer_name} from {pretrained_weight.name}")
```
ãã圢ç¶ãŸãã¯ååã®ãããããäžèŽããªãå Žåããããã誀ã£ãŠð€ Transformersã®å®è£
ã«åæåãããã¬ã€ã€ãŒã«ééã£ããã§ãã¯ãã€ã³ãã®éã¿ãå²ãåœãŠãŠããŸã£ãå¯èœæ§ããããŸãã
誀ã£ã圢ç¶ã¯ããããã`BrandNewBertConfig()`ã§ã®èšå®ãã©ã¡ãŒã¿ãŒãã倿ããããã§ãã¯ãã€ã³ãã§äœ¿çšããããã®ãšæ£ç¢ºã«äžèŽããªãããã§ãã
ãã ããPyTorchã®ã¬ã€ã€ãŒã®å®è£
ã«ãã£ãŠã¯ãéã¿ãäºåã«è»¢çœ®ããå¿
èŠãããå ŽåããããŸãã
æåŸã«ã**ãã¹ãŠ**ã®å¿
èŠãªéã¿ãåæåãããŠããããšã確èªããåæåã«äœ¿çšãããªãã£ããã¹ãŠã®ãã§ãã¯ãã€ã³ãã®éã¿ã衚瀺ããŠãã¢ãã«ãæ£ãã倿ãããŠããããšã確èªããŠãã ããã
倿ãã©ã€ã¢ã«ã誀ã£ã圢ç¶ã¹ããŒãã¡ã³ããŸãã¯èª€ã£ãååå²ãåœãŠã§å€±æããã®ã¯å®å
šã«æ£åžžã§ãã
ããã¯ããããã`BrandNewBertConfig()`ã§èª€ã£ããã©ã¡ãŒã¿ãŒã䜿çšããããð€ Transformersã®å®è£
ã«èª€ã£ãã¢ãŒããã¯ãã£ãããããð€ Transformersã®å®è£
ã®1ã€ã®ã³ã³ããŒãã³ãã®`init()`颿°ã«ãã°ããããããã§ãã¯ãã€ã³ãã®éã¿ã®1ã€ã転眮ããå¿
èŠãããããã§ãã
ãã®ã¹ãããã¯ã以åã®ã¹ããããšç¹°ãè¿ãã¹ãã§ãããã¹ãŠã®ãã§ãã¯ãã€ã³ãã®éã¿ãæ£ããð€ Transformersã¢ãã«ã«èªã¿èŸŒãŸãããŸã§ç¹°ãè¿ãã¹ãã§ãã
ð€ Transformerså®è£
ã«æ£ãããã§ãã¯ãã€ã³ããèªã¿èŸŒãã åŸãéžæãããã©ã«ããŒã«ã¢ãã«ãä¿åã§ããŸã `/path/to/converted/checkpoint/folder`ããã®ãã©ã«ãã«ã¯`pytorch_model.bin`ãã¡ã€ã«ãš`config.json`ãã¡ã€ã«ã®äž¡æ¹ãå«ãŸããã¯ãã§ãã
```python
model.save_pretrained("/path/to/converted/checkpoint/folder")
```
**7. é äŒæïŒforward passïŒã®å®è£
**
ð€ Transformerså®è£
ã§äºååŠç¿æžã¿ã®éã¿ãæ£ããèªã¿èŸŒãã åŸãé äŒæãæ£ããå®è£
ãããŠããããšã確èªããå¿
èŠããããŸãã[å
ã®ãªããžããªãçè§£ãã](#3-4-run-a-pretrained-checkpoint-using-the-original-repository)ã§ãå
ã®ãªããžããªã䜿çšããŠã¢ãã«ã®é äŒæãå®è¡ããã¹ã¯ãªããããã§ã«äœæããŸãããä»åºŠã¯ãå
ã®ãªããžããªã®ä»£ããã«ð€ Transformerså®è£
ã䜿çšããŠé¡äŒŒã®ã¹ã¯ãªãããäœæããå¿
èŠããããŸãã以äžã®ããã«ãªããŸãïŒ
```python
model = BrandNewBertModel.from_pretrained("/path/to/converted/checkpoint/folder")
input_ids = [0, 4, 4, 3, 2, 4, 1, 7, 19]
output = model(input_ids).last_hidden_states
```
ð€ Transformersã®å®è£
ãšå
ã®ã¢ãã«ã®å®è£
ãæåã®å®è¡ã§å®å
šã«åãåºåãæäŸããªããã
ãã©ã¯ãŒããã¹ã§ãšã©ãŒãçºçããå¯èœæ§ãéåžžã«é«ãã§ãã倱æããªãã§ãã ãã - ããã¯äºæ³ãããŠããããšã§ãïŒ
ãŸãããã©ã¯ãŒããã¹ããšã©ãŒãã¹ããŒããªãããšã確èªããå¿
èŠããããŸãã
ééã£ã次å
ã䜿çšããã*次å
ã®äžäžèŽ*ãšã©ãŒãã誀ã£ãããŒã¿åãªããžã§ã¯ãã䜿çšãããããšããããããŸãã
äŸãã°ã`torch.long`ã§ã¯ãªã`torch.float32`ã䜿çšãããŸããç¹å®ã®ãšã©ãŒã解決ã§ããªãå Žåã¯ã
Hugging FaceããŒã ã«å©ããæ±ããããšãèºèºããªãã§ãã ããã
ð€ Transformerså®è£
ãæ£ããæ©èœããããšã確èªããæçµçãªéšåã¯ãåºåã`1e-3`ã®ç²ŸåºŠã§åçã§ããããšã確èªããããšã§ãã
ãŸããåºåã®åœ¢ç¶ãåäžã§ããããšãã€ãŸãã¹ã¯ãªããã®ð€ Transformerså®è£
ãšå
ã®å®è£
ã®äž¡æ¹ã§`outputs.shape`ãåãå€ãçæããå¿
èŠããããŸãã
次ã«ãåºåå€ãåäžã§ããããšã確èªããå¿
èŠããããŸãã
ããã¯æ°ããã¢ãã«ã远å ããéã®æãé£ããéšåã®1ã€ã§ãã
åºåãåäžã§ãªãçç±ã®äžè¬çãªééãã¯ä»¥äžã®éãã§ãã
- äžéšã®ã¬ã€ã€ãŒã远å ãããŠããªããã€ãŸã*掻æ§å*ã¬ã€ã€ãŒã远å ãããŠããªããããªã¶ãã«æ¥ç¶ãå¿ããããŠãã
- åèªåã蟌ã¿è¡åãçµã°ããŠããªã
- ãªãªãžãã«ã®å®è£
ããªãã»ããã䜿çšããŠããããã誀ã£ãäœçœ®åã蟌ã¿ã䜿çšãããŠãã
- ãã©ã¯ãŒããã¹äžã«ããããã¢ãŠããé©çšãããŠããŸãããããä¿®æ£ããã«ã¯ã*model.trainingãFalse*ã§ããããšã確èªãããã©ã¯ãŒããã¹äžã«èª€ã£ãŠããããã¢ãŠãã¬ã€ã€ãŒãã¢ã¯ãã£ãåãããªãããã«ããŸãã
*ã€ãŸã* [PyTorchã®functional dropout](https://pytorch.org/docs/stable/nn.functional.html?highlight=dropout#torch.nn.functional.dropout)ã«*model.training*ãæž¡ããŸãã
åé¡ãä¿®æ£ããæè¯ã®æ¹æ³ã¯ãéåžžãå
ã®å®è£
ãšð€ Transformerså®è£
ã®ãã©ã¯ãŒããã¹ã䞊ã¹ãŠè¡šç€ºããéãããããã©ããã確èªããããšã§ãã
çæ³çã«ã¯ããã©ã¯ãŒããã¹ã®äž¡æ¹ã®å®è£
ã®äžéåºåããããã°/ããªã³ãã¢ãŠãããŠãð€ Transformerså®è£
ãå
ã®å®è£
ãšç°ãªãåºåã瀺ããããã¯ãŒã¯å
ã®æ£ç¢ºãªäœçœ®ãèŠã€ããããšãã§ããŸãã
æåã«ãäž¡æ¹ã®ã¹ã¯ãªããã®ããŒãã³ãŒãã£ã³ã°ããã`input_ids`ãåäžã§ããããšã確èªããŸãã
次ã«ã`input_ids`ã®æåã®å€æïŒéåžžãåèªåã蟌ã¿ïŒã®åºåãåäžã§ããããšã確èªããŸãã
ãã®åŸããããã¯ãŒã¯ã®æåŸã®ã¬ã€ã€ãŒãŸã§äœæ¥ãé²ããŸãã
ããããã®æç¹ã§ã2ã€ã®å®è£
éã§éããããããšã«æ°ä»ãã¯ãã§ãããã«ããð€ Transformerså®è£
ã®ãã°ã®å Žæãç¹å®ãããŸãã
çµéšäžãå
ã®å®è£
ãšð€ Transformerså®è£
ã®ãã©ã¯ãŒããã¹ã®åãäœçœ®ã«å€ãã®ããªã³ãã¹ããŒãã¡ã³ãã远å ãã
äžéãã¬ãŒã³ããŒã·ã§ã³ã§åãå€ã瀺ãããªã³ãã¹ããŒãã¡ã³ããæ®µéçã«åé€ããã®ãã·ã³ãã«ãã€å¹æçãªæ¹æ³ã§ãã
äž¡æ¹ã®å®è£
ãåãåºåãçæããããšã«èªä¿¡ãæã£ãŠããå Žåã`torch.allclose(original_output, output, atol=1e-3)`ã䜿çšããŠåºåã確èªãããšãæãé£ããéšåãå®äºããŸãïŒ
ããã§ãšãããããŸã - å®äºããäœæ¥ã¯ç°¡åãªãã®ã«ãªãã¯ãã§ã ðã
**8. å¿
èŠãªãã¹ãŠã®ã¢ãã«ãã¹ãã远å **
ãã®æç¹ã§ãæ°ããã¢ãã«ãæ£åžžã«è¿œå ãããŸããã
ãã ããã¢ãã«ããŸã å¿
èŠãªèšèšã«å®å
šã«æºæ ããŠããªãå¯èœæ§ãéåžžã«é«ãã§ãã
ð€ Transformersãšå®å
šã«äºææ§ãããããšã確èªããããã«ããã¹ãŠã®äžè¬çãªãã¹ãããã¹ããå¿
èŠããããŸãã
Cookiecutterã¯ããããã¢ãã«çšã®ãã¹ããã¡ã€ã«ãèªåçã«è¿œå ããŠããã¯ãã§ãããããåããã£ã¬ã¯ããªã«`tests/models/brand_new_bert/test_modeling_brand_new_bert.py`ãšããŠååšããŸãã
ãã®ãã¹ããã¡ã€ã«ãå®è¡ããŠããã¹ãŠã®äžè¬çãªãã¹ãããã¹ããããšã確èªããŠãã ããïŒ
```bash
pytest tests/models/brand_new_bert/test_modeling_brand_new_bert.py
```
ãã¹ãŠã®äžè¬çãªãã¹ããä¿®æ£ããããä»åºŠã¯å®è¡ãããã¹ãŠã®çŽ æŽãããäœæ¥ãé©åã«ãã¹ããããŠããããšã確èªããããšãéåžžã«éèŠã§ããããã«ããã
- a) ã³ãã¥ããã£ã¯*brand_new_bert*ã®ç¹å®ã®ãã¹ããèŠãããšã§ãããªãã®äœæ¥ãç°¡åã«çè§£ã§ããŸãã
- b) ã¢ãã«ãžã®å°æ¥ã®å€æŽãã¢ãã«ã®éèŠãªæ©èœãå£ããªãããã«ããããšãã§ããŸãã
ãŸããçµ±åãã¹ãã远å ããå¿
èŠããããŸãããããã®çµ±åãã¹ãã¯ãåºæ¬çã«ã¯ãããã°ã¹ã¯ãªãããšåãããšãè¡ããŸãããããã®ã¢ãã«ãã¹ãã®ãã³ãã¬ãŒãã¯Cookiecutterã«ãã£ãŠæ¢ã«è¿œå ãããŠããããBrandNewBertModelIntegrationTestsããšåŒã°ããŠããŸãããã®ãã¹ããèšå
¥ããã ãã§ãããããã®ãã¹ããåæ ŒããŠããããšã確èªããã«ã¯ã次ã®ã³ãã³ããå®è¡ããŸãã
```bash
RUN_SLOW=1 pytest -sv tests/models/brand_new_bert/test_modeling_brand_new_bert.py::BrandNewBertModelIntegrationTests
```
<Tip>
Windowsã䜿çšããŠããå Žåã`RUN_SLOW=1`ã`SET RUN_SLOW=1`ã«çœ®ãæããŠãã ããã
</Tip>
次ã«ã*brand_new_bert*ã«ç¹æã®ãã¹ãŠã®ç¹åŸŽã¯ãå¥åã®ãã¹ãå
ã§è¿œå ãããã¹ãã§ãã
`BrandNewBertModelTester`/`BrandNewBertModelTest`ã®äžã«ããã®éšåã¯ããå¿ããããŸããã2ã€ã®ç¹ã§éåžžã«åœ¹ç«ã¡ãŸãïŒ
- ã¢ãã«ã®è¿œå äžã«ç²åŸããç¥èãã³ãã¥ããã£ã«äŒãã*brand_new_bert*ã®ç¹å¥ãªæ©èœãã©ã®ããã«åäœãããã瀺ãããšã«ãã£ãŠãç¥èã®å
±æãæ¯æŽããŸãã
- å°æ¥ã®è²¢ç®è
ã¯ããããã®ç¹å¥ãªãã¹ããå®è¡ããããšã§ã¢ãã«ãžã®å€æŽãè¿
éã«ãã¹ãã§ããŸãã
**9. ããŒã¯ãã€ã¶ã®å®è£
**
次ã«ã*brand_new_bert*ã®ããŒã¯ãã€ã¶ã远å ããå¿
èŠããããŸããéåžžãããŒã¯ãã€ã¶ã¯ð€ Transformersã®æ¢åã®ããŒã¯ãã€ã¶ãšåçãéåžžã«äŒŒãŠããŸãã
ããŒã¯ãã€ã¶ãæ£ããåäœããããšã確èªããããã«ã¯ããŸããå
ã®ãªããžããªå
ã§æååãå
¥åãã`input_ids`ãè¿ãã¹ã¯ãªãããäœæããããšããå§ãããŸãã
ãã®ã¹ã¯ãªããã¯ã次ã®ããã«èŠãããããããŸããïŒç䌌ã³ãŒãã§ç€ºããŸãïŒïŒ
```python
input_str = "This is a long example input string containing special characters .$?-, numbers 2872 234 12 and words."
model = BrandNewBertModel.load_pretrained_checkpoint("/path/to/checkpoint/")
input_ids = model.tokenize(input_str)
```
ãªãªãžãã«ã®ãªããžããªã詳ãã調æ»ããæ£ããããŒã¯ãã€ã¶ã®é¢æ°ãèŠã€ããå¿
èŠããããããããŸããã
ãŸãã¯ããªãªãžãã«ã®ãªããžããªã®ã¯ããŒã³ã倿ŽããŠã`input_ids`ã ããåºåããããã«ããå¿
èŠããããããããŸããã
ãªãªãžãã«ã®ãªããžããªã䜿çšããæ©èœçãªããŒã¯ãã€ãŒãŒã·ã§ã³ã¹ã¯ãªãããäœæããåŸã
ð€ Transformersåãã®é¡äŒŒããã¹ã¯ãªãããäœæããå¿
èŠããããŸãã
以äžã®ããã«èŠããã¹ãã§ãïŒ
```python
from transformers import BrandNewBertTokenizer
input_str = "This is a long example input string containing special characters .$?-, numbers 2872 234 12 and words."
tokenizer = BrandNewBertTokenizer.from_pretrained("/path/to/tokenizer/folder/")
input_ids = tokenizer(input_str).input_ids
```
`input_ids`ãåãå€ãçæããå Žåãæçµã¹ããããšããŠããŒã¯ãã€ã¶ã®ãã¹ããã¡ã€ã«ã远å ããã¹ãã§ãã
*brand_new_bert*ã®ã¢ãã«ã³ã°ãã¹ããã¡ã€ã«ãšåæ§ã«ã*brand_new_bert*ã®ããŒã¯ãã€ãºãã¹ããã¡ã€ã«ã«ã¯ãããã€ãã®ããŒãã³ãŒããããçµ±åãã¹ããå«ãŸããã¹ãã§ãã
**10. ãšã³ãããŒãšã³ãçµ±åãã¹ãã®å®è¡**
ããŒã¯ãã€ã¶ã远å ããåŸã`ð€ Transformers`å
ã®`tests/models/brand_new_bert/test_modeling_brand_new_bert.py`ã«
ã¢ãã«ãšããŒã¯ãã€ã¶ã®äž¡æ¹ã䜿çšããããã€ãã®ãšã³ãããŒãšã³ãçµ±åãã¹ãã远å ããå¿
èŠããããŸãã
ãã®ãããªãã¹ãã¯ãð€ Transformersã®å®è£
ãæåŸ
ã©ããã«æ©èœããããšã瀺ãã¹ãã§ãã
æå³ã®ããããã¹ã察ããã¹ãã®ãµã³ãã«ãå«ãŸããŸããæçšãªããã¹ã察ããã¹ãã®ãµã³ãã«ã«ã¯ããœãŒã¹ããã¿ãŒã²ãããžã®ç¿»èš³ãã¢ãèšäºããèŠçŽãžã®ãã¢ã質åããåçãžã®ãã¢ãªã©ãå«ãŸããŸãã
ããŒãããããã§ãã¯ãã€ã³ããããŠã³ã¹ããªãŒã ã¿ã¹ã¯ã§ãã¡ã€ã³ãã¥ãŒãã³ã°ãããŠããªãå Žåãã¢ãã«ã®ãã¹ãã«äŸåããã ãã§ååã§ãã
ã¢ãã«ãå®å
šã«æ©èœããŠããããšã確èªããããã«ããã¹ãŠã®ãã¹ããGPUäžã§å®è¡ããããšããå§ãããŸãã
ã¢ãã«ã®å
éšãã³ãœã«ã«`.to(self.device)`ã¹ããŒãã¡ã³ãã远å ããã®ãå¿ããå¯èœæ§ãããããããã®ãããªãã¹ãã§ã¯ãšã©ãŒã衚瀺ãããããšããããŸãã
GPUã«ã¢ã¯ã»ã¹ã§ããªãå ŽåãHugging FaceããŒã ã代ããã«ãããã®ãã¹ããå®è¡ã§ããŸãã
**11. ããã¥ã¡ã³ãã®è¿œå **
ããã§ã*brand_new_bert*ã®å¿
èŠãªãã¹ãŠã®æ©èœã远å ãããŸãã - ã»ãŒå®äºã§ãïŒæ®ãã®è¿œå ãã¹ãããšã¯ãè¯ãããã¥ã¡ã³ããšããã¥ã¡ã³ãããŒãžã§ãã
Cookiecutterã`docs/source/model_doc/brand_new_bert.md`ãšãããã³ãã¬ãŒããã¡ã€ã«ã远å ããŠããã¯ãã§ããããèšå
¥ããå¿
èŠããããŸãã
ã¢ãã«ã®ãŠãŒã¶ãŒã¯éåžžãã¢ãã«ã䜿çšããåã«ãŸããã®ããŒãžãèŠãŸãããããã£ãŠãããã¥ã¡ã³ããŒã·ã§ã³ã¯çè§£ããããç°¡æœã§ããå¿
èŠããããŸãã
ã¢ãã«ã®äœ¿ç𿹿³ã瀺ãããã«ããã€ãã®*Tips*ã远å ããããšã¯ã³ãã¥ããã£ã«ãšã£ãŠéåžžã«åœ¹ç«ã¡ãŸããããã¥ã¡ã³ããŒã·ã§ã³ã«é¢ããŠã¯ãHugging FaceããŒã ã«åãåãããããšããããããªãã§ãã ããã
次ã«ã`src/transformers/models/brand_new_bert/modeling_brand_new_bert.py`ã«è¿œå ãããããã¥ã¡ã³ããŒã·ã§ã³æååãæ£ããããšãããã³ãã¹ãŠã®å¿
èŠãªå
¥åããã³åºåãå«ãã§ããããšã確èªããŠãã ããã
ããã¥ã¡ã³ããŒã·ã§ã³ã®æžãæ¹ãšããã¥ã¡ã³ããŒã·ã§ã³æååã®ãã©ãŒãããã«ã€ããŠè©³çްãªã¬ã€ãã[ãã¡ã](writing-documentation)ã«ãããŸãã
ããã¥ã¡ã³ããŒã·ã§ã³ã¯éåžžãã³ãã¥ããã£ãšã¢ãã«ã®æåã®æ¥è§Šç¹ã§ãããããã³ãŒããšåããããæ³šææ·±ãæ±ãã¹ãã§ããããšãåžžã«å¿µé ã«çœ®ããŠãã ããã
**ã³ãŒãã®ãªãã¡ã¯ã¿ãªã³ã°**
çŽ æŽããããããã§*brand_new_bert*ã«å¿
èŠãªãã¹ãŠã®ã³ãŒãã远å ãããŸããã
ãã®æç¹ã§ã次ã®ãããªããã³ã·ã£ã«ãªã³ãŒãã¹ã¿ã€ã«ã®èª€ããèšæ£ããããã«ä»¥äžãå®è¡ããå¿
èŠããããŸãïŒ
```bash
make style
```
ããªãã®ã³ãŒãã£ã³ã°ã¹ã¿ã€ã«ãå質ãã§ãã¯ããã¹ããããšã確èªããŠãã ãã:
```bash
make quality
```
ð€ Transformersã®éåžžã«å³æ Œãªãã¶ã€ã³ãã¹ãã«ã¯ããŸã åæ ŒããŠããªãå¯èœæ§ãããããã€ãã®ä»ã®ãã¹ããååšãããããããŸããã
ããã¯ãããã¥ã¡ã³ãæååã«æ
å ±ãäžè¶³ããŠããããååãééã£ãŠããããšãåå ã§ããããšãå€ãã§ããHugging FaceããŒã ã¯ãããã§è©°ãŸã£ãŠããå Žåã«ã¯å¿
ãå©ããŠãããã§ãããã
æåŸã«ãã³ãŒããæ£ããæ©èœããããšã確èªããåŸãã³ãŒãããªãã¡ã¯ã¿ãªã³ã°ããã®ã¯åžžã«è¯ãã¢ã€ãã¢ã§ãã
ãã¹ãŠã®ãã¹ãããã¹ããä»ã远å ããã³ãŒããå床確èªããŠãªãã¡ã¯ã¿ãªã³ã°ãè¡ãã®ã¯è¯ãã¿ã€ãã³ã°ã§ãã
ããã§ã³ãŒãã£ã³ã°ã®éšåã¯å®äºããŸãããããã§ãšãããããŸãïŒ ð ããªãã¯çŽ æŽãããã§ãïŒ ð
**12. ã¢ãã«ãã¢ãã«ããã«ã¢ããããŒã**
æåŸã®ããŒãã§ã¯ããã¹ãŠã®ãã§ãã¯ãã€ã³ããã¢ãã«ããã«å€æããŠã¢ããããŒãããåã¢ããããŒãããã¢ãã«ãã§ãã¯ãã€ã³ãã«ã¢ãã«ã«ãŒãã远å ããå¿
èŠããããŸãã
ã¢ãã«ããã®æ©èœã«ã€ããŠè©³ããã¯ã[Model sharing and uploading Page](model_sharing)ãèªãã§çè§£ã§ããŸãã
ããã§ã¯ã*brand_new_bert*ã®èè
çµç¹ã®äžã«ã¢ãã«ãã¢ããããŒãã§ããããã«å¿
èŠãªã¢ã¯ã»ã¹æš©ãååŸããããã«ãHugging FaceããŒã ãšååããå¿
èŠããããŸãã
`transformers`ã®ãã¹ãŠã®ã¢ãã«ã«ååšãã`push_to_hub`ã¡ãœããã¯ããã§ãã¯ãã€ã³ããããã«ããã·ã¥ããè¿
éãã€å¹ççãªæ¹æ³ã§ãã
以äžã«ãå°ãã®ã³ãŒãã¹ããããã瀺ããŸãïŒ
```python
brand_new_bert.push_to_hub("brand_new_bert")
# Uncomment the following line to push to an organization.
# brand_new_bert.push_to_hub("<organization>/brand_new_bert")
```
åãã§ãã¯ãã€ã³ãã«é©åãªã¢ãã«ã«ãŒããäœæãã䟡å€ããããŸããã¢ãã«ã«ãŒãã¯ããã®ç¹å®ã®ãã§ãã¯ãã€ã³ãã®ç¹æ§ããã€ã©ã€ãããã¹ãã§ããäŸãã°ããã®ãã§ãã¯ãã€ã³ãã¯ã©ã®ããŒã¿ã»ããã§äºååŠç¿/ãã¡ã€ã³ãã¥ãŒãã³ã°ãããããã©ã®ãããªäžæµã¿ã¹ã¯ã§ã¢ãã«ã䜿çšãã¹ããã瀺ãã¹ãã§ãããŸããã¢ãã«ã®æ£ããäœ¿çšæ¹æ³ã«é¢ããã³ãŒããå«ããã¹ãã§ãã
**13.ïŒãªãã·ã§ã³ïŒããŒãããã¯ã®è¿œå **
*brand_new_bert*ãæšè«ãŸãã¯äžæµã¿ã¹ã¯ã®ãã¡ã€ã³ãã¥ãŒãã³ã°ã«ã©ã®ããã«è©³çްã«äœ¿çšã§ãããã瀺ãããŒãããã¯ã远å ããããšã¯éåžžã«åœ¹ç«ã¡ãŸããããã¯ããªãã®PRãããŒãžããããã«å¿
é ã§ã¯ãããŸããããã³ãã¥ããã£ã«ãšã£ãŠéåžžã«æçšã§ãã
**14. 宿ããPRã®æåº**
ããã°ã©ãã³ã°ãå®äºããããæåŸã®ã¹ãããã«ç§»åããPRãã¡ã€ã³ãã©ã³ãã«ããŒãžããŸããããéåžžãHugging FaceããŒã ã¯ãã®æç¹ã§æ¢ã«ããªãããµããŒãããŠããã¯ãã§ãããPRã«è¯ã説æã远å ããã³ãŒãã«ã³ã¡ã³ãã远å ããŠãã¬ãã¥ã¢ãŒã«ç¹å®ã®èšèšã®éžæè¢ãææãããå Žåã¯ã³ã¡ã³ãã远å ããããšã䟡å€ããããŸãã
### Share your work!!
ãããã³ãã¥ããã£ããããªãã®äœæ¥ã«å¯Ÿããè©äŸ¡ãåŸãæãæ¥ãŸããïŒã¢ãã«ã®è¿œå ãå®äºããããšã¯ãTransformersããã³NLPã³ãã¥ããã£ã«ãšã£ãŠéèŠãªè²¢ç®ã§ããããªãã®ã³ãŒããšããŒããããäºååŠç¿æžã¿ã¢ãã«ã¯ãäœçŸäººãäœå人ãšããéçºè
ãç ç©¶è
ã«ãã£ãŠç¢ºå®ã«äœ¿çšãããã§ããããããªãã®ä»äºã«èªããæã¡ãã³ãã¥ããã£ãšããªãã®ææãå
±æããŸãããã
**ããªãã¯ã³ãã¥ããã£ã®èª°ã§ãç°¡åã«ã¢ã¯ã»ã¹ã§ããå¥ã®ã¢ãã«ãäœæããŸããïŒ ð€¯**
|