File size: 17,200 Bytes
17c6d62 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
â ïž Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->
# Training on TPU with TensorFlow
<Tip>
詳现ãªèª¬æãäžèŠã§ãåã«TPUã®ã³ãŒããµã³ãã«ãå
¥æããŠãã¬ãŒãã³ã°ãéå§ãããå Žåã¯ã[ç§ãã¡ã®TPUã®äŸã®ããŒãããã¯ããã§ãã¯ããŠãã ããïŒ](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb)
</Tip>
### What is a TPU?
TPUã¯**Tensor Processing UnitïŒãã³ãœã«åŠçãŠãããïŒ**ã®ç¥ã§ãããããã¯GoogleãèšèšããããŒããŠã§ã¢ã§ããã¥ãŒã©ã«ãããã¯ãŒã¯å
ã®ãã³ãœã«èšç®ã倧å¹
ã«é«éåããããã«äœ¿çšãããŸããããã¯GPUã®ãããªãã®ã§ãããããã¯ãŒã¯ã®ãã¬ãŒãã³ã°ãšæšè«ã®äž¡æ¹ã«äœ¿çšã§ããŸããäžè¬çã«ã¯Googleã®ã¯ã©ãŠããµãŒãã¹ãä»ããŠã¢ã¯ã»ã¹ãããŸãããGoogle ColabãšKaggle KernelsãéããŠãç¡æã§å°èŠæš¡ã®TPUã«çŽæ¥ã¢ã¯ã»ã¹ã§ããŸãã
[ð€ Transformersã®ãã¹ãŠã®TensorFlowã¢ãã«ã¯Kerasã¢ãã«ã§ã](https://huggingface.co/blog/tensorflow-philosophy)ã®ã§ããã®ææžã®ã»ãšãã©ã®æ¹æ³ã¯äžè¬çã«Kerasã¢ãã«çšã®TPUãã¬ãŒãã³ã°ã«é©çšã§ããŸãïŒãã ããTransformersãšDatasetsã®HuggingFaceãšã³ã·ã¹ãã ïŒhug-o-systemïŒïŒã«åºæã®ãã€ã³ããããã€ããããããã«ã€ããŠã¯é©çšãããšãã«ããã瀺ããŸãã
### What kinds of TPU are available?
æ°ãããŠãŒã¶ãŒã¯ãããŸããŸãªTPUãšãã®ã¢ã¯ã»ã¹æ¹æ³ã«é¢ããå¹
åºãæ
å ±ã«ããæ··ä¹±ããŸããçè§£ããããã®æåã®éèŠãªéãã¯ã**TPUããŒã**ãš**TPU VM**ã®éãã§ãã
**TPUããŒã**ã䜿çšãããšãäºå®äžãªã¢ãŒãã®TPUã«éæ¥çã«ã¢ã¯ã»ã¹ããŸããå¥åã®VMãå¿
èŠã§ããããã¯ãŒã¯ãšããŒã¿ãã€ãã©ã€ã³ãåæåãããããããªã¢ãŒãããŒãã«è»¢éããŸããGoogle Colabã§TPUã䜿çšãããšã**TPUããŒã**ã¹ã¿ã€ã«ã§ã¢ã¯ã»ã¹ããŠããŸãã
TPUããŒãã䜿çšãããšãããã«æ
£ããŠããªã人ã
ã«ã¯ããªãäºæããªãåäœãçºçããããšããããŸãïŒç¹ã«ãTPUã¯Pythonã³ãŒããå®è¡ããŠãããã·ã³ãšç©ççã«ç°ãªãã·ã¹ãã ã«é
眮ãããŠãããããããŒã¿ã¯ããŒã«ã«ãã·ã³ã«ããŒã«ã«ã§æ ŒçŽãããŠããããŒã¿ãã€ãã©ã€ã³ãå®å
šã«å€±æããŸãã代ããã«ãããŒã¿ã¯Google Cloud Storageã«æ ŒçŽããå¿
èŠããããŸããããã§ããŒã¿ãã€ãã©ã€ã³ã¯ãªã¢ãŒãã®TPUããŒãã§å®è¡ãããŠããå Žåã§ããããŒã¿ã«ã¢ã¯ã»ã¹ã§ããŸãã
<Tip>
ãã¹ãŠã®ããŒã¿ã`np.ndarray`ãŸãã¯`tf.Tensor`ãšããŠã¡ã¢ãªã«åããããšãã§ããå ŽåãColabãŸãã¯TPUããŒãã䜿çšããŠããå Žåã§ããããŒã¿ãGoogle Cloud Storageã«ã¢ããããŒãããã«`fit()`ã§ãã¬ãŒãã³ã°ã§ããŸãã
</Tip>
<Tip>
**ð€ Hugging Faceåºæã®ãã³ãð€:** TFã³ãŒãã®äŸã§ããèŠãã§ããã`Dataset.to_tf_dataset()`ãšãã®é«ã¬ãã«ã®ã©ãããŒã§ãã`model.prepare_tf_dataset()`ã¯ãTPUããŒãã§å€±æããŸããããã¯ã`tf.data.Dataset`ãäœæããŠããã«ãããããããããããçŽç²ãªã`tf.data`ãã€ãã©ã€ã³ã§ã¯ãªãã`tf.numpy_function`ãŸãã¯`Dataset.from_generator()`ã䜿çšããŠåºç€ãšãªãHuggingFace `Dataset`ããããŒã¿ãã¹ããªãŒã ã§èªã¿èŸŒãããšããã§ãããã®HuggingFace `Dataset`ã¯ããŒã«ã«ãã£ã¹ã¯äžã®ããŒã¿ãããã¯ã¢ããããŠããããªã¢ãŒãTPUããŒããèªã¿åãããšãã§ããªãããã§ãã
</Tip>
TPUã«ã¢ã¯ã»ã¹ãã第äºã®æ¹æ³ã¯ã**TPU VM**ãä»ããŠã§ããTPU VMã䜿çšããå ŽåãTPUãæ¥ç¶ãããŠãããã·ã³ã«çŽæ¥æ¥ç¶ããŸããããã¯GPU VMã§ãã¬ãŒãã³ã°ãè¡ãã®ãšåæ§ã§ããTPU VMã¯äžè¬çã«ããŒã¿ãã€ãã©ã€ã³ã«é¢ããŠã¯ç¹ã«äœæ¥ããããããäžèšã®ãã¹ãŠã®èŠåã¯TPU VMã«ã¯é©çšãããŸããïŒ
ããã¯äž»èгçãªææžã§ãã®ã§ããã¡ãã®æèŠã§ãïŒ**å¯èœãªéãTPUããŒãã®äœ¿çšãé¿ããŠãã ããã** TPU VMãããæ··ä¹±ããããããããã°ãé£ããã§ããå°æ¥çã«ã¯ãµããŒããããªããªãå¯èœæ§ããããŸã - Googleã®ææ°ã®TPUã§ããTPUv4ã¯ãTPU VMãšããŠã®ã¿ã¢ã¯ã»ã¹ã§ãããããTPUããŒãã¯å°æ¥çã«ã¯ãã¬ã¬ã·ãŒãã®ã¢ã¯ã»ã¹æ¹æ³ã«ãªãå¯èœæ§ãé«ãã§ãããã ããç¡æã§TPUã«ã¢ã¯ã»ã¹ã§ããã®ã¯ColabãšKaggle Kernelsã®å ŽåããããŸãããã®å Žåãã©ãããŠã䜿çšããªããã°ãªããªãå Žåã®åãæ±ãæ¹æ³ã説æããããšããŸãïŒè©³çްã¯[TPUã®äŸã®ããŒãããã¯](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb)ã§è©³çްãªèª¬æã確èªããŠãã ããã
### What sizes of TPU are available?
åäžã®TPUïŒv2-8/v3-8/v4-8ïŒã¯8ã€ã®ã¬ããªã«ãå®è¡ããŸããTPUã¯æ°çŸããæ°åã®ã¬ããªã«ãåæã«å®è¡ã§ãã**ããã**ã«ååšããŸããåäžã®TPUãããå€ãã®TPUã䜿çšãããããããå
šäœã§ã¯ãªãå ŽåïŒããšãã°v3-32ïŒãTPUããªãŒãã¯**ãããã¹ã©ã€ã¹**ãšããŠåç
§ãããŸãã
Colabãä»ããŠç¡æã®TPUã«ã¢ã¯ã»ã¹ããå Žåãéåžžã¯åäžã®v2-8 TPUãæäŸãããŸãã
### I keep hearing about this XLA thing. Whatâs XLA, and how does it relate to TPUs?
XLAã¯ãTensorFlowãšJAXã®äž¡æ¹ã§äœ¿çšãããæé©åã³ã³ãã€ã©ã§ããJAXã§ã¯å¯äžã®ã³ã³ãã€ã©ã§ãããTensorFlowã§ã¯ãªãã·ã§ã³ã§ããïŒãããTPUã§ã¯å¿
é ã§ãïŒïŒãKerasã¢ãã«ããã¬ãŒãã³ã°ããéã«`model.compile()`ã«åŒæ°`jit_compile=True`ãæž¡ãããšã§æãç°¡åã«æå¹ã«ã§ããŸãããšã©ãŒãçºçãããããã©ãŒãã³ã¹ãè¯å¥œã§ããã°ãããã¯TPUã«ç§»è¡ããæºåãæŽã£ãè¯ãå
åã§ãïŒ
TPUäžã§ã®ãããã°ã¯äžè¬çã«CPU/GPUãããå°ãé£ãããããTPUã§è©Šãåã«ãŸãCPU/GPUã§XLAã䜿çšããŠã³ãŒããå®è¡ããããšããå§ãããŸãããã¡ãããé·æéãã¬ãŒãã³ã°ããå¿
èŠã¯ãããŸãããã¢ãã«ãšããŒã¿ãã€ãã©ã€ã³ãæåŸ
éãã«åäœãããã確èªããããã®æ°ã¹ãããã ãã§ãã
<Tip>
XLAã³ã³ãã€ã«ãããã³ãŒãã¯éåžžé«éã§ãããããã£ãŠãTPUã§å®è¡ããäºå®ããªãå Žåã§ãã`jit_compile=True`ã远å ããããšã§ããã©ãŒãã³ã¹ãåäžãããããšãã§ããŸãããã ãã以äžã®XLAäºææ§ã«é¢ããæ³šæäºé
ã«æ³šæããŠãã ããïŒ
</Tip>
<Tip warning={true}>
**èŠãçµéšããçãŸãããã³ã:** `jit_compile=True`ã䜿çšããããšã¯ãCPU/GPUã³ãŒããXLAäºæã§ããããšã確èªããé床ãåäžãããè¯ãæ¹æ³ã§ãããå®éã«TPUã§ã³ãŒããå®è¡ããéã«ã¯å€ãã®åé¡ãåŒãèµ·ããå¯èœæ§ããããŸãã XLAã³ã³ãã€ã«ã¯TPUäžã§æé»çã«è¡ããããããå®éã«ã³ãŒããTPUã§å®è¡ããåã«ãã®è¡ãåé€ããããšãå¿ããªãã§ãã ããïŒ
</Tip>
### How do I make my model XLA compatible?
å€ãã®å Žåãã³ãŒãã¯ãã§ã«XLAäºæãããããŸããïŒãã ããXLAã§ã¯åäœããéåžžã®TensorFlowã§ãåäœããªãããã€ãã®èŠçŽ ããããŸãã以äžã«ã3ã€ã®äž»èŠãªã«ãŒã«ã«ãŸãšããŠããŸãïŒ
<Tip>
**ð€ HuggingFaceåºæã®ãã³ãð€:** TensorFlowã¢ãã«ãšæå€±é¢æ°ãXLAäºæã«æžãçŽãããã«å€ãã®åªåãæã£ãŠããŸããéåžžãã¢ãã«ãšæå€±é¢æ°ã¯ããã©ã«ãã§ã«ãŒã«ïŒ1ãšïŒ2ã«åŸã£ãŠããããã`transformers`ã¢ãã«ã䜿çšããŠããå Žåã¯ããããã¹ãããã§ããŸãããã ããç¬èªã®ã¢ãã«ãšæå€±é¢æ°ãèšè¿°ããå Žåã¯ããããã®ã«ãŒã«ãå¿ããªãã§ãã ããïŒ
</Tip>
#### XLA Rule #1: Your code cannot have âdata-dependent conditionalsâ
ããã¯ãä»»æã®`if`ã¹ããŒãã¡ã³ãã`tf.Tensor`å
ã®å€ã«äŸåããŠããªãå¿
èŠãããããšãæå³ããŸããäŸãã°ã次ã®ã³ãŒããããã¯ã¯XLAã§ã³ã³ãã€ã«ã§ããŸããïŒ
```python
if tf.reduce_sum(tensor) > 10:
tensor = tensor / 2.0
```
ããã¯æåã¯éåžžã«å¶éçã«æãããããããŸããããã»ãšãã©ã®ãã¥ãŒã©ã«ãããã³ãŒãã¯ãããè¡ãå¿
èŠã¯ãããŸãããéåžžããã®å¶çŽãåé¿ããããã«`tf.cond`ã䜿çšãããïŒããã¥ã¡ã³ãã¯ãã¡ããåç
§ïŒãæ¡ä»¶ãåé€ããŠä»£ããã«æç€ºå€æ°ã䜿çšãããããããšãã§ããŸããæ¬¡ã®ããã«ïŒ
```python
sum_over_10 = tf.cast(tf.reduce_sum(tensor) > 10, tf.float32)
tensor = tensor / (1.0 + sum_over_10)
```
ãã®ã³ãŒãã¯ãäžèšã®ã³ãŒããšãŸã£ããåã广ãæã£ãŠããŸãããæ¡ä»¶ãåé¿ããããšã§ãXLAã§åé¡ãªãã³ã³ãã€ã«ã§ããããšã確èªããŸãïŒ
#### XLA Rule #2: Your code cannot have âdata-dependent shapesâ
ããã¯ãã³ãŒãå
ã®ãã¹ãŠã® `tf.Tensor` ãªããžã§ã¯ãã®åœ¢ç¶ãããã®å€ã«äŸåããªãããšãæå³ããŸããããšãã°ã`tf.unique` 颿°ã¯XLAã§ã³ã³ãã€ã«ã§ããªãã®ã§ããã®ã«ãŒã«ã«éåããŸãããªããªããããã¯å
¥å `Tensor` ã®äžæã®å€ã®åã€ã³ã¹ã¿ã³ã¹ãå«ã `tensor` ãè¿ãããã§ãããã®åºåã®åœ¢ç¶ã¯ãå
¥å `Tensor` ã®éè€å
·åã«ãã£ãŠç°ãªããããXLAã¯ãããåŠçããªãããšã«ãªããŸãïŒ
äžè¬çã«ãã»ãšãã©ã®ãã¥ãŒã©ã«ãããã¯ãŒã¯ã³ãŒãã¯ããã©ã«ãã§ã«ãŒã«ïŒ2ã«åŸããŸãããã ããããã€ãã®äžè¬çãªã±ãŒã¹ã§ã¯åé¡ãçºçããããšããããŸããéåžžã«äžè¬çãªã±ãŒã¹ã®1ã€ã¯ã**ã©ãã«ãã¹ãã³ã°**ã䜿çšããå Žåã§ããã©ãã«ãç¡èŠããŠæå€±ãèšç®ããå Žæã瀺ãããã«ãã©ãã«ãè² ã®å€ã«èšå®ããæ¹æ³ã§ããNumPyãŸãã¯PyTorchã®ã©ãã«ãã¹ãã³ã°ããµããŒãããæå€±é¢æ°ãèŠããšã次ã®ãããª[ããŒã«ã€ã³ããã¯ã¹](https://numpy.org/doc/stable/user/basics.indexing.html#boolean-array-indexing)ã䜿çšããã³ãŒããããèŠãããŸãïŒ
```python
label_mask = labels >= 0
masked_outputs = outputs[label_mask]
masked_labels = labels[label_mask]
loss = compute_loss(masked_outputs, masked_labels)
mean_loss = torch.mean(loss)
```
ãã®ã³ãŒãã¯NumPyãPyTorchã§ã¯å®å
šã«æ©èœããŸãããXLAã§ã¯åäœããŸããïŒãªããªãã`masked_outputs`ãš`masked_labels`ã®åœ¢ç¶ã¯ãã¹ã¯ãããäœçœ®ã®æ°ã«äŸåãããããããã¯**ããŒã¿äŸåã®åœ¢ç¶**ã«ãªããŸãããã ããã«ãŒã«ïŒ1ãšåæ§ã«ããã®ã³ãŒããæžãçŽããŠãããŒã¿äŸåã®åœ¢ç¶ãªãã§ãŸã£ããåãåºåãçæã§ããããšããããŸãã
```python
label_mask = tf.cast(labels >= 0, tf.float32)
loss = compute_loss(outputs, labels)
loss = loss * label_mask # Set negative label positions to 0
mean_loss = tf.reduce_sum(loss) / tf.reduce_sum(label_mask)
```
ããã§ã¯ãããŒã¿äŸåã®åœ¢ç¶ãé¿ããããã«ãåäœçœ®ã§æå€±ãèšç®ããŠãããå¹³åãèšç®ããéã«ååãšåæ¯ã®äž¡æ¹ã§ãã¹ã¯ãããäœçœ®ããŒãåããæ¹æ³ã玹ä»ããŸããããã«ãããæåã®ã¢ãããŒããšãŸã£ããåãçµæãåŸãããŸãããXLAäºææ§ãç¶æããŸããæ³šæç¹ãšããŠãã«ãŒã«ïŒ1ãšåãããªãã¯ã䜿çšããŸã - `tf.bool`ã`tf.float32`ã«å€æããŠææšå€æ°ãšããŠäœ¿çšããŸããããã¯éåžžã«äŸ¿å©ãªããªãã¯ã§ãã®ã§ãèªåã®ã³ãŒããXLAã«å€æããå¿
èŠãããå Žåã«ã¯èŠããŠãããŠãã ããïŒ
#### XLA Rule #3: XLA will need to recompile your model for every different input shape it sees
ããã¯éèŠãªã«ãŒã«ã§ããããã¯ã€ãŸããå
¥å圢ç¶ãéåžžã«å€åçãªå ŽåãXLA ã¯ã¢ãã«ãäœåºŠãåã³ã³ãã€ã«ããå¿
èŠãããããã倧ããªããã©ãŒãã³ã¹ã®åé¡ãçºçããå¯èœæ§ããããšããããšã§ãããã㯠NLP ã¢ãã«ã§äžè¬çã«çºçããããŒã¯ãã€ãºåŸã®å
¥åããã¹ãã®é·ããç°ãªãå ŽåããããŸããä»ã®ã¢ããªãã£ã§ã¯ãéçãªåœ¢ç¶ãäžè¬çã§ããããã®ã«ãŒã«ã¯ã»ãšãã©åé¡ã«ãªããŸããã
ã«ãŒã«ïŒ3ãåé¿ããæ¹æ³ã¯äœã§ããããïŒéµã¯ãããã£ã³ã°ãã§ã - ãã¹ãŠã®å
¥åãåãé·ãã«ããã£ã³ã°ããæ¬¡ã«ãattention_maskãã䜿çšããããšã§ãå¯å€åœ¢ç¶ãšåãçµæãåŸãããšãã§ããŸãããXLA ã®åé¡ã¯çºçããŸããããã ããé床ã®ããã£ã³ã°ãæ·±å»ãªé
å»¶ãåŒãèµ·ããå¯èœæ§ããããŸã - ããŒã¿ã»ããå
šäœã§æå€§ã®é·ãã«ãã¹ãŠã®ãµã³ãã«ãããã£ã³ã°ãããšãå€ãã®èšç®ãšã¡ã¢ãªãç¡é§ã«ããå¯èœæ§ããããŸãïŒ
ãã®åé¡ã«ã¯å®ç§ãªè§£æ±ºçã¯ãããŸããããããã€ãã®ããªãã¯ã詊ãããšãã§ããŸããéåžžã«äŸ¿å©ãªããªãã¯ã®1ã€ã¯ã**ãããã®ãµã³ãã«ã32ãŸãã¯64ããŒã¯ã³ã®åæ°ãŸã§ããã£ã³ã°ãã**ããšã§ããããã«ãããããŒã¯ã³æ°ããããã«å¢å ããã ãã§ããã¹ãŠã®å
¥å圢ç¶ã32ãŸãã¯64ã®åæ°ã§ããå¿
èŠããããããäžæã®å
¥å圢ç¶ã®æ°ã倧å¹
ã«æžå°ããŸããäžæã®å
¥å圢ç¶ãå°ãªããšãXLA ã®åã³ã³ãã€ã«ãå°ãªããªããŸãïŒ
<Tip>
**ð€ HuggingFace ã«é¢ããå
·äœçãªãã³ãð€:** åŒç€Ÿã®ããŒã¯ãã€ã¶ãŒãšããŒã¿ã³ã¬ã¯ã¿ãŒã«ã¯ãããã§åœ¹ç«ã€ã¡ãœããããããŸããããŒã¯ãã€ã¶ãŒãåŒã³åºãéã« `padding="max_length"` ãŸã㯠`padding="longest"` ã䜿çšããŠãããã£ã³ã°ãããããŒã¿ãåºåããããã«èšå®ã§ããŸããããŒã¯ãã€ã¶ãŒãšããŒã¿ã³ã¬ã¯ã¿ãŒã«ã¯ãäžæã®å
¥å圢ç¶ã®æ°ãæžããã®ã«åœ¹ç«ã€ `pad_to_multiple_of` åŒæ°ããããŸãïŒ
</Tip>
### How do I actually train my model on TPU?
äžåºŠãã¬ãŒãã³ã°ã XLA äºææ§ãããããšã確èªããïŒTPU Node/Colab ã䜿çšããå Žåã¯ïŒããŒã¿ã»ãããé©åã«æºåãããŠããå ŽåãTPU äžã§å®è¡ããããšã¯é©ãã»ã©ç°¡åã§ãïŒã³ãŒãã倿Žããå¿
èŠãããã®ã¯ãããã€ãã®è¡ã远å ã㊠TPU ãåæåããã¢ãã«ãšããŒã¿ã»ããã `TPUStrategy` ã¹ã³ãŒãå
ã§äœæãããããã«ããããšã ãã§ãããããå®éã«èŠãã«ã¯ã[TPU ã®ãµã³ãã«ããŒãããã¯](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb)ãã芧ãã ããïŒ
### Summary
ããã§ã¯å€ãã®æ
å ±ãæäŸãããŸããã®ã§ãTPU ã§ã¢ãã«ããã¬ãŒãã³ã°ããéã«ä»¥äžã®ãã§ãã¯ãªã¹ãã䜿çšã§ããŸãïŒ
- ã³ãŒãã XLA ã®äžã€ã®ã«ãŒã«ã«åŸã£ãŠããããšã確èªããŸãã
- CPU/GPU ã§ `jit_compile=True` ã䜿çšããŠã¢ãã«ãã³ã³ãã€ã«ããXLA ã§ãã¬ãŒãã³ã°ã§ããããšã確èªããŸãã
- ããŒã¿ã»ãããã¡ã¢ãªã«èªã¿èŸŒãããTPU äºæã®ããŒã¿ã»ããèªã¿èŸŒã¿ã¢ãããŒãã䜿çšããŸãïŒ[ããŒãããã¯ãåç
§](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb)ïŒã
- ã³ãŒãã ColabïŒã¢ã¯ã»ã©ã¬ãŒã¿ããTPUãã«èšå®ïŒãŸã㯠Google Cloud ã® TPU VM ã«ç§»è¡ããŸãã
- TPU åæåã³ãŒãã远å ããŸãïŒ[ããŒãããã¯ãåç
§](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb)ïŒã
- `TPUStrategy` ãäœæããããŒã¿ã»ããã®èªã¿èŸŒã¿ãšã¢ãã«ã®äœæã `strategy.scope()` å
ã§è¡ãããããšã確èªããŸãïŒ[ããŒãããã¯ãåç
§](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/tpu_training-tf.ipynb)ïŒã
- TPU ã«ç§»è¡ããéã« `jit_compile=True` ãå€ãã®ãå¿ããªãã§ãã ããïŒ
- ðððð¥ºð¥ºð¥º
- `model.fit()` ãåŒã³åºããŸãã
- ããã§ãšãããããŸãïŒ
|