Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Efficient Training on CPU","local":"efficient-training-on-cpu","sections":[{"title":"Mixed precision with IPEX","local":"mixed-precision-with-ipex","sections":[{"title":"IPEX installation:","local":"ipex-installation","sections":[],"depth":3},{"title":"Usage in Trainer","local":"usage-in-trainer","sections":[],"depth":3},{"title":"Practice example","local":"practice-example","sections":[],"depth":3}],"depth":2}],"depth":1}"> | |
| <link href="/docs/transformers/pr_33686/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/entry/start.353d2327.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/chunks/scheduler.25b97de1.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/chunks/singletons.7c4973e1.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/chunks/index.e188933d.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/chunks/paths.0de49938.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/entry/app.045023f1.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/chunks/index.d9030fc9.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/nodes/0.f5f21eaa.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/nodes/372.b9f024a4.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/chunks/CodeBlock.e6cd0d95.js"> | |
| <link rel="modulepreload" href="/docs/transformers/pr_33686/en/_app/immutable/chunks/EditOnGithub.91d95064.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Efficient Training on CPU","local":"efficient-training-on-cpu","sections":[{"title":"Mixed precision with IPEX","local":"mixed-precision-with-ipex","sections":[{"title":"IPEX installation:","local":"ipex-installation","sections":[],"depth":3},{"title":"Usage in Trainer","local":"usage-in-trainer","sections":[],"depth":3},{"title":"Practice example","local":"practice-example","sections":[],"depth":3}],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="efficient-training-on-cpu" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#efficient-training-on-cpu"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Efficient Training on CPU</span></h1> <p data-svelte-h="svelte-16zn87f">This guide focuses on training large models efficiently on CPU.</p> <h2 class="relative group"><a id="mixed-precision-with-ipex" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#mixed-precision-with-ipex"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Mixed precision with IPEX</span></h2> <p data-svelte-h="svelte-1t5d7gz">Mixed precision uses single (fp32) and half-precision (bf16/fp16) data types in a model to accelerate training or inference while still preserving much of the single-precision accuracy. Modern CPUs such as 3rd and 4th Gen Intel® Xeon® Scalable processors natively support bf16, so you should get more performance out of the box by enabling mixed precision training with bf16.</p> <p data-svelte-h="svelte-7ixgfk">To further maximize training performance, you can use Intel® Extension for PyTorch (IPEX), which is a library built on PyTorch and adds additional CPU instruction level architecture (ISA) level support such as Intel® Advanced Vector Extensions 512 Vector Neural Network Instructions (Intel® AVX512-VNNI), and Intel® Advanced Matrix Extensions (Intel® AMX) for an extra performance boost on Intel CPUs. However, CPUs with only AVX2 (e.g., AMD or older Intel CPUs) are not guaranteed to have better performance under IPEX.</p> <p data-svelte-h="svelte-1aa4rsn">Auto Mixed Precision (AMP) for CPU backends has been enabled since PyTorch 1.10. AMP support for bf16 on CPUs and bf16 operator optimization is also supported in IPEX and partially upstreamed to the main PyTorch branch. You can get better performance and user experience with IPEX AMP.</p> <p data-svelte-h="svelte-1orckb5">Check more detailed information for <a href="https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/features/amp.html" rel="nofollow">Auto Mixed Precision</a>.</p> <h3 class="relative group"><a id="ipex-installation" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#ipex-installation"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>IPEX installation:</span></h3> <p data-svelte-h="svelte-yt7jgu">IPEX release is following PyTorch, to install via pip:</p> <table data-svelte-h="svelte-63v1d5"><thead><tr><th align="center">PyTorch Version</th> <th align="center">IPEX version</th></tr></thead> <tbody><tr><td align="center">2.1.x</td> <td align="center">2.1.100+cpu</td></tr> <tr><td align="center">2.0.x</td> <td align="center">2.0.100+cpu</td></tr> <tr><td align="center">1.13</td> <td align="center">1.13.0+cpu</td></tr> <tr><td align="center">1.12</td> <td align="center">1.12.300+cpu</td></tr></tbody></table> <p data-svelte-h="svelte-op7gjt">Please run <code>pip list | grep torch</code> to get your <code>pytorch_version</code>, so you can get the <code>IPEX version_name</code>.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->pip install intel_extension_for_pytorch==<version_name> -f https://developer.intel.com/ipex-whl-stable-cpu<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-6sdt3v">You can check the latest versions in <a href="https://developer.intel.com/ipex-whl-stable-cpu" rel="nofollow">ipex-whl-stable-cpu</a> if needed.</p> <p data-svelte-h="svelte-6x6sx">Check more approaches for <a href="https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/installation.html" rel="nofollow">IPEX installation</a>.</p> <h3 class="relative group"><a id="usage-in-trainer" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#usage-in-trainer"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Usage in Trainer</span></h3> <p data-svelte-h="svelte-550t3u">To enable auto mixed precision with IPEX in Trainer, users should add <code>use_ipex</code>, <code>bf16</code> and <code>no_cuda</code> in training command arguments.</p> <p data-svelte-h="svelte-1t7mun5">Take an example of the use cases on <a href="https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering" rel="nofollow">Transformers question-answering</a></p> <ul data-svelte-h="svelte-1p4s7xp"><li>Training with IPEX using BF16 auto mixed precision on CPU:<pre> python run_qa.py \ | |
| --model_name_or_path google-bert/bert-base-uncased \ | |
| --dataset_name squad \ | |
| --do_train \ | |
| --do_eval \ | |
| --per_device_train_batch_size 12 \ | |
| --learning_rate 3e-5 \ | |
| --num_train_epochs 2 \ | |
| --max_seq_length 384 \ | |
| --doc_stride 128 \ | |
| --output_dir /tmp/debug_squad/ \ | |
| <b>--use_ipex</b> \ | |
| <b>--bf16</b> \ | |
| <b>--use_cpu</b></pre></li></ul> <p data-svelte-h="svelte-w7vrf2">If you want to enable <code>use_ipex</code> and <code>bf16</code> in your script, add these parameters to <code>TrainingArguments</code> like this:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->training_args = TrainingArguments( | |
| output_dir=args.output_path, | |
| <span class="hljs-addition">+ bf16=True,</span> | |
| <span class="hljs-addition">+ use_ipex=True,</span> | |
| <span class="hljs-addition">+ use_cpu=True,</span> | |
| **kwargs | |
| )<!-- HTML_TAG_END --></pre></div> <h3 class="relative group"><a id="practice-example" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#practice-example"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Practice example</span></h3> <p data-svelte-h="svelte-w7zs5j">Blog: <a href="https://huggingface.co/blog/intel-sapphire-rapids" rel="nofollow">Accelerating PyTorch Transformers with Intel Sapphire Rapids</a></p> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/transformers/blob/main/docs/source/en/perf_train_cpu.md" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_gh72n0 = { | |
| assets: "/docs/transformers/pr_33686/en", | |
| base: "/docs/transformers/pr_33686/en", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/transformers/pr_33686/en/_app/immutable/entry/start.353d2327.js"), | |
| import("/docs/transformers/pr_33686/en/_app/immutable/entry/app.045023f1.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 372], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 16.2 kB
- Xet hash:
- 9ace72c644571e69ec8eca8b913c5d58013162d9c6546c53396d462985f57e4e
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.