Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"Low Precision Training Methods","local":"low-precision-training-methods","sections":[{"title":"What training on FP8 means","local":"what-training-on-fp8-means","sections":[],"depth":2},{"title":"Configuring the Accelerator","local":"configuring-the-accelerator","sections":[],"depth":2},{"title":"Configuring MS-AMP","local":"configuring-ms-amp","sections":[],"depth":2},{"title":"Configuring TransformersEngine","local":"configuring-transformersengine","sections":[],"depth":2},{"title":"Example Zoo","local":"example-zoo","sections":[],"depth":2},{"title":"Further Reading","local":"further-reading","sections":[],"depth":2}],"depth":1}"> | |
| <link href="/docs/accelerate/main/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/entry/start.2ea03080.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/chunks/scheduler.defa9a21.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/chunks/singletons.aff0b9fc.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/chunks/index.beade68d.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/chunks/paths.2c85d1a6.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/entry/app.e6812672.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/chunks/index.fe795e71.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/nodes/0.39c84d5d.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/nodes/45.fc6af5c9.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/chunks/CodeBlock.42404125.js"> | |
| <link rel="modulepreload" href="/docs/accelerate/main/en/_app/immutable/chunks/EditOnGithub.0f575778.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"Low Precision Training Methods","local":"low-precision-training-methods","sections":[{"title":"What training on FP8 means","local":"what-training-on-fp8-means","sections":[],"depth":2},{"title":"Configuring the Accelerator","local":"configuring-the-accelerator","sections":[],"depth":2},{"title":"Configuring MS-AMP","local":"configuring-ms-amp","sections":[],"depth":2},{"title":"Configuring TransformersEngine","local":"configuring-transformersengine","sections":[],"depth":2},{"title":"Example Zoo","local":"example-zoo","sections":[],"depth":2},{"title":"Further Reading","local":"further-reading","sections":[],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="low-precision-training-methods" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#low-precision-training-methods"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Low Precision Training Methods</span></h1> <p data-svelte-h="svelte-1h35032">🤗 Accelerate provides integrations to train on lower precision methods using specified supported hardware through the <code>TransformersEngine</code> and <code>MS-AMP</code> packages. This documentation will help guide you through what hardware is supported, how to configure your <a href="/docs/accelerate/main/en/package_reference/accelerator#accelerate.Accelerator">Accelerator</a> to leverage the low precision methods, and what you can expect when training.</p> <h2 class="relative group"><a id="what-training-on-fp8-means" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#what-training-on-fp8-means"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>What training on FP8 means</span></h2> <p data-svelte-h="svelte-of2aev">To explore more of the nitty-gritty in training in FP8 with PyTorch and 🤗 Accelerate, check out the <a href="../concept_guides/low_precision_training">concept_guide</a> on why this can be difficult. But essentially rather than training in BF16, some (or all) aspects of training a model can be performed using 8 bits instead of 16. The challenge is doing so without degrading final performance.</p> <p data-svelte-h="svelte-10cwb11">This is only enabled on specific NVIDIA hardware, namely:</p> <ul data-svelte-h="svelte-5d1df8"><li>Anything after the 3000 series consumer graphics cards (such as the 4090)</li> <li>Hopper-based GPU architectures (such as the <code>H100</code> and <code>H200</code>)</li></ul> <p data-svelte-h="svelte-1j45xey">What this will result in is some gain in the memory used (as we’ve cut the needed memory in half for some parts of training) and an increase in throughput <em>should</em> be seen as well for larger models that can replace certain layers with FP8-enabled ones.</p> <h2 class="relative group"><a id="configuring-the-accelerator" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#configuring-the-accelerator"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Configuring the Accelerator</span></h2> <p data-svelte-h="svelte-ewzxpg">Currently two different backends for FP8 are supported (<code>TransformersEngine</code> and <code>MS-AMP</code>), each with different capabilities and configurations.</p> <p data-svelte-h="svelte-md67pg">To use either, the same core API is used. Just pass <code>mixed_precision="fp8"</code> to either the <a href="/docs/accelerate/main/en/package_reference/accelerator#accelerate.Accelerator">Accelerator</a>, during <code>accelerate config</code> when prompted about mixed precision, or as part of your <code>config.yaml</code> file in the <code>mixed_precision</code> key:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->from accelerate import Accelerator | |
| <span class="hljs-attribute">accelerator</span> <span class="hljs-operator">=</span> Accelerator(mixed_precision<span class="hljs-operator">=</span><span class="hljs-string">"fp8"</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-wk8to7">By default, if <code>MS-AMP</code> is available in your environment, 🤗 Accelerate will automatically utilize it as a backend. To specify it yourself (and customize other parts of the FP8 mixed precision setup), you can utilize the <a href="/docs/accelerate/main/en/package_reference/fp8#accelerate.utils.FP8RecipeKwargs">utils.FP8RecipeKwargs</a> or clarify it in your config <code>yaml</code>/during <code>accelerate launch</code>:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> accelerate import Accelerator | |
| <span class="hljs-keyword">from</span> accelerate.utils import FP8RecipeKwargs | |
| kwargs = [FP8RecipeKwargs(<span class="hljs-attribute">backend</span>=<span class="hljs-string">"msamp"</span>)] | |
| <span class="hljs-comment"># Or to specify the backend as `TransformersEngine` even if MS-AMP is installed</span> | |
| <span class="hljs-comment"># kwargs = [FP8RecipeKwargs(backend="te")]</span> | |
| accelerator = Accelerator(<span class="hljs-attribute">mixed_precision</span>=<span class="hljs-string">"fp8"</span>, <span class="hljs-attribute">kwarg_handlers</span>=kwargs)<!-- HTML_TAG_END --></pre></div> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-attr">mixed_precision:</span> <span class="hljs-string">fp8</span> | |
| <span class="hljs-attr">fp8_config:</span> | |
| <span class="hljs-attr">amax_compute_algorithm:</span> <span class="hljs-string">max</span> | |
| <span class="hljs-attr">amax_history_length:</span> <span class="hljs-number">1024</span> | |
| <span class="hljs-attr">backend:</span> <span class="hljs-string">TE</span> | |
| <span class="hljs-attr">fp8_format:</span> <span class="hljs-string">HYBRID</span> | |
| <span class="hljs-attr">interval:</span> <span class="hljs-number">1</span> | |
| <span class="hljs-attr">margin:</span> <span class="hljs-number">0</span> | |
| <span class="hljs-attr">override_linear_precision:</span> <span class="hljs-literal">false</span> | |
| <span class="hljs-attr">use_autocast_during_eval:</span> <span class="hljs-literal">false</span><!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="configuring-ms-amp" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#configuring-ms-amp"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Configuring MS-AMP</span></h2> <p data-svelte-h="svelte-13jlsqr">Of the two, <code>MS-AMP</code> is traditionally the easier one to configure as there is only a single argument: the optimization level.</p> <p data-svelte-h="svelte-11lh24s">Currently two levels of optimization are supported in the 🤗 Accelerate integration, <code>"O1"</code> and <code>"O2"</code> (using the letter ‘o’, not zero).</p> <ul data-svelte-h="svelte-ha185h"><li><code>"O1"</code> will cast the weight gradients and <code>all_reduce</code> communications to happen in 8-bit, while the rest are done in 16 bit. This reduces the general GPU memory usage and speeds up communication bandwidths.</li> <li><code>"O2"</code> will also cast first-order optimizer states into 8 bit, while the second order states are in FP16. (Currently just the <code>Adam</code> optimizer is supported). This tries its best to minimize final accuracy degradation and will save the highest potential memory.</li></ul> <p data-svelte-h="svelte-wx6vs8">To specify an optimization level, pass it to the <code>FP8KwargsHandler</code> by setting the <code>optimization_level</code> argument:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> accelerate import Accelerator | |
| <span class="hljs-keyword">from</span> accelerate.utils import FP8RecipeKwargs | |
| kwargs = [FP8RecipeKwargs(<span class="hljs-attribute">backend</span>=<span class="hljs-string">"msamp"</span>, <span class="hljs-attribute">optimization_level</span>=<span class="hljs-string">"O2"</span>)] | |
| accelerator = Accelerator(<span class="hljs-attribute">mixed_precision</span>=<span class="hljs-string">"fp8"</span>, <span class="hljs-attribute">kwarg_handlers</span>=kwargs)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-vo4vij">Or during <code>accelerate launch</code> via <code>--fp8_backend=msamp --fp8_opt_level=O2</code></p> <p data-svelte-h="svelte-1n0fir7">Similarly this can be set in your <code>config.yaml</code>:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-symbol">mixed_precision:</span> fp8 | |
| <span class="hljs-symbol">fp8_config:</span> | |
| <span class="hljs-symbol"> backend:</span> MSAMP | |
| <span class="hljs-symbol"> opt_level:</span> O2<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="configuring-transformersengine" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#configuring-transformersengine"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Configuring TransformersEngine</span></h2> <p data-svelte-h="svelte-ngle0n">TransformersEngine has much more available for customizing how and what FP8 calculations are performed. A full list of supported arguments and what they mean are available in <a href="https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/api/common.html" rel="nofollow">NVIDIA’s documentation</a>, however they are restated as part of <code>FP8KwargsHandler</code>’s docstring for your convenience.</p> <p data-svelte-h="svelte-18df25g">🤗 Accelerate tries to set sensible defaults, but exploring and tweaking the various parameters yourself can lead to better performance potentially.</p> <p data-svelte-h="svelte-8khoko">To use it, specify <code>backend="te"</code> and modify any of the arguments you want as part of your kwarg handler:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> accelerate import Accelerator | |
| <span class="hljs-keyword">from</span> accelerate.utils import FP8RecipeKwargs | |
| kwargs = [FP8RecipeKwargs(<span class="hljs-attribute">backend</span>=<span class="hljs-string">"te"</span>, <span class="hljs-built_in">..</span>.)] | |
| accelerator = Accelerator(<span class="hljs-attribute">mixed_precision</span>=<span class="hljs-string">"fp8"</span>, <span class="hljs-attribute">kwarg_handlers</span>=kwargs)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-99qxte">Or during <code>accelerate launch</code> via <code>--fp8_backend=te ...</code>. Use <code>accelerate launch --fp8_backend=te -h</code> to see relevent arguments.</p> <p data-svelte-h="svelte-1n0fir7">Similarly this can be set in your <code>config.yaml</code>:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-attr">mixed_precision:</span> <span class="hljs-string">fp8</span> | |
| <span class="hljs-attr">fp8_config:</span> | |
| <span class="hljs-attr">amax_compute_algorithm:</span> <span class="hljs-string">max</span> | |
| <span class="hljs-attr">amax_history_length:</span> <span class="hljs-number">1024</span> | |
| <span class="hljs-attr">backend:</span> <span class="hljs-string">TE</span> | |
| <span class="hljs-attr">fp8_format:</span> <span class="hljs-string">HYBRID</span> | |
| <span class="hljs-attr">interval:</span> <span class="hljs-number">1</span> | |
| <span class="hljs-attr">margin:</span> <span class="hljs-number">0</span> | |
| <span class="hljs-attr">override_linear_precision:</span> <span class="hljs-literal">false</span> | |
| <span class="hljs-attr">use_autocast_during_eval:</span> <span class="hljs-literal">false</span><!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="example-zoo" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#example-zoo"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Example Zoo</span></h2> <p data-svelte-h="svelte-1ay0trc">We have examples showcasing training with FP8 both with accelerate and its underlying implementation available in the accelerate repo. | |
| Currently we support scripts showcasing:</p> <ul data-svelte-h="svelte-1affbo7"><li>Single GPU</li> <li>Distributed Data Parallelism (Multi-GPU)</li> <li>Fully Sharded Data Parallelism</li> <li>DeepSpeed ZeRO 1 through 3</li></ul> <p data-svelte-h="svelte-sau342">Find out more <a href="https://github.com/huggingface/accelerate/tree/main/benchmarks/fp8" rel="nofollow">here</a></p> <h2 class="relative group"><a id="further-reading" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#further-reading"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Further Reading</span></h2> <p data-svelte-h="svelte-t5s4ol">To learn more about training in FP8 please check out the following resources:</p> <ul data-svelte-h="svelte-1ipwzpl"><li><a href="../concept_guides/low_precision_training">Our concept guide</a> detailing into more about both TransformersEngine and MS-AMP</li> <li><a href="https://docs.nvidia.com/deeplearning/transformer-engine/user-guide/api/common.html" rel="nofollow">The <code>transformers-engine</code> documentation</a></li> <li><a href="https://azure.github.io/MS-AMP/docs/" rel="nofollow">The <code>MS-AMP</code> documentation</a></li></ul> <a class="!text-gray-400 !no-underline text-sm flex items-center not-prose mt-4" href="https://github.com/huggingface/accelerate/blob/main/docs/source/usage_guides/low_precision_training.md" target="_blank"><span data-svelte-h="svelte-1kd6by1"><</span> <span data-svelte-h="svelte-x0xyl0">></span> <span data-svelte-h="svelte-1dajgef"><span class="underline ml-1.5">Update</span> on GitHub</span></a> <p></p> | |
| <script> | |
| { | |
| __sveltekit_1fyccrg = { | |
| assets: "/docs/accelerate/main/en", | |
| base: "/docs/accelerate/main/en", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/accelerate/main/en/_app/immutable/entry/start.2ea03080.js"), | |
| import("/docs/accelerate/main/en/_app/immutable/entry/app.e6812672.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 45], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 31 kB
- Xet hash:
- 9dbd10fbea22d714a34270dd598216052c4f3fe401fd1c3b2a29caf0de2d963c
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.