Buckets:
| <meta charset="utf-8" /><meta name="hf:doc:metadata" content="{"title":"bitsandbytes","local":"bitsandbytes","sections":[{"title":"TL;DR","local":"tldr","sections":[],"depth":2},{"title":"Features","local":"features","sections":[],"depth":2},{"title":"Requirements & Installation","local":"requirements--installation","sections":[],"depth":2},{"title":"Using bitsandbytes","local":"using-bitsandbytes","sections":[{"title":"Using Int8 Matrix Multiplication","local":"using-int8-matrix-multiplication","sections":[],"depth":3},{"title":"Using the 8-bit Optimizers","local":"using-the-8-bit-optimizers","sections":[],"depth":3},{"title":"Change Bits and other Hyperparameters for Individual Parameters","local":"change-bits-and-other-hyperparameters-for-individual-parameters","sections":[],"depth":3},{"title":"Fairseq Users","local":"fairseq-users","sections":[],"depth":3}],"depth":2},{"title":"Release and Feature History","local":"release-and-feature-history","sections":[],"depth":2},{"title":"Errors","local":"errors","sections":[],"depth":2},{"title":"Compile from source","local":"compile-from-source","sections":[],"depth":2},{"title":"License","local":"license","sections":[],"depth":2},{"title":"How to cite us","local":"how-to-cite-us","sections":[],"depth":2}],"depth":1}"> | |
| <link href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/assets/0.e3b0c442.css" rel="modulepreload"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/entry/start.17d43515.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/chunks/scheduler.9680c161.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/chunks/singletons.c50b9a57.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/chunks/index.9d57cde4.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/chunks/paths.123582f1.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/entry/app.0b902398.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/chunks/index.8ae9bd2f.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/nodes/0.fe724aa5.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/chunks/each.e59479a4.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/nodes/2.e8058f18.js"> | |
| <link rel="modulepreload" href="/docs/bitsandbytes/v0.42.0/en/_app/immutable/chunks/Heading.78e3d528.js"><!-- HEAD_svelte-u9bgzb_START --><meta name="hf:doc:metadata" content="{"title":"bitsandbytes","local":"bitsandbytes","sections":[{"title":"TL;DR","local":"tldr","sections":[],"depth":2},{"title":"Features","local":"features","sections":[],"depth":2},{"title":"Requirements & Installation","local":"requirements--installation","sections":[],"depth":2},{"title":"Using bitsandbytes","local":"using-bitsandbytes","sections":[{"title":"Using Int8 Matrix Multiplication","local":"using-int8-matrix-multiplication","sections":[],"depth":3},{"title":"Using the 8-bit Optimizers","local":"using-the-8-bit-optimizers","sections":[],"depth":3},{"title":"Change Bits and other Hyperparameters for Individual Parameters","local":"change-bits-and-other-hyperparameters-for-individual-parameters","sections":[],"depth":3},{"title":"Fairseq Users","local":"fairseq-users","sections":[],"depth":3}],"depth":2},{"title":"Release and Feature History","local":"release-and-feature-history","sections":[],"depth":2},{"title":"Errors","local":"errors","sections":[],"depth":2},{"title":"Compile from source","local":"compile-from-source","sections":[],"depth":2},{"title":"License","local":"license","sections":[],"depth":2},{"title":"How to cite us","local":"how-to-cite-us","sections":[],"depth":2}],"depth":1}"><!-- HEAD_svelte-u9bgzb_END --> <p></p> <h1 class="relative group"><a id="bitsandbytes" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#bitsandbytes"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>bitsandbytes</span></h1> <p data-svelte-h="svelte-1524s61">The bitsandbytes is a lightweight wrapper around CUDA custom functions, in particular 8-bit optimizers, matrix multiplication (LLM.int8()), and quantization functions.</p> <p data-svelte-h="svelte-1plfg5n">Resources:</p> <ul data-svelte-h="svelte-1c7h3as"><li><p><a href="https://arxiv.org/abs/2110.02861" rel="nofollow">8-bit Optimizer Paper</a> — <a href="https://www.youtube.com/watch?v=IxrlHAJtqKE" rel="nofollow">Video</a> — <a href="https://bitsandbytes.readthedocs.io/en/latest/" rel="nofollow">Docs</a></p></li> <li><p><a href="https://arxiv.org/abs/2208.07339" rel="nofollow">LLM.int8() Paper</a> — <a href="https://huggingface.co/blog/hf-bitsandbytes-integration" rel="nofollow">LLM.int8() Software Blog Post</a> — <a href="https://timdettmers.com/2022/08/17/llm-int8-and-emergent-features/" rel="nofollow">LLM.int8() Emergent Features Blog Post</a></p></li></ul> <h2 class="relative group"><a id="tldr" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#tldr"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>TL;DR</span></h2> <p data-svelte-h="svelte-l1632j"><strong>Requirements</strong> | |
| Python >=3.8. Linux distribution (Ubuntu, MacOS, etc.) + CUDA > 10.0.</p> <p data-svelte-h="svelte-1p647sl">(Deprecated: CUDA 10.0 is deprecated and only CUDA >= 11.0) will be supported with release 0.39.0)</p> <p data-svelte-h="svelte-np8g0f"><strong>Installation</strong>:</p> <p data-svelte-h="svelte-e58kdn"><code>pip install bitsandbytes</code></p> <p data-svelte-h="svelte-1g755db">In some cases it can happen that you need to compile from source. If this happens please consider submitting a bug report with <code>python -m bitsandbytes</code> information. What now follows is some short instructions which might work out of the box if <code>nvcc</code> is installed. If these do not work see further below.</p> <p data-svelte-h="svelte-d8urui">Compilation quickstart:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->git <span class="hljs-built_in">clone</span> https://github.com/timdettmers/bitsandbytes.git | |
| <span class="hljs-built_in">cd</span> bitsandbytes | |
| <span class="hljs-comment"># CUDA_VERSIONS in {110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 120}</span> | |
| <span class="hljs-comment"># make argument in {cuda110, cuda11x, cuda12x}</span> | |
| <span class="hljs-comment"># if you do not know what CUDA you have, try looking at the output of: python -m bitsandbytes</span> | |
| CUDA_VERSION=117 make cuda11x | |
| python setup.py install<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-14fm4sn"><strong>Using Int8 inference with HuggingFace Transformers</strong></p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoModelForCausalLM | |
| model = AutoModelForCausalLM.from_pretrained( | |
| <span class="hljs-string">'decapoda-research/llama-7b-hf'</span>, | |
| device_map=<span class="hljs-string">'auto'</span>, | |
| load_in_8bit=<span class="hljs-literal">True</span>, | |
| max_memory=<span class="hljs-string">f'<span class="hljs-subst">{<span class="hljs-built_in">int</span>(torch.cuda.mem_get_info()[<span class="hljs-number">0</span>]/<span class="hljs-number">1024</span>**<span class="hljs-number">3</span>)-<span class="hljs-number">2</span>}</span>GB'</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-l4ydsj">A more detailed example, can be found in <a href="examples/int8_inference_huggingface.py">examples/int8_inference_huggingface.py</a>.</p> <p data-svelte-h="svelte-225eje"><strong>Using 8-bit optimizer</strong>:</p> <ol data-svelte-h="svelte-byx1wb"><li>Comment out optimizer: <code>#torch.optim.Adam(....)</code></li> <li>Add 8-bit optimizer of your choice <code>bnb.optim.Adam8bit(....)</code> (arguments stay the same)</li> <li>Replace embedding layer if necessary: <code>torch.nn.Embedding(..) -> bnb.nn.Embedding(..)</code></li></ol> <p data-svelte-h="svelte-yfx4r6"><strong>Using 8-bit Inference</strong>:</p> <ol data-svelte-h="svelte-qovb4e"><li>Comment out torch.nn.Linear: <code>#linear = torch.nn.Linear(...)</code></li> <li>Add bnb 8-bit linear light module: <code>linear = bnb.nn.Linear8bitLt(...)</code> (base arguments stay the same)</li> <li>There are two modes:<ul><li>Mixed 8-bit training with 16-bit main weights. Pass the argument <code>has_fp16_weights=True</code> (default)</li> <li>Int8 inference. Pass the argument <code>has_fp16_weights=False</code></li></ul></li> <li>To use the full LLM.int8() method, use the <code>threshold=k</code> argument. We recommend <code>k=6.0</code>.</li></ol> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-comment"># LLM.int8()</span> | |
| linear = bnb.nn.Linear8bitLt(dim1, dim2, bias=<span class="hljs-literal">True</span>, has_fp16_weights=<span class="hljs-literal">False</span>, threshold=<span class="hljs-number">6.0</span>) | |
| <span class="hljs-comment"># inputs need to be fp16</span> | |
| out = linear(x.to(torch.float16))<!-- HTML_TAG_END --></pre></div> <h2 class="relative group"><a id="features" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#features"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Features</span></h2> <ul data-svelte-h="svelte-2fzoyi"><li>8-bit Matrix multiplication with mixed precision decomposition</li> <li>LLM.int8() inference</li> <li>8-bit Optimizers: Adam, AdamW, RMSProp, LARS, LAMB, Lion (saves 75% memory)</li> <li>Stable Embedding Layer: Improved stability through better initialization, and normalization</li> <li>8-bit quantization: Quantile, Linear, and Dynamic quantization</li> <li>Fast quantile estimation: Up to 100x faster than other algorithms</li></ul> <h2 class="relative group"><a id="requirements--installation" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#requirements--installation"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Requirements & Installation</span></h2> <p data-svelte-h="svelte-1k7sgnb">Requirements: anaconda, cudatoolkit, pytorch</p> <p data-svelte-h="svelte-i4ijdu">Hardware requirements:</p> <ul data-svelte-h="svelte-mfep4l"><li>LLM.int8(): NVIDIA Turing (RTX 20xx; T4) or Ampere GPU (RTX 30xx; A4-A100); (a GPU from 2018 or newer).</li> <li>8-bit optimizers and quantization: NVIDIA Kepler GPU or newer (>=GTX 78X).</li></ul> <p data-svelte-h="svelte-yo9g3h">Supported CUDA versions: 10.2 - 12.0</p> <p data-svelte-h="svelte-he1w22">The bitsandbytes library is currently only supported on Linux distributions. Windows is not supported at the moment.</p> <p data-svelte-h="svelte-zum5gu">The requirements can best be fulfilled by installing pytorch via anaconda. You can install PyTorch by following the <a href="https://pytorch.org/get-started/locally/" rel="nofollow">“Get Started”</a> instructions on the official website.</p> <p data-svelte-h="svelte-16ib0br">To install run:</p> <p data-svelte-h="svelte-e58kdn"><code>pip install bitsandbytes</code></p> <h2 class="relative group"><a id="using-bitsandbytes" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#using-bitsandbytes"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Using bitsandbytes</span></h2> <h3 class="relative group"><a id="using-int8-matrix-multiplication" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#using-int8-matrix-multiplication"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Using Int8 Matrix Multiplication</span></h3> <p data-svelte-h="svelte-rlddd">For straight Int8 matrix multiplication with mixed precision decomposition you can use <code>bnb.matmul(...)</code>. To enable mixed precision decomposition, use the threshold parameter:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->bnb.matmul(..., threshold=<span class="hljs-number">6.0</span>)<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-ul3e7v">For instructions how to use LLM.int8() inference layers in your own code, see the TL;DR above or for extended instruction see <a href="https://huggingface.co/blog/hf-bitsandbytes-integration" rel="nofollow">this blog post</a>.</p> <h3 class="relative group"><a id="using-the-8-bit-optimizers" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#using-the-8-bit-optimizers"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Using the 8-bit Optimizers</span></h3> <p data-svelte-h="svelte-14y3jdv">With bitsandbytes 8-bit optimizers can be used by changing a single line of code in your codebase. For NLP models we recommend also to use the StableEmbedding layers (see below) which improves results and helps with stable 8-bit optimization. To get started with 8-bit optimizers, it is sufficient to replace your old optimizer with the 8-bit optimizer in the following way:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-keyword">import</span> bitsandbytes <span class="hljs-keyword">as</span> bnb | |
| <span class="hljs-comment"># adam = torch.optim.Adam(model.parameters(), lr=0.001, betas=(0.9, 0.995)) # comment out old optimizer</span> | |
| adam = bnb.optim.Adam8bit(model.parameters(), lr=<span class="hljs-number">0.001</span>, betas=(<span class="hljs-number">0.9</span>, <span class="hljs-number">0.995</span>)) <span class="hljs-comment"># add bnb optimizer</span> | |
| adam = bnb.optim.Adam(model.parameters(), lr=<span class="hljs-number">0.001</span>, betas=(<span class="hljs-number">0.9</span>, <span class="hljs-number">0.995</span>), optim_bits=<span class="hljs-number">8</span>) <span class="hljs-comment"># equivalent</span> | |
| torch.nn.Embedding(...) -> bnb.nn.StableEmbedding(...) <span class="hljs-comment"># recommended for NLP models</span><!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1ime86q">Note that by default all parameter tensors with less than 4096 elements are kept at 32-bit even if you initialize those parameters with 8-bit optimizers. This is done since such small tensors do not save much memory and often contain highly variable parameters (biases) or parameters that require high precision (batch norm, layer norm). You can change this behavior like so:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START --><span class="hljs-comment"># parameter tensors with less than 16384 values are optimized in 32-bit</span> | |
| <span class="hljs-comment"># it is recommended to use multiplies of 4096</span> | |
| adam = bnb.optim.Adam8bit(model.parameters(), min_8bit_size=<span class="hljs-number">16384</span>)<!-- HTML_TAG_END --></pre></div> <h3 class="relative group"><a id="change-bits-and-other-hyperparameters-for-individual-parameters" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#change-bits-and-other-hyperparameters-for-individual-parameters"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Change Bits and other Hyperparameters for Individual Parameters</span></h3> <p data-svelte-h="svelte-1xc54m3">If you want to optimize some unstable parameters with 32-bit Adam and others with 8-bit Adam, you can use the <code>GlobalOptimManager</code>. With this, we can also configure specific hyperparameters for particular layers, such as embedding layers. To do that, we need two things: (1) register the parameter while they are still on the CPU, (2) override the config with the new desired hyperparameters (anytime, anywhere). See our <a href="howto_config_override.md">guide</a> for more details</p> <h3 class="relative group"><a id="fairseq-users" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#fairseq-users"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Fairseq Users</span></h3> <p data-svelte-h="svelte-11h35ws">To use the Stable Embedding Layer, override the respective <code>build_embedding(...)</code> function of your model. Make sure to also use the <code>--no-scale-embedding</code> flag to disable scaling of the word embedding layer (nor replaced with layer norm). You can use the optimizers by replacing the optimizer in the respective file (<code>adam.py</code> etc.).</p> <h2 class="relative group"><a id="release-and-feature-history" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#release-and-feature-history"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Release and Feature History</span></h2> <p data-svelte-h="svelte-1ezgurm">For upcoming features and changes and full history see <a href="CHANGELOG.md">Patch Notes</a>.</p> <h2 class="relative group"><a id="errors" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#errors"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Errors</span></h2> <ol data-svelte-h="svelte-lqs4qk"><li>RuntimeError: CUDA error: no kernel image is available for execution on the device. <a href="errors_and_solutions.md#No-kernel-image-available">Solution</a></li> <li>_<em>fatbinwrap</em>.. <a href="errors_and_solutions.md#fatbinwrap_">Solution</a></li></ol> <h2 class="relative group"><a id="compile-from-source" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#compile-from-source"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>Compile from source</span></h2> <p data-svelte-h="svelte-9ia3cj">To compile from source, you need an installation of CUDA. If <code>nvcc</code> is not installed, you can install the CUDA Toolkit with nvcc through the following commands.</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->wget https://raw.githubusercontent.com/TimDettmers/bitsandbytes/main/install_cuda.sh | |
| <span class="hljs-comment"># Syntax cuda_install CUDA_VERSION INSTALL_PREFIX EXPORT_TO_BASH</span> | |
| <span class="hljs-comment"># CUDA_VERSION in {110, 111, 112, 113, 114, 115, 116, 117, 118, 120, 121, 122}</span> | |
| <span class="hljs-comment"># EXPORT_TO_BASH in {0, 1} with 0=False and 1=True</span> | |
| <span class="hljs-comment"># For example, the following installs CUDA 11.7 to ~/local/cuda-11.7 and exports the path to your .bashrc</span> | |
| bash install_cuda.sh 117 ~/local 1<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-1r4mw8p">To use a specific CUDA version just for a single compile run, you can set the variable <code>CUDA_HOME</code>, for example the following command compiles <code>libbitsandbytes_cuda117.so</code> using compiler flags for cuda11x with the cuda version at <code>~/local/cuda-11.7</code>:</p> <p data-svelte-h="svelte-rmvlv2"><code>CUDA_HOME=~/local/cuda-11.7 CUDA_VERSION=117 make cuda11x</code></p> <p data-svelte-h="svelte-lecznc">For more detailed instruction, please follow the <a href="compile_from_source.md">compile_from_source.md</a> instructions.</p> <h2 class="relative group"><a id="license" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#license"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>License</span></h2> <p data-svelte-h="svelte-luhn2w">The majority of bitsandbytes is licensed under MIT, however portions of the project are available under separate license terms: Pytorch is licensed under the BSD license.</p> <p data-svelte-h="svelte-18bjayv">We thank Fabio Cannizzo for his work on <a href="https://github.com/fabiocannizzo/FastBinarySearch" rel="nofollow">FastBinarySearch</a> which we use for CPU quantization.</p> <h2 class="relative group"><a id="how-to-cite-us" class="header-link block pr-1.5 text-lg no-hover:hidden with-hover:absolute with-hover:p-1.5 with-hover:opacity-0 with-hover:group-hover:opacity-100 with-hover:right-full" href="#how-to-cite-us"><span><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 256"><path d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z" fill="currentColor"></path></svg></span></a> <span>How to cite us</span></h2> <p data-svelte-h="svelte-18urb29">If you found this library and found LLM.int8() useful, please consider citing our work:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->@article{dettmers2022llmint8, | |
| title={<span class="hljs-keyword">LLM.int8(): </span><span class="hljs-number">8</span>-<span class="hljs-keyword">bit </span>Matrix <span class="hljs-keyword">Multiplication </span>for Transformers <span class="hljs-built_in">at</span> <span class="hljs-keyword">Scale}, | |
| </span> author={Dettmers, Tim <span class="hljs-keyword">and </span>Lewis, Mike <span class="hljs-keyword">and </span><span class="hljs-keyword">Belkada, </span>Younes <span class="hljs-keyword">and </span>Zettlemoyer, Luke}, | |
| <span class="hljs-keyword">journal={arXiv </span>preprint arXiv:<span class="hljs-number">2208</span>.<span class="hljs-number">07339</span>}, | |
| year={<span class="hljs-number">2022</span>} | |
| }<!-- HTML_TAG_END --></pre></div> <p data-svelte-h="svelte-kd50n4">For 8-bit optimizers or quantization routines, please consider citing the following work:</p> <div class="code-block relative"><div class="absolute top-2.5 right-4"><button class="inline-flex items-center relative text-sm focus:text-green-500 cursor-pointer focus:outline-none transition duration-200 ease-in-out opacity-0 mx-0.5 text-gray-600 " title="code excerpt" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg> <div class="absolute pointer-events-none transition-opacity bg-black text-white py-1 px-2 leading-tight rounded font-normal shadow left-1/2 top-full transform -translate-x-1/2 translate-y-2 opacity-0"><div class="absolute bottom-full left-1/2 transform -translate-x-1/2 w-0 h-0 border-black border-4 border-t-0" style="border-left-color: transparent; border-right-color: transparent; "></div> Copied</div></button></div> <pre class=""><!-- HTML_TAG_START -->@article{dettmers2022optimizers, | |
| title={<span class="hljs-number">8</span>-<span class="hljs-keyword">bit </span>Optimizers via <span class="hljs-keyword">Block-wise </span>Quantization}, | |
| author={Dettmers, Tim <span class="hljs-keyword">and </span>Lewis, Mike <span class="hljs-keyword">and </span><span class="hljs-keyword">Shleifer, </span>Sam <span class="hljs-keyword">and </span>Zettlemoyer, Luke}, | |
| <span class="hljs-keyword">journal={9th </span>International Conference on Learning Representations, ICLR}, | |
| year={<span class="hljs-number">2022</span>} | |
| }<!-- HTML_TAG_END --></pre></div> <p></p> | |
| <script> | |
| { | |
| __sveltekit_uy4hsp = { | |
| assets: "/docs/bitsandbytes/v0.42.0/en", | |
| base: "/docs/bitsandbytes/v0.42.0/en", | |
| env: {} | |
| }; | |
| const element = document.currentScript.parentElement; | |
| const data = [null,null]; | |
| Promise.all([ | |
| import("/docs/bitsandbytes/v0.42.0/en/_app/immutable/entry/start.17d43515.js"), | |
| import("/docs/bitsandbytes/v0.42.0/en/_app/immutable/entry/app.0b902398.js") | |
| ]).then(([kit, app]) => { | |
| kit.start(app, element, { | |
| node_ids: [0, 2], | |
| data, | |
| form: null, | |
| error: null | |
| }); | |
| }); | |
| } | |
| </script> | |
Xet Storage Details
- Size:
- 45.6 kB
- Xet hash:
- 6638712229b662ee51fc8df8946188aafeee60a95b66a8ce831ecfd06d7cd8c6
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.