Buckets:

hf-doc-build
/

doc

Files

xet

hf-doc-build/doc / bitsandbytes /main /en /installation.md

HuggingFaceDocBuilder

2 days ago

preview code

download

raw

15.2 kB

	# Installation Guide

	Welcome to the installation guide for the `bitsandbytes` library! This document provides step-by-step instructions to install `bitsandbytes` across various platforms and hardware configurations.

	We provide official support for NVIDIA GPUs, CPUs, Intel XPUs, and Intel Gaudi. We also have experimental support for additional platforms such as AMD ROCm and Apple Silicon.

	## Table of Contents

	- [System Requirements](#requirements)
	- [NVIDIA CUDA](#cuda)
	- [Installation via PyPI](#cuda-pip)
	- [Compile from Source](#cuda-compile)
	- [Intel XPU](#xpu)
	- [Installation via PyPI](#xpu-pip)
	- [Intel Gaudi](#gaudi)
	- [Installation via PyPI](#gaudi-pip)
	- [CPU](#cpu)
	- [Installation via PyPI](#cpu-pip)
	- [Compile from Source](#cpu-compile)
	- [AMD ROCm (Preview)](#rocm)
	- [Installation via PyPI](#rocm-pip)
	- [Compile from Source](#rocm-compile)
	- [Preview Wheels](#preview-wheels)

	## System Requirements[[requirements]]

	These are the minimum requirements for `bitsandbytes` across all platforms. Please be aware that some compute platforms may impose more strict requirements.

	* Python >= 3.10
	* PyTorch >= 2.4

	## NVIDIA CUDA[[cuda]]

	`bitsandbytes` is currently supported on NVIDIA GPUs with [Compute Capability](https://developer.nvidia.com/cuda-gpus) 6.0+.
	The library can be built using CUDA Toolkit versions as old as 11.8.

	\| Feature \| CC Required \| Example Hardware Requirement \|
	\|---------------------------------\|-----------------\|---------------------------------------------\|
	\| LLM.int8() \| 7.5+ \| Turing (RTX 20 series, T4) or newer GPUs \|
	\| 8-bit optimizers/quantization \| 6.0+ \| Pascal (GTX 10X0 series, P100) or newer GPUs\|
	\| NF4/FP4 quantization \| 6.0+ \| Pascal (GTX 10X0 series, P100) or newer GPUs\|

	### Installation via PyPI[[cuda-pip]]

	This is the most straightforward and recommended installation option.

	The currently distributed `bitsandbytes` packages are built with the following configurations:

	\| OS \| CUDA Toolkit \| Host Compiler \| Targets
	\|--------------------\|------------------\|----------------------\|--------------
	\| Linux x86-64 \| 11.8 - 12.6 \| GCC 11.2 \| sm60, sm70, sm75, sm80, sm86, sm89, sm90
	\| Linux x86-64 \| 12.8 - 12.9 \| GCC 11.2 \| sm70, sm75, sm80, sm86, sm89, sm90, sm100, sm120
	\| Linux x86-64 \| 13.0 \| GCC 11.2 \| sm75, sm80, sm86, sm89, sm90, sm100, sm120
	\| Linux aarch64 \| 11.8 - 12.6 \| GCC 11.2 \| sm75, sm80, sm90
	\| Linux aarch64 \| 12.8 - 13.0 \| GCC 11.2 \| sm75, sm80, sm90, sm100, sm110, sm120, sm121
	\| Windows x86-64 \| 11.8 - 12.6 \| MSVC 19.43+ (VS2022) \| sm50, sm60, sm75, sm80, sm86, sm89, sm90
	\| Windows x86-64 \| 12.8 - 12.9 \| MSVC 19.43+ (VS2022) \| sm70, sm75, sm80, sm86, sm89, sm90, sm100, sm120
	\| Windows x86-64 \| 13.0 \| MSVC 19.43+ (VS2022) \| sm75, sm80, sm86, sm89, sm90, sm100, sm120

	The Linux build has a minimum glibc version of 2.24.

	Use `pip` or `uv` to install the latest release:

	```bash
	pip install bitsandbytes
	```

	> [!WARNING]
	> NVIDIA Jetson (L4T / JetPack) — source build required. The `Linux aarch64` wheels above are built on aarch64-sbsa runners (server-class ARM with the standard CUDA Toolkit). They are not compatible with the L4T runtime on Jetson devices (Orin Nano / NX / AGX, Xavier, Thor on CUDA 12), even though both are aarch64 and even though the cubins are binary-compatible with the device's compute capability (e.g., `sm_80` cubin runs on `sm_87` hardware via Ampere-family binary compat — see [NVIDIA's docs on binary compatibility](https://developer.nvidia.com/blog/understanding-ptx-the-assembly-language-of-cuda-gpu-computing/#binary_compatibility)). The mismatch is at the CUDA library / ABI layer (JetPack ships its own CUDA Toolkit and system libraries), and surfaces as a runtime symbol-resolution error like `Error named symbol not found in /src/csrc/ops.cu` on the first CUDA op.
	>
	> Two working options on Jetson:
	>
	> 1. Source build on-device. Use the [Compile from Source](#cuda-compile) instructions below, passing your device's compute capability explicitly (sm_87 for Orin family, sm_72 for Xavier). On an Orin Nano Super: `cmake -DCOMPUTE_BACKEND=cuda -DCOMPUTE_CAPABILITY=87 . && make -j4 && pip install .`
	> 2. Third-party prebuilt from [Jetson AI Lab's package index](https://pypi.jetson-ai-lab.io/) (e.g., `pypi.jetson-ai-lab.io/jp6/cu126/bitsandbytes/`).

	### Compile from Source[[cuda-compile]]

	> [!TIP]
	> Don't hesitate to compile from source! The process is pretty straight forward and resilient. This might be needed for older CUDA Toolkit versions or Linux distributions, or other less common configurations.

	For Linux and Windows systems, compiling from source allows you to customize the build configurations. See below for detailed platform-specific instructions (see the `CMakeLists.txt` if you want to check the specifics and explore some additional options):

	To compile from source, you need CMake >= 3.22.1 and Python >= 3.10 installed. Make sure you have a compiler installed to compile C++ (`gcc`, `make`, headers, etc.). It is recommended to use GCC 11 or newer.

	For example, to install a compiler and CMake on Ubuntu:

	```bash
	apt-get install -y build-essential cmake
	```

	You should also install CUDA Toolkit by following the [NVIDIA CUDA Installation Guide for Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) guide. The current minimum supported CUDA Toolkit version that we support is 11.8.

	```bash
	git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
	cmake -DCOMPUTE_BACKEND=cuda -S .
	make
	pip install -e . # `-e` for "editable" install, when developing BNB (otherwise leave that out)
	```

	> [!TIP]
	> If you have multiple versions of the CUDA Toolkit installed or it is in a non-standard location, please refer to CMake CUDA documentation for how to configure the CUDA compiler.

	Compilation from source on Windows systems require Visual Studio with C++ support as well as an installation of the CUDA Toolkit.

	To compile from source, you need CMake >= 3.22.1 and Python >= 3.10 installed. You should also install CUDA Toolkit by following the [CUDA Installation Guide for Windows](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html) guide from NVIDIA. The current minimum supported CUDA Toolkit version that we support is 11.8.

	```bash
	git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
	cmake -DCOMPUTE_BACKEND=cuda -S .
	cmake --build . --config Release
	pip install -e . # `-e` for "editable" install, when developing BNB (otherwise leave that out)
	```

	Big thanks to [wkpark](https://github.com/wkpark), [Jamezo97](https://github.com/Jamezo97), [rickardp](https://github.com/rickardp), [akx](https://github.com/akx) for their amazing contributions to make bitsandbytes compatible with Windows.

	## Intel XPU[[xpu]]

	* A compatible PyTorch version with Intel XPU support is required. The current minimum is PyTorch 2.6.0. It is recommended to use the latest stable release. See [Getting Started on Intel GPU](https://docs.pytorch.org/docs/stable/notes/get_start_xpu.html) for guidance.

	### Installation via PyPI[[xpu-pip]]

	This is the most straightforward and recommended installation option.

	The currently distributed `bitsandbytes` packages are built with the following configurations:

	\| OS \| oneAPI Toolkit \| Kernel Implementation \|
	\|--------------------\|------------------\|----------------------\|
	\| Linux x86-64 \| 2025.1.3 \| SYCL + Triton \|
	\| Windows x86-64 \| 2025.1.3 \| SYCL + Triton \|

	The Linux build has a minimum glibc version of 2.34.

	Use `pip` or `uv` to install the latest release:

	```bash
	pip install bitsandbytes
	```

	## Intel Gaudi[[gaudi]]

	* A compatible PyTorch version with Intel Gaudi support is required. The current minimum is Gaudi v1.21 with PyTorch 2.6.0. It is recommended to use the latest stable release. See the Gaudi software [installation guide](https://docs.habana.ai/en/latest/Installation_Guide/index.html) for guidance.

	### Installation from PyPI[[gaudi-pip]]

	Use `pip` or `uv` to install the latest release:

	```bash
	pip install bitsandbytes
	```

	## CPU[[cpu]]

	### Installation from PyPI[[cpu-pip]]

	This is the most straightforward and recommended installation option.

	The currently distributed `bitsandbytes` packages are built with the following configurations:

	\| OS \| Host Compiler \| Hardware Minimum
	\|--------------------\|----------------------\|----------------------\|
	\| Linux x86-64 \| GCC 11.4 \| AVX2 \|
	\| Linux aarch64 \| GCC 11.4 \| \|
	\| Windows x86-64 \| MSVC 19.43+ (VS2022) \| AVX2 \|
	\| macOS arm64 \| Apple Clang 17 \| \|

	The Linux build has a minimum glibc version of 2.24.

	Use `pip` or `uv` to install the latest release:

	```bash
	pip install bitsandbytes
	```

	### Compile from Source[[cpu-compile]]

	To compile from source, simply install the package from source using `pip`. The package will be built for CPU only at this time.

	```bash
	git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
	pip install -e .
	```

	## AMD ROCm (Preview)[[rocm]]

	* Support for AMD GPUs is currently in a preview state.
	* All features are supported for both consumer RDNA devices and Data Center CDNA products.
	* A compatible PyTorch version with AMD ROCm support is required. It is recommended to use the latest stable release. On Linux, see [PyTorch on ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/pytorch-install.html) for guidance. On Windows, ROCm-enabled PyTorch wheels are available from:
	- [repo.radeon.com/rocm/windows/](https://repo.radeon.com/rocm/windows/) — official AMD releases
	- [repo.amd.com/rocm/whl/](https://repo.amd.com/rocm/whl/) — [TheRock](https://github.com/ROCm/TheRock) release builds
	- [rocm.nightlies.amd.com/v2](https://rocm.nightlies.amd.com/v2) — TheRock nightly builds

	### Installation from PyPI[[rocm-pip]]

	This is the most straightforward and recommended installation option.

	The currently distributed `bitsandbytes` are built with the following configurations:

	\| OS \| ROCm \| Targets
	\|--------------------\|----------\|---------------------------------------------------------------------\|
	\| Linux x86-64 \| 6.2.4 \| CDNA: gfx90a, gfx942 / RDNA: gfx1100, gfx1101, gfx1102, gfx1103
	\| Linux x86-64 \| 6.3.4 \| CDNA: gfx90a, gfx942 / RDNA: gfx1100, gfx1101, gfx1102, gfx1103
	\| Linux x86-64 \| 6.4.4 \| CDNA: gfx90a, gfx942 / RDNA: gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, gfx1153, gfx1200, gfx1201
	\| Linux x86-64 \| 7.0.2 \| CDNA: gfx90a, gfx942, gfx950 / RDNA: gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, gfx1153, gfx1200, gfx1201
	\| Linux x86-64 \| 7.1.1 \| CDNA: gfx90a, gfx942, gfx950 / RDNA: gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, gfx1153, gfx1200, gfx1201
	\| Linux x86-64 \| 7.2.3 \| CDNA: gfx90a, gfx942, gfx950 / RDNA: gfx1100, gfx1101, gfx1102, gfx1103, gfx1150, gfx1151, gfx1152, gfx1153, gfx1200, gfx1201
	\| Windows x86-64 \| 7.2.1 \| RDNA: gfx1100, gfx1101, gfx1102, gfx1150, gfx1151, gfx1200, gfx1201

	Use `pip` or `uv` to install the latest release:

	```bash
	pip install bitsandbytes
	```

	### Compile from Source[[rocm-compile]]

	bitsandbytes can be compiled from ROCm 6.2 - ROCm 7.2.3. See the `CMakeLists.txt` for additional options.

	To compile from source, you need CMake >= 3.31.6 and Python >= 3.10 installed. Make sure you have a compiler installed to compile C++ (`gcc`, `make`, headers, etc.).

	You should also have a ROCm installation (system-wide or via Docker). The current minimum supported version is 6.2.

	```bash
	# Install bitsandbytes from source

	# Clone bitsandbytes repo
	git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/

	# Compile & install
	apt-get install -y build-essential cmake # install build tools dependencies, unless present
	cmake -DCOMPUTE_BACKEND=hip -S . # Use -DBNB_ROCM_ARCH="gfx90a;gfx942" to target specific gpu arch
	make
	pip install -e . # `-e` for "editable" install, when developing BNB (otherwise leave that out)
	```

	Compilation on Windows requires Visual Studio with C++ support, CMake, Ninja, and Python >= 3.10.

	Instead of a system-wide ROCm installation, you can use the pip-installable ROCm SDK wheels from [repo.radeon.com](https://repo.radeon.com/rocm/windows/):

	```bash
	# Install ROCm SDK wheels (adjust version as needed)
	pip install ninja cmake
	pip install \
	https://repo.radeon.com/rocm/windows/rocm-rel-7.2.1/rocm_sdk_core-7.2.1-py3-none-win_amd64.whl \
	https://repo.radeon.com/rocm/windows/rocm-rel-7.2.1/rocm_sdk_devel-7.2.1-py3-none-win_amd64.whl \
	https://repo.radeon.com/rocm/windows/rocm-rel-7.2.1/rocm_sdk_libraries_custom-7.2.1-py3-none-win_amd64.whl \
	https://repo.radeon.com/rocm/windows/rocm-rel-7.2.1/rocm-7.2.1.tar.gz

	# Expand the devel tarball
	rocm-sdk init

	# Set ROCM_PATH and activate Visual Studio environment, then build
	export ROCM_PATH="$(rocm-sdk path --root)"
	export PATH="${ROCM_PATH}/bin:${PATH}"
	git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
	cmake -G Ninja -DCOMPUTE_BACKEND=hip -DBNB_ROCM_ARCH="gfx1100" -DCMAKE_BUILD_TYPE=Release -S .
	cmake --build . --config Release
	pip install .
	```

	## Preview Wheels[[preview-wheels]]

	If you would like to use new features even before they are officially released and help us test them, feel free to install the wheel directly from our CI (the wheel links will remain stable!):

	```bash
	# Note: if you don't want to reinstall our dependencies, append the `--no-deps` flag!

	# x86_64 (most users)
	pip install --force-reinstall https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_main/bitsandbytes-1.33.7.preview-py3-none-manylinux_2_24_x86_64.whl

	# ARM/aarch64
	pip install --force-reinstall https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_main/bitsandbytes-1.33.7.preview-py3-none-manylinux_2_24_aarch64.whl
	```

	```bash
	# Note: if you don't want to reinstall our dependencies, append the `--no-deps` flag!
	pip install --force-reinstall https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_main/bitsandbytes-1.33.7.preview-py3-none-win_amd64.whl
	```

	```bash
	# Note: if you don't want to reinstall our dependencies, append the `--no-deps` flag!
	pip install --force-reinstall https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_main/bitsandbytes-1.33.7.preview-py3-none-macosx_14_0_arm64.whl
	```

Xet Storage Details

Size:: 15.2 kB
Xet hash:: c1131544a6bf54e7d7e9750804869ff3651f534ebe774c3e2e4cc47dc96edb57

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.