Instructions to use kernels-community/quantization-bitsandbytes with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Kernels
How to use kernels-community/quantization-bitsandbytes with Kernels:
# !pip install kernels from kernels import get_kernel kernel = get_kernel("kernels-community/quantization-bitsandbytes") - Notebooks
- Google Colab
- Kaggle
| import torch | |
| from ._ops import ops | |
| def gemm_4bit_forward( | |
| input: torch.Tensor, | |
| weight: torch.Tensor, | |
| absmax: torch.Tensor, | |
| blocksize: int, | |
| quant_type: int, | |
| ) -> torch.Tensor: | |
| original_dtype = input.dtype | |
| if original_dtype != torch.bfloat16: | |
| input = input.to(torch.bfloat16) | |
| output = ops.gemm_4bit_forward(input, weight, absmax, blocksize, quant_type) | |
| if original_dtype != torch.bfloat16: | |
| output = output.to(original_dtype) | |
| return output | |