Buckets:
| # Kernel requirements | |
| Kernels on the Hub must fulfill the requirements outlined on this page. By | |
| ensuring kernels are compliant, they can be used on a wide range of Linux | |
| systems and Torch builds. | |
| [Join us on Discord](https://discord.gg/H6Tkmd88N3) for questions and discussions | |
| about building kernels! | |
| ## Directory layout | |
| A kernel repository on the Hub must contain a `build` directory. This | |
| directory contains build variants of a kernel in the form of directories | |
| following the template | |
| `-cxx---`. | |
| For example `build/torch26-cxx98-cu118-x86_64-linux`. | |
| The kernel is in the build variant directory and must contain a | |
| `__init__.py` file. For compatibility with older versions of the | |
| `kernels` package, each variant directory must also contain a single | |
| directory with the same name as the repository (replacing `-` by `_`). | |
| For instance, kernels in the `kernels-community/activation` repository | |
| have a directory like `build//activation`. This directory | |
| must contain an `__init__.py` file that exports the same symbols as | |
| `__init__.py` in the build variant directory `build/`. | |
| [This example](https://huggingface.co/kernels-test/flattened-build/blob/main/build/torch-universal/flattened_build/__init__.py) | |
| shows how this can be done. This compatibility directory is | |
| automatically created by `kernel-builder`. | |
| ## Build variants | |
| A kernel can be compliant for a specific compute framework (e.g. CUDA) or | |
| architecture (e.g. x86_64). For compliance with a compute framework and | |
| architecture combination, all the variants from the [build variant list](builder/build-variants) | |
| must be available for that combination. | |
| ## Kernel metadata | |
| The build variant directory can optionally contain a `metadata.json` file. | |
| Currently the metadata specifies the kernel's version and Python dependencies, | |
| for example: | |
| ```json | |
| { | |
| "python-depends": ["einops"], | |
| "version": 1 | |
| ``` | |
| ### Python dependencies | |
| You can specify Python dependencies that your kernel requires. Dependencies can be either general (required for all backends) or backend-specific (required only for certain compute backends like CUDA, ROCm, XPU, Metal, or CPU). | |
| #### General dependencies | |
| For dependencies required regardless of the backend, use the `python-depends` field: | |
| ```json | |
| { | |
| "python-depends": ["einops"] | |
| } | |
| ``` | |
| #### Backend-specific dependencies | |
| For dependencies that are only needed for specific backends, use the `python-depends-backends` field: | |
| ```json | |
| { | |
| "python-depends-backends": { | |
| "cuda": ["nvidia-cutlass-dsl"], | |
| "xpu": ["onednn"] | |
| } | |
| } | |
| ``` | |
| #### Combined example | |
| You can specify both general and backend-specific dependencies: | |
| ```json | |
| { | |
| "python-depends": ["einops"], | |
| "python-depends-backends": { | |
| "cuda": ["nvidia-cutlass-dsl"] | |
| }, | |
| "version": 1 | |
| } | |
| ``` | |
| #### Allowed dependencies | |
| The following dependencies are currently allowed: | |
| **General dependencies:** | |
| - `einops` | |
| **Backend-specific dependencies:** | |
| - CUDA: `nvidia-cutlass-dsl` | |
| - XPU: `onednn` | |
| Dependencies are validated based on the backend being used. When a kernel is loaded, only the dependencies relevant to the active backend are checked. | |
| ## Versioning | |
| Kernels are versioned using a major version. The kernel revisions of a | |
| version are stored in a branch of the form `v`. Each build | |
| variant will also have the kernel version in `metadata.json`. | |
| The version **must** be bumped in the following cases: | |
| - The kernel API is changed in an incompatible way. | |
| - The API is extended in a compatible way, but not all build variants | |
| receive the extension (e.g. because they are for older Torch versions | |
| that are not supported by `kernel-builder` anymore). | |
| In both cases, build variants that are not updated must be removed from | |
| the new version's branch. | |
| ## Native Python module | |
| Kernels will typically contain a native Python module with precompiled | |
| compute kernels and bindings. This module must fulfill the requirements | |
| outlined in this section. For all operating systems, a kernel must not | |
| have dynamic library dependencies outside: | |
| - Torch; | |
| - CUDA/ROCm libraries installed as dependencies of Torch. | |
| ## Compatibility with torch.compile | |
| The Kernel Hub also encourages to write the kernels in a `torch.compile` | |
| compliant way. This helps to ensure that the kernels are compatible with | |
| `torch.compile` without introducing any graph breaks and triggering | |
| recompilation which can limit the benefits of compilation. | |
| [Here](https://github.com/huggingface/kernels/blob/f83b4da6b7f6b171b47bb9bf96271ae2273bc9d3/builder/examples/relu-backprop-compile/tests/test_relu.py#L162) | |
| is a simple test example which checks for graph breaks and | |
| recompilation triggers during `torch.compile`. | |
| ### Linux | |
| - Use [ABI3/Limited API](https://docs.python.org/3/c-api/stable.html#stable-application-binary-interface) | |
| for compatibility with Python 3.9 and later. | |
| - Compatible with [`manylinux_2_28`](https://github.com/pypa/manylinux?tab=readme-ov-file#manylinux_2_28-almalinux-8-based). | |
| This means that the extension **must not** use symbols versions higher than: | |
| - GLIBC 2.28 | |
| - GLIBCXX 3.4.24 | |
| - CXXABI 1.3.11 | |
| - GCC 7.0.0 | |
| These requirements can be checked with the ABI checker (see below). | |
| ### macOS | |
| - Use [ABI3/Limited API](https://docs.python.org/3/c-api/stable.html#stable-application-binary-interface) | |
| for compatibility with Python 3.9 and later. | |
| - macOS deployment target 15.0. | |
| - Metal 3.0 (`-std=metal3.0`). | |
| The ABI3 requirement can be checked with the ABI checker (see below). | |
| ### ABI checker | |
| The manylinux_2_28 and Python ABI 3.9 version requirements can be checked with | |
| [`kernel-abi-check`](https://crates.io/crates/kernel-abi-check): | |
| ```bash | |
| $ cargo install kernel-abi-check | |
| $ kernel-abi-check result/relu/_relu_e87e0ca_dirty.abi3.so | |
| 🐍 Checking for compatibility with manylinux_2_28 and Python ABI version 3.9 | |
| ✅ No compatibility issues found | |
| ``` | |
| ## Torch extension | |
| Torch native extension functions must be [registered](https://pytorch.org/tutorials/advanced/cpp_custom_ops.html#cpp-custom-ops-tutorial) | |
| in `torch.ops.`. Since we allow loading of multiple versions of | |
| a module in the same Python process, `namespace` must be unique for each | |
| version of a kernel. Failing to do so will create clashes when different | |
| versions of the same kernel are loaded. Two suggested ways of doing this | |
| are: | |
| - Appending a truncated SHA-1 hash of the git commit that the kernel was | |
| built from to the name of the extension. | |
| - Appending random material to the name of the extension. | |
| **Note:** we recommend against appending a version number or git tag. | |
| Version numbers are typically not bumped on each commit, so users | |
| might use two different commits that happen to have the same version | |
| number. Git tags are not stable, so they do not provide a good way | |
| of guaranteeing uniqueness of the namespace. | |
| ## Layers | |
| A kernel can provide layers in addition to kernel functions. A layer from | |
| the Hub can replace the `forward` method of an existing layer for a certain | |
| device type. This makes it possible to provide more performant kernels for | |
| existing layers. See the [layers documentation](layers) for more information | |
| on how to use layers. | |
| ### Writing layers | |
| To make the extension of layers safe, the layers must fulfill the following | |
| requirements: | |
| - The layers are subclasses of `torch.nn.Module`. | |
| - The layers are pure, meaning that they do not have their own state. This | |
| means that: | |
| - The layer must not define its own constructor. | |
| - The layer must not use class variables. | |
| - No other methods must be defined than `forward`. | |
| - The `forward` method has a signature that is compatible with the | |
| `forward` method that it is extending. | |
| There are two exceptions to the _no class variables rule_: | |
| 1. The `has_backward` variable can be used to indicate whether the layer has | |
| a backward pass implemented (`True` when absent). | |
| 2. The `can_torch_compile` variable can be used to indicate whether the layer | |
| supports `torch.compile` (`False` when absent). | |
| This is an example of a pure layer: | |
| ```python | |
| class SiluAndMul(nn.Module): | |
| # This layer does not implement backward. | |
| has_backward: bool = False | |
| def forward(self, x: torch.Tensor): | |
| d = x.shape[-1] // 2 | |
| output_shape = x.shape[:-1] + (d,) | |
| out = torch.empty(output_shape, dtype=x.dtype, device=x.device) | |
| ops.silu_and_mul(out, x) | |
| return out | |
| ``` | |
| For some layers, the `forward` method has to use state from the adopting class. | |
| In these cases, we recommend to use type annotations to indicate what member | |
| variables are expected. For instance: | |
| ```python | |
| class LlamaRMSNorm(nn.Module): | |
| weight: torch.Tensor | |
| variance_epsilon: float | |
| def forward(self, hidden_states: torch.Tensor) -> torch.Tensor: | |
| return rms_norm_fn( | |
| hidden_states, | |
| self.weight, | |
| bias=None, | |
| residual=None, | |
| eps=self.variance_epsilon, | |
| dropout_p=0.0, | |
| prenorm=False, | |
| residual_in_fp32=False, | |
| ) | |
| ``` | |
| This layer expects the adopting layer to have `weight` and `variance_epsilon` | |
| member variables and uses them in the `forward` method. | |
| ### Exporting layers | |
| To accommodate portable loading, `layers` must be defined in the main | |
| `__init__.py` file. For example: | |
| ```python | |
| from . import layers | |
| __all__ = [ | |
| # ... | |
| "layers" | |
| # ... | |
| ] | |
| ``` | |
| ## Python requirements | |
| - Python code must be compatible with Python 3.9 and later. | |
| - All Python code imports from the kernel itself must be relative. So, | |
| for instance if in the example kernel `example`, | |
| `module_b` needs a function from `module_a`, import as: | |
| ```python | |
| from .module_a import foo | |
| ``` | |
| **Never use:** | |
| ```python | |
| # DO NOT DO THIS! | |
| from example.module_a import foo | |
| ``` | |
| The latter would import from the module `example` that is in Python's | |
| global module dict. However, since we allow loading multiple versions | |
| of a module, we uniquely name the module. | |
| - Only modules from the Python standard library, Torch, or the kernel itself | |
| can be imported. | |
Xet Storage Details
- Size:
- 9.98 kB
- Xet hash:
- 01c3940c98409e881215bf9c147ce741d1dc58890363a664b3f70ec988970cbb
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.