Buckets:

hf-doc-build
/

doc-dev

Files

xet

hf-doc-build/doc-dev / kernels /pr_673 /en /builder /github-actions.md

HuggingFaceDocBuilder

8 days ago

preview code

download

raw

8.8 kB

	# Building and testing kernels with GitHub Actions

	Compiling a kernel is CPU-intensive and testing it requires an accelerator (such as a GPU), two things that GitHub's standard runners do not provide cheaply. Instead of maintaining self-hosted runners, you can offload both steps to [Hugging Face Jobs](https://huggingface.co/docs/huggingface_hub/guides/jobs) directly from a GitHub Actions workflow.

	Two prebuilt actions make this possible:

	- [`huggingface/kernel-builder-job`](https://github.com/huggingface/kernel-builder-job) runs the Nix kernel builder on a CPU flavor and publishes the built kernel to the Hub.
	- [`huggingface/hf-jobs-action`](https://github.com/huggingface/hf-jobs-action) runs an arbitrary script on any flavor (including GPUs), which is convenient for testing the kernel you just built. This also helps to test the kernel across different hardware.

	A typical setup has two workflows: one that builds the kernel on push, and one that tests it on a GPU. They communicate through the Hub, the build uploads artifacts, the test pulls them back down.

	## Prerequisites

	- An [HF access token](https://huggingface.co/settings/tokens) with the `job.write` permission. If the test job loads gated models, the token also needs read access to them.
	- The token stored as a repository secret named `HF_TOKEN` (Settings → Secrets and variables → Actions).
	- A kernel repository on the Hub to upload to, with kernel-creation access for the owning user or org (see [Building kernels](build#uploading-your-kernel-to-the-hub)).

	> [!NOTE]
	> Jobs run under the `namespace` you specify (your username or an org) and
	> count against that namespace's compute quota.

	## Building on push

	The build action checks out nothing by itself, your script clones the exact commit and invokes the Nix builder. Compilation happens on the HF Jobs CPU flavor, and `build-and-upload` pushes the finished variants to the Hub.

	```yaml
	# .github/workflows/build-kernel.yml
	name: Build Kernel

	on:
	push:
	branches: [main]
	paths:
	- "csrc/**"
	- "torch-ext/**"
	- build.toml
	- flake.nix
	- flake.lock
	workflow_dispatch:

	jobs:
	build:
	runs-on: ubuntu-latest
	steps:
	- uses: actions/checkout@v4
	- name: Build kernel via HF Jobs
	uses: huggingface/kernel-builder-job@main
	with:
	token: ${{ secrets.HF_TOKEN }}
	namespace: your-username
	flavor: cpu-xl
	timeout: "21600"
	script: \|
	# The container starts with `set -x`; disable tracing so the
	# token below is not echoed into the streamed logs.
	set +x
	export HF_TOKEN="${{ secrets.HF_TOKEN }}"
	# Rebuild artifacts, so skip pulling existing LFS blobs.
	export GIT_LFS_SKIP_SMUDGE=1

	git clone "${{ github.server_url }}/${{ github.repository }}" kernel
	cd kernel
	git checkout "${{ github.sha }}"

	nix run github:huggingface/kernels#kernel-builder -- build-and-upload \
	--max-jobs 4 \
	--cores 8 \
	--repo-id your-username/your-kernel
	```

	The path filter keeps the build from running on unrelated commits, and `workflow_dispatch` lets you trigger it by hand from the Actions tab. The upload destination is taken from `--repo-id` (or, if omitted, from the `repo-id`/`version` fields in `build.toml`).

	> [!TIP]
	> Builds can take a long time on the first run because every PyTorch and CUDA
	> variant is compiled. Set a generous `timeout` (the example allows six hours)
	> and rely on the [Hugging Face binary cache](build#using-the-hugging-face-binary-cache)
	> to keep subsequent builds fast.

	You can speed up builds by tuning how much work runs in parallel. `--max-jobs`
	sets how many kernel variants are built concurrently, while `--cores` sets how
	many CPU cores each of those jobs may use. Pick values that fit the chosen CPU
	`flavor`: a larger flavor (such as `cpu-xl`) has more cores to spread across
	jobs, so raising `--max-jobs` and `--cores` together shortens the total build
	time. Setting them too high for the flavor only adds scheduling overhead.

	### `kernel-builder-job` inputs

	\| Input \| Required \| Default \| Description \|
	\| ----------- \| -------- \| ------------------ \| ------------------------------------------------- \|
	\| `token` \| yes \| \| HF token with `job.write` permission. \|
	\| `namespace` \| yes \| \| HF namespace (username or org) that owns the job. \|
	\| `script` \| yes \| \| Shell script to run in the container. \|
	\| `flavor` \| no \| `cpu-upgrade` \| Hardware flavor (e.g. `cpu-xl`). \|
	\| `image` \| no \| Nix + cachix image \| Container image to run the build in. \|
	\| `timeout` \| no \| `1200` \| Maximum seconds to wait for the job. \|

	The action exposes `job_id` and `job_url` outputs that link to the run on huggingface.co.

	## Testing on a GPU

	Once the kernel is on the Hub, the generic jobs action runs a test script on a GPU flavor. The `files` input copies repository files into the container (under `/tmp/files` by default), and a [`uv` script](https://docs.astral.sh/uv/guides/scripts/) with inline dependencies keeps the environment self-contained.

	```yaml
	# .github/workflows/run-tests.yml
	name: Run tests

	on:
	push:
	branches: [main]
	paths:
	- scripts/test.py
	workflow_dispatch:

	jobs:
	run:
	runs-on: ubuntu-latest
	steps:
	- uses: actions/checkout@v4
	- name: Run test.py on an HF Jobs GPU
	uses: huggingface/hf-jobs-action@main
	with:
	token: ${{ secrets.HF_TOKEN }}
	namespace: your-username
	flavor: rtx-pro-6000
	image: ghcr.io/astral-sh/uv:python3.10-bookworm
	timeout: "3600"
	files: scripts/test.py
	script: \|
	set +x
	export HF_TOKEN="${{ secrets.HF_TOKEN }}"
	uv run /tmp/files/test.py
	```

	The test script pulls the kernel straight from the Hub with the [`kernels`](../basic-usage) library, so it always runs against the artifacts the build workflow just published:

	```python
	# scripts/test.py
	# /// script
	# dependencies = ["kernels", "torch"]
	# ///
	from kernels import get_kernel

	kernel = get_kernel("your-username/your-kernel")
	# ... exercise the kernel and assert on the results ...
	```

	Run the build workflow before the test workflow so the Hub has a fresh kernel to pull. For tightly coupled steps, you can also have one workflow trigger the other, or combine both jobs in a single workflow with a `needs:` dependency.

	### `hf-jobs-action` inputs

	\| Input \| Required \| Default \| Description \|
	\| ------------ \| -------- \| ------------ \| ------------------------------------------------------- \|
	\| `token` \| yes \| \| HF token with `job.write` permission. \|
	\| `namespace` \| yes \| \| HF namespace (username or org) that owns the job. \|
	\| `image` \| yes \| \| Container image to run. \|
	\| `script` \| yes \| \| Shell script to execute in the container. \|
	\| `flavor` \| no \| `cpu-basic` \| Hardware flavor (e.g. `rtx-pro-6000`). \|
	\| `files` \| no \| \| Newline-separated repo files to copy into the job. \|
	\| `files_dest` \| no \| `/tmp/files` \| Directory the files are copied to inside the container. \|
	\| `env` \| no \| `{}` \| Environment variables as a JSON object. \|
	\| `timeout` \| no \| `1200` \| Maximum seconds to wait for the job. \|

	## Choosing a flavor

	Flavors map to the machine types available on Hugging Face Jobs, CPU flavors such as `cpu-upgrade` and `cpu-xl` for builds, and GPU flavors such as `l4x1`, `a100-large`, `h200`, or `rtx-pro-6000` for tests. Pick the most reasonable GPU that fits your model to keep jobs low cost. The current list and pricing are in the [Hugging Face Jobs documentation](https://huggingface.co/docs/huggingface_hub/guides/jobs).

	> [!NOTE]
	> HF Jobs currently only offers a few CPU architectures, so the kernel is built
	> for whatever architecture the available CPU flavors provide. This is a current
	> limitation to keep in mind if you need to target a specific architecture.

	> [!WARNING]
	> HF Jobs containers start with shell tracing enabled (`set -x`). Always run
	> `set +x` before exporting `HF_TOKEN` so the token does not leak into the
	> streamed build logs.

Xet Storage Details

Size:: 8.8 kB
Xet hash:: 80dd8b934d919d7c025d2994806b1a6cad14f75f2b304024f705630935585288

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.