yo9otatara
/

prebuilt_wheels

prebuilt-wheels

Model card Files Files and versions

yo9otatara commited on Mar 2

Commit

f0d9c15

·

verified ·

1 Parent(s): ed8a274

Create README.md

Files changed (1) hide show

README.md +86 -0

README.md ADDED Viewed

	@@ -0,0 +1,86 @@

+---
+license: apache-2.0
+tags:
+  - prebuilt-wheels
+  - cuda
+  - triton
+  - sageattention
+  - pytorch
+language:
+  - en
+---
+# Prebuilt CUDA Wheels — Triton 3.6.0 & SageAttention 2.2.0
+Pre-compiled Python wheels for **Linux x86_64**, built against **CUDA 12.8** with **Python 3.12**.
+No compilation needed — just `pip install` the `.whl` file matching your setup.
+## Available Wheels
+### Triton 3.6.0
+| Wheel | Size | PyTorch | GPU |
+|---|---|---|---|
+| `triton-3.6.0-cp312-cp312-linux_x86_64.whl` | 339 MB | Any | All |
+Triton is **PyTorch-version independent** — one wheel works with both PyTorch 2.7 and 2.10.
+### SageAttention 2.2.0
+| Wheel | Size | PyTorch | GPU Arch |
+|---|---|---|---|
+| `sageattention-2.2.0+cu128torch2.10.0sm90-…` | 21.1 MB | 2.10.0 | Hopper (sm90) |
+| `sageattention-2.2.0+cu128torch2.10.0sm120-…` | 15.6 MB | 2.10.0 | Blackwell (sm120) |
+| `sageattention-2.2.0+cu128torch2.7.0sm90-…` | 20.2 MB | 2.7.0 | Hopper (sm90) |
+| `sageattention-2.2.0+cu128torch2.7.0sm120-…` | 14.9 MB | 2.7.0 | Blackwell (sm120) |
+> **Pick the wheel matching your PyTorch version AND GPU architecture.**
+## Quick Install
+```bash
+# Install Triton
+pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/triton-3.6.0-cp312-cp312-linux_x86_64.whl
+# Install SageAttention — pick ONE matching your setup:
+# PyTorch 2.10 + Hopper (H100, H200)
+pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/sageattention-2.2.0%2Bcu128torch2.10.0sm90-cp312-cp312-linux_x86_64.whl
+# PyTorch 2.10 + Blackwell (B100, B200, GB200)
+pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/sageattention-2.2.0%2Bcu128torch2.10.0sm120-cp312-cp312-linux_x86_64.whl
+# PyTorch 2.7 + Hopper (H100, H200)
+pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/sageattention-2.2.0%2Bcu128torch2.7.0sm90-cp312-cp312-linux_x86_64.whl
+# PyTorch 2.7 + Blackwell (B100, B200, GB200)
+pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/sageattention-2.2.0%2Bcu128torch2.7.0sm120-cp312-cp312-linux_x86_64.whl
+```
+## Requirements
+- **OS**: Linux x86_64
+- **Python**: 3.12
+- **CUDA**: 12.8
+- **PyTorch**: 2.7.0 or 2.10.0 (match the wheel)
+## Which GPU wheel do I need?
+| GPU | Architecture | Wheel suffix |
+|---|---|---|
+| H100, H200 | Hopper | `sm90` |
+| B100, B200, GB200 | Blackwell | `sm120` |
+## Build Info
+- Built from source in a Docker container (`nvidia/cuda:12.8.0-devel-ubuntu22.04`)
+- SageAttention source: [SageAttention v2.2.0](https://github.com/thu-ml/SageAttention)
+- Triton source: [Triton v3.6.0](https://github.com/triton-lang/triton)
+- Split-arch build policy: each SageAttention wheel targets exactly one GPU architecture
+## License
+- Triton: MIT License
+- SageAttention: Apache 2.0 License