yo9otatara commited on
Commit
f0d9c15
·
verified ·
1 Parent(s): ed8a274

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -0
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - prebuilt-wheels
5
+ - cuda
6
+ - triton
7
+ - sageattention
8
+ - pytorch
9
+ language:
10
+ - en
11
+ ---
12
+
13
+ # Prebuilt CUDA Wheels — Triton 3.6.0 & SageAttention 2.2.0
14
+
15
+ Pre-compiled Python wheels for **Linux x86_64**, built against **CUDA 12.8** with **Python 3.12**.
16
+
17
+ No compilation needed — just `pip install` the `.whl` file matching your setup.
18
+
19
+ ## Available Wheels
20
+
21
+ ### Triton 3.6.0
22
+
23
+ | Wheel | Size | PyTorch | GPU |
24
+ |---|---|---|---|
25
+ | `triton-3.6.0-cp312-cp312-linux_x86_64.whl` | 339 MB | Any | All |
26
+
27
+ Triton is **PyTorch-version independent** — one wheel works with both PyTorch 2.7 and 2.10.
28
+
29
+ ### SageAttention 2.2.0
30
+
31
+ | Wheel | Size | PyTorch | GPU Arch |
32
+ |---|---|---|---|
33
+ | `sageattention-2.2.0+cu128torch2.10.0sm90-…` | 21.1 MB | 2.10.0 | Hopper (sm90) |
34
+ | `sageattention-2.2.0+cu128torch2.10.0sm120-…` | 15.6 MB | 2.10.0 | Blackwell (sm120) |
35
+ | `sageattention-2.2.0+cu128torch2.7.0sm90-…` | 20.2 MB | 2.7.0 | Hopper (sm90) |
36
+ | `sageattention-2.2.0+cu128torch2.7.0sm120-…` | 14.9 MB | 2.7.0 | Blackwell (sm120) |
37
+
38
+ > **Pick the wheel matching your PyTorch version AND GPU architecture.**
39
+
40
+ ## Quick Install
41
+
42
+ ```bash
43
+ # Install Triton
44
+ pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/triton-3.6.0-cp312-cp312-linux_x86_64.whl
45
+
46
+ # Install SageAttention — pick ONE matching your setup:
47
+
48
+ # PyTorch 2.10 + Hopper (H100, H200)
49
+ pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/sageattention-2.2.0%2Bcu128torch2.10.0sm90-cp312-cp312-linux_x86_64.whl
50
+
51
+ # PyTorch 2.10 + Blackwell (B100, B200, GB200)
52
+ pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/sageattention-2.2.0%2Bcu128torch2.10.0sm120-cp312-cp312-linux_x86_64.whl
53
+
54
+ # PyTorch 2.7 + Hopper (H100, H200)
55
+ pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/sageattention-2.2.0%2Bcu128torch2.7.0sm90-cp312-cp312-linux_x86_64.whl
56
+
57
+ # PyTorch 2.7 + Blackwell (B100, B200, GB200)
58
+ pip install https://huggingface.co/yo9otatara/prebuilt_wheels/resolve/main/sageattention-2.2.0%2Bcu128torch2.7.0sm120-cp312-cp312-linux_x86_64.whl
59
+ ```
60
+
61
+ ## Requirements
62
+
63
+ - **OS**: Linux x86_64
64
+ - **Python**: 3.12
65
+ - **CUDA**: 12.8
66
+ - **PyTorch**: 2.7.0 or 2.10.0 (match the wheel)
67
+
68
+ ## Which GPU wheel do I need?
69
+
70
+ | GPU | Architecture | Wheel suffix |
71
+ |---|---|---|
72
+ | H100, H200 | Hopper | `sm90` |
73
+ | B100, B200, GB200 | Blackwell | `sm120` |
74
+
75
+ ## Build Info
76
+
77
+ - Built from source in a Docker container (`nvidia/cuda:12.8.0-devel-ubuntu22.04`)
78
+ - SageAttention source: [SageAttention v2.2.0](https://github.com/thu-ml/SageAttention)
79
+ - Triton source: [Triton v3.6.0](https://github.com/triton-lang/triton)
80
+ - Split-arch build policy: each SageAttention wheel targets exactly one GPU architecture
81
+
82
+ ## License
83
+
84
+ - Triton: MIT License
85
+ - SageAttention: Apache 2.0 License
86
+