File size: 4,204 Bytes
708f4a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
# CRAYON Installation Guide

## Quick Install (CPU Only)

```bash
pip install xerv-crayon
```

## CUDA Installation (NVIDIA GPUs)

### Prerequisites
1. **NVIDIA GPU** with CUDA support (Pascal architecture or newer)
2. **CUDA Toolkit** 12.1+ recommended
3. **PyTorch with CUDA support**

### Step 1: Install CUDA Toolkit
Download and install from: https://developer.nvidia.com/cuda-downloads

**Windows:**
- Install to default location: `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x`
- Add to PATH: `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x\bin`

**Linux:**
```bash
export CUDA_HOME=/usr/local/cuda
export PATH=$CUDA_HOME/bin:$PATH
```

### Step 2: Install PyTorch CUDA
```bash
# Uninstall CPU-only version first
pip uninstall torch torchvision torchaudio

# Install CUDA version
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```

### Step 3: Install CRAYON with CUDA
```bash
# Development install (recommended)
git clone https://github.com/Electroiscoding/CRAYON.git
cd CRAYON
pip install -e . --verbose

# Or production install
pip install xerv-crayon --verbose
```

### Step 4: Verify Installation
```python
from crayon.core.vocabulary import CrayonVocab

# Should show green message if CUDA is available
vocab = CrayonVocab(device="auto")
print(f"Active device: {vocab.device}")
```

## ROCm Installation (AMD GPUs)

### Prerequisites
1. **AMD GPU** with ROCm support
2. **ROCm Toolkit** 5.4+ recommended

### Installation
```bash
# Set ROCm environment
export ROCM_HOME=/opt/rocm
export HIP_VISIBLE_DEVICES=0

# Install CRAYON
pip install -e . --verbose
```

## Troubleshooting

### CUDA Extension Not Compiled

If you see:
```
WARNING:crayon.vocab:CUDA extension not compiled. Falling back to CPU.
```

Run this diagnostic:
```python
from crayon.core.vocabulary import CrayonVocab
vocab = CrayonVocab(device="cpu")  # Initialize first
print(vocab._get_cuda_import_error())  # Get detailed fix instructions
```

### Common Issues

#### 1. "NVCC not found"
**Solution:** Install CUDA Toolkit and add to PATH

#### 2. "PyTorch CUDA not available" 
**Solution:** Install CUDA version of PyTorch:
```bash
pip uninstall torch
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```

#### 3. "CUDA_HOME not set"
**Solution:** Set environment variable:
- **Windows:** `CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.x`
- **Linux:** `export CUDA_HOME=/usr/local/cuda`

#### 4. Build fails with "out of memory"
**Solution:** Limit build jobs:
```bash
export MAX_JOBS=1
pip install -e . --verbose
```

### Forced Builds

If you have CUDA installed but no GPU, force build:
```bash
# Windows
set CRAYON_FORCE_CUDA=1
pip install -e . --force-reinstall

# Linux/Mac
export CRAYON_FORCE_CUDA=1
pip install -e . --force-reinstall
```

### Generic Wheel Build (for distribution)
```bash
export CRAYON_GENERIC_BUILD=1
python -m build
```

## Performance Verification

```python
import time
from crayon.core.vocabulary import CrayonVocab

# Test with different backends
for device in ["cpu", "cuda"]:
    try:
        vocab = CrayonVocab(device=device)
        vocab.load_profile("lite")
        
        start = time.time()
        tokens = vocab.tokenize("Hello world! " * 1000)
        elapsed = time.time() - start
        
        print(f"{device.upper()}: {elapsed:.6f}s for {len(tokens)} tokens")
    except Exception as e:
        print(f"{device.upper()}: {e}")
```

## Getting Help

- **Issues:** https://github.com/Electroiscoding/CRAYON/issues
- **Discussions:** https://github.com/Electroiscoding/CRAYON/discussions
- **Documentation:** https://github.com/Electroiscoding/CRAYON#readme

## Environment Variables

| Variable | Purpose | Example |
|----------|---------|---------|
| `CRAYON_DEVICE` | Force device selection | `cuda`, `cpu`, `rocm` |
| `CRAYON_FORCE_CUDA` | Force CUDA build | `1` |
| `CRAYON_FORCE_ROCM` | Force ROCm build | `1` |
| `CRAYON_FORCE_CPU` | CPU-only build | `1` |
| `CRAYON_GENERIC_BUILD` | Build for all GPU archs | `1` |
| `CRAYON_PROFILE_DIR` | Custom profile directory | `/path/to/profiles` |
| `MAX_JOBS` | Limit build parallelism | `1` |