- troubleshooting.md +30 -21
troubleshooting.md
CHANGED
|
@@ -27,9 +27,9 @@ Using older or newer versions might cause unexpected behavior with the Spaces GP
|
|
| 27 |
|
| 28 |
## GPU Acceleration Issues
|
| 29 |
|
| 30 |
-
### spaces.GPU
|
| 31 |
|
| 32 |
-
We've observed that the `spaces.GPU
|
| 33 |
|
| 34 |
```
|
| 35 |
HTTP Request: POST http://device-api.zero/release?allowToken=... "HTTP/1.1 404 Not Found"
|
|
@@ -38,31 +38,47 @@ Error in text generation: 'GPU task aborted'
|
|
| 38 |
|
| 39 |
### Solution
|
| 40 |
|
| 41 |
-
1.
|
| 42 |
|
| 43 |
-
**Problematic:**
|
| 44 |
```python
|
| 45 |
-
@spaces.GPU
|
| 46 |
def generate_text(model_path, text):
|
| 47 |
# ...
|
| 48 |
```
|
| 49 |
|
| 50 |
-
**Recommended:**
|
| 51 |
```python
|
| 52 |
-
@spaces.GPU
|
| 53 |
-
def
|
| 54 |
# ...
|
| 55 |
```
|
| 56 |
|
| 57 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
**Problematic:**
|
| 60 |
```python
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
```
|
| 65 |
|
|
|
|
|
|
|
| 66 |
**Recommended:**
|
| 67 |
```python
|
| 68 |
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
|
@@ -75,14 +91,7 @@ Error in text generation: 'GPU task aborted'
|
|
| 75 |
)
|
| 76 |
```
|
| 77 |
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
**Problematic:**
|
| 81 |
-
```python
|
| 82 |
-
from huggingface_hub import AsyncInferenceClient
|
| 83 |
-
client = AsyncInferenceClient(model_id)
|
| 84 |
-
response = await client.text_generation(text)
|
| 85 |
-
```
|
| 86 |
|
| 87 |
**Recommended:**
|
| 88 |
```python
|
|
@@ -91,7 +100,7 @@ Error in text generation: 'GPU task aborted'
|
|
| 91 |
response = client.text_generation(text) # Synchronous call
|
| 92 |
```
|
| 93 |
|
| 94 |
-
|
| 95 |
|
| 96 |
```python
|
| 97 |
try:
|
|
|
|
| 27 |
|
| 28 |
## GPU Acceleration Issues
|
| 29 |
|
| 30 |
+
### spaces.GPU Decorator Issues
|
| 31 |
|
| 32 |
+
We've observed that the `spaces.GPU` decorator may not work correctly when used with methods inside a class. This can lead to errors like:
|
| 33 |
|
| 34 |
```
|
| 35 |
HTTP Request: POST http://device-api.zero/release?allowToken=... "HTTP/1.1 404 Not Found"
|
|
|
|
| 38 |
|
| 39 |
### Solution
|
| 40 |
|
| 41 |
+
1. The syntax for spaces.GPU can be either with or without parentheses. Both of these syntaxes should work:
|
| 42 |
|
|
|
|
| 43 |
```python
|
| 44 |
+
@spaces.GPU
|
| 45 |
def generate_text(model_path, text):
|
| 46 |
# ...
|
| 47 |
```
|
| 48 |
|
|
|
|
| 49 |
```python
|
| 50 |
+
@spaces.GPU()
|
| 51 |
+
def generate_text(model_path, text):
|
| 52 |
# ...
|
| 53 |
```
|
| 54 |
|
| 55 |
+
If you need to specify a duration for longer GPU operations, use parentheses:
|
| 56 |
+
|
| 57 |
+
```python
|
| 58 |
+
@spaces.GPU(duration=120) # Set 120-second duration
|
| 59 |
+
def generate_long_text(model_path, text):
|
| 60 |
+
# ...
|
| 61 |
+
```
|
| 62 |
+
|
| 63 |
+
2. Use standalone functions instead of class methods with spaces.GPU:
|
| 64 |
|
| 65 |
**Problematic:**
|
| 66 |
```python
|
| 67 |
+
class ModelManager:
|
| 68 |
+
@spaces.GPU
|
| 69 |
+
def generate_text(self, model_path, text): # Class method doesn't work well
|
| 70 |
+
# ...
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
**Recommended:**
|
| 74 |
+
```python
|
| 75 |
+
@spaces.GPU
|
| 76 |
+
def generate_text_local(model_path, text): # Standalone function
|
| 77 |
+
# ...
|
| 78 |
```
|
| 79 |
|
| 80 |
+
3. Use direct pipeline creation instead of loading model and tokenizer separately:
|
| 81 |
+
|
| 82 |
**Recommended:**
|
| 83 |
```python
|
| 84 |
tokenizer = AutoTokenizer.from_pretrained(model_path)
|
|
|
|
| 91 |
)
|
| 92 |
```
|
| 93 |
|
| 94 |
+
4. Use synchronous `InferenceClient` instead of `AsyncInferenceClient` for API calls:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 95 |
|
| 96 |
**Recommended:**
|
| 97 |
```python
|
|
|
|
| 100 |
response = client.text_generation(text) # Synchronous call
|
| 101 |
```
|
| 102 |
|
| 103 |
+
5. Implement appropriate error handling to gracefully recover from GPU task aborts:
|
| 104 |
|
| 105 |
```python
|
| 106 |
try:
|