File size: 3,564 Bytes
9313c24 b96e1f4 83a9c3e b96e1f4 83a9c3e b96e1f4 83a9c3e 9313c24 83a9c3e c89343f 83a9c3e c89343f 83a9c3e c89343f 83a9c3e c89343f 83a9c3e 9313c24 83a9c3e 9313c24 83a9c3e 9313c24 83a9c3e 9313c24 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 |
# PULSE-7B Handler Deployment Guide
## 🚀 Deployment Rehberi
### Gereksinimler
- Python 3.8+
- CUDA 11.8+ (GPU kullanımı için)
- Minimum 16GB RAM (CPU), 8GB VRAM (GPU)
### Kurulum
1. **Bağımlılıkları yükleyin:**
```bash
pip install -r requirements.txt
```
2. **PULSE LLaVA Installation (PULSE-7B için kritik):**
```bash
# PULSE-7B için PULSE'un kendi LLaVA implementasyonu gerekli:
pip install git+https://github.com/AIMedLab/PULSE.git#subdirectory=LLaVA
# Bu otomatik olarak transformers==4.37.2 yükleyecektir
```
3. **Flash Attention (isteğe bağlı, performans için):**
```bash
pip install flash-attn --no-build-isolation
```
### HuggingFace Inference Deployment
#### 1. Model Repository Yapısı
```
your-model-repo/
├── handler.py
├── config.json
├── generation_config.json
├── requirements.txt
├── model.safetensors.index.json
├── tokenizer_config.json
├── special_tokens_map.json
└── tokenizer.model
```
#### 2. Endpoint Oluşturma
```bash
# HuggingFace CLI ile deploy
huggingface-cli login
huggingface-cli repo create your-pulse-endpoint --type=space
```
#### 3. Test Requests
**Image URL ile test:**
```bash
curl -X POST "YOUR_ENDPOINT_URL" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"query": "Analyze this ECG image",
"image": "https://i.imgur.com/7uuejqO.jpeg"
},
"parameters": {
"temperature": 0.2,
"max_new_tokens": 512
}
}'
```
**Base64 ile test:**
```bash
curl -X POST "YOUR_ENDPOINT_URL" \
-H "Content-Type: application/json" \
-d '{
"inputs": {
"query": "What do you see in this ECG?",
"image": "data:image/jpeg;base64,/9j/4AAQ..."
},
"parameters": {
"temperature": 0.2
}
}'
```
### Performans Optimizasyonları
#### GPU Memory Optimizasyonu
- `torch_dtype=torch.bfloat16` kullanın
- `low_cpu_mem_usage=True` ayarlayın
- `device_map="auto"` ile otomatik dağıtım
#### CPU Optimizasyonu
- `torch_dtype=torch.float32` kullanın
- Thread sayısını ayarlayın: `torch.set_num_threads(4)`
### Monitoring ve Debugging
#### Log Seviyeleri
```python
import logging
logging.basicConfig(level=logging.INFO)
```
#### Memory Usage
```python
import torch
print(f"GPU Memory: {torch.cuda.memory_allocated()/1024**3:.2f}GB")
```
### Troubleshooting
#### Common Issues:
1. **"llava_llama architecture not recognized" Error**
```bash
# PULSE-7B Solution: Install PULSE's LLaVA implementation
pip install git+https://github.com/AIMedLab/PULSE.git#subdirectory=LLaVA
# Also install development transformers
pip install git+https://github.com/huggingface/transformers.git
# Or add both to requirements.txt:
git+https://github.com/huggingface/transformers.git
git+https://github.com/AIMedLab/PULSE.git#subdirectory=LLaVA
```
2. **CUDA Out of Memory**
- Batch size'ı azaltın
- `max_new_tokens` değerini düşürün
- Gradient checkpointing kullanın
3. **Slow Image Processing**
- Image timeout değerini artırın
- Image resize threshold ayarlayın
4. **Model Loading Issues**
- HuggingFace token'ını kontrol edin
- Network bağlantısını doğrulayın
- Cache dizinini temizleyin
- Transformers sürümünü kontrol edin
### Security Best Practices
- Image URL'leri validate edin
- Base64 boyut limitlerini ayarlayın
- Rate limiting uygulayın
- Input sanitization yapın
### Monitoring Metrics
- Response time
- Memory usage
- Error rates
- Image processing success rate
- Token generation speed
|