key_word_Fast_API / GPU_SETUP_GUIDE.md
ihtesham0345's picture
Add GPU Setup Cheat Sheet for RTX 4050
30aa031
|
Raw
History Blame Contribute Delete
2.79 kB
# πŸš€ Local GPU Setup Guide (RTX 4050 Edition)
This cheat sheet guides you through running the SEO Analyzer on your **NVIDIA RTX 4050** for lightning-fast inference.
## βœ… Prerequisites
1. **NVIDIA Drivers**: Ensure your GeForce Experience drivers are up to date.
2. **Python 3.10 or 3.11**: Installed and added to PATH.
3. **Git**: To clone the repository.
---
## πŸ› οΈ Step 1: Clone & Setup
Open your terminal (PowerShell or CMD) and run:
```powershell
# 1. Clone the repository (if you haven't already)
git clone https://huggingface.co/spaces/ihtesham0345/key_word_Fast_API
cd key_word_Fast_API
# 2. Create a virtual environment (Recommended)
python -m venv venv
.\venv\Scripts\activate
```
---
## ⚑ Step 2: Install GPU-Enabled PyTorch (Crucial!)
By default, `pip install torch` might install the CPU version. We need the CUDA version.
```powershell
# Uninstall any existing CPU version
pip uninstall torch torchvision torchaudio -y
# Install PyTorch with CUDA 12.1 support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```
*Verify installation:*
```powershell
python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}'); print(f'Device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}')"
```
*It should say `CUDA Available: True` and `Device: NVIDIA GeForce RTX 4050 Laptop GPU`.*
---
## πŸ“¦ Step 3: Install Other Dependencies
Now install the rest of the app requirements.
```powershell
pip install -r requirements.txt
```
---
## πŸš€ Step 4: Run the Server
Launch the API. It will automatically detect your GPU.
```powershell
python -m uvicorn main:app --reload
```
---
## 🧠 Model Recommendations for RTX 4050 (6GB)
Your card fits small to medium models perfectly.
### Option A: Ultra Speed (Current)
* **Model**: `Qwen/Qwen2.5-0.5B-Instruct`
* **Speed**: Instant
* **VRAM**: ~1 GB
### Option B: The "Goldilocks" (Recommended)
Upgrade to the 1.5B model for smarter results.
1. Open `services/analyzer.py`
2. Change line 14:
```python
MODEL_ID = "Qwen/Qwen2.5-1.5B-Instruct"
```
3. Save and the server will auto-download it (3GB).
### Option C: Max Intelligence (Quantized)
Run the 7B model using 4-bit quantization (Smarter than GPT-3.5).
1. Install bitsandbytes: `pip install bitsandbytes`
2. Update `services/analyzer.py`:
```python
MODEL_ID = "Qwen/Qwen2.5-7B-Instruct"
# Update pipeline config
pipe = pipeline(
...,
model_kwargs={"load_in_4bit": True}
)
```
---
## ❓ Troubleshooting
- **Out of Memory (OOM)**: If you get a CUDA OOM error, close other apps (Chrome uses GPU!) or switch to a smaller model.
- **Slow Speed**: Ensure your laptop is plugged in and in "Performance Mode".