Spaces:

ihtesham0345
/

key_word_Fast_API

Sleeping

App Files Files Community

key_word_Fast_API / GPU_SETUP_GUIDE.md

ihtesham0345

Add GPU Setup Cheat Sheet for RTX 4050

30aa031 5 months ago

preview code

Raw

History Blame Contribute Delete

2.79 kB

	# 🚀 Local GPU Setup Guide (RTX 4050 Edition)

	This cheat sheet guides you through running the SEO Analyzer on your NVIDIA RTX 4050 for lightning-fast inference.

	## ✅ Prerequisites
	1. NVIDIA Drivers: Ensure your GeForce Experience drivers are up to date.
	2. Python 3.10 or 3.11: Installed and added to PATH.
	3. Git: To clone the repository.

	---

	## 🛠️ Step 1: Clone & Setup
	Open your terminal (PowerShell or CMD) and run:

	```powershell
	# 1. Clone the repository (if you haven't already)
	git clone https://huggingface.co/spaces/ihtesham0345/key_word_Fast_API
	cd key_word_Fast_API

	# 2. Create a virtual environment (Recommended)
	python -m venv venv
	.\venv\Scripts\activate
	```

	---

	## ⚡ Step 2: Install GPU-Enabled PyTorch (Crucial!)
	By default, `pip install torch` might install the CPU version. We need the CUDA version.

	```powershell
	# Uninstall any existing CPU version
	pip uninstall torch torchvision torchaudio -y

	# Install PyTorch with CUDA 12.1 support
	pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
	```

	Verify installation:
	```powershell
	python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}'); print(f'Device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}')"
	```
	It should say `CUDA Available: True` and `Device: NVIDIA GeForce RTX 4050 Laptop GPU`.

	---

	## 📦 Step 3: Install Other Dependencies
	Now install the rest of the app requirements.

	```powershell
	pip install -r requirements.txt
	```

	---

	## 🚀 Step 4: Run the Server
	Launch the API. It will automatically detect your GPU.

	```powershell
	python -m uvicorn main:app --reload
	```

	---

	## 🧠 Model Recommendations for RTX 4050 (6GB)
	Your card fits small to medium models perfectly.

	### Option A: Ultra Speed (Current)
	* Model: `Qwen/Qwen2.5-0.5B-Instruct`
	* Speed: Instant
	* VRAM: ~1 GB

	### Option B: The "Goldilocks" (Recommended)
	Upgrade to the 1.5B model for smarter results.
	1. Open `services/analyzer.py`
	2. Change line 14:
	```python
	MODEL_ID = "Qwen/Qwen2.5-1.5B-Instruct"
	```
	3. Save and the server will auto-download it (3GB).

	### Option C: Max Intelligence (Quantized)
	Run the 7B model using 4-bit quantization (Smarter than GPT-3.5).
	1. Install bitsandbytes: `pip install bitsandbytes`
	2. Update `services/analyzer.py`:
	```python
	MODEL_ID = "Qwen/Qwen2.5-7B-Instruct"
	# Update pipeline config
	pipe = pipeline(
	...,
	model_kwargs={"load_in_4bit": True}
	)
	```

	---

	## ❓ Troubleshooting
	- Out of Memory (OOM): If you get a CUDA OOM error, close other apps (Chrome uses GPU!) or switch to a smaller model.
	- Slow Speed: Ensure your laptop is plugged in and in "Performance Mode".