Spaces:
Sleeping
Sleeping
| # π Local GPU Setup Guide (RTX 4050 Edition) | |
| This cheat sheet guides you through running the SEO Analyzer on your **NVIDIA RTX 4050** for lightning-fast inference. | |
| ## β Prerequisites | |
| 1. **NVIDIA Drivers**: Ensure your GeForce Experience drivers are up to date. | |
| 2. **Python 3.10 or 3.11**: Installed and added to PATH. | |
| 3. **Git**: To clone the repository. | |
| --- | |
| ## π οΈ Step 1: Clone & Setup | |
| Open your terminal (PowerShell or CMD) and run: | |
| ```powershell | |
| # 1. Clone the repository (if you haven't already) | |
| git clone https://huggingface.co/spaces/ihtesham0345/key_word_Fast_API | |
| cd key_word_Fast_API | |
| # 2. Create a virtual environment (Recommended) | |
| python -m venv venv | |
| .\venv\Scripts\activate | |
| ``` | |
| --- | |
| ## β‘ Step 2: Install GPU-Enabled PyTorch (Crucial!) | |
| By default, `pip install torch` might install the CPU version. We need the CUDA version. | |
| ```powershell | |
| # Uninstall any existing CPU version | |
| pip uninstall torch torchvision torchaudio -y | |
| # Install PyTorch with CUDA 12.1 support | |
| pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 | |
| ``` | |
| *Verify installation:* | |
| ```powershell | |
| python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}'); print(f'Device: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}')" | |
| ``` | |
| *It should say `CUDA Available: True` and `Device: NVIDIA GeForce RTX 4050 Laptop GPU`.* | |
| --- | |
| ## π¦ Step 3: Install Other Dependencies | |
| Now install the rest of the app requirements. | |
| ```powershell | |
| pip install -r requirements.txt | |
| ``` | |
| --- | |
| ## π Step 4: Run the Server | |
| Launch the API. It will automatically detect your GPU. | |
| ```powershell | |
| python -m uvicorn main:app --reload | |
| ``` | |
| --- | |
| ## π§ Model Recommendations for RTX 4050 (6GB) | |
| Your card fits small to medium models perfectly. | |
| ### Option A: Ultra Speed (Current) | |
| * **Model**: `Qwen/Qwen2.5-0.5B-Instruct` | |
| * **Speed**: Instant | |
| * **VRAM**: ~1 GB | |
| ### Option B: The "Goldilocks" (Recommended) | |
| Upgrade to the 1.5B model for smarter results. | |
| 1. Open `services/analyzer.py` | |
| 2. Change line 14: | |
| ```python | |
| MODEL_ID = "Qwen/Qwen2.5-1.5B-Instruct" | |
| ``` | |
| 3. Save and the server will auto-download it (3GB). | |
| ### Option C: Max Intelligence (Quantized) | |
| Run the 7B model using 4-bit quantization (Smarter than GPT-3.5). | |
| 1. Install bitsandbytes: `pip install bitsandbytes` | |
| 2. Update `services/analyzer.py`: | |
| ```python | |
| MODEL_ID = "Qwen/Qwen2.5-7B-Instruct" | |
| # Update pipeline config | |
| pipe = pipeline( | |
| ..., | |
| model_kwargs={"load_in_4bit": True} | |
| ) | |
| ``` | |
| --- | |
| ## β Troubleshooting | |
| - **Out of Memory (OOM)**: If you get a CUDA OOM error, close other apps (Chrome uses GPU!) or switch to a smaller model. | |
| - **Slow Speed**: Ensure your laptop is plugged in and in "Performance Mode". | |