| # Quick Start: Running Qwen-2.5-32B Locally |
|
|
| This is a quick guide to get you started with FREE local LLM inference using your A100 GPU. |
|
|
| ## Why Local? |
|
|
| β
**$0 cost** - No API fees |
| β
**Privacy** - Data stays on your machine |
| β
**Quality** - 32B parameter model with strong performance |
|
|
| ## Setup (One-time) |
|
|
| ### 1. Pull the Model (~10-30 minutes) |
|
|
| ```bash |
| # Pull Qwen-2.5-32B-Instruct |
| ollama pull qwen2.5:32b-instruct |
| |
| # Wait for download to complete (~20GB) |
| # Model will be cached at: ~/.ollama/models/ |
| ``` |
|
|
| ### 2. Verify Model is Ready |
|
|
| ```bash |
| # List installed models |
| ollama list |
| |
| # Should show: qwen2.5:32b-instruct |
| |
| # Test it |
| ollama run qwen2.5:32b-instruct "Hello, who are you?" |
| ``` |
|
|
| If you see a response, you're ready! β
|
|
|
| ## Running the Notebook |
|
|
| ### Open the Notebook |
|
|
| ```bash |
| cd jupyter_notebooks |
| jupyter notebook Section_2-3-4_Figure_8_deepfake_adapters.ipynb |
| ``` |
|
|
| ### Run the Cells |
|
|
| 1. **Cell 5**: NER & Name Cleaning (processes names) |
| 2. **Cell 7**: Country/Nationality Mapping |
| 3. **Cell 20**: Qwen-2.5-32B Local Annotation π **This is the new one!** |
|
|
| ### Configure Cell 20 |
|
|
| ```python |
| # Start with test mode |
| TEST_MODE = True |
| TEST_SIZE = 10 |
| |
| # Then run full dataset |
| TEST_MODE = False |
| MAX_ROWS = 20000 # or None for all |
| ``` |
|
|
| ### Run Cell 20 |
|
|
| Just click "Run" or press Shift+Enter. The cell will: |
| 1. Check if Ollama is installed β
|
| 2. Check if model is available β
|
| 3. Start annotating |
| 4. Save progress every 10 rows |
| 5. Show completion stats |
|
|
| ### Monitor Progress |
|
|
| ``` |
| Qwen Local: 100%|ββββββββββ| 10/10 [02:30<00:00, 15.0s/it] |
| β
Saved after 10 rows (~24.0 samples/hour) |
| |
| β
Done! Results: data/CSV/qwen_local_annotated_POI_test.csv |
| Total time: 2.5 minutes |
| Average speed: 240.0 samples/hour |
| ``` |
|
|
| ## Performance |
|
|
| On your A100 80GB: |
| - **Speed**: ~5-10 tokens/second |
| - **Throughput**: ~100-200 samples/hour |
| - **Memory**: ~22-25GB VRAM |
| - **Cost**: $0 |
|
|
| ### Time Estimates |
|
|
| | Dataset Size | Time | |
| |-------------|------| |
| | 10 samples (test) | ~2-3 minutes | |
| | 100 samples | ~20-30 minutes | |
| | 1,000 samples | ~5-10 hours | |
| | 10,000 samples | ~50-100 hours | |
|
|
| **Tip**: Run overnight or over the weekend for large datasets! |
|
|
| ## Troubleshooting |
|
|
| ### "Model not found" |
|
|
| ```bash |
| ollama pull qwen2.5:32b-instruct |
| ``` |
|
|
| ### "Ollama not running" |
|
|
| ```bash |
| ollama serve |
| ``` |
|
|
| ### Out of Memory |
|
|
| Your A100 has 80GB VRAM - this should NOT happen with the 32B model (~25GB VRAM). |
|
|
| If it does, try the quantized version: |
| ```bash |
| ollama pull qwen2.5:32b-instruct-q4_0 # Only ~12GB VRAM |
| ``` |
|
|
| ## Output |
|
|
| Results saved to: |
| - Test: `data/CSV/qwen_local_annotated_POI_test.csv` |
| - Full: `data/CSV/qwen_local_annotated_POI.csv` |
|
|
| Same format as API results - easy to compare! |
|
|
| ## Custom Model Cache Location |
|
|
| To store models in `data/models/`: |
|
|
| ```bash |
| export OLLAMA_MODELS="/home/lauhp/000_PHD/000_010_PUBLICATION/CODE/pm-paper/data/models" |
| ollama pull qwen2.5:32b-instruct |
| ``` |
|
|
| ## Comparing API vs Local |
|
|
| After running both: |
|
|
| ```python |
| import pandas as pd |
| |
| qwen_api = pd.read_csv('data/CSV/qwen_annotated_POI_test.csv') |
| qwen_local = pd.read_csv('data/CSV/qwen_local_annotated_POI_test.csv') |
| |
| # Check agreement |
| agreement = (qwen_api['profession_llm'] == qwen_local['profession_llm']).mean() |
| print(f"Agreement: {agreement*100:.1f}%") |
| ``` |
|
|
| ## Full Documentation |
|
|
| For more details, see: |
| - `QWEN_LOCAL_SETUP.md` - Complete setup guide |
| - `LLM_MODELS_COMPARISON.md` - All 6 LLM options compared |
|
|
| ## Summary |
|
|
| β
Ollama already installed |
| β
A100 80GB GPU - perfect for Qwen-2.5-32B |
| β
FREE inference - no API costs |
| β
Privacy - data stays local |
|
|
| **Next step**: Run Cell 20 in the notebook! π |
|
|