| # Quick Start: Running Qwen-2.5-32B Locally | |
| This is a quick guide to get you started with FREE local LLM inference using your A100 GPU. | |
| ## Why Local? | |
| β **$0 cost** - No API fees | |
| β **Privacy** - Data stays on your machine | |
| β **Quality** - 32B parameter model with strong performance | |
| ## Setup (One-time) | |
| ### 1. Pull the Model (~10-30 minutes) | |
| ```bash | |
| # Pull Qwen-2.5-32B-Instruct | |
| ollama pull qwen2.5:32b-instruct | |
| # Wait for download to complete (~20GB) | |
| # Model will be cached at: ~/.ollama/models/ | |
| ``` | |
| ### 2. Verify Model is Ready | |
| ```bash | |
| # List installed models | |
| ollama list | |
| # Should show: qwen2.5:32b-instruct | |
| # Test it | |
| ollama run qwen2.5:32b-instruct "Hello, who are you?" | |
| ``` | |
| If you see a response, you're ready! β | |
| ## Running the Notebook | |
| ### Open the Notebook | |
| ```bash | |
| cd jupyter_notebooks | |
| jupyter notebook Section_2-3-4_Figure_8_deepfake_adapters.ipynb | |
| ``` | |
| ### Run the Cells | |
| 1. **Cell 5**: NER & Name Cleaning (processes names) | |
| 2. **Cell 7**: Country/Nationality Mapping | |
| 3. **Cell 20**: Qwen-2.5-32B Local Annotation π **This is the new one!** | |
| ### Configure Cell 20 | |
| ```python | |
| # Start with test mode | |
| TEST_MODE = True | |
| TEST_SIZE = 10 | |
| # Then run full dataset | |
| TEST_MODE = False | |
| MAX_ROWS = 20000 # or None for all | |
| ``` | |
| ### Run Cell 20 | |
| Just click "Run" or press Shift+Enter. The cell will: | |
| 1. Check if Ollama is installed β | |
| 2. Check if model is available β | |
| 3. Start annotating | |
| 4. Save progress every 10 rows | |
| 5. Show completion stats | |
| ### Monitor Progress | |
| ``` | |
| Qwen Local: 100%|ββββββββββ| 10/10 [02:30<00:00, 15.0s/it] | |
| β Saved after 10 rows (~24.0 samples/hour) | |
| β Done! Results: data/CSV/qwen_local_annotated_POI_test.csv | |
| Total time: 2.5 minutes | |
| Average speed: 240.0 samples/hour | |
| ``` | |
| ## Performance | |
| On your A100 80GB: | |
| - **Speed**: ~5-10 tokens/second | |
| - **Throughput**: ~100-200 samples/hour | |
| - **Memory**: ~22-25GB VRAM | |
| - **Cost**: $0 | |
| ### Time Estimates | |
| | Dataset Size | Time | | |
| |-------------|------| | |
| | 10 samples (test) | ~2-3 minutes | | |
| | 100 samples | ~20-30 minutes | | |
| | 1,000 samples | ~5-10 hours | | |
| | 10,000 samples | ~50-100 hours | | |
| **Tip**: Run overnight or over the weekend for large datasets! | |
| ## Troubleshooting | |
| ### "Model not found" | |
| ```bash | |
| ollama pull qwen2.5:32b-instruct | |
| ``` | |
| ### "Ollama not running" | |
| ```bash | |
| ollama serve | |
| ``` | |
| ### Out of Memory | |
| Your A100 has 80GB VRAM - this should NOT happen with the 32B model (~25GB VRAM). | |
| If it does, try the quantized version: | |
| ```bash | |
| ollama pull qwen2.5:32b-instruct-q4_0 # Only ~12GB VRAM | |
| ``` | |
| ## Output | |
| Results saved to: | |
| - Test: `data/CSV/qwen_local_annotated_POI_test.csv` | |
| - Full: `data/CSV/qwen_local_annotated_POI.csv` | |
| Same format as API results - easy to compare! | |
| ## Custom Model Cache Location | |
| To store models in `data/models/`: | |
| ```bash | |
| export OLLAMA_MODELS="/home/lauhp/000_PHD/000_010_PUBLICATION/CODE/pm-paper/data/models" | |
| ollama pull qwen2.5:32b-instruct | |
| ``` | |
| ## Comparing API vs Local | |
| After running both: | |
| ```python | |
| import pandas as pd | |
| qwen_api = pd.read_csv('data/CSV/qwen_annotated_POI_test.csv') | |
| qwen_local = pd.read_csv('data/CSV/qwen_local_annotated_POI_test.csv') | |
| # Check agreement | |
| agreement = (qwen_api['profession_llm'] == qwen_local['profession_llm']).mean() | |
| print(f"Agreement: {agreement*100:.1f}%") | |
| ``` | |
| ## Full Documentation | |
| For more details, see: | |
| - `QWEN_LOCAL_SETUP.md` - Complete setup guide | |
| - `LLM_MODELS_COMPARISON.md` - All 6 LLM options compared | |
| ## Summary | |
| β Ollama already installed | |
| β A100 80GB GPU - perfect for Qwen-2.5-32B | |
| β FREE inference - no API costs | |
| β Privacy - data stays local | |
| **Next step**: Run Cell 20 in the notebook! π | |