code / md /QUICK_START_LOCAL.md
Laura Wagner
to commit or not commit that is the question
5f5806d
# Quick Start: Running Qwen-2.5-32B Locally
This is a quick guide to get you started with FREE local LLM inference using your A100 GPU.
## Why Local?
βœ… **$0 cost** - No API fees
βœ… **Privacy** - Data stays on your machine
βœ… **Quality** - 32B parameter model with strong performance
## Setup (One-time)
### 1. Pull the Model (~10-30 minutes)
```bash
# Pull Qwen-2.5-32B-Instruct
ollama pull qwen2.5:32b-instruct
# Wait for download to complete (~20GB)
# Model will be cached at: ~/.ollama/models/
```
### 2. Verify Model is Ready
```bash
# List installed models
ollama list
# Should show: qwen2.5:32b-instruct
# Test it
ollama run qwen2.5:32b-instruct "Hello, who are you?"
```
If you see a response, you're ready! βœ…
## Running the Notebook
### Open the Notebook
```bash
cd jupyter_notebooks
jupyter notebook Section_2-3-4_Figure_8_deepfake_adapters.ipynb
```
### Run the Cells
1. **Cell 5**: NER & Name Cleaning (processes names)
2. **Cell 7**: Country/Nationality Mapping
3. **Cell 20**: Qwen-2.5-32B Local Annotation πŸ‘ˆ **This is the new one!**
### Configure Cell 20
```python
# Start with test mode
TEST_MODE = True
TEST_SIZE = 10
# Then run full dataset
TEST_MODE = False
MAX_ROWS = 20000 # or None for all
```
### Run Cell 20
Just click "Run" or press Shift+Enter. The cell will:
1. Check if Ollama is installed βœ…
2. Check if model is available βœ…
3. Start annotating
4. Save progress every 10 rows
5. Show completion stats
### Monitor Progress
```
Qwen Local: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 10/10 [02:30<00:00, 15.0s/it]
βœ… Saved after 10 rows (~24.0 samples/hour)
βœ… Done! Results: data/CSV/qwen_local_annotated_POI_test.csv
Total time: 2.5 minutes
Average speed: 240.0 samples/hour
```
## Performance
On your A100 80GB:
- **Speed**: ~5-10 tokens/second
- **Throughput**: ~100-200 samples/hour
- **Memory**: ~22-25GB VRAM
- **Cost**: $0
### Time Estimates
| Dataset Size | Time |
|-------------|------|
| 10 samples (test) | ~2-3 minutes |
| 100 samples | ~20-30 minutes |
| 1,000 samples | ~5-10 hours |
| 10,000 samples | ~50-100 hours |
**Tip**: Run overnight or over the weekend for large datasets!
## Troubleshooting
### "Model not found"
```bash
ollama pull qwen2.5:32b-instruct
```
### "Ollama not running"
```bash
ollama serve
```
### Out of Memory
Your A100 has 80GB VRAM - this should NOT happen with the 32B model (~25GB VRAM).
If it does, try the quantized version:
```bash
ollama pull qwen2.5:32b-instruct-q4_0 # Only ~12GB VRAM
```
## Output
Results saved to:
- Test: `data/CSV/qwen_local_annotated_POI_test.csv`
- Full: `data/CSV/qwen_local_annotated_POI.csv`
Same format as API results - easy to compare!
## Custom Model Cache Location
To store models in `data/models/`:
```bash
export OLLAMA_MODELS="/home/lauhp/000_PHD/000_010_PUBLICATION/CODE/pm-paper/data/models"
ollama pull qwen2.5:32b-instruct
```
## Comparing API vs Local
After running both:
```python
import pandas as pd
qwen_api = pd.read_csv('data/CSV/qwen_annotated_POI_test.csv')
qwen_local = pd.read_csv('data/CSV/qwen_local_annotated_POI_test.csv')
# Check agreement
agreement = (qwen_api['profession_llm'] == qwen_local['profession_llm']).mean()
print(f"Agreement: {agreement*100:.1f}%")
```
## Full Documentation
For more details, see:
- `QWEN_LOCAL_SETUP.md` - Complete setup guide
- `LLM_MODELS_COMPARISON.md` - All 6 LLM options compared
## Summary
βœ… Ollama already installed
βœ… A100 80GB GPU - perfect for Qwen-2.5-32B
βœ… FREE inference - no API costs
βœ… Privacy - data stays local
**Next step**: Run Cell 20 in the notebook! πŸš€