---
description: Set up Ollama on the machine for local LLM inference
tags: [ai, ml, ollama, llm, setup, project, gitignored]
---

You are helping the user set up Ollama for local LLM inference.

## Process

1. **Check if Ollama is already installed**
   - Run: `ollama --version`
   - Check if service is running: `systemctl status ollama` or `sudo systemctl status ollama`

2. **Install Ollama if needed**
   - Download and install: `curl -fsSL https://ollama.com/install.sh | sh`
   - Or manual install from https://ollama.com/download
   - Verify installation: `ollama --version`

3. **Start Ollama service**
   - Start service: `systemctl start ollama` or `sudo systemctl start ollama`
   - Enable on boot: `systemctl enable ollama` or `sudo systemctl enable ollama`
   - Check status: `systemctl status ollama`

4. **Verify GPU support (for AMD on Daniel's system)**
   - Check if ROCm is detected: `rocm-smi` or `rocminfo`
   - Ollama should auto-detect AMD GPU
   - Check Ollama logs for GPU recognition: `journalctl -u ollama -n 50`

5. **Configure Ollama**
   - Check default model storage: `~/.ollama/models`
   - Environment variables (if needed):
     - `OLLAMA_HOST` - change port/binding
     - `OLLAMA_MODELS` - custom model directory
     - `OLLAMA_NUM_PARALLEL` - parallel requests
   - Edit systemd service if needed: `/etc/systemd/system/ollama.service`

6. **Test Ollama**
   - Pull a test model: `ollama pull llama2` (or smaller: `ollama pull tinyllama`)
   - Run a test: `ollama run tinyllama "Hello, how are you?"`
   - Verify GPU usage during inference

7. **Suggest initial models**
   - Based on Daniel's hardware (AMD GPU), suggest:
     - General: llama3.2, qwen2.5
     - Code: codellama, deepseek-coder
     - Fast: tinyllama, phi
     - Vision: llava, bakllava

## Output

Provide a summary showing:
- Ollama installation status and version
- Service status
- GPU detection status
- Default configuration
- Recommended models to pull
- Next steps for usage