Spaces:

danielrosehill
/

Claude-Code-Slash-Commands

Sleeping

Redesign interface with accordion cards and category pills

292d92c 2 months ago

1.99 kB

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

metadata

description: Set up Ollama on the machine for local LLM inference
tags:
  - ai
  - ml
  - ollama
  - llm
  - setup
  - project
  - gitignored

You are helping the user set up Ollama for local LLM inference.

Process

Check if Ollama is already installed
- Run: ollama --version
- Check if service is running: systemctl status ollama or sudo systemctl status ollama
Install Ollama if needed
- Download and install: curl -fsSL https://ollama.com/install.sh | sh
- Or manual install from https://ollama.com/download
- Verify installation: ollama --version
Start Ollama service
- Start service: systemctl start ollama or sudo systemctl start ollama
- Enable on boot: systemctl enable ollama or sudo systemctl enable ollama
- Check status: systemctl status ollama
Verify GPU support (for AMD on Daniel's system)
- Check if ROCm is detected: rocm-smi or rocminfo
- Ollama should auto-detect AMD GPU
- Check Ollama logs for GPU recognition: journalctl -u ollama -n 50
Configure Ollama
- Check default model storage: ~/.ollama/models
- Environment variables (if needed):
  - OLLAMA_HOST - change port/binding
  - OLLAMA_MODELS - custom model directory
  - OLLAMA_NUM_PARALLEL - parallel requests
- Edit systemd service if needed: /etc/systemd/system/ollama.service
Test Ollama
- Pull a test model: ollama pull llama2 (or smaller: ollama pull tinyllama)
- Run a test: ollama run tinyllama "Hello, how are you?"
- Verify GPU usage during inference
Suggest initial models
- Based on Daniel's hardware (AMD GPU), suggest:
  - General: llama3.2, qwen2.5
  - Code: codellama, deepseek-coder
  - Fast: tinyllama, phi
  - Vision: llava, bakllava

Provide a summary showing: