Spaces:

mishrabp
/

ollama

Sleeping

App Files Files Community

ollama / README.md

mishrabp

Upload folder using huggingface_hub

1c1c554 verified 3 months ago

preview code

raw

history blame contribute delete

3.18 kB

metadata

title: Ollama
emoji: ⚡
colorFrom: pink
colorTo: blue
sdk: docker
pinned: false
license: mit
short_description: Ollama (llama3.2:3b) on Hugging Face Spaces (Docker)

🦙 Ollama (llama3.2:3b) on Hugging Face Spaces (Docker)

This Space runs Ollama with the llama3.2:3b model using a Docker-based Hugging Face Space.

The container starts Ollama, pulls the model at runtime, and exposes the Ollama HTTP API so you can interact with the model programmatically.

🚀 Hugging Face Space Configuration

When creating the Space:

SDK: Docker
Template: Blank
Hardware: Free CPU (sufficient for 3B models)
Visibility: Public or Private (your choice)

No additional configuration files are required beyond this repository.

🧱 How It Works

Ollama runs inside a Docker container
The model llama3.2:3b is pulled on startup
Ollama listens on port 11434
Hugging Face automatically maps the port and exposes the Space URL

📡 API Usage

Once the Space is running, you can interact with Ollama via HTTP.

Check version

curl https://<your-space>.hf.space/api/version

Generate text

curl https://<your-space>.hf.space/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:3b",
    "prompt": "Explain Kubernetes like I am five"
  }'

🔄 Automatic Deployment (GitHub → Hugging Face)

You can deploy automatically using GitHub Actions.

Required Secrets (GitHub Repository)

Go to Settings → Secrets and variables → Actions, then add:

Secret Name	Description
`HF_USERNAME`	Your Hugging Face username
`HF_TOKEN`	Hugging Face Write access token
`SPACE_NAME`	Name of the Hugging Face Space

The workflow logs in to:

registry.hf.space

and pushes the Docker image, triggering a redeploy of the Space.

🧪 Local Development (Optional)

You can run the same container locally for testing.

Build the image

docker build -t ollama-local:latest .

If you hit DNS/network issues:

docker build --network=host -t ollama-local:latest .

Run locally

docker run -p 11434:11434 ollama-local:latest

Test:

curl http://localhost:11434/api/version

☸️ Local Kubernetes Deployment (Optional)

This project can also be deployed to a local Kubernetes cluster (Docker Desktop, Minikube, MicroK8s).

Build the image

docker build -t ollama-local:latest .

Deploy

kubectl apply -f local.yml

Access the service

Docker Desktop:
http://localhost:30786
Minikube:
```
echo "http://$(minikube ip):30786"
```

Verify:

curl http://localhost:30786/api/version

⚠️ Notes & Limitations

Free CPU Spaces are slow for inference (expected for LLMs)
Model downloads happen at container startup
Hugging Face Spaces may restart containers periodically

📚 References

Ollama: https://ollama.com/
Hugging Face Spaces (Docker): https://huggingface.co/docs/hub/spaces-sdks-docker