builder / README.md
Claude
feat(builder): add HTMX dashboard, hivemind integration, and robust enum parsing
8029b17
metadata
title: HF Builder
sdk: docker
pinned: false

HF Builder

Daemonless Docker image builder for HuggingFace Spaces using Kaniko.

Features

  • Build Tracking: Unique build IDs, history, and metrics
  • GitHub Webhooks: Automatic builds on push with signature verification
  • Notifications: Slack, Discord, and custom webhook notifications
  • Observability: OpenTelemetry tracing, Prometheus metrics, structured logging
  • Status Badge: SVG badge for READMEs
  • Request Tracing: Trace IDs through logs and responses
  • HTMX Dashboard: Real-time web interface with live updates
  • Hivemind Integration: Connect to hivemind controller for distributed builds
  • Task Runner: go-task automation for common operations

Quick Start

# Install go-task
brew install go-task

# Run locally
task dev

# Check status
task status

# Trigger a build
task build REPO_URL=https://github.com/owner/repo IMAGE=owner/repo

Configuration

Core Settings

Secret Required Default Description
REGISTRY_URL No ghcr.io Container registry URL
REGISTRY_USER Yes - Registry username
REGISTRY_PASSWORD Yes - Registry password/token
GITHUB_TOKEN For private - Token for cloning
WEBHOOK_SECRET Recommended - GitHub webhook secret
DEFAULT_IMAGE No - Default image name
BUILD_TIMEOUT No 1800 Timeout in seconds
ENABLE_CACHE No false Enable Kaniko cache

Notifications

Secret Description
NOTIFICATION_URL Generic webhook URL for build results
SLACK_WEBHOOK_URL Slack incoming webhook URL
DISCORD_WEBHOOK_URL Discord webhook URL
NOTIFY_ON When to notify: all, failure, success (default: failure)

Observability

Secret Description
LOG_FORMAT text or json (default: text)
OTEL_EXPORTER_OTLP_ENDPOINT OpenTelemetry collector endpoint
OTEL_SERVICE_NAME Service name for traces (default: hf-builder)

Hivemind Integration

Secret Description
HIVEMIND_CONTROLLER_URL URL of the hivemind controller (enables worker mode)
HIVEMIND_POLL_INTERVAL Seconds between work polls (default: 30)

When HIVEMIND_CONTROLLER_URL is set, the builder:

  1. Registers itself with the controller as a "build" worker
  2. Polls for build work items
  3. Executes builds and reports completion/failure
  4. Sends heartbeats to indicate status

Web Dashboard

The builder includes a real-time HTMX-powered dashboard at the root URL showing:

  • Stats: Status, completed/failed builds, success rate
  • Current Build: Active build with cancel button
  • Build History: Recent builds with status badges
  • New Build Form: Trigger builds from the UI
  • Live Logs: Real-time log streaming

All panels auto-refresh via HTMX polling.

Status Badge

Add to your README:

![Build Status](https://your-space.hf.space/badge)

Returns an SVG badge showing: passing, failing, building, or no builds.

Request Tracing

Every request gets a trace ID:

  • Pass X-Request-ID or X-Trace-ID header, or one is auto-generated
  • Response includes X-Trace-ID header
  • Logs include trace ID: [HH:MM:SS] [trace123] message
  • JSON logs include trace_id field

Notifications

Slack

Set SLACK_WEBHOOK_URL to receive formatted messages:

✅ Build SUCCESS
owner/repo:latest
ID: abc123 | Duration: 180.5s | Branch: main

Discord

Set DISCORD_WEBHOOK_URL to receive embedded messages with build details.

Custom Webhook

Set NOTIFICATION_URL or pass callback_url in API request. Receives JSON:

{
  "build": {
    "id": "abc123",
    "status": "success",
    "image": "ghcr.io/owner/repo",
    "tags": ["latest"],
    "duration_seconds": 180.5,
    "trace_id": "xyz789"
  },
  "runner_id": "runner1",
  "registry": "ghcr.io"
}

OpenTelemetry

Set OTEL_EXPORTER_OTLP_ENDPOINT to enable distributed tracing.

Traces include:

  • Span per build with build.id, build.image, build.status
  • Errors recorded as span events

Works with:

  • Jaeger
  • Grafana Tempo
  • Honeycomb
  • Any OTLP-compatible backend

Prometheus Metrics

GET /api/metrics returns:

# HELP hf_builder_builds_total Total builds
# TYPE hf_builder_builds_total counter
hf_builder_builds_total{status="success"} 42
hf_builder_builds_total{status="failed"} 3

# HELP hf_builder_build_duration_seconds Avg build duration
# TYPE hf_builder_build_duration_seconds gauge
hf_builder_build_duration_seconds 180.50

# HELP hf_builder_success_rate Success rate
# TYPE hf_builder_success_rate gauge
hf_builder_success_rate 0.9333

API Endpoints

Endpoint Method Description
/ GET Web dashboard (HTMX)
/health GET Health check
/ready GET Readiness check
/badge GET SVG status badge
/api/status GET Builder status
/api/metrics GET Prometheus metrics
/api/history GET Build history
/api/logs GET Recent logs
/api/build POST Trigger build
/api/build/{id}/cancel POST Cancel build
/webhook/github POST GitHub webhook
/webhook/test POST Test endpoint
/stats-partial GET Stats HTML partial
/current-partial GET Current build HTML partial
/history-partial GET History HTML partial
/logs-partial GET Logs HTML partial

Webhook Headers

Header Description
X-Builder-Image Override image name
X-Builder-Tags Comma-separated tags
X-Builder-Token GitHub token
X-Builder-Platform Target platform
X-Builder-Args Build args: K1=V1,K2=V2
X-Builder-Callback Notification URL

Task Runner

The project includes a Taskfile.yml for go-task:

# Development
task dev           # Run locally
task dev:watch     # Run with auto-reload

# Testing
task lint          # Run linting
task lint:fix      # Fix linting issues

# Docker
task docker:build  # Build Docker image
task docker:run    # Run Docker image

# API Operations
task health        # Check health
task status        # Get status
task metrics       # Get metrics
task history       # Get build history
task logs          # Get recent logs

# Build Operations
task build REPO_URL=... IMAGE=...  # Trigger a build
task build:cancel BUILD_ID=...     # Cancel a build

# Hivemind
task hivemind:register  # Register with controller
task hivemind:heartbeat # Send heartbeat

# Deployment
task deploy:hf     # Deploy to HuggingFace

License

MIT