File size: 6,842 Bytes
ca16345
8029b17
ca16345
 
 
 
8029b17
6fe9185
8029b17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6fe9185
 
 
8029b17
6fe9185
8029b17
 
 
 
 
 
 
 
 
 
6fe9185
8029b17
6fe9185
8029b17
 
 
 
 
 
9e22678
8029b17
9e22678
8029b17
 
 
 
 
9e22678
8029b17
9e22678
8029b17
 
 
 
9e22678
8029b17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9e22678
 
8029b17
9e22678
8029b17
9e22678
8029b17
 
 
 
 
9e22678
8029b17
9e22678
8029b17
6fe9185
8029b17
9e22678
8029b17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6fe9185
8029b17
 
 
 
 
 
6fe9185
 
8029b17
6fe9185
8029b17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9e22678
8029b17
 
 
 
 
 
 
 
 
 
 
 
 
6fe9185
 
8029b17
6fe9185
9e22678
 
8029b17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
---
title: HF Builder
sdk: docker
pinned: false
---

# HF Builder

Daemonless Docker image builder for HuggingFace Spaces using [Kaniko](https://github.com/GoogleContainerTools/kaniko).

## Features

- **Build Tracking**: Unique build IDs, history, and metrics
- **GitHub Webhooks**: Automatic builds on push with signature verification
- **Notifications**: Slack, Discord, and custom webhook notifications
- **Observability**: OpenTelemetry tracing, Prometheus metrics, structured logging
- **Status Badge**: SVG badge for READMEs
- **Request Tracing**: Trace IDs through logs and responses
- **HTMX Dashboard**: Real-time web interface with live updates
- **Hivemind Integration**: Connect to hivemind controller for distributed builds
- **Task Runner**: go-task automation for common operations

## Quick Start

```bash
# Install go-task
brew install go-task

# Run locally
task dev

# Check status
task status

# Trigger a build
task build REPO_URL=https://github.com/owner/repo IMAGE=owner/repo
```

## Configuration

### Core Settings

| Secret | Required | Default | Description |
|--------|----------|---------|-------------|
| `REGISTRY_URL` | No | `ghcr.io` | Container registry URL |
| `REGISTRY_USER` | Yes | - | Registry username |
| `REGISTRY_PASSWORD` | Yes | - | Registry password/token |
| `GITHUB_TOKEN` | For private | - | Token for cloning |
| `WEBHOOK_SECRET` | Recommended | - | GitHub webhook secret |
| `DEFAULT_IMAGE` | No | - | Default image name |
| `BUILD_TIMEOUT` | No | `1800` | Timeout in seconds |
| `ENABLE_CACHE` | No | `false` | Enable Kaniko cache |

### Notifications

| Secret | Description |
|--------|-------------|
| `NOTIFICATION_URL` | Generic webhook URL for build results |
| `SLACK_WEBHOOK_URL` | Slack incoming webhook URL |
| `DISCORD_WEBHOOK_URL` | Discord webhook URL |
| `NOTIFY_ON` | When to notify: `all`, `failure`, `success` (default: `failure`) |

### Observability

| Secret | Description |
|--------|-------------|
| `LOG_FORMAT` | `text` or `json` (default: `text`) |
| `OTEL_EXPORTER_OTLP_ENDPOINT` | OpenTelemetry collector endpoint |
| `OTEL_SERVICE_NAME` | Service name for traces (default: `hf-builder`) |

### Hivemind Integration

| Secret | Description |
|--------|-------------|
| `HIVEMIND_CONTROLLER_URL` | URL of the hivemind controller (enables worker mode) |
| `HIVEMIND_POLL_INTERVAL` | Seconds between work polls (default: `30`) |

When `HIVEMIND_CONTROLLER_URL` is set, the builder:
1. Registers itself with the controller as a "build" worker
2. Polls for build work items
3. Executes builds and reports completion/failure
4. Sends heartbeats to indicate status

## Web Dashboard

The builder includes a real-time HTMX-powered dashboard at the root URL showing:

- **Stats**: Status, completed/failed builds, success rate
- **Current Build**: Active build with cancel button
- **Build History**: Recent builds with status badges
- **New Build Form**: Trigger builds from the UI
- **Live Logs**: Real-time log streaming

All panels auto-refresh via HTMX polling.

## Status Badge

Add to your README:

```markdown
![Build Status](https://your-space.hf.space/badge)
```

Returns an SVG badge showing: `passing`, `failing`, `building`, or `no builds`.

## Request Tracing

Every request gets a trace ID:
- Pass `X-Request-ID` or `X-Trace-ID` header, or one is auto-generated
- Response includes `X-Trace-ID` header
- Logs include trace ID: `[HH:MM:SS] [trace123] message`
- JSON logs include `trace_id` field

## Notifications

### Slack

Set `SLACK_WEBHOOK_URL` to receive formatted messages:

```
✅ Build SUCCESS
owner/repo:latest
ID: abc123 | Duration: 180.5s | Branch: main
```

### Discord

Set `DISCORD_WEBHOOK_URL` to receive embedded messages with build details.

### Custom Webhook

Set `NOTIFICATION_URL` or pass `callback_url` in API request. Receives JSON:

```json
{
  "build": {
    "id": "abc123",
    "status": "success",
    "image": "ghcr.io/owner/repo",
    "tags": ["latest"],
    "duration_seconds": 180.5,
    "trace_id": "xyz789"
  },
  "runner_id": "runner1",
  "registry": "ghcr.io"
}
```

## OpenTelemetry

Set `OTEL_EXPORTER_OTLP_ENDPOINT` to enable distributed tracing.

Traces include:
- Span per build with `build.id`, `build.image`, `build.status`
- Errors recorded as span events

Works with:
- Jaeger
- Grafana Tempo
- Honeycomb
- Any OTLP-compatible backend

## Prometheus Metrics

`GET /api/metrics` returns:

```
# HELP hf_builder_builds_total Total builds
# TYPE hf_builder_builds_total counter
hf_builder_builds_total{status="success"} 42
hf_builder_builds_total{status="failed"} 3

# HELP hf_builder_build_duration_seconds Avg build duration
# TYPE hf_builder_build_duration_seconds gauge
hf_builder_build_duration_seconds 180.50

# HELP hf_builder_success_rate Success rate
# TYPE hf_builder_success_rate gauge
hf_builder_success_rate 0.9333
```

## API Endpoints

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/` | GET | Web dashboard (HTMX) |
| `/health` | GET | Health check |
| `/ready` | GET | Readiness check |
| `/badge` | GET | SVG status badge |
| `/api/status` | GET | Builder status |
| `/api/metrics` | GET | Prometheus metrics |
| `/api/history` | GET | Build history |
| `/api/logs` | GET | Recent logs |
| `/api/build` | POST | Trigger build |
| `/api/build/{id}/cancel` | POST | Cancel build |
| `/webhook/github` | POST | GitHub webhook |
| `/webhook/test` | POST | Test endpoint |
| `/stats-partial` | GET | Stats HTML partial |
| `/current-partial` | GET | Current build HTML partial |
| `/history-partial` | GET | History HTML partial |
| `/logs-partial` | GET | Logs HTML partial |

## Webhook Headers

| Header | Description |
|--------|-------------|
| `X-Builder-Image` | Override image name |
| `X-Builder-Tags` | Comma-separated tags |
| `X-Builder-Token` | GitHub token |
| `X-Builder-Platform` | Target platform |
| `X-Builder-Args` | Build args: `K1=V1,K2=V2` |
| `X-Builder-Callback` | Notification URL |

## Task Runner

The project includes a `Taskfile.yml` for [go-task](https://taskfile.dev/):

```bash
# Development
task dev           # Run locally
task dev:watch     # Run with auto-reload

# Testing
task lint          # Run linting
task lint:fix      # Fix linting issues

# Docker
task docker:build  # Build Docker image
task docker:run    # Run Docker image

# API Operations
task health        # Check health
task status        # Get status
task metrics       # Get metrics
task history       # Get build history
task logs          # Get recent logs

# Build Operations
task build REPO_URL=... IMAGE=...  # Trigger a build
task build:cancel BUILD_ID=...     # Cancel a build

# Hivemind
task hivemind:register  # Register with controller
task hivemind:heartbeat # Send heartbeat

# Deployment
task deploy:hf     # Deploy to HuggingFace
```

## License

MIT