jonathanagustin commited on
Commit
9e22678
·
verified ·
1 Parent(s): 76a994f

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +73 -20
  2. app.py +234 -2
README.md CHANGED
@@ -12,44 +12,97 @@ Daemonless Docker image builder using Kaniko. Builds and pushes images to GHCR w
12
 
13
  Set these as HuggingFace Secrets:
14
 
15
- - `REGISTRY_USER`: GitHub username
16
- - `REGISTRY_PASSWORD`: GitHub PAT with `packages:write` scope
17
- - `GITHUB_TOKEN`: (optional) GitHub PAT for cloning private repositories
18
- - `UPSTASH_REDIS_REST_URL`: (optional) Redis URL for build queue
19
- - `UPSTASH_REDIS_REST_TOKEN`: (optional) Redis token
 
 
 
 
20
 
21
  ## Usage
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ### Via Web UI
 
24
  Navigate to the Space and use the build form.
25
 
26
  ### Via API
 
27
  ```bash
28
  curl -X POST https://jonathanagustin-builder.hf.space/build \
29
- -H "Authorization: Bearer $HF_TOKEN" \
30
  -H "Content-Type: application/json" \
31
  -d '{
32
- "repo_url": "https://github.com/jonathanagustin/lawforge",
33
- "image_name": "jonathanagustin/lawforge-worker",
34
  "branch": "main",
35
  "tags": ["latest"],
36
- "dockerfile": "Dockerfile",
37
- "context_path": "jobs/worker"
38
  }'
39
  ```
40
 
41
- ### Via Taskfile
42
- ```bash
43
- # Build worker image
44
- task builder:build:worker
45
 
46
- # Build any image
47
- REPO_URL=https://github.com/user/repo IMAGE_NAME=user/image task builder:build
 
 
 
 
 
 
 
48
  ```
49
 
50
  ## Endpoints
51
 
52
- - `GET /` - Web UI
53
- - `GET /api/status` - Builder status JSON
54
- - `POST /build` - Trigger a build
55
- - `POST /api/queue` - Queue a build (requires Redis)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  Set these as HuggingFace Secrets:
14
 
15
+ | Secret | Required | Description |
16
+ |--------|----------|-------------|
17
+ | `REGISTRY_USER` | Yes | GitHub username |
18
+ | `REGISTRY_PASSWORD` | Yes | GitHub PAT with `packages:write` scope |
19
+ | `GITHUB_TOKEN` | For private repos | GitHub PAT for cloning private repositories |
20
+ | `GITHUB_WEBHOOK_SECRET` | Recommended | Secret for validating GitHub webhook payloads |
21
+ | `DEFAULT_IMAGE_NAME` | For webhooks | Default image name (e.g., `jonathanagustin/lawforge`) |
22
+ | `UPSTASH_REDIS_REST_URL` | Optional | Redis URL for build queue |
23
+ | `UPSTASH_REDIS_REST_TOKEN` | Optional | Redis token |
24
 
25
  ## Usage
26
 
27
+ ### Via GitHub Webhook (Automatic Builds)
28
+
29
+ Set up automatic builds when you push to your repository:
30
+
31
+ 1. Go to your GitHub repo → **Settings** → **Webhooks** → **Add webhook**
32
+ 2. **Payload URL**: `https://jonathanagustin-builder.hf.space/webhook/github`
33
+ 3. **Content type**: `application/json`
34
+ 4. **Secret**: Same value as `GITHUB_WEBHOOK_SECRET` (generate with `openssl rand -hex 32`)
35
+ 5. **Events**: Select "Just the push event"
36
+
37
+ The builder will automatically:
38
+ - Trigger on pushes to the default branch (main/master)
39
+ - Only build when relevant files change (Dockerfile, src/**, pyproject.toml, etc.)
40
+ - Skip builds for non-Docker-related changes (docs, tests, etc.)
41
+
42
+ #### File Patterns That Trigger Builds
43
+
44
+ ```
45
+ Dockerfile, docker/*, src/**/*.py, pyproject.toml, uv.lock, requirements*.txt, .dockerignore
46
+ ```
47
+
48
+ #### Optional Webhook Headers
49
+
50
+ For per-repo configuration, add custom headers to your webhook:
51
+
52
+ | Header | Description |
53
+ |--------|-------------|
54
+ | `X-Builder-Token` | GitHub token for cloning private repos |
55
+ | `X-Builder-Image` | Override image name (default: repo name) |
56
+ | `X-Builder-Tags` | Comma-separated tags (default: "latest") |
57
+
58
  ### Via Web UI
59
+
60
  Navigate to the Space and use the build form.
61
 
62
  ### Via API
63
+
64
  ```bash
65
  curl -X POST https://jonathanagustin-builder.hf.space/build \
 
66
  -H "Content-Type: application/json" \
67
  -d '{
68
+ "repo_url": "https://github.com/owner/repo",
69
+ "image_name": "owner/repo",
70
  "branch": "main",
71
  "tags": ["latest"],
72
+ "dockerfile": "Dockerfile"
 
73
  }'
74
  ```
75
 
76
+ ### Via Test Endpoint
 
 
 
77
 
78
+ Trigger a build without webhook signature validation:
79
+
80
+ ```bash
81
+ curl -X POST https://jonathanagustin-builder.hf.space/webhook/test \
82
+ -H "Content-Type: application/json" \
83
+ -d '{
84
+ "repo_url": "https://github.com/owner/repo",
85
+ "image_name": "owner/repo"
86
+ }'
87
  ```
88
 
89
  ## Endpoints
90
 
91
+ | Endpoint | Method | Description |
92
+ |----------|--------|-------------|
93
+ | `/` | GET | Web UI |
94
+ | `/api/status` | GET | Builder status JSON |
95
+ | `/build` | POST | Trigger a build manually |
96
+ | `/webhook/github` | POST | GitHub webhook endpoint |
97
+ | `/webhook/test` | POST | Test build trigger (no signature) |
98
+ | `/api/queue` | POST | Queue a build (requires Redis) |
99
+
100
+ ## How It Works
101
+
102
+ 1. **Webhook received**: GitHub sends push event to `/webhook/github`
103
+ 2. **Signature verified**: HMAC-SHA256 validation using `GITHUB_WEBHOOK_SECRET`
104
+ 3. **Files checked**: Only builds if Docker-relevant files changed
105
+ 4. **Repo cloned**: Uses `GITHUB_TOKEN` (or per-repo `X-Builder-Token`) for private repos
106
+ 5. **Image built**: Kaniko builds without Docker daemon
107
+ 6. **Image pushed**: Pushes to GHCR using `REGISTRY_USER`/`REGISTRY_PASSWORD`
108
+ 7. **Cleanup**: Temporary files removed
app.py CHANGED
@@ -2,16 +2,28 @@
2
  """Docker image builder using Kaniko - no Docker daemon required.
3
 
4
  Builds Docker images and pushes to container registries (GHCR, Docker Hub, etc.)
5
- Can be triggered via API or run builds from a queue in Redis.
6
 
7
  Environment variables:
8
  - REGISTRY_USER: Registry username (e.g., GitHub username for GHCR)
9
  - REGISTRY_PASSWORD: Registry password/token (e.g., GitHub PAT with packages:write)
10
  - REGISTRY_URL: Registry URL (default: ghcr.io)
 
 
11
  - UPSTASH_REDIS_REST_URL: Redis URL for build queue
12
  - UPSTASH_REDIS_REST_TOKEN: Redis token
 
 
 
 
 
 
 
13
  """
14
 
 
 
 
15
  import json
16
  import os
17
  import shutil
@@ -23,7 +35,7 @@ import uuid
23
  from datetime import datetime, timezone
24
  from pathlib import Path
25
 
26
- from flask import Flask, jsonify, render_template_string, request
27
  import git
28
 
29
  # =============================================================================
@@ -36,8 +48,25 @@ REGISTRY_URL = os.environ.get("REGISTRY_URL", "ghcr.io")
36
  REGISTRY_USER = os.environ.get("REGISTRY_USER", "")
37
  REGISTRY_PASSWORD = os.environ.get("REGISTRY_PASSWORD", "")
38
  GITHUB_TOKEN = os.environ.get("GITHUB_TOKEN", "") # For cloning private repos
 
39
  AUTO_START = os.environ.get("AUTO_START", "false").lower() == "true"
40
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  # Global state
42
  state = {
43
  "status": "idle",
@@ -869,6 +898,199 @@ def api_queue():
869
  return jsonify({"status": "queued", "build_id": build_id}), 202
870
 
871
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
872
  # =============================================================================
873
  # Startup
874
  # =============================================================================
@@ -886,11 +1108,21 @@ def startup():
886
  else:
887
  log("⚠️ REGISTRY_USER not set - pushes will fail")
888
 
 
 
 
 
 
 
 
 
 
889
  # Start queue worker if Redis is configured
890
  if redis_client:
891
  threading.Thread(target=queue_worker, daemon=True).start()
892
 
893
  log("Ready for builds!")
 
894
 
895
 
896
  threading.Thread(target=startup, daemon=True).start()
 
2
  """Docker image builder using Kaniko - no Docker daemon required.
3
 
4
  Builds Docker images and pushes to container registries (GHCR, Docker Hub, etc.)
5
+ Can be triggered via API, GitHub webhooks, or run builds from a queue in Redis.
6
 
7
  Environment variables:
8
  - REGISTRY_USER: Registry username (e.g., GitHub username for GHCR)
9
  - REGISTRY_PASSWORD: Registry password/token (e.g., GitHub PAT with packages:write)
10
  - REGISTRY_URL: Registry URL (default: ghcr.io)
11
+ - GITHUB_TOKEN: GitHub token for cloning private repos
12
+ - GITHUB_WEBHOOK_SECRET: Secret for validating GitHub webhook payloads
13
  - UPSTASH_REDIS_REST_URL: Redis URL for build queue
14
  - UPSTASH_REDIS_REST_TOKEN: Redis token
15
+
16
+ Webhook setup:
17
+ 1. Go to your GitHub repo → Settings → Webhooks → Add webhook
18
+ 2. Payload URL: https://your-space.hf.space/webhook/github
19
+ 3. Content type: application/json
20
+ 4. Secret: Same value as GITHUB_WEBHOOK_SECRET
21
+ 5. Events: Just the push event
22
  """
23
 
24
+ import fnmatch
25
+ import hashlib
26
+ import hmac
27
  import json
28
  import os
29
  import shutil
 
35
  from datetime import datetime, timezone
36
  from pathlib import Path
37
 
38
+ from flask import Flask, jsonify, render_template_string, request, abort
39
  import git
40
 
41
  # =============================================================================
 
48
  REGISTRY_USER = os.environ.get("REGISTRY_USER", "")
49
  REGISTRY_PASSWORD = os.environ.get("REGISTRY_PASSWORD", "")
50
  GITHUB_TOKEN = os.environ.get("GITHUB_TOKEN", "") # For cloning private repos
51
+ GITHUB_WEBHOOK_SECRET = os.environ.get("GITHUB_WEBHOOK_SECRET", "") # For webhook validation
52
  AUTO_START = os.environ.get("AUTO_START", "false").lower() == "true"
53
 
54
+ # Default image name for webhook-triggered builds (owner/repo format)
55
+ DEFAULT_IMAGE_NAME = os.environ.get("DEFAULT_IMAGE_NAME", "")
56
+
57
+ # File patterns that should trigger a Docker rebuild when changed
58
+ # Uses glob-style patterns
59
+ BUILD_TRIGGER_PATTERNS = [
60
+ "Dockerfile",
61
+ "docker/*",
62
+ "docker/**/*",
63
+ "src/**/*.py",
64
+ "pyproject.toml",
65
+ "uv.lock",
66
+ "requirements*.txt",
67
+ ".dockerignore",
68
+ ]
69
+
70
  # Global state
71
  state = {
72
  "status": "idle",
 
898
  return jsonify({"status": "queued", "build_id": build_id}), 202
899
 
900
 
901
+ # =============================================================================
902
+ # GitHub Webhook
903
+ # =============================================================================
904
+
905
+ def verify_webhook_signature(payload: bytes, signature: str) -> bool:
906
+ """Verify GitHub webhook signature using HMAC-SHA256."""
907
+ if not GITHUB_WEBHOOK_SECRET:
908
+ log("⚠️ GITHUB_WEBHOOK_SECRET not set - skipping signature verification")
909
+ return True # Allow if no secret configured (not recommended for production)
910
+
911
+ if not signature or not signature.startswith("sha256="):
912
+ return False
913
+
914
+ expected = hmac.new(
915
+ GITHUB_WEBHOOK_SECRET.encode(),
916
+ payload,
917
+ hashlib.sha256
918
+ ).hexdigest()
919
+
920
+ return hmac.compare_digest(f"sha256={expected}", signature)
921
+
922
+
923
+ def should_trigger_build(changed_files: list[str]) -> tuple[bool, list[str]]:
924
+ """Check if any changed files match build trigger patterns.
925
+
926
+ Returns:
927
+ Tuple of (should_build, matching_files)
928
+ """
929
+ matching = []
930
+ for filepath in changed_files:
931
+ for pattern in BUILD_TRIGGER_PATTERNS:
932
+ if fnmatch.fnmatch(filepath, pattern):
933
+ matching.append(filepath)
934
+ break
935
+ return len(matching) > 0, matching
936
+
937
+
938
+ def extract_changed_files(payload: dict) -> list[str]:
939
+ """Extract list of changed files from GitHub push payload."""
940
+ files = set()
941
+ for commit in payload.get("commits", []):
942
+ files.update(commit.get("added", []))
943
+ files.update(commit.get("modified", []))
944
+ files.update(commit.get("removed", []))
945
+ return list(files)
946
+
947
+
948
+ @app.route("/webhook/github", methods=["POST"])
949
+ def github_webhook():
950
+ """Handle GitHub webhook events from any repository.
951
+
952
+ Triggers a build when:
953
+ - Event is a push to the default branch (main/master)
954
+ - Changed files match BUILD_TRIGGER_PATTERNS
955
+
956
+ Works with any GitHub repository - extracts repo info from payload.
957
+
958
+ Optional headers for per-repo configuration:
959
+ - X-Builder-Token: GitHub token for cloning private repos (overrides env GITHUB_TOKEN)
960
+ - X-Builder-Image: Override image name (default: uses repo full_name)
961
+ - X-Builder-Tags: Comma-separated tags (default: "latest")
962
+ """
963
+ # Verify signature
964
+ signature = request.headers.get("X-Hub-Signature-256", "")
965
+ if not verify_webhook_signature(request.data, signature):
966
+ log("✗ Webhook signature verification failed")
967
+ abort(401, "Invalid signature")
968
+
969
+ # Only handle push events
970
+ event = request.headers.get("X-GitHub-Event", "")
971
+ if event == "ping":
972
+ log("✓ Webhook ping received")
973
+ return jsonify({"status": "pong"}), 200
974
+
975
+ if event != "push":
976
+ log(f"Ignoring non-push event: {event}")
977
+ return jsonify({"status": "ignored", "reason": f"event type: {event}"}), 200
978
+
979
+ payload = request.json
980
+ if not payload:
981
+ abort(400, "Missing payload")
982
+
983
+ # Extract repo info from webhook payload (works for any repo)
984
+ repo = payload.get("repository", {})
985
+ repo_url = repo.get("clone_url", "")
986
+ repo_full_name = repo.get("full_name", "") # e.g., "owner/repo"
987
+ default_branch = repo.get("default_branch", "main")
988
+ is_private = repo.get("private", False)
989
+
990
+ # Get the ref that was pushed
991
+ ref = payload.get("ref", "")
992
+ pushed_branch = ref.replace("refs/heads/", "") if ref.startswith("refs/heads/") else ""
993
+
994
+ log(f"Webhook: push to {repo_full_name}/{pushed_branch} (private={is_private})")
995
+
996
+ # Only build on pushes to default branch
997
+ if pushed_branch != default_branch:
998
+ log(f"Ignoring push to non-default branch: {pushed_branch} (default: {default_branch})")
999
+ return jsonify({
1000
+ "status": "ignored",
1001
+ "reason": f"branch {pushed_branch} is not default branch {default_branch}"
1002
+ }), 200
1003
+
1004
+ # Check if any relevant files changed
1005
+ changed_files = extract_changed_files(payload)
1006
+ should_build, matching_files = should_trigger_build(changed_files)
1007
+
1008
+ if not should_build:
1009
+ log(f"No build-relevant files changed in {len(changed_files)} files")
1010
+ return jsonify({
1011
+ "status": "ignored",
1012
+ "reason": "no build-relevant files changed",
1013
+ "changed_files": changed_files[:20] # Limit for response size
1014
+ }), 200
1015
+
1016
+ log(f"Build triggered by {len(matching_files)} matching files: {matching_files[:5]}")
1017
+
1018
+ # Per-repo overrides via headers
1019
+ override_token = request.headers.get("X-Builder-Token", "")
1020
+ override_image = request.headers.get("X-Builder-Image", "")
1021
+ override_tags = request.headers.get("X-Builder-Tags", "")
1022
+
1023
+ # Determine image name: header override > env default > repo name
1024
+ image_name = override_image or DEFAULT_IMAGE_NAME or repo_full_name
1025
+
1026
+ # Determine tags
1027
+ if override_tags:
1028
+ tags = [t.strip() for t in override_tags.split(",") if t.strip()]
1029
+ else:
1030
+ tags = ["latest"]
1031
+
1032
+ # Build configuration
1033
+ build_config = {
1034
+ "repo_url": repo_url,
1035
+ "branch": pushed_branch,
1036
+ "image_name": image_name,
1037
+ "tags": tags,
1038
+ "trigger": "webhook",
1039
+ "matching_files": matching_files[:10],
1040
+ }
1041
+
1042
+ # Add per-repo token if provided (for private repos)
1043
+ if override_token:
1044
+ build_config["github_token"] = override_token
1045
+ log(f"Using per-repo token for {repo_full_name}")
1046
+
1047
+ # Check if already building
1048
+ if state["status"] == "building":
1049
+ log("Build already in progress - queueing webhook build")
1050
+ if redis_client:
1051
+ build_id = queue_build(build_config)
1052
+ return jsonify({"status": "queued", "build_id": build_id}), 202
1053
+ else:
1054
+ return jsonify({"status": "busy", "reason": "build in progress and no queue configured"}), 409
1055
+
1056
+ threading.Thread(target=run_build, args=(build_config,), daemon=True).start()
1057
+
1058
+ return jsonify({
1059
+ "status": "started",
1060
+ "repo": repo_full_name,
1061
+ "branch": pushed_branch,
1062
+ "image": f"{REGISTRY_URL}/{image_name}",
1063
+ "tags": tags,
1064
+ "matching_files": matching_files
1065
+ }), 202
1066
+
1067
+
1068
+ @app.route("/webhook/test", methods=["POST"])
1069
+ def test_webhook():
1070
+ """Test endpoint to simulate a webhook (no signature required)."""
1071
+ if state["status"] == "building":
1072
+ return jsonify({"error": "Build already in progress"}), 409
1073
+
1074
+ # Allow testing with minimal payload
1075
+ payload = request.json or {}
1076
+ repo_url = payload.get("repo_url", "https://github.com/jonathanagustin/lawforge")
1077
+ branch = payload.get("branch", "main")
1078
+ image_name = payload.get("image_name", DEFAULT_IMAGE_NAME or "jonathanagustin/lawforge")
1079
+
1080
+ build_config = {
1081
+ "repo_url": repo_url,
1082
+ "branch": branch,
1083
+ "image_name": image_name,
1084
+ "tags": ["latest"],
1085
+ "trigger": "test",
1086
+ }
1087
+
1088
+ log(f"Test webhook triggered: {image_name}")
1089
+ threading.Thread(target=run_build, args=(build_config,), daemon=True).start()
1090
+
1091
+ return jsonify({"status": "started", "config": build_config}), 202
1092
+
1093
+
1094
  # =============================================================================
1095
  # Startup
1096
  # =============================================================================
 
1108
  else:
1109
  log("⚠️ REGISTRY_USER not set - pushes will fail")
1110
 
1111
+ # Webhook configuration status
1112
+ if GITHUB_WEBHOOK_SECRET:
1113
+ log("✓ GitHub webhook secret configured")
1114
+ else:
1115
+ log("⚠️ GITHUB_WEBHOOK_SECRET not set - webhooks will accept unsigned payloads")
1116
+
1117
+ if DEFAULT_IMAGE_NAME:
1118
+ log(f"✓ Default image: {REGISTRY_URL}/{DEFAULT_IMAGE_NAME}")
1119
+
1120
  # Start queue worker if Redis is configured
1121
  if redis_client:
1122
  threading.Thread(target=queue_worker, daemon=True).start()
1123
 
1124
  log("Ready for builds!")
1125
+ log(f"Webhook endpoint: /webhook/github")
1126
 
1127
 
1128
  threading.Thread(target=startup, daemon=True).start()