IdlecloudX commited on
Commit
56cd400
·
verified ·
1 Parent(s): 6422101

Deploy SHARP ZeroGPU Space

Browse files
Files changed (3) hide show
  1. README.md +35 -7
  2. app.py +268 -0
  3. requirements.txt +18 -0
README.md CHANGED
@@ -1,13 +1,41 @@
1
  ---
2
- title: Ml Sharp Zerogpu
3
- emoji: 📉
4
- colorFrom: pink
5
- colorTo: purple
6
  sdk: gradio
7
  sdk_version: 6.19.0
8
- python_version: '3.12'
9
  app_file: app.py
10
- pinned: false
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Apple SHARP ZeroGPU
3
+ emoji: 🧊
4
+ colorFrom: blue
5
+ colorTo: gray
6
  sdk: gradio
7
  sdk_version: 6.19.0
8
+ python_version: 3.12.12
9
  app_file: app.py
10
+ short_description: Single-image SHARP to 3DGS PLY on ZeroGPU.
11
+ models:
12
+ - IdlecloudX/ml-sharp-weights
13
+ tags:
14
+ - zero-gpu
15
+ - gradio
16
+ - 3d
17
+ - gaussian-splatting
18
+ - monocular-view-synthesis
19
+ preload_from_hub:
20
+ - IdlecloudX/ml-sharp-weights sharp_2572gikvuh.pt
21
  ---
22
 
23
+ # Apple SHARP ZeroGPU
24
+
25
+ This Space wraps [apple/ml-sharp](https://github.com/apple/ml-sharp) for a public research demo on Hugging Face ZeroGPU.
26
+
27
+ Upload a single image and the Space returns a downloadable 3D Gaussian Splatting `.ply` file. The output is a 3DGS representation, not a mesh, OBJ, or GLB file.
28
+
29
+ ## Usage and License Limits
30
+
31
+ Apple SHARP model weights are released under the Apple Machine Learning Research Model License Agreement. They are limited to scientific research and non-commercial use. See the linked upstream repository and the model repository license file before using the weights or outputs.
32
+
33
+ This Space does not modify or fine-tune the Apple model. It only loads the published checkpoint and exports SHARP's predicted 3DGS scene.
34
+
35
+ ## Implementation Notes
36
+
37
+ - Runtime: Gradio Space on ZeroGPU.
38
+ - GPU function: `@spaces.GPU(duration=60, size="large")`.
39
+ - Model source: `IdlecloudX/ml-sharp-weights/sharp_2572gikvuh.pt`.
40
+ - SHARP source commit: `apple/ml-sharp@1eaa046834b81852261262b41b0919f5c1efdd2e`.
41
+ - Default output: `.ply` only. Video rendering is intentionally not enabled in this first version.
app.py ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from __future__ import annotations
2
+
3
+ import logging
4
+ import os
5
+ import re
6
+ import time
7
+ import traceback
8
+ from pathlib import Path
9
+ from uuid import uuid4
10
+
11
+ import gradio as gr
12
+ import numpy as np
13
+ import spaces
14
+ import torch
15
+ import torch.nn.functional as F
16
+ from huggingface_hub import hf_hub_download
17
+
18
+ from sharp.models import PredictorParams, RGBGaussianPredictor, create_predictor
19
+ from sharp.utils import io
20
+ from sharp.utils.gaussians import Gaussians3D, save_ply, unproject_gaussians
21
+
22
+
23
+ LOGGER = logging.getLogger(__name__)
24
+ logging.basicConfig(level=logging.INFO)
25
+
26
+ WEIGHTS_REPO_ID = os.getenv("SHARP_WEIGHTS_REPO_ID", "IdlecloudX/ml-sharp-weights")
27
+ CHECKPOINT_FILENAME = os.getenv("SHARP_CHECKPOINT_FILENAME", "sharp_2572gikvuh.pt")
28
+ OUTPUT_DIR = Path(os.getenv("SHARP_OUTPUT_DIR", "outputs"))
29
+ INTERNAL_SHAPE = (1536, 1536)
30
+
31
+
32
+ def get_runtime_device() -> torch.device:
33
+ """选择 SHARP 推理使用的运行设备。
34
+
35
+ Args:
36
+ 无。
37
+
38
+ Returns:
39
+ torch.device: ZeroGPU/真实 CUDA 环境返回 cuda,本地烟测环境无 CUDA 时返回 cpu。
40
+ """
41
+ if os.getenv("SPACE_ID") or torch.cuda.is_available():
42
+ return torch.device("cuda")
43
+ return torch.device("cpu")
44
+
45
+
46
+ DEVICE = get_runtime_device()
47
+ OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
48
+
49
+
50
+ def sanitize_stem(stem: str) -> str:
51
+ """清理上传文件名,生成可安全写入输出目录的文件名前缀。
52
+
53
+ Args:
54
+ stem: 原始文件名去除扩展名后的文本。
55
+
56
+ Returns:
57
+ str: 仅包含字母、数字、点、下划线和短横线的文件名前缀。
58
+ """
59
+ normalized = re.sub(r"[^A-Za-z0-9._-]+", "_", stem).strip("._-")
60
+ return normalized[:64] or "sharp_scene"
61
+
62
+
63
+ def resolve_checkpoint_path() -> Path:
64
+ """从 Hugging Face Hub 缓存中解析 SHARP checkpoint 路径。
65
+
66
+ Args:
67
+ 无。
68
+
69
+ Returns:
70
+ Path: 已下载或已预加载的 checkpoint 本地路径。
71
+ """
72
+ checkpoint_path = hf_hub_download(
73
+ repo_id=WEIGHTS_REPO_ID,
74
+ filename=CHECKPOINT_FILENAME,
75
+ repo_type="model",
76
+ )
77
+ return Path(checkpoint_path)
78
+
79
+
80
+ def load_predictor() -> RGBGaussianPredictor:
81
+ """加载 Apple SHARP 权重并初始化 Gaussian predictor。
82
+
83
+ Args:
84
+ 无。
85
+
86
+ Returns:
87
+ RGBGaussianPredictor: 已切换为 eval 模式并移动到目标设备的预测模型。
88
+ """
89
+ checkpoint_path = resolve_checkpoint_path()
90
+ LOGGER.info("Loading SHARP checkpoint from %s", checkpoint_path)
91
+
92
+ # 先在 CPU 反序列化权重,避免下载和反序列化阶段占用 ZeroGPU 真实显存。
93
+ state_dict = torch.load(checkpoint_path, map_location="cpu", weights_only=True)
94
+ predictor = create_predictor(PredictorParams())
95
+ predictor.load_state_dict(state_dict)
96
+ predictor.eval()
97
+
98
+ # ZeroGPU 文档建议模型在模块加载阶段移动到 cuda,由运行时接管真实 GPU 分配。
99
+ predictor.to(DEVICE)
100
+ return predictor
101
+
102
+
103
+ @torch.no_grad()
104
+ def predict_image(
105
+ predictor: RGBGaussianPredictor,
106
+ image: np.ndarray,
107
+ f_px: float,
108
+ device: torch.device,
109
+ ) -> Gaussians3D:
110
+ """将单张 RGB 图片转换为 3D Gaussian 表示。
111
+
112
+ Args:
113
+ predictor: 已加载权重的 SHARP Gaussian predictor。
114
+ image: RGB 图像数组,形状为 HxWx3。
115
+ f_px: 由 EXIF 或默认参数推导出的像素焦距。
116
+ device: 执行张量推理的设备。
117
+
118
+ Returns:
119
+ Gaussians3D: 已从 NDC 空间还原到度量空间的 3D Gaussian 数据。
120
+ """
121
+ image_pt = torch.from_numpy(image.copy()).float().to(device).permute(2, 0, 1) / 255.0
122
+ _, height, width = image_pt.shape
123
+ disparity_factor = torch.tensor([f_px / width], dtype=torch.float32, device=device)
124
+
125
+ # SHARP 官方实现固定使用 1536x1536 作为网络内部输入分辨率。
126
+ image_resized_pt = F.interpolate(
127
+ image_pt[None],
128
+ size=(INTERNAL_SHAPE[1], INTERNAL_SHAPE[0]),
129
+ mode="bilinear",
130
+ align_corners=True,
131
+ )
132
+
133
+ # 网络输出位于 NDC 空间,后续需要结合相机内参还原到度量空间。
134
+ gaussians_ndc = predictor(image_resized_pt, disparity_factor)
135
+
136
+ intrinsics = torch.tensor(
137
+ [
138
+ [f_px, 0, width / 2, 0],
139
+ [0, f_px, height / 2, 0],
140
+ [0, 0, 1, 0],
141
+ [0, 0, 0, 1],
142
+ ],
143
+ dtype=torch.float32,
144
+ device=device,
145
+ )
146
+ intrinsics_resized = intrinsics.clone()
147
+ intrinsics_resized[0] *= INTERNAL_SHAPE[0] / width
148
+ intrinsics_resized[1] *= INTERNAL_SHAPE[1] / height
149
+
150
+ # 与 upstream CLI 保持一致:导出前把 NDC Gaussian 变换到 metric 3D 空间。
151
+ return unproject_gaussians(
152
+ gaussians_ndc,
153
+ torch.eye(4, device=device),
154
+ intrinsics_resized,
155
+ INTERNAL_SHAPE,
156
+ )
157
+
158
+
159
+ def save_uploaded_image_as_ply(image_path: str, predictor: RGBGaussianPredictor) -> tuple[Path, float]:
160
+ """读取用户上传图片,运行 SHARP,并保存为 3DGS PLY 文件。
161
+
162
+ Args:
163
+ image_path: Gradio 上传图片的本地临时文件路径。
164
+ predictor: 已加载权重的 SHARP Gaussian predictor。
165
+
166
+ Returns:
167
+ tuple[Path, float]: 输出 PLY 路径和本次处理耗时秒数。
168
+ """
169
+ start_time = time.perf_counter()
170
+ input_path = Path(image_path)
171
+
172
+ # io.load_rgb 会处理 EXIF 方向、HEIC 以及无焦距 EXIF 时的默认焦距回退。
173
+ image, _, f_px = io.load_rgb(input_path)
174
+ height, width = image.shape[:2]
175
+
176
+ gaussians = predict_image(predictor, image, f_px, DEVICE)
177
+ output_name = f"{sanitize_stem(input_path.stem)}_{uuid4().hex[:10]}.ply"
178
+ output_path = OUTPUT_DIR / output_name
179
+
180
+ # 保存格式沿用 Apple SHARP,包含顶点属性、内参、图像尺寸和颜色空间元数据。
181
+ save_ply(gaussians, f_px, (height, width), output_path)
182
+ elapsed_seconds = time.perf_counter() - start_time
183
+ return output_path, elapsed_seconds
184
+
185
+
186
+ MODEL_LOAD_ERROR: str | None = None
187
+ PREDICTOR: RGBGaussianPredictor | None = None
188
+
189
+ try:
190
+ if os.getenv("SHARP_SKIP_MODEL_LOAD") == "1":
191
+ LOGGER.warning("Skipping SHARP model load because SHARP_SKIP_MODEL_LOAD=1.")
192
+ else:
193
+ PREDICTOR = load_predictor()
194
+ except Exception:
195
+ MODEL_LOAD_ERROR = traceback.format_exc(limit=8)
196
+ LOGGER.exception("Failed to load SHARP model.")
197
+
198
+
199
+ @spaces.GPU(duration=60, size="large")
200
+ def generate_ply(image_path: str | None) -> tuple[str | None, str]:
201
+ """Gradio 事件函数:把上传图片转换为可下载的 3DGS PLY 文件。
202
+
203
+ Args:
204
+ image_path: Gradio Image 组件传入的本地图片路径。
205
+
206
+ Returns:
207
+ tuple[str | None, str]: PLY 文件路径和面向用户展示的状态文本。
208
+ """
209
+ if image_path is None:
210
+ return None, "请先上传一张 JPEG、PNG 或 HEIC 图片。"
211
+
212
+ if PREDICTOR is None:
213
+ detail = MODEL_LOAD_ERROR or "模型尚未加载,且没有捕获到详细异常。"
214
+ return None, f"SHARP 模型加载失败,无法执行推理。\n\n```text\n{detail}\n```"
215
+
216
+ try:
217
+ output_path, elapsed_seconds = save_uploaded_image_as_ply(image_path, PREDICTOR)
218
+ except Exception:
219
+ detail = traceback.format_exc(limit=8)
220
+ LOGGER.exception("Failed to generate PLY.")
221
+ return None, f"生成失败。请确认上传的是有效图片文件。\n\n```text\n{detail}\n```"
222
+
223
+ file_size_mb = output_path.stat().st_size / (1024 * 1024)
224
+ status = (
225
+ f"生成完成:`{output_path.name}`\n\n"
226
+ f"- 耗时:{elapsed_seconds:.2f} 秒\n"
227
+ f"- 文件大小:{file_size_mb:.2f} MB\n"
228
+ "- 输出格式:3D Gaussian Splatting `.ply`,不是 mesh/GLB\n"
229
+ "- 使用限制:Apple SHARP 模型权重仅限 scientific research / non-commercial use"
230
+ )
231
+ return str(output_path), status
232
+
233
+
234
+ with gr.Blocks(title="Apple SHARP ZeroGPU") as demo:
235
+ gr.Markdown(
236
+ """
237
+ # Apple SHARP ZeroGPU
238
+
239
+ Upload one image and generate a downloadable 3D Gaussian Splatting `.ply` file.
240
+
241
+ This Space is a research demo for Apple SHARP. The model weights are licensed
242
+ for scientific research and non-commercial use only. The output is a 3DGS file,
243
+ not a mesh or GLB model.
244
+ """
245
+ )
246
+ with gr.Row():
247
+ image_input = gr.Image(
248
+ label="Input image",
249
+ sources=["upload"],
250
+ type="filepath",
251
+ image_mode="RGB",
252
+ )
253
+ with gr.Column():
254
+ output_file = gr.File(label="Generated 3DGS PLY")
255
+ status_output = gr.Markdown(label="Status")
256
+
257
+ run_button = gr.Button("Generate PLY", variant="primary")
258
+ run_button.click(
259
+ fn=generate_ply,
260
+ inputs=image_input,
261
+ outputs=[output_file, status_output],
262
+ concurrency_limit=1,
263
+ show_progress="full",
264
+ )
265
+
266
+
267
+ if __name__ == "__main__":
268
+ demo.queue(default_concurrency_limit=1).launch()
requirements.txt ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ gradio==6.19.0
2
+ spaces==0.50.4
3
+ huggingface_hub==1.20.1
4
+ torch==2.8.0
5
+ torchvision==0.23.0
6
+ gsplat==1.5.3
7
+ numpy==2.3.3
8
+ pillow==11.3.0
9
+ pillow-heif==1.1.1
10
+ plyfile==1.1.2
11
+ scipy==1.16.2
12
+ timm==1.0.20
13
+ imageio==2.37.0
14
+ imageio-ffmpeg==0.6.0
15
+ matplotlib==3.10.6
16
+ click==8.3.0
17
+ ninja==1.13.0
18
+ git+https://github.com/apple/ml-sharp.git@1eaa046834b81852261262b41b0919f5c1efdd2e