Spaces:
Running
on
Zero
Running
on
Zero
linhaotong
commited on
Commit
·
b396ed8
1
Parent(s):
6a8ae2c
update paper link and clean files
Browse files- EXAMPLES_DIRECTORY.md +0 -286
- SPACES_GPU_BEST_PRACTICES.md +0 -481
- SPACES_GPU_FIX_GUIDE.md +0 -484
- SPACES_SETUP.md +0 -190
- UPLOAD_EXAMPLES.md +0 -314
- XFORMERS_GUIDE.md +0 -299
- depth_anything_3/app/css_and_html.py +1 -1
- fix_spaces_gpu.patch +0 -142
EXAMPLES_DIRECTORY.md
DELETED
|
@@ -1,286 +0,0 @@
|
|
| 1 |
-
# 📁 Examples 目录配置指南
|
| 2 |
-
|
| 3 |
-
## 📍 Examples 目录位置
|
| 4 |
-
|
| 5 |
-
### 默认位置
|
| 6 |
-
|
| 7 |
-
Examples 目录应该放在:
|
| 8 |
-
|
| 9 |
-
```
|
| 10 |
-
workspace/gradio/examples/
|
| 11 |
-
```
|
| 12 |
-
|
| 13 |
-
### 完整路径说明
|
| 14 |
-
|
| 15 |
-
根据 `app.py` 的配置:
|
| 16 |
-
|
| 17 |
-
```python
|
| 18 |
-
workspace_dir = os.environ.get("DA3_WORKSPACE_DIR", "workspace/gradio")
|
| 19 |
-
examples_dir = os.path.join(workspace_dir, "examples")
|
| 20 |
-
# 结果: workspace/gradio/examples/
|
| 21 |
-
```
|
| 22 |
-
|
| 23 |
-
## 📂 目录结构
|
| 24 |
-
|
| 25 |
-
Examples 目录应该按以下结构组织:
|
| 26 |
-
|
| 27 |
-
```
|
| 28 |
-
workspace/gradio/examples/
|
| 29 |
-
├── scene1/ # 场景 1
|
| 30 |
-
│ ├── 000.png # 图像文件
|
| 31 |
-
│ ├── 010.png
|
| 32 |
-
│ ├── 020.png
|
| 33 |
-
│ └── ...
|
| 34 |
-
├── scene2/ # 场景 2
|
| 35 |
-
│ ├── 000.jpg
|
| 36 |
-
│ ├── 010.jpg
|
| 37 |
-
│ └── ...
|
| 38 |
-
└── scene3/ # 场景 3
|
| 39 |
-
├── image1.png
|
| 40 |
-
├── image2.png
|
| 41 |
-
└── ...
|
| 42 |
-
```
|
| 43 |
-
|
| 44 |
-
### 要求
|
| 45 |
-
|
| 46 |
-
1. **每个场景一个文件夹**:每个场景应该有自己的文件夹
|
| 47 |
-
2. **文件夹名称**:文件夹名称会显示为场景名称
|
| 48 |
-
3. **图像文件**:支持 `.jpg`, `.jpeg`, `.png`, `.bmp`, `.tiff`, `.tif` 格式
|
| 49 |
-
4. **第一张图像**:第一张图像(按文件名排序)会用作缩略图
|
| 50 |
-
|
| 51 |
-
## 🔧 配置方式
|
| 52 |
-
|
| 53 |
-
### 方式 1:使用默认路径(推荐)
|
| 54 |
-
|
| 55 |
-
直接创建目录:
|
| 56 |
-
|
| 57 |
-
```bash
|
| 58 |
-
mkdir -p workspace/gradio/examples
|
| 59 |
-
```
|
| 60 |
-
|
| 61 |
-
然后添加场景:
|
| 62 |
-
|
| 63 |
-
```bash
|
| 64 |
-
# 创建场景文件夹
|
| 65 |
-
mkdir -p workspace/gradio/examples/my_scene
|
| 66 |
-
|
| 67 |
-
# 复制图像文件
|
| 68 |
-
cp your_images/* workspace/gradio/examples/my_scene/
|
| 69 |
-
```
|
| 70 |
-
|
| 71 |
-
### 方式 2:使用环境变量
|
| 72 |
-
|
| 73 |
-
通过环境变量自定义位置:
|
| 74 |
-
|
| 75 |
-
```bash
|
| 76 |
-
# 设置环境变量
|
| 77 |
-
export DA3_WORKSPACE_DIR="/path/to/your/workspace"
|
| 78 |
-
|
| 79 |
-
# 然后 examples 会在 /path/to/your/workspace/examples
|
| 80 |
-
```
|
| 81 |
-
|
| 82 |
-
或在 `app.py` 中修改:
|
| 83 |
-
|
| 84 |
-
```python
|
| 85 |
-
workspace_dir = os.environ.get("DA3_WORKSPACE_DIR", "/custom/path/workspace")
|
| 86 |
-
```
|
| 87 |
-
|
| 88 |
-
### 方式 3:在 Hugging Face Spaces 中
|
| 89 |
-
|
| 90 |
-
在 Spaces 中,可以通过以下方式添加 examples:
|
| 91 |
-
|
| 92 |
-
1. **通过 Git 上传**:
|
| 93 |
-
```bash
|
| 94 |
-
git add workspace/gradio/examples/
|
| 95 |
-
git commit -m "Add example scenes"
|
| 96 |
-
git push
|
| 97 |
-
```
|
| 98 |
-
|
| 99 |
-
2. **通过网页界面上传**:
|
| 100 |
-
- 在 Spaces 的文件浏览器中创建 `workspace/gradio/examples/` 目录
|
| 101 |
-
- 上传场景文件夹和图像
|
| 102 |
-
|
| 103 |
-
3. **使用持久存储**:
|
| 104 |
-
- 如果使用持久存储,examples 会保存在持久存储中
|
| 105 |
-
- 路径仍然是 `workspace/gradio/examples/`
|
| 106 |
-
|
| 107 |
-
## 📝 示例场景结构示例
|
| 108 |
-
|
| 109 |
-
### 示例 1:简单场景
|
| 110 |
-
|
| 111 |
-
```
|
| 112 |
-
workspace/gradio/examples/
|
| 113 |
-
└── indoor_room/
|
| 114 |
-
├── 000.png
|
| 115 |
-
├── 010.png
|
| 116 |
-
├── 020.png
|
| 117 |
-
└── 030.png
|
| 118 |
-
```
|
| 119 |
-
|
| 120 |
-
### 示例 2:多个场景
|
| 121 |
-
|
| 122 |
-
```
|
| 123 |
-
workspace/gradio/examples/
|
| 124 |
-
├── outdoor_garden/
|
| 125 |
-
│ ├── frame_001.jpg
|
| 126 |
-
│ ├── frame_002.jpg
|
| 127 |
-
│ └── frame_003.jpg
|
| 128 |
-
├── office_space/
|
| 129 |
-
│ ├── img_000.png
|
| 130 |
-
│ ├── img_010.png
|
| 131 |
-
│ └── img_020.png
|
| 132 |
-
└── street_scene/
|
| 133 |
-
├── 000.png
|
| 134 |
-
├── 010.png
|
| 135 |
-
└── 020.png
|
| 136 |
-
```
|
| 137 |
-
|
| 138 |
-
## 🔍 验证 Examples 目录
|
| 139 |
-
|
| 140 |
-
### 检查目录是否存在
|
| 141 |
-
|
| 142 |
-
```bash
|
| 143 |
-
# 检查默认位置
|
| 144 |
-
ls -la workspace/gradio/examples/
|
| 145 |
-
|
| 146 |
-
# 或使用 Python
|
| 147 |
-
python -c "
|
| 148 |
-
import os
|
| 149 |
-
workspace_dir = os.environ.get('DA3_WORKSPACE_DIR', 'workspace/gradio')
|
| 150 |
-
examples_dir = os.path.join(workspace_dir, 'examples')
|
| 151 |
-
print(f'Examples directory: {examples_dir}')
|
| 152 |
-
print(f'Exists: {os.path.exists(examples_dir)}')
|
| 153 |
-
if os.path.exists(examples_dir):
|
| 154 |
-
scenes = [d for d in os.listdir(examples_dir) if os.path.isdir(os.path.join(examples_dir, d))]
|
| 155 |
-
print(f'Found {len(scenes)} scenes: {scenes}')
|
| 156 |
-
"
|
| 157 |
-
```
|
| 158 |
-
|
| 159 |
-
### 检查场景信息
|
| 160 |
-
|
| 161 |
-
应用启动时会自动扫描 examples 目录,并在日志中显示:
|
| 162 |
-
|
| 163 |
-
```
|
| 164 |
-
Found 3 example scenes:
|
| 165 |
-
- scene1 (5 images)
|
| 166 |
-
- scene2 (10 images)
|
| 167 |
-
- scene3 (8 images)
|
| 168 |
-
```
|
| 169 |
-
|
| 170 |
-
## 🚀 快速开始
|
| 171 |
-
|
| 172 |
-
### 1. 创建目录结构
|
| 173 |
-
|
| 174 |
-
```bash
|
| 175 |
-
# 在项目根目录
|
| 176 |
-
mkdir -p workspace/gradio/examples
|
| 177 |
-
```
|
| 178 |
-
|
| 179 |
-
### 2. 添加示例场景
|
| 180 |
-
|
| 181 |
-
```bash
|
| 182 |
-
# 创建场景文件夹
|
| 183 |
-
mkdir -p workspace/gradio/examples/my_first_scene
|
| 184 |
-
|
| 185 |
-
# 添加图像文件(复制你的图像)
|
| 186 |
-
cp /path/to/your/images/* workspace/gradio/examples/my_first_scene/
|
| 187 |
-
```
|
| 188 |
-
|
| 189 |
-
### 3. 验证
|
| 190 |
-
|
| 191 |
-
启动应用后,你应该能在 UI 中看到示例场景网格。
|
| 192 |
-
|
| 193 |
-
## 📊 在 Hugging Face Spaces 中
|
| 194 |
-
|
| 195 |
-
### 上传方式
|
| 196 |
-
|
| 197 |
-
1. **通过 Git**(推荐):
|
| 198 |
-
```bash
|
| 199 |
-
# 在本地准备 examples
|
| 200 |
-
mkdir -p workspace/gradio/examples
|
| 201 |
-
# ... 添加场景 ...
|
| 202 |
-
|
| 203 |
-
# 提交并推送
|
| 204 |
-
git add workspace/gradio/examples/
|
| 205 |
-
git commit -m "Add example scenes"
|
| 206 |
-
git push
|
| 207 |
-
```
|
| 208 |
-
|
| 209 |
-
2. **通过网页界面**:
|
| 210 |
-
- 在 Spaces 的文件浏览器中
|
| 211 |
-
- 创建 `workspace/gradio/examples/` 目录
|
| 212 |
-
- 上传场景文件夹
|
| 213 |
-
|
| 214 |
-
### 注意事项
|
| 215 |
-
|
| 216 |
-
- **文件大小限制**:确保图像文件不超过 Spaces 的文件大小限制
|
| 217 |
-
- **持久存储**:如果使用持久存储,examples 会持久保存
|
| 218 |
-
- **缓存**:示例场景的结果会缓存在 `workspace/gradio/input_images/` 下
|
| 219 |
-
|
| 220 |
-
## 🔗 相关配置
|
| 221 |
-
|
| 222 |
-
### 环境变量
|
| 223 |
-
|
| 224 |
-
- `DA3_WORKSPACE_DIR`: 工作空间目录(默认:`workspace/gradio`)
|
| 225 |
-
- Examples 目录自动设置为:`{DA3_WORKSPACE_DIR}/examples`
|
| 226 |
-
|
| 227 |
-
### 代码中的配置
|
| 228 |
-
|
| 229 |
-
- `depth_anything_3/app/gradio_app.py`: `cache_examples()` 方法
|
| 230 |
-
- `depth_anything_3/app/modules/utils.py`: `get_scene_info()` 函数
|
| 231 |
-
- `depth_anything_3/app/modules/event_handlers.py`: `load_example_scene()` 方法
|
| 232 |
-
|
| 233 |
-
## ❓ 常见问题
|
| 234 |
-
|
| 235 |
-
### Q: Examples 目录不存在怎么办?
|
| 236 |
-
|
| 237 |
-
A: 应用会自动创建 `workspace/gradio/` 目录,但不会自动创建 `examples/` 子目录。你需要手动创建:
|
| 238 |
-
|
| 239 |
-
```bash
|
| 240 |
-
mkdir -p workspace/gradio/examples
|
| 241 |
-
```
|
| 242 |
-
|
| 243 |
-
### Q: 如何添加新的示例场景?
|
| 244 |
-
|
| 245 |
-
A: 只需在 `workspace/gradio/examples/` 下创建新文件夹并添加图像:
|
| 246 |
-
|
| 247 |
-
```bash
|
| 248 |
-
mkdir -p workspace/gradio/examples/new_scene
|
| 249 |
-
cp images/* workspace/gradio/examples/new_scene/
|
| 250 |
-
```
|
| 251 |
-
|
| 252 |
-
应用会在下次启动时自动检测新场景。
|
| 253 |
-
|
| 254 |
-
### Q: 场景名称如何显示?
|
| 255 |
-
|
| 256 |
-
A: 场景名称就是文件夹名称。例如:
|
| 257 |
-
- 文件夹:`workspace/gradio/examples/indoor_room/`
|
| 258 |
-
- 显示名称:`indoor_room`
|
| 259 |
-
|
| 260 |
-
### Q: 缩略图如何选择?
|
| 261 |
-
|
| 262 |
-
A: 缩略图是文件夹中按文件名排序后的第一张图像。
|
| 263 |
-
|
| 264 |
-
## 📝 总结
|
| 265 |
-
|
| 266 |
-
**Examples 目录位置:**
|
| 267 |
-
- **默认**:`workspace/gradio/examples/`
|
| 268 |
-
- **可通过环境变量**:`DA3_WORKSPACE_DIR` 自定义
|
| 269 |
-
|
| 270 |
-
**目录结构:**
|
| 271 |
-
```
|
| 272 |
-
workspace/gradio/examples/
|
| 273 |
-
├── scene1/
|
| 274 |
-
│ └── images...
|
| 275 |
-
├── scene2/
|
| 276 |
-
│ └── images...
|
| 277 |
-
└── scene3/
|
| 278 |
-
└── images...
|
| 279 |
-
```
|
| 280 |
-
|
| 281 |
-
**快速创建:**
|
| 282 |
-
```bash
|
| 283 |
-
mkdir -p workspace/gradio/examples
|
| 284 |
-
# 然后添加场景文件夹和图像
|
| 285 |
-
```
|
| 286 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SPACES_GPU_BEST_PRACTICES.md
DELETED
|
@@ -1,481 +0,0 @@
|
|
| 1 |
-
# 🎯 Spaces GPU 最佳实践指南
|
| 2 |
-
|
| 3 |
-
## 📚 spaces.GPU 工作原理
|
| 4 |
-
|
| 5 |
-
### 架构概览
|
| 6 |
-
|
| 7 |
-
```
|
| 8 |
-
┌─────────────────────────────────────────────────────────┐
|
| 9 |
-
│ 主进程 (Main Process) │
|
| 10 |
-
│ - CPU 环境 │
|
| 11 |
-
│ - ❌ 不能初始化 CUDA │
|
| 12 |
-
│ - ✅ 可以创建 Gradio UI │
|
| 13 |
-
│ - ✅ 可以创建 ModelInference 实例(但不加载模型) │
|
| 14 |
-
└─────────────────────────────────────────────────────────┘
|
| 15 |
-
│
|
| 16 |
-
│ 调用 @spaces.GPU 装饰的函数
|
| 17 |
-
│
|
| 18 |
-
▼
|
| 19 |
-
┌─────────────────────────────────────────────────────────┐
|
| 20 |
-
│ 子进程 (GPU Worker Process) │
|
| 21 |
-
│ - GPU 环境 │
|
| 22 |
-
│ - ✅ 可以初始化 CUDA │
|
| 23 |
-
│ - ✅ 可以加载模型到 GPU │
|
| 24 |
-
│ - ✅ 运行推理 │
|
| 25 |
-
│ - ✅ 全局变量缓存(每个子进程独立) │
|
| 26 |
-
└─────────────────────────────────────────────────────────┘
|
| 27 |
-
│
|
| 28 |
-
│ pickle 序列化返回值
|
| 29 |
-
│
|
| 30 |
-
▼
|
| 31 |
-
┌─────────────────────────────────────────────────────────┐
|
| 32 |
-
│ 主进程接收返回值 │
|
| 33 |
-
│ - ✅ 必须是 CPU 数据(numpy, 基本类型) │
|
| 34 |
-
│ - ❌ 不能包含 CUDA 张量 │
|
| 35 |
-
└─────────────────────────────────────────────────────────┘
|
| 36 |
-
```
|
| 37 |
-
|
| 38 |
-
## ✅ 最佳实践:模型加载策略
|
| 39 |
-
|
| 40 |
-
### ❌ 错误做法 1:主进程加载模型
|
| 41 |
-
|
| 42 |
-
```python
|
| 43 |
-
# ❌ 错误:在主进程加载模型
|
| 44 |
-
class EventHandlers:
|
| 45 |
-
def __init__(self):
|
| 46 |
-
self.model_inference = ModelInference()
|
| 47 |
-
# ❌ 如果在主进程调用这个,会触发 CUDA 初始化错误
|
| 48 |
-
self.model_inference.initialize_model("cuda") # 💥
|
| 49 |
-
```
|
| 50 |
-
|
| 51 |
-
**为什么错误?**
|
| 52 |
-
- 主进程不能初始化 CUDA
|
| 53 |
-
- 会立即报错:`CUDA must not be initialized in the main process`
|
| 54 |
-
|
| 55 |
-
### ❌ 错误做法 2:实例变量存储模型
|
| 56 |
-
|
| 57 |
-
```python
|
| 58 |
-
# ❌ 错误:使用实例变量存储模型
|
| 59 |
-
class ModelInference:
|
| 60 |
-
def __init__(self):
|
| 61 |
-
self.model = None # ❌ 实例变量
|
| 62 |
-
|
| 63 |
-
def initialize_model(self, device):
|
| 64 |
-
if self.model is None:
|
| 65 |
-
self.model = load_model() # ❌ 保存在实例中
|
| 66 |
-
return self.model
|
| 67 |
-
```
|
| 68 |
-
|
| 69 |
-
**为什么错误?**
|
| 70 |
-
- 实例在主进程创建
|
| 71 |
-
- 模型状态可能跨进程混乱
|
| 72 |
-
- 第二次调用时状态不确定
|
| 73 |
-
|
| 74 |
-
### ✅ 正确做法:子进程全局变量缓存
|
| 75 |
-
|
| 76 |
-
```python
|
| 77 |
-
# ✅ 正确:使用全局变量在子进程中缓存
|
| 78 |
-
_MODEL_CACHE = None # 全局变量,每个子进程独立
|
| 79 |
-
|
| 80 |
-
class ModelInference:
|
| 81 |
-
def __init__(self):
|
| 82 |
-
# ✅ 不存储任何状态
|
| 83 |
-
pass
|
| 84 |
-
|
| 85 |
-
def initialize_model(self, device: str = "cuda"):
|
| 86 |
-
global _MODEL_CACHE
|
| 87 |
-
|
| 88 |
-
if _MODEL_CACHE is None:
|
| 89 |
-
# ✅ 在子进程中加载(第一次调用时)
|
| 90 |
-
print("Loading model in GPU subprocess...")
|
| 91 |
-
model_dir = os.environ.get("DA3_MODEL_DIR", "...")
|
| 92 |
-
_MODEL_CACHE = DepthAnything3.from_pretrained(model_dir)
|
| 93 |
-
_MODEL_CACHE = _MODEL_CACHE.to(device) # ✅ 在子进程中移动
|
| 94 |
-
_MODEL_CACHE.eval()
|
| 95 |
-
else:
|
| 96 |
-
# ✅ 复用缓存的模型
|
| 97 |
-
print("Using cached model")
|
| 98 |
-
|
| 99 |
-
return _MODEL_CACHE # ✅ 返回模型,不存储
|
| 100 |
-
```
|
| 101 |
-
|
| 102 |
-
**为什么正确?**
|
| 103 |
-
- ✅ 模型只在子进程加载(GPU 环境)
|
| 104 |
-
- ✅ 全局变量在子进程内安全(每个子进程独立)
|
| 105 |
-
- ✅ 不污染主进程
|
| 106 |
-
- ✅ 可以缓存复用(避免重复加载)
|
| 107 |
-
|
| 108 |
-
## 🎯 完整实现示例
|
| 109 |
-
|
| 110 |
-
### 文件结构
|
| 111 |
-
|
| 112 |
-
```
|
| 113 |
-
app.py # 主入口,配置 @spaces.GPU
|
| 114 |
-
depth_anything_3/app/modules/
|
| 115 |
-
├── model_inference.py # 模型推理(使用全局变量)
|
| 116 |
-
└── event_handlers.py # 事件处理(主进程,不加载模型)
|
| 117 |
-
```
|
| 118 |
-
|
| 119 |
-
### 1. app.py - 装饰器配置
|
| 120 |
-
|
| 121 |
-
```python
|
| 122 |
-
import spaces
|
| 123 |
-
from depth_anything_3.app.modules.model_inference import ModelInference
|
| 124 |
-
|
| 125 |
-
# ✅ 装饰 run_inference 方法
|
| 126 |
-
original_run_inference = ModelInference.run_inference
|
| 127 |
-
|
| 128 |
-
@spaces.GPU(duration=120)
|
| 129 |
-
def gpu_run_inference(self, *args, **kwargs):
|
| 130 |
-
"""
|
| 131 |
-
在 GPU 子进程中运行推理。
|
| 132 |
-
|
| 133 |
-
这个函数会在独立的 GPU 子进程中执行,
|
| 134 |
-
可以安全地初始化 CUDA 和加载模型。
|
| 135 |
-
"""
|
| 136 |
-
return original_run_inference(self, *args, **kwargs)
|
| 137 |
-
|
| 138 |
-
# 替换原方法
|
| 139 |
-
ModelInference.run_inference = gpu_run_inference
|
| 140 |
-
|
| 141 |
-
# ✅ 主进程:只创建应用,不加载模型
|
| 142 |
-
if __name__ == "__main__":
|
| 143 |
-
app = DepthAnything3App(...)
|
| 144 |
-
app.launch(host="0.0.0.0", port=7860)
|
| 145 |
-
```
|
| 146 |
-
|
| 147 |
-
### 2. model_inference.py - 模型管理
|
| 148 |
-
|
| 149 |
-
```python
|
| 150 |
-
import torch
|
| 151 |
-
from depth_anything_3.api import DepthAnything3
|
| 152 |
-
|
| 153 |
-
# ========================================
|
| 154 |
-
# ✅ 全局变量缓存(子进程安全)
|
| 155 |
-
# ========================================
|
| 156 |
-
_MODEL_CACHE = None
|
| 157 |
-
|
| 158 |
-
class ModelInference:
|
| 159 |
-
def __init__(self):
|
| 160 |
-
"""
|
| 161 |
-
初始化 - 不存储任何状态。
|
| 162 |
-
|
| 163 |
-
注意:这个实例在主进程创建,但模型加载在子进程。
|
| 164 |
-
"""
|
| 165 |
-
pass # ✅ 无实例变量
|
| 166 |
-
|
| 167 |
-
def initialize_model(self, device: str = "cuda"):
|
| 168 |
-
"""
|
| 169 |
-
在子进程中加载模型。
|
| 170 |
-
|
| 171 |
-
使用全局变量缓存,因为:
|
| 172 |
-
1. @spaces.GPU 在子进程运行
|
| 173 |
-
2. 每个子进程有独立的全局命名空间
|
| 174 |
-
3. 可以安全缓存,避免重复加载
|
| 175 |
-
"""
|
| 176 |
-
global _MODEL_CACHE
|
| 177 |
-
|
| 178 |
-
if _MODEL_CACHE is None:
|
| 179 |
-
# 第一次调用:加载模型
|
| 180 |
-
model_dir = os.environ.get("DA3_MODEL_DIR", "...")
|
| 181 |
-
print(f"🔄 Loading model in GPU subprocess from {model_dir}")
|
| 182 |
-
|
| 183 |
-
_MODEL_CACHE = DepthAnything3.from_pretrained(model_dir)
|
| 184 |
-
_MODEL_CACHE = _MODEL_CACHE.to(device) # ✅ 在子进程中移动
|
| 185 |
-
_MODEL_CACHE.eval()
|
| 186 |
-
|
| 187 |
-
print(f"✅ Model loaded on {device}")
|
| 188 |
-
else:
|
| 189 |
-
# 后续调用:复用缓存
|
| 190 |
-
print("✅ Using cached model")
|
| 191 |
-
# 确保在正确的设备上(防御性编程)
|
| 192 |
-
_MODEL_CACHE = _MODEL_CACHE.to(device)
|
| 193 |
-
|
| 194 |
-
return _MODEL_CACHE
|
| 195 |
-
|
| 196 |
-
def run_inference(self, target_dir, ...):
|
| 197 |
-
"""
|
| 198 |
-
运行推理 - 在 GPU 子进程中执行。
|
| 199 |
-
|
| 200 |
-
这个函数被 @spaces.GPU 装饰,会在子进程运行。
|
| 201 |
-
"""
|
| 202 |
-
# ✅ 在子进程中获取模型(局部变量)
|
| 203 |
-
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 204 |
-
model = self.initialize_model(device) # ✅ 返回模型,不存储
|
| 205 |
-
|
| 206 |
-
# ✅ 运行推理
|
| 207 |
-
with torch.no_grad():
|
| 208 |
-
prediction = model.inference(...)
|
| 209 |
-
|
| 210 |
-
# ✅ 处理结果
|
| 211 |
-
# ...
|
| 212 |
-
|
| 213 |
-
# ✅ 关键:返回前移动所有 CUDA 张量到 CPU
|
| 214 |
-
prediction = self._move_to_cpu(prediction)
|
| 215 |
-
|
| 216 |
-
return prediction, processed_data
|
| 217 |
-
|
| 218 |
-
def _move_to_cpu(self, prediction):
|
| 219 |
-
"""移动所有 CUDA 张量到 CPU,确保 pickle 安全"""
|
| 220 |
-
# ... 实现见下文
|
| 221 |
-
return prediction
|
| 222 |
-
```
|
| 223 |
-
|
| 224 |
-
### 3. event_handlers.py - 主进程代码
|
| 225 |
-
|
| 226 |
-
```python
|
| 227 |
-
class EventHandlers:
|
| 228 |
-
def __init__(self):
|
| 229 |
-
"""
|
| 230 |
-
主进程初始化 - 不加载模型。
|
| 231 |
-
|
| 232 |
-
注意:这里创建 ModelInference 实例是安全的,
|
| 233 |
-
因为它不立即加载模型。模型会在子进程中加载。
|
| 234 |
-
"""
|
| 235 |
-
# ✅ 可以创建实例(不加载模型)
|
| 236 |
-
self.model_inference = ModelInference()
|
| 237 |
-
|
| 238 |
-
# ❌ 不要在这里调用 initialize_model()
|
| 239 |
-
# ❌ 不要在这里加载模型
|
| 240 |
-
|
| 241 |
-
def gradio_demo(self, ...):
|
| 242 |
-
"""
|
| 243 |
-
Gradio 回调 - 在主进程调用。
|
| 244 |
-
|
| 245 |
-
这个函数会调用 self.model_inference.run_inference,
|
| 246 |
-
而 run_inference 被 @spaces.GPU 装饰,会在子进程运行。
|
| 247 |
-
"""
|
| 248 |
-
# ✅ 调用被装饰的方法(自动在子进程运行)
|
| 249 |
-
result = self.model_inference.run_inference(...)
|
| 250 |
-
return result
|
| 251 |
-
```
|
| 252 |
-
|
| 253 |
-
## 🔑 关键原则总结
|
| 254 |
-
|
| 255 |
-
### ✅ DO(应该做)
|
| 256 |
-
|
| 257 |
-
1. **主进程:只创建实例,不加载模型**
|
| 258 |
-
```python
|
| 259 |
-
# ✅ 主进程
|
| 260 |
-
model_inference = ModelInference() # 安全
|
| 261 |
-
# 不调用 initialize_model()
|
| 262 |
-
```
|
| 263 |
-
|
| 264 |
-
2. **子进程:使用全局变量缓存模型**
|
| 265 |
-
```python
|
| 266 |
-
# ✅ 子进程(@spaces.GPU 装饰的函数内)
|
| 267 |
-
_MODEL_CACHE = None # 全局变量
|
| 268 |
-
model = initialize_model() # 在子进程加载
|
| 269 |
-
```
|
| 270 |
-
|
| 271 |
-
3. **返回前:移动所有张量到 CPU**
|
| 272 |
-
```python
|
| 273 |
-
# ✅ 返回前
|
| 274 |
-
prediction = move_all_tensors_to_cpu(prediction)
|
| 275 |
-
return prediction
|
| 276 |
-
```
|
| 277 |
-
|
| 278 |
-
4. **清理 GPU 内存**
|
| 279 |
-
```python
|
| 280 |
-
# ✅ 推理后
|
| 281 |
-
torch.cuda.empty_cache()
|
| 282 |
-
```
|
| 283 |
-
|
| 284 |
-
### ❌ DON'T(不应该做)
|
| 285 |
-
|
| 286 |
-
1. **主进程:不要初始化 CUDA**
|
| 287 |
-
```python
|
| 288 |
-
# ❌ 主进程
|
| 289 |
-
model.to("cuda") # 💥 错误
|
| 290 |
-
torch.cuda.is_available() # 💥 可能触发初始化
|
| 291 |
-
```
|
| 292 |
-
|
| 293 |
-
2. **不要用实例变量存储模型**
|
| 294 |
-
```python
|
| 295 |
-
# ❌
|
| 296 |
-
self.model = load_model() # 状态混乱
|
| 297 |
-
```
|
| 298 |
-
|
| 299 |
-
3. **不要返回 CUDA 张量**
|
| 300 |
-
```python
|
| 301 |
-
# ❌
|
| 302 |
-
return prediction # 如果包含 CUDA 张量,会报错
|
| 303 |
-
```
|
| 304 |
-
|
| 305 |
-
4. **不要在 __init__ 中加载模型**
|
| 306 |
-
```python
|
| 307 |
-
# ❌
|
| 308 |
-
def __init__(self):
|
| 309 |
-
self.model = load_model() # 在主进程执行,会报错
|
| 310 |
-
```
|
| 311 |
-
|
| 312 |
-
## 📊 执行流程对比
|
| 313 |
-
|
| 314 |
-
### ❌ 错误流程
|
| 315 |
-
|
| 316 |
-
```
|
| 317 |
-
主进程启动
|
| 318 |
-
↓
|
| 319 |
-
创建 ModelInference() 实例
|
| 320 |
-
↓
|
| 321 |
-
__init__ 中 self.model = None # ✅ 安全
|
| 322 |
-
↓
|
| 323 |
-
第一次调用 run_inference
|
| 324 |
-
↓
|
| 325 |
-
@spaces.GPU 创建子进程
|
| 326 |
-
↓
|
| 327 |
-
子进程:self.model = load_model() # ✅ 在子进程
|
| 328 |
-
↓
|
| 329 |
-
返回 prediction(包含 CUDA 张量) # ❌ 错误
|
| 330 |
-
↓
|
| 331 |
-
pickle 尝试在主进程重建 CUDA 张量 # 💥 报错
|
| 332 |
-
```
|
| 333 |
-
|
| 334 |
-
### ✅ 正确流程
|
| 335 |
-
|
| 336 |
-
```
|
| 337 |
-
主进程启动
|
| 338 |
-
↓
|
| 339 |
-
创建 ModelInference() 实例(无状态) # ✅
|
| 340 |
-
↓
|
| 341 |
-
第一次调用 run_inference
|
| 342 |
-
↓
|
| 343 |
-
@spaces.GPU 创建子进程
|
| 344 |
-
↓
|
| 345 |
-
子进程:_MODEL_CACHE = load_model() # ✅ 全局变量
|
| 346 |
-
↓
|
| 347 |
-
子进程:model = _MODEL_CACHE # ✅ 局部变量
|
| 348 |
-
↓
|
| 349 |
-
子进程:prediction = model.inference(...)
|
| 350 |
-
↓
|
| 351 |
-
子进程:prediction = move_to_cpu(prediction) # ✅
|
| 352 |
-
↓
|
| 353 |
-
返回 prediction(所有张量在 CPU) # ✅
|
| 354 |
-
↓
|
| 355 |
-
主进程:安全接收 CPU 数据 # ✅
|
| 356 |
-
```
|
| 357 |
-
|
| 358 |
-
## 🧪 验证清单
|
| 359 |
-
|
| 360 |
-
### 主进程检查
|
| 361 |
-
|
| 362 |
-
```python
|
| 363 |
-
# ✅ 应该通过
|
| 364 |
-
def test_main_process():
|
| 365 |
-
# 可以创建实例
|
| 366 |
-
model_inference = ModelInference()
|
| 367 |
-
|
| 368 |
-
# 不应该有模型
|
| 369 |
-
assert not hasattr(model_inference, 'model') or model_inference.model is None
|
| 370 |
-
|
| 371 |
-
# 不应该初始化 CUDA
|
| 372 |
-
# (这个测试需要在主进程运行)
|
| 373 |
-
```
|
| 374 |
-
|
| 375 |
-
### 子进程检查
|
| 376 |
-
|
| 377 |
-
```python
|
| 378 |
-
# ✅ 应该通过
|
| 379 |
-
@spaces.GPU
|
| 380 |
-
def test_gpu_subprocess():
|
| 381 |
-
model_inference = ModelInference()
|
| 382 |
-
|
| 383 |
-
# 可以加载模型
|
| 384 |
-
model = model_inference.initialize_model("cuda")
|
| 385 |
-
assert model is not None
|
| 386 |
-
|
| 387 |
-
# 模型应该在 GPU
|
| 388 |
-
# (检查模型参数设备)
|
| 389 |
-
|
| 390 |
-
# 可以运行推理
|
| 391 |
-
# ...
|
| 392 |
-
|
| 393 |
-
# 返回前应该移到 CPU
|
| 394 |
-
# ...
|
| 395 |
-
```
|
| 396 |
-
|
| 397 |
-
## 🎓 常见问题
|
| 398 |
-
|
| 399 |
-
### Q1: 为什么不能用实例变量?
|
| 400 |
-
|
| 401 |
-
**A:** 因为实例在主进程创建,如果存储模型状态,会跨进程混乱。
|
| 402 |
-
|
| 403 |
-
```python
|
| 404 |
-
# ❌ 问题
|
| 405 |
-
self.model = load_model() # 状态可能混乱
|
| 406 |
-
|
| 407 |
-
# ✅ 解决
|
| 408 |
-
_MODEL_CACHE = load_model() # 每个子进程独立
|
| 409 |
-
```
|
| 410 |
-
|
| 411 |
-
### Q2: 全局变量安全吗?
|
| 412 |
-
|
| 413 |
-
**A:** 是的!因为:
|
| 414 |
-
- 每个子进程有独立的全局命名空间
|
| 415 |
-
- 主进程不会访问子进程的全局变量
|
| 416 |
-
- 不会跨进程污染
|
| 417 |
-
|
| 418 |
-
### Q3: 模型会重复加载吗?
|
| 419 |
-
|
| 420 |
-
**A:** 不会!因为:
|
| 421 |
-
- 全局变量在子进程内缓存
|
| 422 |
-
- 同一个子进程的多次调用会复用
|
| 423 |
-
- 不同子进程各自缓存(如果需要)
|
| 424 |
-
|
| 425 |
-
### Q4: 如何清理模型?
|
| 426 |
-
|
| 427 |
-
**A:** 通常不需要手动清理,因为:
|
| 428 |
-
- 子进程结束后自动清理
|
| 429 |
-
- 如果需要,可以在子进程中:
|
| 430 |
-
```python
|
| 431 |
-
global _MODEL_CACHE
|
| 432 |
-
_MODEL_CACHE = None
|
| 433 |
-
del model
|
| 434 |
-
torch.cuda.empty_cache()
|
| 435 |
-
```
|
| 436 |
-
|
| 437 |
-
## 📝 完整代码模板
|
| 438 |
-
|
| 439 |
-
```python
|
| 440 |
-
# ========================================
|
| 441 |
-
# model_inference.py
|
| 442 |
-
# ========================================
|
| 443 |
-
_MODEL_CACHE = None # 全局缓存
|
| 444 |
-
|
| 445 |
-
class ModelInference:
|
| 446 |
-
def __init__(self):
|
| 447 |
-
pass # 无状态
|
| 448 |
-
|
| 449 |
-
def initialize_model(self, device="cuda"):
|
| 450 |
-
global _MODEL_CACHE
|
| 451 |
-
if _MODEL_CACHE is None:
|
| 452 |
-
_MODEL_CACHE = load_model().to(device)
|
| 453 |
-
return _MODEL_CACHE
|
| 454 |
-
|
| 455 |
-
def run_inference(self, ...):
|
| 456 |
-
model = self.initialize_model("cuda")
|
| 457 |
-
prediction = model.inference(...)
|
| 458 |
-
prediction = self._move_to_cpu(prediction)
|
| 459 |
-
return prediction
|
| 460 |
-
|
| 461 |
-
# ========================================
|
| 462 |
-
# app.py
|
| 463 |
-
# ========================================
|
| 464 |
-
@spaces.GPU(duration=120)
|
| 465 |
-
def gpu_run_inference(self, *args, **kwargs):
|
| 466 |
-
return ModelInference.run_inference(self, *args, **kwargs)
|
| 467 |
-
|
| 468 |
-
ModelInference.run_inference = gpu_run_inference
|
| 469 |
-
```
|
| 470 |
-
|
| 471 |
-
## 🎯 总结
|
| 472 |
-
|
| 473 |
-
**核心原则:**
|
| 474 |
-
|
| 475 |
-
1. ✅ **主进程 = CPU 环境**,不加载模型,不初始化 CUDA
|
| 476 |
-
2. ✅ **子进程 = GPU 环境**,加载模型,运行推理
|
| 477 |
-
3. ✅ **全局变量缓存**,每个子进程独立
|
| 478 |
-
4. ✅ **返回 CPU 数据**,确保 pickle 安全
|
| 479 |
-
|
| 480 |
-
遵循这些原则,你的 Spaces GPU 应用就能稳定运行!🚀
|
| 481 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SPACES_GPU_FIX_GUIDE.md
DELETED
|
@@ -1,484 +0,0 @@
|
|
| 1 |
-
# 🔧 Spaces GPU 问题完整修复指南
|
| 2 |
-
|
| 3 |
-
## 🎯 问题诊断:你说得完全正确!
|
| 4 |
-
|
| 5 |
-
### 问题根源分析
|
| 6 |
-
|
| 7 |
-
```python
|
| 8 |
-
# event_handlers.py - 主进程中
|
| 9 |
-
class EventHandlers:
|
| 10 |
-
def __init__(self):
|
| 11 |
-
self.model_inference = ModelInference() # ❌ 在主进程创建实例
|
| 12 |
-
|
| 13 |
-
# model_inference.py
|
| 14 |
-
class ModelInference:
|
| 15 |
-
def __init__(self):
|
| 16 |
-
self.model = None # ❌ 实例变量,跨进程共享状态有问题
|
| 17 |
-
|
| 18 |
-
def initialize_model(self, device):
|
| 19 |
-
if self.model is None:
|
| 20 |
-
self.model = load_model() # 第一次:在子进程加载
|
| 21 |
-
else:
|
| 22 |
-
self.model = self.model.to(device) # 第二次:💥 主进程CUDA操作!
|
| 23 |
-
```
|
| 24 |
-
|
| 25 |
-
### 为什么第二次会失败?
|
| 26 |
-
|
| 27 |
-
1. **第一次调用**:
|
| 28 |
-
- `@spaces.GPU` 在子进程运行
|
| 29 |
-
- `self.model is None` → 加载模型
|
| 30 |
-
- `self.model` 保存在实例中
|
| 31 |
-
- 返回时 `prediction.gaussians` 包含 CUDA 张量
|
| 32 |
-
- **pickle 时尝试在主进程重建 CUDA 张量** → 💥
|
| 33 |
-
|
| 34 |
-
2. **第二次调用**(即使第一次成功了):
|
| 35 |
-
- 新的子进程或状态混乱
|
| 36 |
-
- `self.model` 状态不确定
|
| 37 |
-
- 尝试 `.to(device)` 操作 → 💥
|
| 38 |
-
|
| 39 |
-
## ✅ 解决方案:两个关键修改
|
| 40 |
-
|
| 41 |
-
### 修改 1:使用全局变量缓存模型(避免实例状态)
|
| 42 |
-
|
| 43 |
-
**为什么用全局变量?**
|
| 44 |
-
- `@spaces.GPU` 每次在独立子进程运行
|
| 45 |
-
- 全局变量在子进程内是安全的
|
| 46 |
-
- 不会污染主进程
|
| 47 |
-
|
| 48 |
-
### 修改 2:返回前移动所有 CUDA 张量到 CPU
|
| 49 |
-
|
| 50 |
-
**为什么需要?**
|
| 51 |
-
- Pickle 序列化返回值时会尝试重建 CUDA 张量
|
| 52 |
-
- 必须确保返回的数据都在 CPU 上
|
| 53 |
-
|
| 54 |
-
## 📝 完整修复代码
|
| 55 |
-
|
| 56 |
-
### 文件:`depth_anything_3/app/modules/model_inference.py`
|
| 57 |
-
|
| 58 |
-
```python
|
| 59 |
-
"""
|
| 60 |
-
Model inference module for Depth Anything 3 Gradio app.
|
| 61 |
-
|
| 62 |
-
Modified for HF Spaces GPU compatibility.
|
| 63 |
-
"""
|
| 64 |
-
|
| 65 |
-
import gc
|
| 66 |
-
import glob
|
| 67 |
-
import os
|
| 68 |
-
from typing import Any, Dict, Optional, Tuple
|
| 69 |
-
import numpy as np
|
| 70 |
-
import torch
|
| 71 |
-
|
| 72 |
-
from depth_anything_3.api import DepthAnything3
|
| 73 |
-
from depth_anything_3.utils.export.glb import export_to_glb
|
| 74 |
-
from depth_anything_3.utils.export.gs import export_to_gs_video
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
# ========================================
|
| 78 |
-
# 🔑 关键修改 1:使用全局变量缓存模型
|
| 79 |
-
# ========================================
|
| 80 |
-
# Global cache for model (used in GPU subprocess)
|
| 81 |
-
# This is SAFE because @spaces.GPU runs in isolated subprocess
|
| 82 |
-
# Each subprocess gets its own copy of this global variable
|
| 83 |
-
_MODEL_CACHE = None
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
class ModelInference:
|
| 87 |
-
"""
|
| 88 |
-
Handles model inference and data processing for Depth Anything 3.
|
| 89 |
-
|
| 90 |
-
Modified for HF Spaces GPU compatibility - does NOT store state
|
| 91 |
-
in instance variables to avoid cross-process issues.
|
| 92 |
-
"""
|
| 93 |
-
|
| 94 |
-
def __init__(self):
|
| 95 |
-
"""Initialize the model inference handler.
|
| 96 |
-
|
| 97 |
-
Note: Do NOT store model in instance variable to avoid
|
| 98 |
-
state sharing issues with @spaces.GPU decorator.
|
| 99 |
-
"""
|
| 100 |
-
# No instance variables! All state in global or local variables
|
| 101 |
-
pass
|
| 102 |
-
|
| 103 |
-
def initialize_model(self, device: str = "cuda"):
|
| 104 |
-
"""
|
| 105 |
-
Initialize the DepthAnything3 model using global cache.
|
| 106 |
-
|
| 107 |
-
This uses a global variable which is safe because:
|
| 108 |
-
1. @spaces.GPU runs in isolated subprocess
|
| 109 |
-
2. Each subprocess has its own global namespace
|
| 110 |
-
3. No state leaks to main process
|
| 111 |
-
|
| 112 |
-
Args:
|
| 113 |
-
device: Device to load the model on
|
| 114 |
-
|
| 115 |
-
Returns:
|
| 116 |
-
Model instance ready for inference
|
| 117 |
-
"""
|
| 118 |
-
global _MODEL_CACHE
|
| 119 |
-
|
| 120 |
-
if _MODEL_CACHE is None:
|
| 121 |
-
# First time loading in this subprocess
|
| 122 |
-
model_dir = os.environ.get(
|
| 123 |
-
"DA3_MODEL_DIR", "depth-anything/DA3NESTED-GIANT-LARGE"
|
| 124 |
-
)
|
| 125 |
-
print(f"🔄 Loading model from {model_dir}...")
|
| 126 |
-
_MODEL_CACHE = DepthAnything3.from_pretrained(model_dir)
|
| 127 |
-
_MODEL_CACHE = _MODEL_CACHE.to(device)
|
| 128 |
-
_MODEL_CACHE.eval()
|
| 129 |
-
print("✅ Model loaded and ready on GPU")
|
| 130 |
-
else:
|
| 131 |
-
# Model already cached in this subprocess
|
| 132 |
-
print("✅ Using cached model")
|
| 133 |
-
# Ensure it's on the correct device (defensive programming)
|
| 134 |
-
_MODEL_CACHE = _MODEL_CACHE.to(device)
|
| 135 |
-
|
| 136 |
-
return _MODEL_CACHE
|
| 137 |
-
|
| 138 |
-
def run_inference(
|
| 139 |
-
self,
|
| 140 |
-
target_dir: str,
|
| 141 |
-
filter_black_bg: bool = False,
|
| 142 |
-
filter_white_bg: bool = False,
|
| 143 |
-
process_res_method: str = "upper_bound_resize",
|
| 144 |
-
show_camera: bool = True,
|
| 145 |
-
selected_first_frame: Optional[str] = None,
|
| 146 |
-
save_percentage: float = 30.0,
|
| 147 |
-
num_max_points: int = 1_000_000,
|
| 148 |
-
infer_gs: bool = False,
|
| 149 |
-
gs_trj_mode: str = "extend",
|
| 150 |
-
gs_video_quality: str = "high",
|
| 151 |
-
) -> Tuple[Any, Dict[int, Dict[str, Any]]]:
|
| 152 |
-
"""
|
| 153 |
-
Run DepthAnything3 model inference on images.
|
| 154 |
-
|
| 155 |
-
This method is wrapped with @spaces.GPU in app.py.
|
| 156 |
-
|
| 157 |
-
Args:
|
| 158 |
-
target_dir: Directory containing images
|
| 159 |
-
filter_black_bg: Whether to filter black background
|
| 160 |
-
filter_white_bg: Whether to filter white background
|
| 161 |
-
process_res_method: Method for resizing input images
|
| 162 |
-
show_camera: Whether to show camera in 3D view
|
| 163 |
-
selected_first_frame: Selected first frame filename
|
| 164 |
-
save_percentage: Percentage of points to save (0-100)
|
| 165 |
-
num_max_points: Maximum number of points
|
| 166 |
-
infer_gs: Whether to infer 3D Gaussian Splatting
|
| 167 |
-
gs_trj_mode: Trajectory mode for GS
|
| 168 |
-
gs_video_quality: Video quality for GS
|
| 169 |
-
|
| 170 |
-
Returns:
|
| 171 |
-
Tuple of (prediction, processed_data)
|
| 172 |
-
"""
|
| 173 |
-
print(f"Processing images from {target_dir}")
|
| 174 |
-
|
| 175 |
-
# Device check
|
| 176 |
-
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 177 |
-
device = torch.device(device)
|
| 178 |
-
print(f"Using device: {device}")
|
| 179 |
-
|
| 180 |
-
# 🔑 使用返回值,而不是 self.model
|
| 181 |
-
model = self.initialize_model(device)
|
| 182 |
-
|
| 183 |
-
# Get image paths
|
| 184 |
-
print("Loading images...")
|
| 185 |
-
image_folder_path = os.path.join(target_dir, "images")
|
| 186 |
-
all_image_paths = sorted(glob.glob(os.path.join(image_folder_path, "*")))
|
| 187 |
-
|
| 188 |
-
# Filter for image files
|
| 189 |
-
image_extensions = [".jpg", ".jpeg", ".png", ".bmp", ".tiff", ".tif"]
|
| 190 |
-
all_image_paths = [
|
| 191 |
-
path
|
| 192 |
-
for path in all_image_paths
|
| 193 |
-
if any(path.lower().endswith(ext) for ext in image_extensions)
|
| 194 |
-
]
|
| 195 |
-
|
| 196 |
-
print(f"Found {len(all_image_paths)} images")
|
| 197 |
-
|
| 198 |
-
# Apply first frame selection logic
|
| 199 |
-
if selected_first_frame:
|
| 200 |
-
selected_path = None
|
| 201 |
-
for path in all_image_paths:
|
| 202 |
-
if os.path.basename(path) == selected_first_frame:
|
| 203 |
-
selected_path = path
|
| 204 |
-
break
|
| 205 |
-
|
| 206 |
-
if selected_path:
|
| 207 |
-
image_paths = [selected_path] + [
|
| 208 |
-
path for path in all_image_paths if path != selected_path
|
| 209 |
-
]
|
| 210 |
-
print(f"User selected first frame: {selected_first_frame}")
|
| 211 |
-
else:
|
| 212 |
-
image_paths = all_image_paths
|
| 213 |
-
print(f"Selected frame not found, using default order")
|
| 214 |
-
else:
|
| 215 |
-
image_paths = all_image_paths
|
| 216 |
-
|
| 217 |
-
if len(image_paths) == 0:
|
| 218 |
-
raise ValueError("No images found. Check your upload.")
|
| 219 |
-
|
| 220 |
-
# Map UI options to actual method names
|
| 221 |
-
method_mapping = {"high_res": "lower_bound_resize", "low_res": "upper_bound_resize"}
|
| 222 |
-
actual_method = method_mapping.get(process_res_method, "upper_bound_crop")
|
| 223 |
-
|
| 224 |
-
# Run model inference
|
| 225 |
-
print(f"Running inference with method: {actual_method}")
|
| 226 |
-
with torch.no_grad():
|
| 227 |
-
# 🔑 使用局部变量 model,不是 self.model
|
| 228 |
-
prediction = model.inference(
|
| 229 |
-
image_paths, export_dir=None, process_res_method=actual_method, infer_gs=infer_gs
|
| 230 |
-
)
|
| 231 |
-
|
| 232 |
-
# Export to GLB
|
| 233 |
-
export_to_glb(
|
| 234 |
-
prediction,
|
| 235 |
-
filter_black_bg=filter_black_bg,
|
| 236 |
-
filter_white_bg=filter_white_bg,
|
| 237 |
-
export_dir=target_dir,
|
| 238 |
-
show_cameras=show_camera,
|
| 239 |
-
conf_thresh_percentile=save_percentage,
|
| 240 |
-
num_max_points=int(num_max_points),
|
| 241 |
-
)
|
| 242 |
-
|
| 243 |
-
# Export to GS video if needed
|
| 244 |
-
if infer_gs:
|
| 245 |
-
mode_mapping = {"extend": "extend", "smooth": "interpolate_smooth"}
|
| 246 |
-
print(f"GS mode: {gs_trj_mode}; Backend mode: {mode_mapping[gs_trj_mode]}")
|
| 247 |
-
export_to_gs_video(
|
| 248 |
-
prediction,
|
| 249 |
-
export_dir=target_dir,
|
| 250 |
-
chunk_size=4,
|
| 251 |
-
trj_mode=mode_mapping.get(gs_trj_mode, "extend"),
|
| 252 |
-
enable_tqdm=True,
|
| 253 |
-
vis_depth="hcat",
|
| 254 |
-
video_quality=gs_video_quality,
|
| 255 |
-
)
|
| 256 |
-
|
| 257 |
-
# Save predictions cache
|
| 258 |
-
self._save_predictions_cache(target_dir, prediction)
|
| 259 |
-
|
| 260 |
-
# Process results
|
| 261 |
-
processed_data = self._process_results(target_dir, prediction, image_paths)
|
| 262 |
-
|
| 263 |
-
# ========================================
|
| 264 |
-
# 🔑 关键修改 2:返回前移动所有 CUDA 张量到 CPU
|
| 265 |
-
# ========================================
|
| 266 |
-
print("Moving all tensors to CPU for safe return...")
|
| 267 |
-
prediction = self._move_prediction_to_cpu(prediction)
|
| 268 |
-
|
| 269 |
-
# Clean up GPU memory
|
| 270 |
-
torch.cuda.empty_cache()
|
| 271 |
-
|
| 272 |
-
return prediction, processed_data
|
| 273 |
-
|
| 274 |
-
def _move_prediction_to_cpu(self, prediction: Any) -> Any:
|
| 275 |
-
"""
|
| 276 |
-
Move all CUDA tensors in prediction to CPU for safe pickling.
|
| 277 |
-
|
| 278 |
-
This is CRITICAL for HF Spaces with @spaces.GPU decorator.
|
| 279 |
-
Without this, pickle will try to reconstruct CUDA tensors in
|
| 280 |
-
the main process, causing CUDA initialization error.
|
| 281 |
-
|
| 282 |
-
Args:
|
| 283 |
-
prediction: Prediction object that may contain CUDA tensors
|
| 284 |
-
|
| 285 |
-
Returns:
|
| 286 |
-
Prediction object with all tensors moved to CPU
|
| 287 |
-
"""
|
| 288 |
-
# Move gaussians tensors to CPU
|
| 289 |
-
if hasattr(prediction, 'gaussians') and prediction.gaussians is not None:
|
| 290 |
-
gaussians = prediction.gaussians
|
| 291 |
-
|
| 292 |
-
# Move each tensor attribute to CPU
|
| 293 |
-
tensor_attrs = ['means', 'scales', 'rotations', 'harmonics', 'opacities']
|
| 294 |
-
for attr in tensor_attrs:
|
| 295 |
-
if hasattr(gaussians, attr):
|
| 296 |
-
tensor = getattr(gaussians, attr)
|
| 297 |
-
if isinstance(tensor, torch.Tensor) and tensor.is_cuda:
|
| 298 |
-
setattr(gaussians, attr, tensor.cpu())
|
| 299 |
-
print(f" ✓ Moved gaussians.{attr} to CPU")
|
| 300 |
-
|
| 301 |
-
# Move any tensors in aux dict to CPU
|
| 302 |
-
if hasattr(prediction, 'aux') and prediction.aux is not None:
|
| 303 |
-
for key, value in list(prediction.aux.items()):
|
| 304 |
-
if isinstance(value, torch.Tensor) and value.is_cuda:
|
| 305 |
-
prediction.aux[key] = value.cpu()
|
| 306 |
-
print(f" ✓ Moved aux['{key}'] to CPU")
|
| 307 |
-
elif isinstance(value, dict):
|
| 308 |
-
# Recursively handle nested dicts
|
| 309 |
-
for k, v in list(value.items()):
|
| 310 |
-
if isinstance(v, torch.Tensor) and v.is_cuda:
|
| 311 |
-
value[k] = v.cpu()
|
| 312 |
-
print(f" ✓ Moved aux['{key}']['{k}'] to CPU")
|
| 313 |
-
|
| 314 |
-
print("✅ All tensors moved to CPU")
|
| 315 |
-
return prediction
|
| 316 |
-
|
| 317 |
-
def _save_predictions_cache(self, target_dir: str, prediction: Any) -> None:
|
| 318 |
-
"""Save predictions data to predictions.npz for caching."""
|
| 319 |
-
try:
|
| 320 |
-
output_file = os.path.join(target_dir, "predictions.npz")
|
| 321 |
-
save_dict = {}
|
| 322 |
-
|
| 323 |
-
if prediction.processed_images is not None:
|
| 324 |
-
save_dict["images"] = prediction.processed_images
|
| 325 |
-
|
| 326 |
-
if prediction.depth is not None:
|
| 327 |
-
save_dict["depths"] = np.round(prediction.depth, 6)
|
| 328 |
-
|
| 329 |
-
if prediction.conf is not None:
|
| 330 |
-
save_dict["conf"] = np.round(prediction.conf, 2)
|
| 331 |
-
|
| 332 |
-
if prediction.extrinsics is not None:
|
| 333 |
-
save_dict["extrinsics"] = prediction.extrinsics
|
| 334 |
-
if prediction.intrinsics is not None:
|
| 335 |
-
save_dict["intrinsics"] = prediction.intrinsics
|
| 336 |
-
|
| 337 |
-
np.savez_compressed(output_file, **save_dict)
|
| 338 |
-
print(f"Saved predictions cache to: {output_file}")
|
| 339 |
-
|
| 340 |
-
except Exception as e:
|
| 341 |
-
print(f"Warning: Failed to save predictions cache: {e}")
|
| 342 |
-
|
| 343 |
-
def _process_results(
|
| 344 |
-
self, target_dir: str, prediction: Any, image_paths: list
|
| 345 |
-
) -> Dict[int, Dict[str, Any]]:
|
| 346 |
-
"""Process model results into structured data."""
|
| 347 |
-
processed_data = {}
|
| 348 |
-
|
| 349 |
-
depth_vis_dir = os.path.join(target_dir, "depth_vis")
|
| 350 |
-
|
| 351 |
-
if os.path.exists(depth_vis_dir):
|
| 352 |
-
depth_files = sorted(glob.glob(os.path.join(depth_vis_dir, "*.jpg")))
|
| 353 |
-
for i, depth_file in enumerate(depth_files):
|
| 354 |
-
processed_image = None
|
| 355 |
-
if prediction.processed_images is not None and i < len(
|
| 356 |
-
prediction.processed_images
|
| 357 |
-
):
|
| 358 |
-
processed_image = prediction.processed_images[i]
|
| 359 |
-
|
| 360 |
-
processed_data[i] = {
|
| 361 |
-
"depth_image": depth_file,
|
| 362 |
-
"image": processed_image,
|
| 363 |
-
"original_image_path": image_paths[i] if i < len(image_paths) else None,
|
| 364 |
-
"depth": prediction.depth[i] if i < len(prediction.depth) else None,
|
| 365 |
-
"intrinsics": (
|
| 366 |
-
prediction.intrinsics[i]
|
| 367 |
-
if prediction.intrinsics is not None and i < len(prediction.intrinsics)
|
| 368 |
-
else None
|
| 369 |
-
),
|
| 370 |
-
"mask": None,
|
| 371 |
-
}
|
| 372 |
-
|
| 373 |
-
return processed_data
|
| 374 |
-
|
| 375 |
-
def cleanup(self) -> None:
|
| 376 |
-
"""Clean up GPU memory."""
|
| 377 |
-
if torch.cuda.is_available():
|
| 378 |
-
torch.cuda.empty_cache()
|
| 379 |
-
gc.collect()
|
| 380 |
-
```
|
| 381 |
-
|
| 382 |
-
## 🔍 关键变化总结
|
| 383 |
-
|
| 384 |
-
### Before (有问题):
|
| 385 |
-
```python
|
| 386 |
-
class ModelInference:
|
| 387 |
-
def __init__(self):
|
| 388 |
-
self.model = None # ❌ 实例变量
|
| 389 |
-
|
| 390 |
-
def initialize_model(self, device):
|
| 391 |
-
if self.model is None:
|
| 392 |
-
self.model = load_model() # ❌ 保存在实例中
|
| 393 |
-
else:
|
| 394 |
-
self.model = self.model.to(device) # ❌ 跨进程操作
|
| 395 |
-
|
| 396 |
-
def run_inference(self):
|
| 397 |
-
self.initialize_model(device) # ❌ 使用实例方法
|
| 398 |
-
prediction = self.model.inference(...) # ❌ 使用实例变量
|
| 399 |
-
return prediction # ❌ 包含 CUDA 张量
|
| 400 |
-
```
|
| 401 |
-
|
| 402 |
-
### After (正确):
|
| 403 |
-
```python
|
| 404 |
-
_MODEL_CACHE = None # ✅ 全局变量(子进程安全)
|
| 405 |
-
|
| 406 |
-
class ModelInference:
|
| 407 |
-
def __init__(self):
|
| 408 |
-
pass # ✅ 无实例变量
|
| 409 |
-
|
| 410 |
-
def initialize_model(self, device):
|
| 411 |
-
global _MODEL_CACHE
|
| 412 |
-
if _MODEL_CACHE is None:
|
| 413 |
-
_MODEL_CACHE = load_model() # ✅ 保存在全局
|
| 414 |
-
return _MODEL_CACHE # ✅ 返回而不是存储
|
| 415 |
-
|
| 416 |
-
def run_inference(self):
|
| 417 |
-
model = self.initialize_model(device) # ✅ 局部变量
|
| 418 |
-
prediction = model.inference(...) # ✅ 使用局部变量
|
| 419 |
-
prediction = self._move_prediction_to_cpu(prediction) # ✅ 移到 CPU
|
| 420 |
-
return prediction # ✅ 安全返回
|
| 421 |
-
```
|
| 422 |
-
|
| 423 |
-
## 🎯 为什么这样修改?
|
| 424 |
-
|
| 425 |
-
### 1. 全局变量 vs 实例变量
|
| 426 |
-
|
| 427 |
-
| 方式 | 问题 | 原因 |
|
| 428 |
-
|------|------|------|
|
| 429 |
-
| `self.model` | ❌ 跨进程状态混乱 | 实例在主进程创建 |
|
| 430 |
-
| `_MODEL_CACHE` | ✅ 子进程内安全 | 每个子进程独立 |
|
| 431 |
-
|
| 432 |
-
### 2. 返回 CPU 张量
|
| 433 |
-
|
| 434 |
-
```python
|
| 435 |
-
# ❌ 直接返回会报错
|
| 436 |
-
return prediction # prediction.gaussians.means is on CUDA
|
| 437 |
-
|
| 438 |
-
# ✅ 移到 CPU 后返回
|
| 439 |
-
prediction = move_to_cpu(prediction)
|
| 440 |
-
return prediction # All tensors are on CPU, pickle safe
|
| 441 |
-
```
|
| 442 |
-
|
| 443 |
-
## 🧪 测试修复
|
| 444 |
-
|
| 445 |
-
```bash
|
| 446 |
-
# 1. 应用修改
|
| 447 |
-
# 复制上面的完整代码到 model_inference.py
|
| 448 |
-
|
| 449 |
-
# 2. 推送到 Spaces
|
| 450 |
-
git add depth_anything_3/app/modules/model_inference.py
|
| 451 |
-
git commit -m "Fix: Spaces GPU CUDA initialization error"
|
| 452 |
-
git push
|
| 453 |
-
|
| 454 |
-
# 3. 测试多次运行
|
| 455 |
-
# 在 Space 中连续运行 2-3 次推理
|
| 456 |
-
# 应该不再出现 CUDA 错误
|
| 457 |
-
```
|
| 458 |
-
|
| 459 |
-
## 📊 修复效果
|
| 460 |
-
|
| 461 |
-
| 问题 | Before | After |
|
| 462 |
-
|------|--------|-------|
|
| 463 |
-
| 第一次推理 | ❌ CUDA 错误 | ✅ 正常 |
|
| 464 |
-
| 第二次推理 | ❌ CUDA 错误 | ✅ 正常 |
|
| 465 |
-
| 连续推理 | ❌ 失败 | ✅ 稳定 |
|
| 466 |
-
| 模型加载 | 每次重新加载 | 缓存复用 |
|
| 467 |
-
|
| 468 |
-
## 💡 最佳实践
|
| 469 |
-
|
| 470 |
-
对于 `@spaces.GPU` 装饰的函数:
|
| 471 |
-
|
| 472 |
-
1. ✅ 使用**全局变量**缓存模型(子进程安全)
|
| 473 |
-
2. ✅ **不要**使用实例变量存储模型
|
| 474 |
-
3. ✅ 返回前**移动所有张量到 CPU**
|
| 475 |
-
4. ✅ 清理 GPU 内存 (`torch.cuda.empty_cache()`)
|
| 476 |
-
5. ❌ **不要**在主进程中初始化 CUDA
|
| 477 |
-
6. ❌ **不要**返回 CUDA 张量
|
| 478 |
-
|
| 479 |
-
## 🔗 相关资源
|
| 480 |
-
|
| 481 |
-
- [HF Spaces Zero GPU 文档](https://huggingface.co/docs/hub/spaces-gpus#zero-gpu)
|
| 482 |
-
- [PyTorch Multiprocessing](https://pytorch.org/docs/stable/notes/multiprocessing.html)
|
| 483 |
-
- [Pickle 协议](https://docs.python.org/3/library/pickle.html)
|
| 484 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SPACES_SETUP.md
DELETED
|
@@ -1,190 +0,0 @@
|
|
| 1 |
-
# Hugging Face Spaces 部署指南
|
| 2 |
-
|
| 3 |
-
## 📋 概述
|
| 4 |
-
|
| 5 |
-
这个项目已经配置好可以部署到 Hugging Face Spaces,使用 `@spaces.GPU` 装饰器来动态分配 GPU 资源。
|
| 6 |
-
|
| 7 |
-
## 🎯 关键文件
|
| 8 |
-
|
| 9 |
-
### 1. `app.py` - 主应用文件
|
| 10 |
-
|
| 11 |
-
```python
|
| 12 |
-
import spaces
|
| 13 |
-
from depth_anything_3.app.gradio_app import DepthAnything3App
|
| 14 |
-
from depth_anything_3.app.modules.model_inference import ModelInference
|
| 15 |
-
|
| 16 |
-
# 使用 monkey-patching 将 GPU 装饰器应用到推理函数
|
| 17 |
-
original_run_inference = ModelInference.run_inference
|
| 18 |
-
|
| 19 |
-
@spaces.GPU(duration=120) # 请求 GPU,最多 120 秒
|
| 20 |
-
def gpu_run_inference(self, *args, **kwargs):
|
| 21 |
-
return original_run_inference(self, *args, **kwargs)
|
| 22 |
-
|
| 23 |
-
ModelInference.run_inference = gpu_run_inference
|
| 24 |
-
```
|
| 25 |
-
|
| 26 |
-
**工作原理:**
|
| 27 |
-
- `@spaces.GPU` 装饰器在函数调用时动态分配 GPU
|
| 28 |
-
- `duration=120` 表示单次推理最多使用 GPU 120 秒
|
| 29 |
-
- 通过 monkey-patching,我们将装饰器应用到已有的推理函数上,无需修改核心代码
|
| 30 |
-
|
| 31 |
-
### 2. `README.md` - Spaces 配置
|
| 32 |
-
|
| 33 |
-
```yaml
|
| 34 |
-
---
|
| 35 |
-
title: Depth Anything 3
|
| 36 |
-
sdk: gradio
|
| 37 |
-
sdk_version: 5.49.1
|
| 38 |
-
app_file: app.py
|
| 39 |
-
pinned: false
|
| 40 |
-
license: cc-by-nc-4.0
|
| 41 |
-
---
|
| 42 |
-
```
|
| 43 |
-
|
| 44 |
-
这个 YAML 前置内容告诉 Hugging Face Spaces:
|
| 45 |
-
- 使用 Gradio SDK
|
| 46 |
-
- 入口文件是 `app.py`
|
| 47 |
-
- 使用的 Gradio 版本
|
| 48 |
-
|
| 49 |
-
### 3. `pyproject.toml` - 依赖配置
|
| 50 |
-
|
| 51 |
-
已经更新,包含了 `spaces` 依赖:
|
| 52 |
-
|
| 53 |
-
```toml
|
| 54 |
-
[project.optional-dependencies]
|
| 55 |
-
app = ["gradio>=5", "pillow>=9.0", "spaces"]
|
| 56 |
-
```
|
| 57 |
-
|
| 58 |
-
## 🚀 部署步骤
|
| 59 |
-
|
| 60 |
-
### 方式 1:通过 Hugging Face 网页界面
|
| 61 |
-
|
| 62 |
-
1. 在 Hugging Face 创建一个新的 Space
|
| 63 |
-
2. 选择 **Gradio** 作为 SDK
|
| 64 |
-
3. 上传你的代码(包括 `app.py`, `src/`, `pyproject.toml` 等)
|
| 65 |
-
4. Space 会自动构建并启动
|
| 66 |
-
|
| 67 |
-
### 方式 2:通过 Git
|
| 68 |
-
|
| 69 |
-
```bash
|
| 70 |
-
# 克隆你的 Space 仓库
|
| 71 |
-
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
|
| 72 |
-
cd YOUR_SPACE_NAME
|
| 73 |
-
|
| 74 |
-
# 添加你的代码
|
| 75 |
-
cp -r /path/to/depth-anything-3/* .
|
| 76 |
-
|
| 77 |
-
# 提交并推送
|
| 78 |
-
git add .
|
| 79 |
-
git commit -m "Initial commit"
|
| 80 |
-
git push
|
| 81 |
-
```
|
| 82 |
-
|
| 83 |
-
## 🔧 配置选项
|
| 84 |
-
|
| 85 |
-
### GPU 类型
|
| 86 |
-
|
| 87 |
-
Hugging Face Spaces 支持不同的 GPU 类型:
|
| 88 |
-
|
| 89 |
-
- **Free (T4)**: 免费,适合小型模型
|
| 90 |
-
- **A10G**: 付费,更强大
|
| 91 |
-
- **A100**: 付费,最强大
|
| 92 |
-
|
| 93 |
-
### GPU Duration
|
| 94 |
-
|
| 95 |
-
在 `app.py` 中可以调整:
|
| 96 |
-
|
| 97 |
-
```python
|
| 98 |
-
@spaces.GPU(duration=120) # 120 秒
|
| 99 |
-
```
|
| 100 |
-
|
| 101 |
-
- 设置太短:复杂推理可能超时
|
| 102 |
-
- 设置太长:浪费资源
|
| 103 |
-
- 推荐:根据实际推理时间设置(可以先设长一点,然后根据日志调整)
|
| 104 |
-
|
| 105 |
-
### 环境变量
|
| 106 |
-
|
| 107 |
-
可以在 Space 设置中配置环境变量:
|
| 108 |
-
|
| 109 |
-
- `DA3_MODEL_DIR`: 模型目录路径
|
| 110 |
-
- `DA3_WORKSPACE_DIR`: 工作空间目录
|
| 111 |
-
- `DA3_GALLERY_DIR`: 图库目录
|
| 112 |
-
|
| 113 |
-
## 📊 监控和调试
|
| 114 |
-
|
| 115 |
-
### 查看日志
|
| 116 |
-
|
| 117 |
-
在 Spaces 界面点击 "Logs" 标签可以看到:
|
| 118 |
-
|
| 119 |
-
```
|
| 120 |
-
🚀 Launching Depth Anything 3 on Hugging Face Spaces...
|
| 121 |
-
📦 Model Directory: depth-anything/DA3NESTED-GIANT-LARGE
|
| 122 |
-
📁 Workspace Directory: workspace/gradio
|
| 123 |
-
🖼️ Gallery Directory: workspace/gallery
|
| 124 |
-
```
|
| 125 |
-
|
| 126 |
-
### GPU 使用情况
|
| 127 |
-
|
| 128 |
-
在装饰的函数内部,可以检查 GPU 状态:
|
| 129 |
-
|
| 130 |
-
```python
|
| 131 |
-
print(torch.cuda.is_available()) # True
|
| 132 |
-
print(torch.cuda.device_count()) # 1 (通常)
|
| 133 |
-
print(torch.cuda.get_device_name(0)) # 'Tesla T4' 或其他
|
| 134 |
-
```
|
| 135 |
-
|
| 136 |
-
## 🎓 示例代码
|
| 137 |
-
|
| 138 |
-
查看 `example_spaces_gpu.py` 了解 `@spaces.GPU` 装饰器的基本用法。
|
| 139 |
-
|
| 140 |
-
## ❓ 常见问题
|
| 141 |
-
|
| 142 |
-
### Q: 为什么使用 monkey-patching?
|
| 143 |
-
|
| 144 |
-
A: 这样可以在不修改核心代码的情况下添加 Spaces 支持。如果你想更优雅的方式,可以:
|
| 145 |
-
|
| 146 |
-
1. 直接在 `ModelInference.run_inference` 方法上添加装饰器
|
| 147 |
-
2. 创建一个继承自 `ModelInference` 的新类
|
| 148 |
-
|
| 149 |
-
### Q: 如何测试本地是否能运行?
|
| 150 |
-
|
| 151 |
-
A: 本地运行时,`spaces.GPU` 装饰器会被忽略(如果没有安装 spaces 包),或者会直接执行函数而不做特殊处理。
|
| 152 |
-
|
| 153 |
-
```bash
|
| 154 |
-
# 本地测试
|
| 155 |
-
python app.py
|
| 156 |
-
```
|
| 157 |
-
|
| 158 |
-
### Q: 可以装饰多个函数吗?
|
| 159 |
-
|
| 160 |
-
A: 可以!你可以给任何需要 GPU 的函数添加 `@spaces.GPU` 装饰器。
|
| 161 |
-
|
| 162 |
-
```python
|
| 163 |
-
@spaces.GPU(duration=60)
|
| 164 |
-
def function1():
|
| 165 |
-
pass
|
| 166 |
-
|
| 167 |
-
@spaces.GPU(duration=120)
|
| 168 |
-
def function2():
|
| 169 |
-
pass
|
| 170 |
-
```
|
| 171 |
-
|
| 172 |
-
### Q: 如何优化 GPU 使用?
|
| 173 |
-
|
| 174 |
-
A: 一些建议:
|
| 175 |
-
|
| 176 |
-
1. **只装饰必要的函数**:不要装饰整个 app,只装饰实际使用 GPU 的推理函数
|
| 177 |
-
2. **设置合适的 duration**:根据实际需求设置
|
| 178 |
-
3. **清理 GPU 内存**:在函数结束时调用 `torch.cuda.empty_cache()`
|
| 179 |
-
4. **批处理**:如果可能,批量处理多个请求
|
| 180 |
-
|
| 181 |
-
## 🔗 相关资源
|
| 182 |
-
|
| 183 |
-
- [Hugging Face Spaces 文档](https://huggingface.co/docs/hub/spaces)
|
| 184 |
-
- [Spaces GPU 使用指南](https://huggingface.co/docs/hub/spaces-gpus)
|
| 185 |
-
- [Gradio 文档](https://gradio.app/docs)
|
| 186 |
-
|
| 187 |
-
## 📝 许可证
|
| 188 |
-
|
| 189 |
-
Apache-2.0
|
| 190 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
UPLOAD_EXAMPLES.md
DELETED
|
@@ -1,314 +0,0 @@
|
|
| 1 |
-
# 📤 上传 Examples 到 Hugging Face Spaces 指南
|
| 2 |
-
|
| 3 |
-
## 🚨 问题:二进制文件被拒绝
|
| 4 |
-
|
| 5 |
-
Hugging Face Spaces 会拒绝大文件(>100MB)或二进制文件,需要使用 **Git LFS** 来上传。
|
| 6 |
-
|
| 7 |
-
## ✅ 解决方案
|
| 8 |
-
|
| 9 |
-
### 方案 1:使用 Git LFS(推荐)⭐
|
| 10 |
-
|
| 11 |
-
#### 步骤 1:配置 Git LFS
|
| 12 |
-
|
| 13 |
-
我已经为你创建了 `.gitattributes` 文件,配置了图片文件的 Git LFS:
|
| 14 |
-
|
| 15 |
-
```gitattributes
|
| 16 |
-
# Images in examples directory
|
| 17 |
-
workspace/gradio/examples/**/*.png filter=lfs diff=lfs merge=lfs -text
|
| 18 |
-
workspace/gradio/examples/**/*.jpg filter=lfs diff=lfs merge=lfs -text
|
| 19 |
-
workspace/gradio/examples/**/*.jpeg filter=lfs diff=lfs merge=lfs -text
|
| 20 |
-
workspace/gradio/examples/**/*.bmp filter=lfs diff=lfs merge=lfs -text
|
| 21 |
-
workspace/gradio/examples/**/*.tiff filter=lfs diff=lfs merge=lfs -text
|
| 22 |
-
workspace/gradio/examples/**/*.tif filter=lfs diff=lfs merge=lfs -text
|
| 23 |
-
```
|
| 24 |
-
|
| 25 |
-
#### 步骤 2:安装 Git LFS(如果还没有)
|
| 26 |
-
|
| 27 |
-
```bash
|
| 28 |
-
# macOS
|
| 29 |
-
brew install git-lfs
|
| 30 |
-
|
| 31 |
-
# Linux
|
| 32 |
-
sudo apt-get install git-lfs
|
| 33 |
-
|
| 34 |
-
# Windows
|
| 35 |
-
# 下载安装:https://git-lfs.github.com/
|
| 36 |
-
```
|
| 37 |
-
|
| 38 |
-
#### 步骤 3:初始化 Git LFS
|
| 39 |
-
|
| 40 |
-
```bash
|
| 41 |
-
cd /Users/bytedance/depth-anything-3
|
| 42 |
-
|
| 43 |
-
# 初始化 Git LFS
|
| 44 |
-
git lfs install
|
| 45 |
-
|
| 46 |
-
# 验证配置
|
| 47 |
-
git lfs track
|
| 48 |
-
```
|
| 49 |
-
|
| 50 |
-
#### 步骤 4:添加示例场景
|
| 51 |
-
|
| 52 |
-
```bash
|
| 53 |
-
# 创建 examples 目录
|
| 54 |
-
mkdir -p workspace/gradio/examples/my_scene
|
| 55 |
-
|
| 56 |
-
# 添加图像文件
|
| 57 |
-
cp your_images/* workspace/gradio/examples/my_scene/
|
| 58 |
-
|
| 59 |
-
# 添加文件到 Git LFS
|
| 60 |
-
git add workspace/gradio/examples/
|
| 61 |
-
git add .gitattributes
|
| 62 |
-
|
| 63 |
-
# 提交
|
| 64 |
-
git commit -m "Add example scenes with Git LFS"
|
| 65 |
-
|
| 66 |
-
# 推送到 Spaces
|
| 67 |
-
git push origin main
|
| 68 |
-
```
|
| 69 |
-
|
| 70 |
-
#### 步骤 5:验证
|
| 71 |
-
|
| 72 |
-
```bash
|
| 73 |
-
# 检查哪些文件使用了 LFS
|
| 74 |
-
git lfs ls-files
|
| 75 |
-
|
| 76 |
-
# 应该看到你的图片文件
|
| 77 |
-
```
|
| 78 |
-
|
| 79 |
-
---
|
| 80 |
-
|
| 81 |
-
### 方案 2:使用持久存储(推荐用于大量数据)⭐
|
| 82 |
-
|
| 83 |
-
如果示例场景很大,可以使用 Hugging Face Spaces 的持久存储功能。
|
| 84 |
-
|
| 85 |
-
#### 步骤 1:在 Spaces 设置中启用持久存储
|
| 86 |
-
|
| 87 |
-
1. 进入你的 Space 设置
|
| 88 |
-
2. 启用 "Persistent storage"
|
| 89 |
-
3. 设置存储大小(如 50GB)
|
| 90 |
-
|
| 91 |
-
#### 步骤 2:在应用启动时下载示例
|
| 92 |
-
|
| 93 |
-
修改 `app.py`,在启动时从外部源下载示例:
|
| 94 |
-
|
| 95 |
-
```python
|
| 96 |
-
import os
|
| 97 |
-
import subprocess
|
| 98 |
-
|
| 99 |
-
def download_examples():
|
| 100 |
-
"""Download examples from external source if not exists"""
|
| 101 |
-
examples_dir = "workspace/gradio/examples"
|
| 102 |
-
if not os.path.exists(examples_dir) or not os.listdir(examples_dir):
|
| 103 |
-
print("Downloading example scenes...")
|
| 104 |
-
# 从 Hugging Face Dataset 下载
|
| 105 |
-
# 或从其他存储服务下载
|
| 106 |
-
# subprocess.run(["huggingface-cli", "download", "dataset/examples", ...])
|
| 107 |
-
pass
|
| 108 |
-
|
| 109 |
-
if __name__ == "__main__":
|
| 110 |
-
download_examples()
|
| 111 |
-
# ... 启动应用
|
| 112 |
-
```
|
| 113 |
-
|
| 114 |
-
#### 步骤 3:上传到 Hugging Face Dataset
|
| 115 |
-
|
| 116 |
-
```bash
|
| 117 |
-
# 安装依赖
|
| 118 |
-
pip install huggingface_hub datasets
|
| 119 |
-
|
| 120 |
-
# 上传到 Dataset
|
| 121 |
-
python -c "
|
| 122 |
-
from datasets import Dataset
|
| 123 |
-
from huggingface_hub import HfApi
|
| 124 |
-
|
| 125 |
-
# 创建 dataset 并上传
|
| 126 |
-
api = HfApi()
|
| 127 |
-
api.upload_folder(
|
| 128 |
-
folder_path='workspace/gradio/examples',
|
| 129 |
-
repo_id='your-username/your-examples-dataset',
|
| 130 |
-
repo_type='dataset'
|
| 131 |
-
)
|
| 132 |
-
"
|
| 133 |
-
```
|
| 134 |
-
|
| 135 |
-
---
|
| 136 |
-
|
| 137 |
-
### 方案 3:压缩后上传(小文件)
|
| 138 |
-
|
| 139 |
-
如果图片文件较小(<100MB),可以压缩后上传:
|
| 140 |
-
|
| 141 |
-
```bash
|
| 142 |
-
# 压缩 examples 目录
|
| 143 |
-
tar -czf examples.tar.gz workspace/gradio/examples/
|
| 144 |
-
|
| 145 |
-
# 添加到 Git(作为普通文件)
|
| 146 |
-
git add examples.tar.gz
|
| 147 |
-
git commit -m "Add compressed examples"
|
| 148 |
-
git push
|
| 149 |
-
|
| 150 |
-
# 在应用启动时解压
|
| 151 |
-
# 在 app.py 中添加:
|
| 152 |
-
import tarfile
|
| 153 |
-
if not os.path.exists("workspace/gradio/examples"):
|
| 154 |
-
print("Extracting examples...")
|
| 155 |
-
tarfile.open("examples.tar.gz").extractall()
|
| 156 |
-
```
|
| 157 |
-
|
| 158 |
-
---
|
| 159 |
-
|
| 160 |
-
### 方案 4:运行时下载(推荐用于生产)⭐
|
| 161 |
-
|
| 162 |
-
在应用启动时从外部源下载示例场景:
|
| 163 |
-
|
| 164 |
-
#### 修改 `app.py`
|
| 165 |
-
|
| 166 |
-
```python
|
| 167 |
-
import os
|
| 168 |
-
import subprocess
|
| 169 |
-
from huggingface_hub import hf_hub_download
|
| 170 |
-
|
| 171 |
-
def setup_examples():
|
| 172 |
-
"""Setup examples directory by downloading if needed"""
|
| 173 |
-
examples_dir = "workspace/gradio/examples"
|
| 174 |
-
os.makedirs(examples_dir, exist_ok=True)
|
| 175 |
-
|
| 176 |
-
# 如果 examples 目录为空,从外部源下载
|
| 177 |
-
if not os.listdir(examples_dir):
|
| 178 |
-
print("📥 Downloading example scenes...")
|
| 179 |
-
|
| 180 |
-
# 方式 1: 从 Hugging Face Dataset 下载
|
| 181 |
-
try:
|
| 182 |
-
from datasets import load_dataset
|
| 183 |
-
dataset = load_dataset("your-username/your-examples-dataset")
|
| 184 |
-
# 处理并保存到 examples_dir
|
| 185 |
-
except:
|
| 186 |
-
pass
|
| 187 |
-
|
| 188 |
-
# 方式 2: 从 URL 下载压缩包
|
| 189 |
-
# import urllib.request
|
| 190 |
-
# urllib.request.urlretrieve("https://...", "examples.zip")
|
| 191 |
-
# 解压到 examples_dir
|
| 192 |
-
|
| 193 |
-
print("✅ Examples downloaded")
|
| 194 |
-
|
| 195 |
-
if __name__ == "__main__":
|
| 196 |
-
setup_examples()
|
| 197 |
-
# ... 启动应用
|
| 198 |
-
```
|
| 199 |
-
|
| 200 |
-
---
|
| 201 |
-
|
| 202 |
-
## 🎯 推荐方案对比
|
| 203 |
-
|
| 204 |
-
| 方案 | 优点 | 缺点 | 适用场景 |
|
| 205 |
-
|------|------|------|----------|
|
| 206 |
-
| **Git LFS** | ✅ 简单直接<br>✅ 版���控制 | ⚠️ 需要 LFS 配额<br>⚠️ 大文件可能慢 | 小到中等示例(<1GB) |
|
| 207 |
-
| **持久存储** | ✅ 无大小限制<br>✅ 快速访问 | ⚠️ 需要手动上传<br>⚠️ 需要付费 | 大量示例(>1GB) |
|
| 208 |
-
| **运行时下载** | ✅ 不占用仓库空间<br>✅ 灵活 | ⚠️ 首次启动慢<br>⚠️ 需要网络 | 生产环境 |
|
| 209 |
-
| **压缩上传** | ✅ 简单 | ⚠️ 大小限制<br>⚠️ 需要解压 | 小文件(<100MB) |
|
| 210 |
-
|
| 211 |
-
---
|
| 212 |
-
|
| 213 |
-
## 📝 完整 Git LFS 设置步骤
|
| 214 |
-
|
| 215 |
-
### 1. 确保 Git LFS 已安装
|
| 216 |
-
|
| 217 |
-
```bash
|
| 218 |
-
git lfs version
|
| 219 |
-
# 如果未安装,按照上面的步骤安装
|
| 220 |
-
```
|
| 221 |
-
|
| 222 |
-
### 2. 初始化 Git LFS
|
| 223 |
-
|
| 224 |
-
```bash
|
| 225 |
-
cd /Users/bytedance/depth-anything-3
|
| 226 |
-
git lfs install
|
| 227 |
-
```
|
| 228 |
-
|
| 229 |
-
### 3. 检查 .gitattributes
|
| 230 |
-
|
| 231 |
-
确保 `.gitattributes` 包含图片文件配置(我已经添加了)。
|
| 232 |
-
|
| 233 |
-
### 4. 添加示例场景
|
| 234 |
-
|
| 235 |
-
```bash
|
| 236 |
-
# 创建场景
|
| 237 |
-
mkdir -p workspace/gradio/examples/scene1
|
| 238 |
-
cp your_images/* workspace/gradio/examples/scene1/
|
| 239 |
-
|
| 240 |
-
# 添加文件
|
| 241 |
-
git add workspace/gradio/examples/
|
| 242 |
-
git add .gitattributes
|
| 243 |
-
|
| 244 |
-
# 检查哪些文件会使用 LFS
|
| 245 |
-
git lfs ls-files
|
| 246 |
-
|
| 247 |
-
# 提交
|
| 248 |
-
git commit -m "Add example scenes with Git LFS"
|
| 249 |
-
|
| 250 |
-
# 推送
|
| 251 |
-
git push origin main
|
| 252 |
-
```
|
| 253 |
-
|
| 254 |
-
### 5. 验证上传
|
| 255 |
-
|
| 256 |
-
在 Spaces 中检查文件是否成功上传,图片文件应该显示为 LFS 指针。
|
| 257 |
-
|
| 258 |
-
---
|
| 259 |
-
|
| 260 |
-
## 🔧 故障排除
|
| 261 |
-
|
| 262 |
-
### 问题 1:Git LFS 配额不足
|
| 263 |
-
|
| 264 |
-
**解决方案:**
|
| 265 |
-
- 使用方案 2(持久存储)或方案 4(运行时下载)
|
| 266 |
-
- 压缩图片文件
|
| 267 |
-
- 只上传必要的示例
|
| 268 |
-
|
| 269 |
-
### 问题 2:推送失败
|
| 270 |
-
|
| 271 |
-
**检查:**
|
| 272 |
-
```bash
|
| 273 |
-
# 检查 LFS 文件
|
| 274 |
-
git lfs ls-files
|
| 275 |
-
|
| 276 |
-
# 检查 LFS 状态
|
| 277 |
-
git lfs status
|
| 278 |
-
|
| 279 |
-
# 重新推送
|
| 280 |
-
git push origin main --force
|
| 281 |
-
```
|
| 282 |
-
|
| 283 |
-
### 问题 3:文件仍然被拒绝
|
| 284 |
-
|
| 285 |
-
**可能原因:**
|
| 286 |
-
- `.gitattributes` 配置不正确
|
| 287 |
-
- 文件没有通过 LFS 添加
|
| 288 |
-
|
| 289 |
-
**解决:**
|
| 290 |
-
```bash
|
| 291 |
-
# 移除并重新添加
|
| 292 |
-
git rm --cached workspace/gradio/examples/**/*.png
|
| 293 |
-
git add workspace/gradio/examples/
|
| 294 |
-
git commit -m "Fix: Add images via Git LFS"
|
| 295 |
-
git push
|
| 296 |
-
```
|
| 297 |
-
|
| 298 |
-
---
|
| 299 |
-
|
| 300 |
-
## 💡 最佳实践
|
| 301 |
-
|
| 302 |
-
1. **小示例(<100MB)**:使用 Git LFS
|
| 303 |
-
2. **中等示例(100MB-1GB)**:使用 Git LFS 或持久存储
|
| 304 |
-
3. **大示例(>1GB)**:使用持久存储或运行时下载
|
| 305 |
-
4. **生产环境**:使用运行时下载,从外部源获取
|
| 306 |
-
|
| 307 |
-
---
|
| 308 |
-
|
| 309 |
-
## 📚 相关资源
|
| 310 |
-
|
| 311 |
-
- [Git LFS 文档](https://git-lfs.github.com/)
|
| 312 |
-
- [Hugging Face Spaces 文档](https://huggingface.co/docs/hub/spaces)
|
| 313 |
-
- [Hugging Face Datasets](https://huggingface.co/docs/datasets)
|
| 314 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
XFORMERS_GUIDE.md
DELETED
|
@@ -1,299 +0,0 @@
|
|
| 1 |
-
# xformers 依赖说明
|
| 2 |
-
|
| 3 |
-
## 🔍 问题描述
|
| 4 |
-
|
| 5 |
-
构建时遇到 xformers 安装失败:
|
| 6 |
-
|
| 7 |
-
```
|
| 8 |
-
RuntimeError: CUTLASS submodule not found. Did you forget to run `git submodule update --init --recursive` ?
|
| 9 |
-
```
|
| 10 |
-
|
| 11 |
-
## ✅ 好消息:xformers 不是必需的!
|
| 12 |
-
|
| 13 |
-
你的代码已经有 **fallback 机制**,在没有 xformers 的情况下会自动使用纯 PyTorch 实现:
|
| 14 |
-
|
| 15 |
-
```python
|
| 16 |
-
# src/depth_anything_3/model/dinov2/layers/swiglu_ffn.py
|
| 17 |
-
try:
|
| 18 |
-
from xformers.ops import SwiGLU
|
| 19 |
-
XFORMERS_AVAILABLE = True
|
| 20 |
-
except ImportError:
|
| 21 |
-
SwiGLU = SwiGLUFFN # 使用纯 PyTorch 实现
|
| 22 |
-
XFORMERS_AVAILABLE = False
|
| 23 |
-
```
|
| 24 |
-
|
| 25 |
-
**性能差异:**
|
| 26 |
-
- **有 xformers**: 稍快一些(~5-10%)
|
| 27 |
-
- **无 xformers**: 稍慢一些,但功能完全相同
|
| 28 |
-
|
| 29 |
-
## 🎯 推荐配置
|
| 30 |
-
|
| 31 |
-
### 当前配置(已设置)✅
|
| 32 |
-
|
| 33 |
-
**requirements.txt** - xformers 已注释掉:
|
| 34 |
-
```txt
|
| 35 |
-
# xformers - install separately if needed
|
| 36 |
-
```
|
| 37 |
-
|
| 38 |
-
这样可以确保构建成功,应用正常运行。
|
| 39 |
-
|
| 40 |
-
## 📝 三种使用方式
|
| 41 |
-
|
| 42 |
-
---
|
| 43 |
-
|
| 44 |
-
### 方式 1:不使用 xformers(当前配置)⭐ 推荐
|
| 45 |
-
|
| 46 |
-
**优点:**
|
| 47 |
-
- ✅ 构建快速(5-10 分钟)
|
| 48 |
-
- ✅ 100% 成功率
|
| 49 |
-
- ✅ 功能完整
|
| 50 |
-
- ✅ 无需处理兼容性问题
|
| 51 |
-
|
| 52 |
-
**缺点:**
|
| 53 |
-
- ⚠️ 性能略低(5-10%)
|
| 54 |
-
|
| 55 |
-
**适用场景:**
|
| 56 |
-
- HF Spaces 部署
|
| 57 |
-
- 快速测试
|
| 58 |
-
- 不想处理编译问题
|
| 59 |
-
|
| 60 |
-
---
|
| 61 |
-
|
| 62 |
-
### 方式 2:使用预编译 xformers
|
| 63 |
-
|
| 64 |
-
如果你想要更好的性能,可以使用预编译版本:
|
| 65 |
-
|
| 66 |
-
**步骤 1:确定 PyTorch 和 CUDA 版本**
|
| 67 |
-
|
| 68 |
-
```python
|
| 69 |
-
import torch
|
| 70 |
-
print(f"PyTorch: {torch.__version__}")
|
| 71 |
-
print(f"CUDA: {torch.version.cuda}")
|
| 72 |
-
```
|
| 73 |
-
|
| 74 |
-
**步骤 2:选择对应的 xformers 版本**
|
| 75 |
-
|
| 76 |
-
访问:https://github.com/facebookresearch/xformers#installing-xformers
|
| 77 |
-
|
| 78 |
-
| PyTorch | CUDA | xformers |
|
| 79 |
-
|---------|------|----------|
|
| 80 |
-
| 2.1.x | 11.8 | 0.0.23 |
|
| 81 |
-
| 2.0.x | 11.8 | 0.0.22 |
|
| 82 |
-
| 2.0.x | 11.7 | 0.0.20 |
|
| 83 |
-
|
| 84 |
-
**步骤 3:修改 requirements.txt**
|
| 85 |
-
|
| 86 |
-
```txt
|
| 87 |
-
# 在 torch 和 torchvision 之后添加
|
| 88 |
-
torch==2.1.0
|
| 89 |
-
torchvision==0.16.0
|
| 90 |
-
xformers==0.0.23 # 匹配 PyTorch 2.1 + CUDA 11.8
|
| 91 |
-
```
|
| 92 |
-
|
| 93 |
-
**或者使用官方索引:**
|
| 94 |
-
|
| 95 |
-
```txt
|
| 96 |
-
torch==2.1.0
|
| 97 |
-
torchvision==0.16.0
|
| 98 |
-
--extra-index-url https://download.pytorch.org/whl/cu118
|
| 99 |
-
xformers==0.0.23
|
| 100 |
-
```
|
| 101 |
-
|
| 102 |
-
---
|
| 103 |
-
|
| 104 |
-
### 方式 3:从源码编译(不推荐)
|
| 105 |
-
|
| 106 |
-
**仅在以下情况考虑:**
|
| 107 |
-
- 需要最新的 xformers 功能
|
| 108 |
-
- 有特殊的 CUDA 版本需求
|
| 109 |
-
- 愿意花费 15-30 分钟构建时间
|
| 110 |
-
|
| 111 |
-
**requirements.txt:**
|
| 112 |
-
```txt
|
| 113 |
-
# 需要 CUDA 环境和 git submodules
|
| 114 |
-
xformers @ git+https://github.com/facebookresearch/xformers.git
|
| 115 |
-
```
|
| 116 |
-
|
| 117 |
-
**额外要求:**
|
| 118 |
-
|
| 119 |
-
**packages.txt:**
|
| 120 |
-
```txt
|
| 121 |
-
build-essential
|
| 122 |
-
git
|
| 123 |
-
ninja-build
|
| 124 |
-
```
|
| 125 |
-
|
| 126 |
-
**注意:**
|
| 127 |
-
- ⚠️ 构建可能失败
|
| 128 |
-
- ⚠️ 构建时间长
|
| 129 |
-
- ⚠️ 需要 GPU 环境
|
| 130 |
-
|
| 131 |
-
---
|
| 132 |
-
|
| 133 |
-
## 🔧 实际配置示例
|
| 134 |
-
|
| 135 |
-
### 示例 1:HF Spaces(推荐)✅
|
| 136 |
-
|
| 137 |
-
**requirements.txt:**
|
| 138 |
-
```txt
|
| 139 |
-
torch>=2.0.0
|
| 140 |
-
torchvision
|
| 141 |
-
gradio>=5.0.0
|
| 142 |
-
spaces
|
| 143 |
-
# xformers 不包含 - 使用 PyTorch fallback
|
| 144 |
-
```
|
| 145 |
-
|
| 146 |
-
**效果:**
|
| 147 |
-
- 构建时间:5-10 分钟
|
| 148 |
-
- 成功率:100%
|
| 149 |
-
- 性能:良好
|
| 150 |
-
|
| 151 |
-
### 示例 2:带预编译 xformers
|
| 152 |
-
|
| 153 |
-
**requirements.txt:**
|
| 154 |
-
```txt
|
| 155 |
-
torch==2.1.0
|
| 156 |
-
torchvision==0.16.0
|
| 157 |
-
xformers==0.0.23
|
| 158 |
-
gradio>=5.0.0
|
| 159 |
-
spaces
|
| 160 |
-
```
|
| 161 |
-
|
| 162 |
-
**效果:**
|
| 163 |
-
- 构建时间:8-12 分钟
|
| 164 |
-
- 成功率:95%(取决于版本匹配)
|
| 165 |
-
- 性能:最佳
|
| 166 |
-
|
| 167 |
-
### 示例 3:本地开发(最灵活)
|
| 168 |
-
|
| 169 |
-
```bash
|
| 170 |
-
# 先安装基础依赖
|
| 171 |
-
pip install -r requirements.txt
|
| 172 |
-
|
| 173 |
-
# 可选:安装 xformers(如果需要)
|
| 174 |
-
pip install xformers==0.0.23
|
| 175 |
-
|
| 176 |
-
# 或者让 PyTorch 自动选择版本
|
| 177 |
-
pip install xformers
|
| 178 |
-
```
|
| 179 |
-
|
| 180 |
-
---
|
| 181 |
-
|
| 182 |
-
## 🐛 常见问题
|
| 183 |
-
|
| 184 |
-
### Q1: 如何知道是否使用了 xformers?
|
| 185 |
-
|
| 186 |
-
**检查代码:**
|
| 187 |
-
```python
|
| 188 |
-
from depth_anything_3.model.dinov2.layers.swiglu_ffn import XFORMERS_AVAILABLE
|
| 189 |
-
|
| 190 |
-
print(f"xformers available: {XFORMERS_AVAILABLE}")
|
| 191 |
-
```
|
| 192 |
-
|
| 193 |
-
**或者在日志中查看:**
|
| 194 |
-
```python
|
| 195 |
-
import logging
|
| 196 |
-
logging.basicConfig(level=logging.INFO)
|
| 197 |
-
# 如果 xformers 不可用,不会有错误,只是使用 fallback
|
| 198 |
-
```
|
| 199 |
-
|
| 200 |
-
### Q2: xformers 版本不匹配怎么办?
|
| 201 |
-
|
| 202 |
-
**错误信息:**
|
| 203 |
-
```
|
| 204 |
-
RuntimeError: xformers is not compatible with this PyTorch version
|
| 205 |
-
```
|
| 206 |
-
|
| 207 |
-
**解决方法:**
|
| 208 |
-
1. 移除 xformers(使用 fallback)
|
| 209 |
-
2. 或者匹配 PyTorch 和 xformers 版本(参考上面的表格)
|
| 210 |
-
|
| 211 |
-
### Q3: 性能差异大吗?
|
| 212 |
-
|
| 213 |
-
**基准测试(参考):**
|
| 214 |
-
- 单图推理:几乎无差异(< 5%)
|
| 215 |
-
- 批量推理:5-10% 差异
|
| 216 |
-
- 内存使用:相近
|
| 217 |
-
|
| 218 |
-
**结论:** 对大多数用户来说,差异可以忽略。
|
| 219 |
-
|
| 220 |
-
### Q4: 为什么不直接包含 xformers?
|
| 221 |
-
|
| 222 |
-
**原因:**
|
| 223 |
-
1. **兼容性复杂** - 需要精确匹配 PyTorch、CUDA、Python 版本
|
| 224 |
-
2. **构建不稳定** - 从源码编译经常失败
|
| 225 |
-
3. **不是必需的** - 代码有 fallback
|
| 226 |
-
4. **增加构建时间** - 可能增加 5-15 分钟
|
| 227 |
-
|
| 228 |
-
---
|
| 229 |
-
|
| 230 |
-
## 📊 性能对比
|
| 231 |
-
|
| 232 |
-
### 推理速度(单图,GPU T4)
|
| 233 |
-
|
| 234 |
-
| 配置 | 时间 | 相对速度 |
|
| 235 |
-
|------|------|---------|
|
| 236 |
-
| PyTorch (无 xformers) | 1.00s | 100% |
|
| 237 |
-
| xformers 0.0.23 | 0.95s | 105% ⚡ |
|
| 238 |
-
|
| 239 |
-
**结论:** 性能提升不明显,不值得为此增加部署复杂度。
|
| 240 |
-
|
| 241 |
-
### 构建时间
|
| 242 |
-
|
| 243 |
-
| 配置 | 首次构建 | 成功率 |
|
| 244 |
-
|------|---------|--------|
|
| 245 |
-
| 无 xformers | 5-10 分钟 | ✅ 100% |
|
| 246 |
-
| 预编译 xformers | 8-12 分钟 | ✅ 95% |
|
| 247 |
-
| 源码编译 xformers | 20-40 分钟 | ⚠️ 60% |
|
| 248 |
-
|
| 249 |
-
---
|
| 250 |
-
|
| 251 |
-
## 🎯 最终建议
|
| 252 |
-
|
| 253 |
-
### 对于 HF Spaces 部署:⭐
|
| 254 |
-
|
| 255 |
-
**推荐:不使用 xformers**
|
| 256 |
-
|
| 257 |
-
理由:
|
| 258 |
-
1. 构建稳定可靠
|
| 259 |
-
2. 性能差异可忽略
|
| 260 |
-
3. 用户体验更好(不会因构建失败而无法使用)
|
| 261 |
-
|
| 262 |
-
### 对于本地开发:
|
| 263 |
-
|
| 264 |
-
**可选:安装预编译 xformers**
|
| 265 |
-
|
| 266 |
-
```bash
|
| 267 |
-
pip install -r requirements.txt
|
| 268 |
-
pip install xformers # 可选
|
| 269 |
-
```
|
| 270 |
-
|
| 271 |
-
### 对于生产环境:
|
| 272 |
-
|
| 273 |
-
**如需最佳性能,使用预编译 xformers**
|
| 274 |
-
|
| 275 |
-
```txt
|
| 276 |
-
torch==2.1.0
|
| 277 |
-
xformers==0.0.23
|
| 278 |
-
```
|
| 279 |
-
|
| 280 |
-
---
|
| 281 |
-
|
| 282 |
-
## 🔗 相关资源
|
| 283 |
-
|
| 284 |
-
- [xformers GitHub](https://github.com/facebookresearch/xformers)
|
| 285 |
-
- [xformers 安装指南](https://github.com/facebookresearch/xformers#installing-xformers)
|
| 286 |
-
- [PyTorch 版本兼容性](https://pytorch.org/get-started/previous-versions/)
|
| 287 |
-
|
| 288 |
-
---
|
| 289 |
-
|
| 290 |
-
## ✅ 当前状态
|
| 291 |
-
|
| 292 |
-
你的配置:
|
| 293 |
-
- ✅ **requirements.txt** - xformers 已注释(使用 fallback)
|
| 294 |
-
- ✅ **代码支持** - 自动 fallback 到 PyTorch 实现
|
| 295 |
-
- ✅ **功能完整** - 所有功能正常工作
|
| 296 |
-
- ✅ **构建稳定** - 100% 成功率
|
| 297 |
-
|
| 298 |
-
**无需进一步操作,可以直接部署!** 🚀
|
| 299 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
depth_anything_3/app/css_and_html.py
CHANGED
|
@@ -390,7 +390,7 @@ def get_header_html(logo_base64=None):
|
|
| 390 |
<a href="https://depth-anything-3.github.io" target="_blank" class="link-btn">
|
| 391 |
<i class="fas fa-globe" style="margin-right: 8px;"></i> Project Page
|
| 392 |
</a>
|
| 393 |
-
<a href="https://arxiv.org/abs/
|
| 394 |
<i class="fas fa-file-pdf" style="margin-right: 8px;"></i> Paper
|
| 395 |
</a>
|
| 396 |
<a href="https://github.com/ByteDance-Seed/Depth-Anything-3" target="_blank" class="link-btn">
|
|
|
|
| 390 |
<a href="https://depth-anything-3.github.io" target="_blank" class="link-btn">
|
| 391 |
<i class="fas fa-globe" style="margin-right: 8px;"></i> Project Page
|
| 392 |
</a>
|
| 393 |
+
<a href="https://arxiv.org/abs/2511.10647" target="_blank" class="link-btn">
|
| 394 |
<i class="fas fa-file-pdf" style="margin-right: 8px;"></i> Paper
|
| 395 |
</a>
|
| 396 |
<a href="https://github.com/ByteDance-Seed/Depth-Anything-3" target="_blank" class="link-btn">
|
fix_spaces_gpu.patch
DELETED
|
@@ -1,142 +0,0 @@
|
|
| 1 |
-
--- a/depth_anything_3/app/modules/model_inference.py
|
| 2 |
-
+++ b/depth_anything_3/app/modules/model_inference.py
|
| 3 |
-
@@ -31,47 +31,67 @@ from depth_anything_3.utils.export.glb import export_to_glb
|
| 4 |
-
from depth_anything_3.utils.export.gs import export_to_gs_video
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
+# Global cache for model (used in GPU subprocess)
|
| 8 |
-
+# This is safe because @spaces.GPU runs in isolated subprocess
|
| 9 |
-
+_MODEL_CACHE = None
|
| 10 |
-
+
|
| 11 |
-
+
|
| 12 |
-
class ModelInference:
|
| 13 |
-
"""
|
| 14 |
-
Handles model inference and data processing for Depth Anything 3.
|
| 15 |
-
"""
|
| 16 |
-
|
| 17 |
-
def __init__(self):
|
| 18 |
-
- """Initialize the model inference handler."""
|
| 19 |
-
- self.model = None
|
| 20 |
-
-
|
| 21 |
-
- def initialize_model(self, device: str = "cuda") -> None:
|
| 22 |
-
+ """Initialize the model inference handler.
|
| 23 |
-
+
|
| 24 |
-
+ Note: Do NOT store model in instance variable to avoid
|
| 25 |
-
+ state sharing issues with @spaces.GPU decorator.
|
| 26 |
-
+ """
|
| 27 |
-
+ pass # No instance variables
|
| 28 |
-
+
|
| 29 |
-
+ def initialize_model(self, device: str = "cuda"):
|
| 30 |
-
"""
|
| 31 |
-
Initialize the DepthAnything3 model.
|
| 32 |
-
+
|
| 33 |
-
+ Uses global cache to store model safely in GPU subprocess.
|
| 34 |
-
+ This avoids CUDA initialization in main process.
|
| 35 |
-
|
| 36 |
-
Args:
|
| 37 |
-
device: Device to load the model on
|
| 38 |
-
+
|
| 39 |
-
+ Returns:
|
| 40 |
-
+ Model instance
|
| 41 |
-
"""
|
| 42 |
-
- if self.model is None:
|
| 43 |
-
+ global _MODEL_CACHE
|
| 44 |
-
+
|
| 45 |
-
+ if _MODEL_CACHE is None:
|
| 46 |
-
# Get model directory from environment variable or use default
|
| 47 |
-
model_dir = os.environ.get(
|
| 48 |
-
"DA3_MODEL_DIR", "/dev/shm/da3_models/DA3HF-VITG-METRIC_VITL"
|
| 49 |
-
)
|
| 50 |
-
- self.model = DepthAnything3.from_pretrained(model_dir)
|
| 51 |
-
- self.model = self.model.to(device)
|
| 52 |
-
+ print(f"Loading model from {model_dir}...")
|
| 53 |
-
+ _MODEL_CACHE = DepthAnything3.from_pretrained(model_dir)
|
| 54 |
-
+ _MODEL_CACHE = _MODEL_CACHE.to(device)
|
| 55 |
-
+ _MODEL_CACHE.eval()
|
| 56 |
-
+ print("Model loaded and moved to GPU")
|
| 57 |
-
else:
|
| 58 |
-
- self.model = self.model.to(device)
|
| 59 |
-
-
|
| 60 |
-
- self.model.eval()
|
| 61 |
-
+ print("Using cached model")
|
| 62 |
-
+ # Ensure model is on correct device
|
| 63 |
-
+ _MODEL_CACHE = _MODEL_CACHE.to(device)
|
| 64 |
-
+
|
| 65 |
-
+ return _MODEL_CACHE
|
| 66 |
-
|
| 67 |
-
def run_inference(
|
| 68 |
-
self,
|
| 69 |
-
...
|
| 70 |
-
# Initialize model if needed
|
| 71 |
-
- self.initialize_model(device)
|
| 72 |
-
+ model = self.initialize_model(device)
|
| 73 |
-
|
| 74 |
-
...
|
| 75 |
-
|
| 76 |
-
# Run model inference
|
| 77 |
-
print(f"Running inference with method: {actual_method}")
|
| 78 |
-
with torch.no_grad():
|
| 79 |
-
- prediction = self.model.inference(
|
| 80 |
-
+ prediction = model.inference(
|
| 81 |
-
image_paths, export_dir=None, process_res_method=actual_method, infer_gs=infer_gs
|
| 82 |
-
)
|
| 83 |
-
|
| 84 |
-
@@ -192,6 +212,10 @@ class ModelInference:
|
| 85 |
-
# Process results
|
| 86 |
-
processed_data = self._process_results(target_dir, prediction, image_paths)
|
| 87 |
-
|
| 88 |
-
+ # CRITICAL: Move all CUDA tensors to CPU before returning
|
| 89 |
-
+ # This prevents CUDA initialization in main process during unpickling
|
| 90 |
-
+ prediction = self._move_prediction_to_cpu(prediction)
|
| 91 |
-
+
|
| 92 |
-
# Clean up
|
| 93 |
-
torch.cuda.empty_cache()
|
| 94 |
-
|
| 95 |
-
@@ -282,6 +306,45 @@ class ModelInference:
|
| 96 |
-
|
| 97 |
-
return processed_data
|
| 98 |
-
|
| 99 |
-
+ def _move_prediction_to_cpu(self, prediction: Any) -> Any:
|
| 100 |
-
+ """
|
| 101 |
-
+ Move all CUDA tensors in prediction to CPU for safe pickling.
|
| 102 |
-
+
|
| 103 |
-
+ This is REQUIRED for HF Spaces with @spaces.GPU decorator to avoid
|
| 104 |
-
+ CUDA initialization in the main process during unpickling.
|
| 105 |
-
+
|
| 106 |
-
+ Args:
|
| 107 |
-
+ prediction: Prediction object that may contain CUDA tensors
|
| 108 |
-
+
|
| 109 |
-
+ Returns:
|
| 110 |
-
+ Prediction object with all tensors moved to CPU
|
| 111 |
-
+ """
|
| 112 |
-
+ # Move gaussians tensors to CPU
|
| 113 |
-
+ if hasattr(prediction, 'gaussians') and prediction.gaussians is not None:
|
| 114 |
-
+ gaussians = prediction.gaussians
|
| 115 |
-
+
|
| 116 |
-
+ # Move each tensor attribute to CPU
|
| 117 |
-
+ tensor_attrs = ['means', 'scales', 'rotations', 'harmonics', 'opacities']
|
| 118 |
-
+ for attr in tensor_attrs:
|
| 119 |
-
+ if hasattr(gaussians, attr):
|
| 120 |
-
+ tensor = getattr(gaussians, attr)
|
| 121 |
-
+ if isinstance(tensor, torch.Tensor) and tensor.is_cuda:
|
| 122 |
-
+ setattr(gaussians, attr, tensor.cpu())
|
| 123 |
-
+ print(f"Moved gaussians.{attr} to CPU")
|
| 124 |
-
+
|
| 125 |
-
+ # Move any tensors in aux dict to CPU
|
| 126 |
-
+ if hasattr(prediction, 'aux') and prediction.aux is not None:
|
| 127 |
-
+ for key, value in list(prediction.aux.items()):
|
| 128 |
-
+ if isinstance(value, torch.Tensor) and value.is_cuda:
|
| 129 |
-
+ prediction.aux[key] = value.cpu()
|
| 130 |
-
+ print(f"Moved aux['{key}'] to CPU")
|
| 131 |
-
+ elif isinstance(value, dict):
|
| 132 |
-
+ # Recursively handle nested dicts
|
| 133 |
-
+ for k, v in list(value.items()):
|
| 134 |
-
+ if isinstance(v, torch.Tensor) and v.is_cuda:
|
| 135 |
-
+ value[k] = v.cpu()
|
| 136 |
-
+ print(f"Moved aux['{key}']['{k}'] to CPU")
|
| 137 |
-
+
|
| 138 |
-
+ return prediction
|
| 139 |
-
+
|
| 140 |
-
def cleanup(self) -> None:
|
| 141 |
-
"""Clean up GPU memory."""
|
| 142 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|