depth-anything-3

Running on Zero

App Files Files Community

linhaotong commited on 28 days ago

Commit

b396ed8

1 Parent(s): 6a8ae2c

update paper link and clean files

Browse files

Files changed (8) hide show

EXAMPLES_DIRECTORY.md +0 -286
SPACES_GPU_BEST_PRACTICES.md +0 -481
SPACES_GPU_FIX_GUIDE.md +0 -484
SPACES_SETUP.md +0 -190
UPLOAD_EXAMPLES.md +0 -314
XFORMERS_GUIDE.md +0 -299
depth_anything_3/app/css_and_html.py +1 -1
fix_spaces_gpu.patch +0 -142

EXAMPLES_DIRECTORY.md DELETED Viewed

@@ -1,286 +0,0 @@
-# 📁 Examples 目录配置指南
-## 📍 Examples 目录位置
-### 默认位置
-Examples 目录应该放在：
-```
-workspace/gradio/examples/
-```
-### 完整路径说明
-根据 `app.py` 的配置：
-```python
-workspace_dir = os.environ.get("DA3_WORKSPACE_DIR", "workspace/gradio")
-examples_dir = os.path.join(workspace_dir, "examples")
-# 结果: workspace/gradio/examples/
-```
-## 📂 目录结构
-Examples 目录应该按以下结构组织：
-```
-workspace/gradio/examples/
-├── scene1/              # 场景 1
-│   ├── 000.png          # 图像文件
-│   ├── 010.png
-│   ├── 020.png
-│   └── ...
-├── scene2/              # 场景 2
-│   ├── 000.jpg
-│   ├── 010.jpg
-│   └── ...
-└── scene3/              # 场景 3
-    ├── image1.png
-    ├── image2.png
-    └── ...
-```
-### 要求
-1. **每个场景一个文件夹**：每个场景应该有自己的文件夹
-2. **文件夹名称**：文件夹名称会显示为场景名称
-3. **图像文件**：支持 `.jpg`, `.jpeg`, `.png`, `.bmp`, `.tiff`, `.tif` 格式
-4. **第一张图像**：第一张图像（按文件名排序）会用作缩略图
-## 🔧 配置方式
-### 方式 1：使用默认路径（推荐）
-直接创建目录：
-```bash
-mkdir -p workspace/gradio/examples
-```
-然后添加场景：
-```bash
-# 创建场景文件夹
-mkdir -p workspace/gradio/examples/my_scene
-# 复制图像文件
-cp your_images/* workspace/gradio/examples/my_scene/
-```
-### 方式 2：使用环境变量
-通过环境变量自定义位置：
-```bash
-# 设置环境变量
-export DA3_WORKSPACE_DIR="/path/to/your/workspace"
-# 然后 examples 会在 /path/to/your/workspace/examples
-```
-或在 `app.py` 中修改：
-```python
-workspace_dir = os.environ.get("DA3_WORKSPACE_DIR", "/custom/path/workspace")
-```
-### 方式 3：在 Hugging Face Spaces 中
-在 Spaces 中，可以通过以下方式添加 examples：
-1. **通过 Git 上传**：
-   ```bash
-   git add workspace/gradio/examples/
-   git commit -m "Add example scenes"
-   git push
-   ```
-2. **通过网页界面上传**：
-   - 在 Spaces 的文件浏览器中创建 `workspace/gradio/examples/` 目录
-   - 上传场景文件夹和图像
-3. **使用持久存储**：
-   - 如果使用持久存储，examples 会保存在持久存储中
-   - 路径仍然是 `workspace/gradio/examples/`
-## 📝 示例场景结构示例
-### 示例 1：简单场景
-```
-workspace/gradio/examples/
-└── indoor_room/
-    ├── 000.png
-    ├── 010.png
-    ├── 020.png
-    └── 030.png
-```
-### 示例 2：多个场景
-```
-workspace/gradio/examples/
-├── outdoor_garden/
-│   ├── frame_001.jpg
-│   ├── frame_002.jpg
-│   └── frame_003.jpg
-├── office_space/
-│   ├── img_000.png
-│   ├── img_010.png
-│   └── img_020.png
-└── street_scene/
-    ├── 000.png
-    ├── 010.png
-    └── 020.png
-```
-## 🔍 验证 Examples 目录
-### 检查目录是否存在
-```bash
-# 检查默认位置
-ls -la workspace/gradio/examples/
-# 或使用 Python
-python -c "
-import os
-workspace_dir = os.environ.get('DA3_WORKSPACE_DIR', 'workspace/gradio')
-examples_dir = os.path.join(workspace_dir, 'examples')
-print(f'Examples directory: {examples_dir}')
-print(f'Exists: {os.path.exists(examples_dir)}')
-if os.path.exists(examples_dir):
-    scenes = [d for d in os.listdir(examples_dir) if os.path.isdir(os.path.join(examples_dir, d))]
-    print(f'Found {len(scenes)} scenes: {scenes}')
-"
-```
-### 检查场景信息
-应用启动时会自动扫描 examples 目录，并在日志中显示：
-```
-Found 3 example scenes:
-  - scene1 (5 images)
-  - scene2 (10 images)
-  - scene3 (8 images)
-```
-## 🚀 快速开始
-### 1. 创建目录结构
-```bash
-# 在项目根目录
-mkdir -p workspace/gradio/examples
-```
-### 2. 添加示例场景
-```bash
-# 创建场景文件夹
-mkdir -p workspace/gradio/examples/my_first_scene
-# 添加图像文件（复制你的图像）
-cp /path/to/your/images/* workspace/gradio/examples/my_first_scene/
-```
-### 3. 验证
-启动应用后，你应该能在 UI 中看到示例场景网格。
-## 📊 在 Hugging Face Spaces 中
-### 上传方式
-1. **通过 Git**（推荐）：
-   ```bash
-   # 在本地准备 examples
-   mkdir -p workspace/gradio/examples
-   # ... 添加场景 ...
-   # 提交并推送
-   git add workspace/gradio/examples/
-   git commit -m "Add example scenes"
-   git push
-   ```
-2. **通过网页界面**：
-   - 在 Spaces 的文件浏览器中
-   - 创建 `workspace/gradio/examples/` 目录
-   - 上传场景文件夹
-### 注意事项
-- **文件大小限制**：确保图像文件不超过 Spaces 的文件大小限制
-- **持久存储**：如果使用持久存储，examples 会持久保存
-- **缓存**：示例场景的结果会缓存在 `workspace/gradio/input_images/` 下
-## 🔗 相关配置
-### 环境变量
-- `DA3_WORKSPACE_DIR`: 工作空间目录（默认：`workspace/gradio`）
-- Examples 目录自动设置为：`{DA3_WORKSPACE_DIR}/examples`
-### 代码中的配置
-- `depth_anything_3/app/gradio_app.py`: `cache_examples()` 方法
-- `depth_anything_3/app/modules/utils.py`: `get_scene_info()` 函数
-- `depth_anything_3/app/modules/event_handlers.py`: `load_example_scene()` 方法
-## ❓ 常见问题
-### Q: Examples 目录不存在怎么办？
-A: 应用会自动创建 `workspace/gradio/` 目录，但不会自动创建 `examples/` 子目录。你需要手动创建：
-```bash
-mkdir -p workspace/gradio/examples
-```
-### Q: 如何添加新的示例场景？
-A: 只需在 `workspace/gradio/examples/` 下创建新文件夹并添加图像：
-```bash
-mkdir -p workspace/gradio/examples/new_scene
-cp images/* workspace/gradio/examples/new_scene/
-```
-应用会在下次启动时自动检测新场景。
-### Q: 场景名称如何显示？
-A: 场景名称就是文件夹名称。例如：
-- 文件夹：`workspace/gradio/examples/indoor_room/`
-- 显示名称：`indoor_room`
-### Q: 缩略图如何选择？
-A: 缩略图是文件夹中按文件名排序后的第一张图像。
-## 📝 总结
-**Examples 目录位置：**
-- **默认**：`workspace/gradio/examples/`
-- **可通过环境变量**：`DA3_WORKSPACE_DIR` 自定义
-**目录结构：**
-```
-workspace/gradio/examples/
-├── scene1/
-│   └── images...
-├── scene2/
-│   └── images...
-└── scene3/
-    └── images...
-```
-**快速创建：**
-```bash
-mkdir -p workspace/gradio/examples
-# 然后添加场景文件夹和图像
-```

SPACES_GPU_BEST_PRACTICES.md DELETED Viewed

@@ -1,481 +0,0 @@
-# 🎯 Spaces GPU 最佳实践指南
-## 📚 spaces.GPU 工作原理
-### 架构概览
-```
-┌─────────────────────────────────────────────────────────┐
-│ 主进程 (Main Process)                                    │
-│ - CPU 环境                                              │
-│ - ❌ 不能初始化 CUDA                                     │
-│ - ✅ 可以创建 Gradio UI                                 │
-│ - ✅ 可以创建 ModelInference 实例（但不加载模型）       │
-└─────────────────────────────────────────────────────────┘
-                        │
-                        │ 调用 @spaces.GPU 装饰的函数
-                        │
-                        ▼
-┌─────────────────────────────────────────────────────────┐
-│ 子进程 (GPU Worker Process)                             │
-│ - GPU 环境                                              │
-│ - ✅ 可以初始化 CUDA                                     │
-│ - ✅ 可以加载模型到 GPU                                  │
-│ - ✅ 运行推理                                           │
-│ - ✅ 全局变量缓存（每个子进程独立）                      │
-└─────────────────────────────────────────────────────────┘
-                        │
-                        │ pickle 序列化返回值
-                        │
-                        ▼
-┌─────────────────────────────────────────────────────────┐
-│ 主进程接收返回值                                         │
-│ - ✅ 必须是 CPU 数据（numpy, 基本类型）                 │
-│ - ❌ 不能包含 CUDA 张量                                 │
-└─────────────────────────────────────────────────────────┘
-```
-## ✅ 最佳实践：模型加载策略
-### ❌ 错误做法 1：主进程加载模型
-```python
-# ❌ 错误：在主进程加载模型
-class EventHandlers:
-    def __init__(self):
-        self.model_inference = ModelInference()
-        # ❌ 如果在主进程调用这个，会触发 CUDA 初始化错误
-        self.model_inference.initialize_model("cuda")  # 💥
-```
-**为什么错误？**
-- 主进程不能初始化 CUDA
-- 会立即报错：`CUDA must not be initialized in the main process`
-### ❌ 错误做法 2：实例变量存储模型
-```python
-# ❌ 错误：使用实例变量存储模型
-class ModelInference:
-    def __init__(self):
-        self.model = None  # ❌ 实例变量
-    def initialize_model(self, device):
-        if self.model is None:
-            self.model = load_model()  # ❌ 保存在实例中
-        return self.model
-```
-**为什么错误？**
-- 实例在主进程创建
-- 模型状态可能跨进程混乱
-- 第二次调用时状态不确定
-### ✅ 正确做法：子进程全局变量缓存
-```python
-# ✅ 正确：使用全局变量在子进程中缓存
-_MODEL_CACHE = None  # 全局变量，每个子进程独立
-class ModelInference:
-    def __init__(self):
-        # ✅ 不存储任何状态
-        pass
-    def initialize_model(self, device: str = "cuda"):
-        global _MODEL_CACHE
-        if _MODEL_CACHE is None:
-            # ✅ 在子进程中加载（第一次调用时）
-            print("Loading model in GPU subprocess...")
-            model_dir = os.environ.get("DA3_MODEL_DIR", "...")
-            _MODEL_CACHE = DepthAnything3.from_pretrained(model_dir)
-            _MODEL_CACHE = _MODEL_CACHE.to(device)  # ✅ 在子进程中移动
-            _MODEL_CACHE.eval()
-        else:
-            # ✅ 复用缓存的模型
-            print("Using cached model")
-        return _MODEL_CACHE  # ✅ 返回模型，不存储
-```
-**为什么正确？**
-- ✅ 模型只在子进程加载（GPU 环境）
-- ✅ 全局变量在子进程内安全（每个子进程独立）
-- ✅ 不污染主进程
-- ✅ 可以缓存复用（避免重复加载）
-## 🎯 完整实现示例
-### 文件结构
-```
-app.py                          # 主入口，配置 @spaces.GPU
-depth_anything_3/app/modules/
-  ├── model_inference.py        # 模型推理（使用全局变量）
-  └── event_handlers.py         # 事件处理（主进程，不加载模型）
-```
-### 1. app.py - 装饰器配置
-```python
-import spaces
-from depth_anything_3.app.modules.model_inference import ModelInference
-# ✅ 装饰 run_inference 方法
-original_run_inference = ModelInference.run_inference
-@spaces.GPU(duration=120)
-def gpu_run_inference(self, *args, **kwargs):
-    """
-    在 GPU 子进程中运行推理。
-    这个函数会在独立的 GPU 子进程中执行，
-    可以安全地初始化 CUDA 和加载模型。
-    """
-    return original_run_inference(self, *args, **kwargs)
-# 替换原方法
-ModelInference.run_inference = gpu_run_inference
-# ✅ 主进程：只创建应用，不加载模型
-if __name__ == "__main__":
-    app = DepthAnything3App(...)
-    app.launch(host="0.0.0.0", port=7860)
-```
-### 2. model_inference.py - 模型管理
-```python
-import torch
-from depth_anything_3.api import DepthAnything3
-# ========================================
-# ✅ 全局变量缓存（子进程安全）
-# ========================================
-_MODEL_CACHE = None
-class ModelInference:
-    def __init__(self):
-        """
-        初始化 - 不存储任何状态。
-        注意：这个实例在主进程创建，但模型加载在子进程。
-        """
-        pass  # ✅ 无实例变量
-    def initialize_model(self, device: str = "cuda"):
-        """
-        在子进程中加载模型。
-        使用全局变量缓存，因为：
-        1. @spaces.GPU 在子进程运行
-        2. 每个子进程有独立的全局命名空间
-        3. 可以安全缓存，避免重复加载
-        """
-        global _MODEL_CACHE
-        if _MODEL_CACHE is None:
-            # 第一次调用：加载模型
-            model_dir = os.environ.get("DA3_MODEL_DIR", "...")
-            print(f"🔄 Loading model in GPU subprocess from {model_dir}")
-            _MODEL_CACHE = DepthAnything3.from_pretrained(model_dir)
-            _MODEL_CACHE = _MODEL_CACHE.to(device)  # ✅ 在子进程中移动
-            _MODEL_CACHE.eval()
-            print(f"✅ Model loaded on {device}")
-        else:
-            # 后续调用：复用缓存
-            print("✅ Using cached model")
-            # 确保在正确的设备上（防御性编程）
-            _MODEL_CACHE = _MODEL_CACHE.to(device)
-        return _MODEL_CACHE
-    def run_inference(self, target_dir, ...):
-        """
-        运行推理 - 在 GPU 子进程中执行。
-        这个函数被 @spaces.GPU 装饰，会在子进程运行。
-        """
-        # ✅ 在子进程中获取模型（局部变量）
-        device = "cuda" if torch.cuda.is_available() else "cpu"
-        model = self.initialize_model(device)  # ✅ 返回模型，不存储
-        # ✅ 运行推理
-        with torch.no_grad():
-            prediction = model.inference(...)
-        # ✅ 处理结果
-        # ...
-        # ✅ 关键：返回前移动所有 CUDA 张量到 CPU
-        prediction = self._move_to_cpu(prediction)
-        return prediction, processed_data
-    def _move_to_cpu(self, prediction):
-        """移动所有 CUDA 张量到 CPU，确保 pickle 安全"""
-        # ... 实现见下文
-        return prediction
-```
-### 3. event_handlers.py - 主进程代码
-```python
-class EventHandlers:
-    def __init__(self):
-        """
-        主进程初始化 - 不加载模型。
-        注意：这里创建 ModelInference 实例是安全的，
-        因为它不立即加载模型。模型会在子进程中加载。
-        """
-        # ✅ 可以创建实例（不加载模型）
-        self.model_inference = ModelInference()
-        # ❌ 不要在这里调用 initialize_model()
-        # ❌ 不要在这里加载模型
-    def gradio_demo(self, ...):
-        """
-        Gradio 回调 - 在主进程调用。
-        这个函数会调用 self.model_inference.run_inference，
-        而 run_inference 被 @spaces.GPU 装饰，会在子进程运行。
-        """
-        # ✅ 调用被装饰的方法（自动在子进程运行）
-        result = self.model_inference.run_inference(...)
-        return result
-```
-## 🔑 关键原则总结
-### ✅ DO（应该做）
-1. **主进程：只创建实例，不加载模型**
-   ```python
-   # ✅ 主进程
-   model_inference = ModelInference()  # 安全
-   # 不调用 initialize_model()
-   ```
-2. **子进程：使用全局变量缓存模型**
-   ```python
-   # ✅ 子进程（@spaces.GPU 装饰的函数内）
-   _MODEL_CACHE = None  # 全局变量
-   model = initialize_model()  # 在子进程加载
-   ```
-3. **返回前：移动所有张量到 CPU**
-   ```python
-   # ✅ 返回前
-   prediction = move_all_tensors_to_cpu(prediction)
-   return prediction
-   ```
-4. **清理 GPU 内存**
-   ```python
-   # ✅ 推理后
-   torch.cuda.empty_cache()
-   ```
-### ❌ DON'T（不应该做）
-1. **主进程：不要初始化 CUDA**
-   ```python
-   # ❌ 主进程
-   model.to("cuda")  # 💥 错误
-   torch.cuda.is_available()  # 💥 可能触发初始化
-   ```
-2. **不要用实例变量存储模型**
-   ```python
-   # ❌
-   self.model = load_model()  # 状态混乱
-   ```
-3. **不要返回 CUDA 张量**
-   ```python
-   # ❌
-   return prediction  # 如果包含 CUDA 张量，会报错
-   ```
-4. **不要在 __init__ 中加载模型**
-   ```python
-   # ❌
-   def __init__(self):
-       self.model = load_model()  # 在主进程执行，会报错
-   ```
-## 📊 执行流程对比
-### ❌ 错误流程
-```
-主进程启动
-  ↓
-创建 ModelInference() 实例
-  ↓
-__init__ 中 self.model = None  # ✅ 安全
-  ↓
-第一次调用 run_inference
-  ↓
-@spaces.GPU 创建子进程
-  ↓
-子进程：self.model = load_model()  # ✅ 在子进程
-  ↓
-返回 prediction（包含 CUDA 张量）  # ❌ 错误
-  ↓
-pickle 尝试在主进程重建 CUDA 张量  # 💥 报错
-```
-### ✅ 正确流程
-```
-主进程启动
-  ↓
-创建 ModelInference() 实例（无状态）  # ✅
-  ↓
-第一次调用 run_inference
-  ↓
-@spaces.GPU 创建子进程
-  ↓
-子进程：_MODEL_CACHE = load_model()  # ✅ 全局变量
-  ↓
-子进程：model = _MODEL_CACHE  # ✅ 局部变量
-  ↓
-子进程：prediction = model.inference(...)
-  ↓
-子进程：prediction = move_to_cpu(prediction)  # ✅
-  ↓
-返回 prediction（所有张量在 CPU）  # ✅
-  ↓
-主进程：安全接收 CPU 数据  # ✅
-```
-## 🧪 验证清单
-### 主进程检查
-```python
-# ✅ 应该通过
-def test_main_process():
-    # 可以创建实例
-    model_inference = ModelInference()
-    # 不应该有模型
-    assert not hasattr(model_inference, 'model') or model_inference.model is None
-    # 不应该初始化 CUDA
-    # (这个测试需要在主进程运行)
-```
-### 子进程检查
-```python
-# ✅ 应该通过
-@spaces.GPU
-def test_gpu_subprocess():
-    model_inference = ModelInference()
-    # 可以加载模型
-    model = model_inference.initialize_model("cuda")
-    assert model is not None
-    # 模型应该在 GPU
-    # (检查模型参数设备)
-    # 可以运行推理
-    # ...
-    # 返回前应该移到 CPU
-    # ...
-```
-## 🎓 常见问题
-### Q1: 为什么不能用实例变量？
-**A:** 因为实例在主进程创建，如果存储模型状态，会跨进程混乱。
-```python
-# ❌ 问题
-self.model = load_model()  # 状态可能混乱
-# ✅ 解决
-_MODEL_CACHE = load_model()  # 每个子进程独立
-```
-### Q2: 全局变量安全吗？
-**A:** 是的！因为：
-- 每个子进程有独立的全局命名空间
-- 主进程不会访问子进程的全局变量
-- 不会跨进程污染
-### Q3: 模型会重复加载吗？
-**A:** 不会！因为：
-- 全局变量在子进程内缓存
-- 同一个子进程的多次调用会复用
-- 不同子进程各自缓存（如果需要）
-### Q4: 如何清理模型？
-**A:** 通常不需要手动清理，因为：
-- 子进程结束后自动清理
-- 如果需要，可以在子进程中：
-  ```python
-  global _MODEL_CACHE
-  _MODEL_CACHE = None
-  del model
-  torch.cuda.empty_cache()
-  ```
-## 📝 完整代码模板
-```python
-# ========================================
-# model_inference.py
-# ========================================
-_MODEL_CACHE = None  # 全局缓存
-class ModelInference:
-    def __init__(self):
-        pass  # 无状态
-    def initialize_model(self, device="cuda"):
-        global _MODEL_CACHE
-        if _MODEL_CACHE is None:
-            _MODEL_CACHE = load_model().to(device)
-        return _MODEL_CACHE
-    def run_inference(self, ...):
-        model = self.initialize_model("cuda")
-        prediction = model.inference(...)
-        prediction = self._move_to_cpu(prediction)
-        return prediction
-# ========================================
-# app.py
-# ========================================
-@spaces.GPU(duration=120)
-def gpu_run_inference(self, *args, **kwargs):
-    return ModelInference.run_inference(self, *args, **kwargs)
-ModelInference.run_inference = gpu_run_inference
-```
-## 🎯 总结
-**核心原则：**
-1. ✅ **主进程 = CPU 环境**，不加载模型，不初始化 CUDA
-2. ✅ **子进程 = GPU 环境**，加载模型，运行推理
-3. ✅ **全局变量缓存**，每个子进程独立
-4. ✅ **返回 CPU 数据**，确保 pickle 安全
-遵循这些原则，你的 Spaces GPU 应用就能稳定运行！🚀

SPACES_GPU_FIX_GUIDE.md DELETED Viewed

@@ -1,484 +0,0 @@
-# 🔧 Spaces GPU 问题完整修复指南
-## 🎯 问题诊断：你说得完全正确！
-### 问题根源分析
-```python
-# event_handlers.py - 主进程中
-class EventHandlers:
-    def __init__(self):
-        self.model_inference = ModelInference()  # ❌ 在主进程创建实例
-# model_inference.py
-class ModelInference:
-    def __init__(self):
-        self.model = None  # ❌ 实例变量，跨进程共享状态有问题
-    def initialize_model(self, device):
-        if self.model is None:
-            self.model = load_model()  # 第一次：在子进程加载
-        else:
-            self.model = self.model.to(device)  # 第二次：💥 主进程CUDA操作！
-```
-### 为什么第二次会失败？
-1. **第一次调用**：
-   - `@spaces.GPU` 在子进程运行
-   - `self.model is None` → 加载模型
-   - `self.model` 保存在实例中
-   - 返回时 `prediction.gaussians` 包含 CUDA 张量
-   - **pickle 时尝试在主进程重建 CUDA 张量** → 💥
-2. **第二次调用**（即使第一次成功了）：
-   - 新的子进程或状态混乱
-   - `self.model` 状态不确定
-   - 尝试 `.to(device)` 操作 → 💥
-## ✅ 解决方案：两个关键修改
-### 修改 1：使用全局变量缓存模型（避免实例状态）
-**为什么用全局变量？**
-- `@spaces.GPU` 每次在独立子进程运行
-- 全局变量在子进程内是安全的
-- 不会污染主进程
-### 修改 2：返回前移动所有 CUDA 张量到 CPU
-**为什么需要？**
-- Pickle 序列化返回值时会尝试重建 CUDA 张量
-- 必须确保返回的数据都在 CPU 上
-## 📝 完整修复代码
-### 文件：`depth_anything_3/app/modules/model_inference.py`
-```python
-"""
-Model inference module for Depth Anything 3 Gradio app.
-Modified for HF Spaces GPU compatibility.
-"""
-import gc
-import glob
-import os
-from typing import Any, Dict, Optional, Tuple
-import numpy as np
-import torch
-from depth_anything_3.api import DepthAnything3
-from depth_anything_3.utils.export.glb import export_to_glb
-from depth_anything_3.utils.export.gs import export_to_gs_video
-# ========================================
-# 🔑 关键修改 1：使用全局变量缓存模型
-# ========================================
-# Global cache for model (used in GPU subprocess)
-# This is SAFE because @spaces.GPU runs in isolated subprocess
-# Each subprocess gets its own copy of this global variable
-_MODEL_CACHE = None
-class ModelInference:
-    """
-    Handles model inference and data processing for Depth Anything 3.
-    Modified for HF Spaces GPU compatibility - does NOT store state
-    in instance variables to avoid cross-process issues.
-    """
-    def __init__(self):
-        """Initialize the model inference handler.
-        Note: Do NOT store model in instance variable to avoid
-        state sharing issues with @spaces.GPU decorator.
-        """
-        # No instance variables! All state in global or local variables
-        pass
-    def initialize_model(self, device: str = "cuda"):
-        """
-        Initialize the DepthAnything3 model using global cache.
-        This uses a global variable which is safe because:
-        1. @spaces.GPU runs in isolated subprocess
-        2. Each subprocess has its own global namespace
-        3. No state leaks to main process
-        Args:
-            device: Device to load the model on
-        Returns:
-            Model instance ready for inference
-        """
-        global _MODEL_CACHE
-        if _MODEL_CACHE is None:
-            # First time loading in this subprocess
-            model_dir = os.environ.get(
-                "DA3_MODEL_DIR", "depth-anything/DA3NESTED-GIANT-LARGE"
-            )
-            print(f"🔄 Loading model from {model_dir}...")
-            _MODEL_CACHE = DepthAnything3.from_pretrained(model_dir)
-            _MODEL_CACHE = _MODEL_CACHE.to(device)
-            _MODEL_CACHE.eval()
-            print("✅ Model loaded and ready on GPU")
-        else:
-            # Model already cached in this subprocess
-            print("✅ Using cached model")
-            # Ensure it's on the correct device (defensive programming)
-            _MODEL_CACHE = _MODEL_CACHE.to(device)
-        return _MODEL_CACHE
-    def run_inference(
-        self,
-        target_dir: str,
-        filter_black_bg: bool = False,
-        filter_white_bg: bool = False,
-        process_res_method: str = "upper_bound_resize",
-        show_camera: bool = True,
-        selected_first_frame: Optional[str] = None,
-        save_percentage: float = 30.0,
-        num_max_points: int = 1_000_000,
-        infer_gs: bool = False,
-        gs_trj_mode: str = "extend",
-        gs_video_quality: str = "high",
-    ) -> Tuple[Any, Dict[int, Dict[str, Any]]]:
-        """
-        Run DepthAnything3 model inference on images.
-        This method is wrapped with @spaces.GPU in app.py.
-        Args:
-            target_dir: Directory containing images
-            filter_black_bg: Whether to filter black background
-            filter_white_bg: Whether to filter white background
-            process_res_method: Method for resizing input images
-            show_camera: Whether to show camera in 3D view
-            selected_first_frame: Selected first frame filename
-            save_percentage: Percentage of points to save (0-100)
-            num_max_points: Maximum number of points
-            infer_gs: Whether to infer 3D Gaussian Splatting
-            gs_trj_mode: Trajectory mode for GS
-            gs_video_quality: Video quality for GS
-        Returns:
-            Tuple of (prediction, processed_data)
-        """
-        print(f"Processing images from {target_dir}")
-        # Device check
-        device = "cuda" if torch.cuda.is_available() else "cpu"
-        device = torch.device(device)
-        print(f"Using device: {device}")
-        # 🔑 使用返回值，而不是 self.model
-        model = self.initialize_model(device)
-        # Get image paths
-        print("Loading images...")
-        image_folder_path = os.path.join(target_dir, "images")
-        all_image_paths = sorted(glob.glob(os.path.join(image_folder_path, "*")))
-        # Filter for image files
-        image_extensions = [".jpg", ".jpeg", ".png", ".bmp", ".tiff", ".tif"]
-        all_image_paths = [
-            path
-            for path in all_image_paths
-            if any(path.lower().endswith(ext) for ext in image_extensions)
-        ]
-        print(f"Found {len(all_image_paths)} images")
-        # Apply first frame selection logic
-        if selected_first_frame:
-            selected_path = None
-            for path in all_image_paths:
-                if os.path.basename(path) == selected_first_frame:
-                    selected_path = path
-                    break
-            if selected_path:
-                image_paths = [selected_path] + [
-                    path for path in all_image_paths if path != selected_path
-                ]
-                print(f"User selected first frame: {selected_first_frame}")
-            else:
-                image_paths = all_image_paths
-                print(f"Selected frame not found, using default order")
-        else:
-            image_paths = all_image_paths
-        if len(image_paths) == 0:
-            raise ValueError("No images found. Check your upload.")
-        # Map UI options to actual method names
-        method_mapping = {"high_res": "lower_bound_resize", "low_res": "upper_bound_resize"}
-        actual_method = method_mapping.get(process_res_method, "upper_bound_crop")
-        # Run model inference
-        print(f"Running inference with method: {actual_method}")
-        with torch.no_grad():
-            # 🔑 使用局部变量 model，不是 self.model
-            prediction = model.inference(
-                image_paths, export_dir=None, process_res_method=actual_method, infer_gs=infer_gs
-            )
-        # Export to GLB
-        export_to_glb(
-            prediction,
-            filter_black_bg=filter_black_bg,
-            filter_white_bg=filter_white_bg,
-            export_dir=target_dir,
-            show_cameras=show_camera,
-            conf_thresh_percentile=save_percentage,
-            num_max_points=int(num_max_points),
-        )
-        # Export to GS video if needed
-        if infer_gs:
-            mode_mapping = {"extend": "extend", "smooth": "interpolate_smooth"}
-            print(f"GS mode: {gs_trj_mode}; Backend mode: {mode_mapping[gs_trj_mode]}")
-            export_to_gs_video(
-                prediction,
-                export_dir=target_dir,
-                chunk_size=4,
-                trj_mode=mode_mapping.get(gs_trj_mode, "extend"),
-                enable_tqdm=True,
-                vis_depth="hcat",
-                video_quality=gs_video_quality,
-            )
-        # Save predictions cache
-        self._save_predictions_cache(target_dir, prediction)
-        # Process results
-        processed_data = self._process_results(target_dir, prediction, image_paths)
-        # ========================================
-        # 🔑 关键修改 2：返回前移动所有 CUDA 张量到 CPU
-        # ========================================
-        print("Moving all tensors to CPU for safe return...")
-        prediction = self._move_prediction_to_cpu(prediction)
-        # Clean up GPU memory
-        torch.cuda.empty_cache()
-        return prediction, processed_data
-    def _move_prediction_to_cpu(self, prediction: Any) -> Any:
-        """
-        Move all CUDA tensors in prediction to CPU for safe pickling.
-        This is CRITICAL for HF Spaces with @spaces.GPU decorator.
-        Without this, pickle will try to reconstruct CUDA tensors in
-        the main process, causing CUDA initialization error.
-        Args:
-            prediction: Prediction object that may contain CUDA tensors
-        Returns:
-            Prediction object with all tensors moved to CPU
-        """
-        # Move gaussians tensors to CPU
-        if hasattr(prediction, 'gaussians') and prediction.gaussians is not None:
-            gaussians = prediction.gaussians
-            # Move each tensor attribute to CPU
-            tensor_attrs = ['means', 'scales', 'rotations', 'harmonics', 'opacities']
-            for attr in tensor_attrs:
-                if hasattr(gaussians, attr):
-                    tensor = getattr(gaussians, attr)
-                    if isinstance(tensor, torch.Tensor) and tensor.is_cuda:
-                        setattr(gaussians, attr, tensor.cpu())
-                        print(f"  ✓ Moved gaussians.{attr} to CPU")
-        # Move any tensors in aux dict to CPU
-        if hasattr(prediction, 'aux') and prediction.aux is not None:
-            for key, value in list(prediction.aux.items()):
-                if isinstance(value, torch.Tensor) and value.is_cuda:
-                    prediction.aux[key] = value.cpu()
-                    print(f"  ✓ Moved aux['{key}'] to CPU")
-                elif isinstance(value, dict):
-                    # Recursively handle nested dicts
-                    for k, v in list(value.items()):
-                        if isinstance(v, torch.Tensor) and v.is_cuda:
-                            value[k] = v.cpu()
-                            print(f"  ✓ Moved aux['{key}']['{k}'] to CPU")
-        print("✅ All tensors moved to CPU")
-        return prediction
-    def _save_predictions_cache(self, target_dir: str, prediction: Any) -> None:
-        """Save predictions data to predictions.npz for caching."""
-        try:
-            output_file = os.path.join(target_dir, "predictions.npz")
-            save_dict = {}
-            if prediction.processed_images is not None:
-                save_dict["images"] = prediction.processed_images
-            if prediction.depth is not None:
-                save_dict["depths"] = np.round(prediction.depth, 6)
-            if prediction.conf is not None:
-                save_dict["conf"] = np.round(prediction.conf, 2)
-            if prediction.extrinsics is not None:
-                save_dict["extrinsics"] = prediction.extrinsics
-            if prediction.intrinsics is not None:
-                save_dict["intrinsics"] = prediction.intrinsics
-            np.savez_compressed(output_file, **save_dict)
-            print(f"Saved predictions cache to: {output_file}")
-        except Exception as e:
-            print(f"Warning: Failed to save predictions cache: {e}")
-    def _process_results(
-        self, target_dir: str, prediction: Any, image_paths: list
-    ) -> Dict[int, Dict[str, Any]]:
-        """Process model results into structured data."""
-        processed_data = {}
-        depth_vis_dir = os.path.join(target_dir, "depth_vis")
-        if os.path.exists(depth_vis_dir):
-            depth_files = sorted(glob.glob(os.path.join(depth_vis_dir, "*.jpg")))
-            for i, depth_file in enumerate(depth_files):
-                processed_image = None
-                if prediction.processed_images is not None and i < len(
-                    prediction.processed_images
-                ):
-                    processed_image = prediction.processed_images[i]
-                processed_data[i] = {
-                    "depth_image": depth_file,
-                    "image": processed_image,
-                    "original_image_path": image_paths[i] if i < len(image_paths) else None,
-                    "depth": prediction.depth[i] if i < len(prediction.depth) else None,
-                    "intrinsics": (
-                        prediction.intrinsics[i]
-                        if prediction.intrinsics is not None and i < len(prediction.intrinsics)
-                        else None
-                    ),
-                    "mask": None,
-                }
-        return processed_data
-    def cleanup(self) -> None:
-        """Clean up GPU memory."""
-        if torch.cuda.is_available():
-            torch.cuda.empty_cache()
-        gc.collect()
-```
-## 🔍 关键变化总结
-### Before (有问题)：
-```python
-class ModelInference:
-    def __init__(self):
-        self.model = None  # ❌ 实例变量
-    def initialize_model(self, device):
-        if self.model is None:
-            self.model = load_model()  # ❌ 保存在实例中
-        else:
-            self.model = self.model.to(device)  # ❌ 跨进程操作
-def run_inference(self):
-        self.initialize_model(device)  # ❌ 使用实例方法
-        prediction = self.model.inference(...)  # ❌ 使用实例变量
-        return prediction  # ❌ 包含 CUDA 张量
-```
-### After (正确)：
-```python
-_MODEL_CACHE = None  # ✅ 全局变量（子进程安全）
-class ModelInference:
-    def __init__(self):
-        pass  # ✅ 无实例变量
-    def initialize_model(self, device):
-        global _MODEL_CACHE
-        if _MODEL_CACHE is None:
-            _MODEL_CACHE = load_model()  # ✅ 保存在全局
-        return _MODEL_CACHE  # ✅ 返回而不是存储
-    def run_inference(self):
-        model = self.initialize_model(device)  # ✅ 局部变量
-        prediction = model.inference(...)  # ✅ 使用局部变量
-        prediction = self._move_prediction_to_cpu(prediction)  # ✅ 移到 CPU
-        return prediction  # ✅ 安全返回
-```
-## 🎯 为什么这样修改？
-### 1. 全局变量 vs 实例变量
-| 方式 | 问题 | 原因 |
-|------|------|------|
-| `self.model` | ❌ 跨进程状态混乱 | 实例在主进程创建 |
-| `_MODEL_CACHE` | ✅ 子进程内安全 | 每个子进程独立 |
-### 2. 返回 CPU 张量
-```python
-# ❌ 直接返回会报错
-return prediction  # prediction.gaussians.means is on CUDA
-# ✅ 移到 CPU 后返回
-prediction = move_to_cpu(prediction)
-return prediction  # All tensors are on CPU, pickle safe
-```
-## 🧪 测试修复
-```bash
-# 1. 应用修改
-# 复制上面的完整代码到 model_inference.py
-# 2. 推送到 Spaces
-git add depth_anything_3/app/modules/model_inference.py
-git commit -m "Fix: Spaces GPU CUDA initialization error"
-git push
-# 3. 测试多次运行
-# 在 Space 中连续运行 2-3 次推理
-# 应该不再出现 CUDA 错误
-```
-## 📊 修复效果
-| 问题 | Before | After |
-|------|--------|-------|
-| 第一次推理 | ❌ CUDA 错误 | ✅ 正常 |
-| 第二次推理 | ❌ CUDA 错误 | ✅ 正常 |
-| 连续推理 | ❌ 失败 | ✅ 稳定 |
-| 模型加载 | 每次重新加载 | 缓存复用 |
-## 💡 最佳实践
-对于 `@spaces.GPU` 装饰的函数：
-1. ✅ 使用**全局变量**缓存模型（子进程安全）
-2. ✅ **不要**使用实例变量存储模型
-3. ✅ 返回前**移动所有张量到 CPU**
-4. ✅ 清理 GPU 内存 (`torch.cuda.empty_cache()`)
-5. ❌ **不要**在主进程中初始化 CUDA
-6. ❌ **不要**返回 CUDA 张量
-## 🔗 相关资源
-- [HF Spaces Zero GPU 文档](https://huggingface.co/docs/hub/spaces-gpus#zero-gpu)
-- [PyTorch Multiprocessing](https://pytorch.org/docs/stable/notes/multiprocessing.html)
-- [Pickle 协议](https://docs.python.org/3/library/pickle.html)

SPACES_SETUP.md DELETED Viewed

@@ -1,190 +0,0 @@
-# Hugging Face Spaces 部署指南
-## 📋 概述
-这个项目已经配置好可以部署到 Hugging Face Spaces，使用 `@spaces.GPU` 装饰器来动态分配 GPU 资源。
-## 🎯 关键文件
-### 1. `app.py` - 主应用文件
-```python
-import spaces
-from depth_anything_3.app.gradio_app import DepthAnything3App
-from depth_anything_3.app.modules.model_inference import ModelInference
-# 使用 monkey-patching 将 GPU 装饰器应用到推理函数
-original_run_inference = ModelInference.run_inference
-@spaces.GPU(duration=120)  # 请求 GPU，最多 120 秒
-def gpu_run_inference(self, *args, **kwargs):
-    return original_run_inference(self, *args, **kwargs)
-ModelInference.run_inference = gpu_run_inference
-```
-**工作原理：**
-- `@spaces.GPU` 装饰器在函数调用时动态分配 GPU
-- `duration=120` 表示单次推理最多使用 GPU 120 秒
-- 通过 monkey-patching，我们将装饰器应用到已有的推理函数上，无需修改核心代码
-### 2. `README.md` - Spaces 配置
-```yaml
----
-title: Depth Anything 3
-sdk: gradio
-sdk_version: 5.49.1
-app_file: app.py
-pinned: false
-license: cc-by-nc-4.0
----
-```
-这个 YAML 前置内容告诉 Hugging Face Spaces：
-- 使用 Gradio SDK
-- 入口文件是 `app.py`
-- 使用的 Gradio 版本
-### 3. `pyproject.toml` - 依赖配置
-已经更新，包含了 `spaces` 依赖：
-```toml
-[project.optional-dependencies]
-app = ["gradio>=5", "pillow>=9.0", "spaces"]
-```
-## 🚀 部署步骤
-### 方式 1：通过 Hugging Face 网页界面
-1. 在 Hugging Face 创建一个新的 Space
-2. 选择 **Gradio** 作为 SDK
-3. 上传你的代码（包括 `app.py`, `src/`, `pyproject.toml` 等）
-4. Space 会自动构建并启动
-### 方式 2：通过 Git
-```bash
-# 克隆你的 Space 仓库
-git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
-cd YOUR_SPACE_NAME
-# 添加你的代码
-cp -r /path/to/depth-anything-3/* .
-# 提交并推送
-git add .
-git commit -m "Initial commit"
-git push
-```
-## 🔧 配置选项
-### GPU 类型
-Hugging Face Spaces 支持不同的 GPU 类型：
-- **Free (T4)**: 免费，适合小型模型
-- **A10G**: 付费，更强大
-- **A100**: 付费，最强大
-### GPU Duration
-在 `app.py` 中可以调整：
-```python
-@spaces.GPU(duration=120)  # 120 秒
-```
-- 设置太短：复杂推理可能超时
-- 设置太长：浪费资源
-- 推荐：根据实际推理时间设置（可以先设长一点，然后根据日志调整）
-### 环境变量
-可以在 Space 设置中配置环境变量：
-- `DA3_MODEL_DIR`: 模型目录路径
-- `DA3_WORKSPACE_DIR`: 工作空间目录
-- `DA3_GALLERY_DIR`: 图库目录
-## 📊 监控和调试
-### 查看日志
-在 Spaces 界面点击 "Logs" 标签可以看到：
-```
-🚀 Launching Depth Anything 3 on Hugging Face Spaces...
-📦 Model Directory: depth-anything/DA3NESTED-GIANT-LARGE
-📁 Workspace Directory: workspace/gradio
-🖼️  Gallery Directory: workspace/gallery
-```
-### GPU 使用情况
-在装饰的函数内部，可以检查 GPU 状态：
-```python
-print(torch.cuda.is_available())  # True
-print(torch.cuda.device_count())  # 1 (通常)
-print(torch.cuda.get_device_name(0))  # 'Tesla T4' 或其他
-```
-## 🎓 示例代码
-查看 `example_spaces_gpu.py` 了解 `@spaces.GPU` 装饰器的基本用法。
-## ❓ 常见问题
-### Q: 为什么使用 monkey-patching？
-A: 这样可以在不修改核心代码的情况下添加 Spaces 支持。如果你想更优雅的方式，可以：
-1. 直接在 `ModelInference.run_inference` 方法上添加装饰器
-2. 创建一个继承自 `ModelInference` 的新类
-### Q: 如何测试本地是否能运行？
-A: 本地运行时，`spaces.GPU` 装饰器会被忽略（如果没有安装 spaces 包），或者会直接执行函数而不做特殊处理。
-```bash
-# 本地测试
-python app.py
-```
-### Q: 可以装饰多个函数吗？
-A: 可以！你可以给任何需要 GPU 的函数添加 `@spaces.GPU` 装饰器。
-```python
-@spaces.GPU(duration=60)
-def function1():
-    pass
-@spaces.GPU(duration=120)
-def function2():
-    pass
-```
-### Q: 如何优化 GPU 使用？
-A: 一些建议：
-1. **只装饰必要的函数**：不要装饰整个 app，只装饰实际使用 GPU 的推理函数
-2. **设置合适的 duration**：根据实际需求设置
-3. **清理 GPU 内存**：在函数结束时调用 `torch.cuda.empty_cache()`
-4. **批处理**：如果可能，批量处理多个请求
-## 🔗 相关资源
-- [Hugging Face Spaces 文档](https://huggingface.co/docs/hub/spaces)
-- [Spaces GPU 使用指南](https://huggingface.co/docs/hub/spaces-gpus)
-- [Gradio 文档](https://gradio.app/docs)
-## 📝 许可证
-Apache-2.0

UPLOAD_EXAMPLES.md DELETED Viewed

@@ -1,314 +0,0 @@
-# 📤 上传 Examples 到 Hugging Face Spaces 指南
-## 🚨 问题：二进制文件被拒绝
-Hugging Face Spaces 会拒绝大文件（>100MB）或二进制文件，需要使用 **Git LFS** 来上传。
-## ✅ 解决方案
-### 方案 1：使用 Git LFS（推荐）⭐
-#### 步骤 1：配置 Git LFS
-我已经为你创建了 `.gitattributes` 文件，配置了图片文件的 Git LFS：
-```gitattributes
-# Images in examples directory
-workspace/gradio/examples/**/*.png filter=lfs diff=lfs merge=lfs -text
-workspace/gradio/examples/**/*.jpg filter=lfs diff=lfs merge=lfs -text
-workspace/gradio/examples/**/*.jpeg filter=lfs diff=lfs merge=lfs -text
-workspace/gradio/examples/**/*.bmp filter=lfs diff=lfs merge=lfs -text
-workspace/gradio/examples/**/*.tiff filter=lfs diff=lfs merge=lfs -text
-workspace/gradio/examples/**/*.tif filter=lfs diff=lfs merge=lfs -text
-```
-#### 步骤 2：安装 Git LFS（如果还没有）
-```bash
-# macOS
-brew install git-lfs
-# Linux
-sudo apt-get install git-lfs
-# Windows
-# 下载安装：https://git-lfs.github.com/
-```
-#### 步骤 3：初始化 Git LFS
-```bash
-cd /Users/bytedance/depth-anything-3
-# 初始化 Git LFS
-git lfs install
-# 验证配置
-git lfs track
-```
-#### 步骤 4：添加示例场景
-```bash
-# 创建 examples 目录
-mkdir -p workspace/gradio/examples/my_scene
-# 添加图像文件
-cp your_images/* workspace/gradio/examples/my_scene/
-# 添加文件到 Git LFS
-git add workspace/gradio/examples/
-git add .gitattributes
-# 提交
-git commit -m "Add example scenes with Git LFS"
-# 推送到 Spaces
-git push origin main
-```
-#### 步骤 5：验证
-```bash
-# 检查哪些文件使用了 LFS
-git lfs ls-files
-# 应该看到你的图片文件
-```
----
-### 方案 2：使用持久存储（推荐用于大量数据）⭐
-如果示例场景很大，可以使用 Hugging Face Spaces 的持久存储功能。
-#### 步骤 1：在 Spaces 设置中启用持久存储
-1. 进入你的 Space 设置
-2. 启用 "Persistent storage"
-3. 设置存储大小（如 50GB）
-#### 步骤 2：在应用启动时下载示例
-修改 `app.py`，在启动时从外部源下载示例：
-```python
-import os
-import subprocess
-def download_examples():
-    """Download examples from external source if not exists"""
-    examples_dir = "workspace/gradio/examples"
-    if not os.path.exists(examples_dir) or not os.listdir(examples_dir):
-        print("Downloading example scenes...")
-        # 从 Hugging Face Dataset 下载
-        # 或从其他存储服务下载
-        # subprocess.run(["huggingface-cli", "download", "dataset/examples", ...])
-        pass
-if __name__ == "__main__":
-    download_examples()
-    # ... 启动应用
-```
-#### 步骤 3：上传到 Hugging Face Dataset
-```bash
-# 安装依赖
-pip install huggingface_hub datasets
-# 上传到 Dataset
-python -c "
-from datasets import Dataset
-from huggingface_hub import HfApi
-# 创建 dataset 并上传
-api = HfApi()
-api.upload_folder(
-    folder_path='workspace/gradio/examples',
-    repo_id='your-username/your-examples-dataset',
-    repo_type='dataset'
-)
-"
-```
----
-### 方案 3：压缩后上传（小文件）
-如果图片文件较小（<100MB），可以压缩后上传：
-```bash
-# 压缩 examples 目录
-tar -czf examples.tar.gz workspace/gradio/examples/
-# 添加到 Git（作为普通文件）
-git add examples.tar.gz
-git commit -m "Add compressed examples"
-git push
-# 在应用启动时解压
-# 在 app.py 中添加：
-import tarfile
-if not os.path.exists("workspace/gradio/examples"):
-    print("Extracting examples...")
-    tarfile.open("examples.tar.gz").extractall()
-```
----
-### 方案 4：运行时下载（推荐用于生产）⭐
-在应用启动时从外部源下载示例场景：
-#### 修改 `app.py`
-```python
-import os
-import subprocess
-from huggingface_hub import hf_hub_download
-def setup_examples():
-    """Setup examples directory by downloading if needed"""
-    examples_dir = "workspace/gradio/examples"
-    os.makedirs(examples_dir, exist_ok=True)
-    # 如果 examples 目录为空，从外部源下载
-    if not os.listdir(examples_dir):
-        print("📥 Downloading example scenes...")
-        # 方式 1: 从 Hugging Face Dataset 下载
-        try:
-            from datasets import load_dataset
-            dataset = load_dataset("your-username/your-examples-dataset")
-            # 处理并保存到 examples_dir
-        except:
-            pass
-        # 方式 2: 从 URL 下载压缩包
-        # import urllib.request
-        # urllib.request.urlretrieve("https://...", "examples.zip")
-        # 解压到 examples_dir
-        print("✅ Examples downloaded")
-if __name__ == "__main__":
-    setup_examples()
-    # ... 启动应用
-```
----
-## 🎯 推荐方案对比
-| 方案 | 优点 | 缺点 | 适用场景 |
-|------|------|------|----------|
-| **Git LFS** | ✅ 简单直接<br>✅ 版���控制 | ⚠️ 需要 LFS 配额<br>⚠️ 大文件可能慢 | 小到中等示例（<1GB） |
-| **持久存储** | ✅ 无大小限制<br>✅ 快速访问 | ⚠️ 需要手动上传<br>⚠️ 需要付费 | 大量示例（>1GB） |
-| **运行时下载** | ✅ 不占用仓库空间<br>✅ 灵活 | ⚠️ 首次启动慢<br>⚠️ 需要网络 | 生产环境 |
-| **压缩上传** | ✅ 简单 | ⚠️ 大小限制<br>⚠️ 需要解压 | 小文件（<100MB） |
----
-## 📝 完整 Git LFS 设置步骤
-### 1. 确保 Git LFS 已安装
-```bash
-git lfs version
-# 如果未安装，按照上面的步骤安装
-```
-### 2. 初始化 Git LFS
-```bash
-cd /Users/bytedance/depth-anything-3
-git lfs install
-```
-### 3. 检查 .gitattributes
-确保 `.gitattributes` 包含图片文件配置（我已经添加了）。
-### 4. 添加示例场景
-```bash
-# 创建场景
-mkdir -p workspace/gradio/examples/scene1
-cp your_images/* workspace/gradio/examples/scene1/
-# 添加文件
-git add workspace/gradio/examples/
-git add .gitattributes
-# 检查哪些文件会使用 LFS
-git lfs ls-files
-# 提交
-git commit -m "Add example scenes with Git LFS"
-# 推送
-git push origin main
-```
-### 5. 验证上传
-在 Spaces 中检查文件是否成功上传，图片文件应该显示为 LFS 指针。
----
-## 🔧 故障排除
-### 问题 1：Git LFS 配额不足
-**解决方案：**
-- 使用方案 2（持久存储）或方案 4（运行时下载）
-- 压缩图片文件
-- 只上传必要的示例
-### 问题 2：推送失败
-**检查：**
-```bash
-# 检查 LFS 文件
-git lfs ls-files
-# 检查 LFS 状态
-git lfs status
-# 重新推送
-git push origin main --force
-```
-### 问题 3：文件仍然被拒绝
-**可能原因：**
-- `.gitattributes` 配置不正确
-- 文件没有通过 LFS 添加
-**解决：**
-```bash
-# 移除并重新添加
-git rm --cached workspace/gradio/examples/**/*.png
-git add workspace/gradio/examples/
-git commit -m "Fix: Add images via Git LFS"
-git push
-```
----
-## 💡 最佳实践
-1. **小示例（<100MB）**：使用 Git LFS
-2. **中等示例（100MB-1GB）**：使用 Git LFS 或持久存储
-3. **大示例（>1GB）**：使用持久存储或运行时下载
-4. **生产环境**：使用运行时下载，从外部源获取
----
-## 📚 相关资源
-- [Git LFS 文档](https://git-lfs.github.com/)
-- [Hugging Face Spaces 文档](https://huggingface.co/docs/hub/spaces)
-- [Hugging Face Datasets](https://huggingface.co/docs/datasets)

XFORMERS_GUIDE.md DELETED Viewed

@@ -1,299 +0,0 @@
-# xformers 依赖说明
-## 🔍 问题描述
-构建时遇到 xformers 安装失败：
-```
-RuntimeError: CUTLASS submodule not found. Did you forget to run `git submodule update --init --recursive` ?
-```
-## ✅ 好消息：xformers 不是必需的！
-你的代码已经有 **fallback 机制**，在没有 xformers 的情况下会自动使用纯 PyTorch 实现：
-```python
-# src/depth_anything_3/model/dinov2/layers/swiglu_ffn.py
-try:
-    from xformers.ops import SwiGLU
-    XFORMERS_AVAILABLE = True
-except ImportError:
-    SwiGLU = SwiGLUFFN  # 使用纯 PyTorch 实现
-    XFORMERS_AVAILABLE = False
-```
-**性能差异：**
-- **有 xformers**: 稍快一些（~5-10%）
-- **无 xformers**: 稍慢一些，但功能完全相同
-## 🎯 推荐配置
-### 当前配置（已设置）✅
-**requirements.txt** - xformers 已注释掉：
-```txt
-# xformers - install separately if needed
-```
-这样可以确保构建成功，应用正常运行。
-## 📝 三种使用方式
----
-### 方式 1：不使用 xformers（当前配置）⭐ 推荐
-**优点：**
-- ✅ 构建快速（5-10 分钟）
-- ✅ 100% 成功率
-- ✅ 功能完整
-- ✅ 无需处理兼容性问题
-**缺点：**
-- ⚠️ 性能略低（5-10%）
-**适用场景：**
-- HF Spaces 部署
-- 快速测试
-- 不想处理编译问题
----
-### 方式 2：使用预编译 xformers
-如果你想要更好的性能，可以使用预编译版本：
-**步骤 1：确定 PyTorch 和 CUDA 版本**
-```python
-import torch
-print(f"PyTorch: {torch.__version__}")
-print(f"CUDA: {torch.version.cuda}")
-```
-**步骤 2：选择对应的 xformers 版本**
-访问：https://github.com/facebookresearch/xformers#installing-xformers
-| PyTorch | CUDA | xformers |
-|---------|------|----------|
-| 2.1.x | 11.8 | 0.0.23 |
-| 2.0.x | 11.8 | 0.0.22 |
-| 2.0.x | 11.7 | 0.0.20 |
-**步骤 3：修改 requirements.txt**
-```txt
-# 在 torch 和 torchvision 之后添加
-torch==2.1.0
-torchvision==0.16.0
-xformers==0.0.23  # 匹配 PyTorch 2.1 + CUDA 11.8
-```
-**或者使用官方索引：**
-```txt
-torch==2.1.0
-torchvision==0.16.0
---extra-index-url https://download.pytorch.org/whl/cu118
-xformers==0.0.23
-```
----
-### 方式 3：从源码编译（不推荐）
-**仅在以下情况考虑：**
-- 需要最新的 xformers 功能
-- 有特殊的 CUDA 版本需求
-- 愿意花费 15-30 分钟构建时间
-**requirements.txt:**
-```txt
-# 需要 CUDA 环境和 git submodules
-xformers @ git+https://github.com/facebookresearch/xformers.git
-```
-**额外要求：**
-**packages.txt:**
-```txt
-build-essential
-git
-ninja-build
-```
-**注意：**
-- ⚠️ 构建可能失败
-- ⚠️ 构建时间长
-- ⚠️ 需要 GPU 环境
----
-## 🔧 实际配置示例
-### 示例 1：HF Spaces（推荐）✅
-**requirements.txt:**
-```txt
-torch>=2.0.0
-torchvision
-gradio>=5.0.0
-spaces
-# xformers 不包含 - 使用 PyTorch fallback
-```
-**效果：**
-- 构建时间：5-10 分钟
-- 成功率：100%
-- 性能：良好
-### 示例 2：带预编译 xformers
-**requirements.txt:**
-```txt
-torch==2.1.0
-torchvision==0.16.0
-xformers==0.0.23
-gradio>=5.0.0
-spaces
-```
-**效果：**
-- 构建时间：8-12 分钟
-- 成功率：95%（取决于版本匹配）
-- 性能：最佳
-### 示例 3：本地开发（最灵活）
-```bash
-# 先安装基础依赖
-pip install -r requirements.txt
-# 可选：安装 xformers（如果需要）
-pip install xformers==0.0.23
-# 或者让 PyTorch 自动选择版本
-pip install xformers
-```
----
-## 🐛 常见问题
-### Q1: 如何知道是否使用了 xformers？
-**检查代码：**
-```python
-from depth_anything_3.model.dinov2.layers.swiglu_ffn import XFORMERS_AVAILABLE
-print(f"xformers available: {XFORMERS_AVAILABLE}")
-```
-**或者在日志中查看：**
-```python
-import logging
-logging.basicConfig(level=logging.INFO)
-# 如果 xformers 不可用，不会有错误，只是使用 fallback
-```
-### Q2: xformers 版本不匹配怎么办？
-**错误信息：**
-```
-RuntimeError: xformers is not compatible with this PyTorch version
-```
-**解决方法：**
-1. 移除 xformers（使用 fallback）
-2. 或者匹配 PyTorch 和 xformers 版本（参考上面的表格）
-### Q3: 性能差异大吗？
-**基准测试（参考）：**
-- 单图推理：几乎无差异（< 5%）
-- 批量推理：5-10% 差异
-- 内存使用：相近
-**结论：** 对大多数用户来说，差异可以忽略。
-### Q4: 为什么不直接包含 xformers？
-**原因：**
-1. **兼容性复杂** - 需要精确匹配 PyTorch、CUDA、Python 版本
-2. **构建不稳定** - 从源码编译经常失败
-3. **不是必需的** - 代码有 fallback
-4. **增加构建时间** - 可能增加 5-15 分钟
----
-## 📊 性能对比
-### 推理速度（单图，GPU T4）
-| 配置 | 时间 | 相对速度 |
-|------|------|---------|
-| PyTorch (无 xformers) | 1.00s | 100% |
-| xformers 0.0.23 | 0.95s | 105% ⚡ |
-**结论：** 性能提升不明显，不值得为此增加部署复杂度。
-### 构建时间
-| 配置 | 首次构建 | 成功率 |
-|------|---------|--------|
-| 无 xformers | 5-10 分钟 | ✅ 100% |
-| 预编译 xformers | 8-12 分钟 | ✅ 95% |
-| 源码编译 xformers | 20-40 分钟 | ⚠️ 60% |
----
-## 🎯 最终建议
-### 对于 HF Spaces 部署：⭐
-**推荐：不使用 xformers**
-理由：
-1. 构建稳定可靠
-2. 性能差异可忽略
-3. 用户体验更好（不会因构建失败而无法使用）
-### 对于本地开发：
-**可选：安装预编译 xformers**
-```bash
-pip install -r requirements.txt
-pip install xformers  # 可选
-```
-### 对于生产环境：
-**如需最佳性能，使用预编译 xformers**
-```txt
-torch==2.1.0
-xformers==0.0.23
-```
----
-## 🔗 相关资源
-- [xformers GitHub](https://github.com/facebookresearch/xformers)
-- [xformers 安装指南](https://github.com/facebookresearch/xformers#installing-xformers)
-- [PyTorch 版本兼容性](https://pytorch.org/get-started/previous-versions/)
----
-## ✅ 当前状态
-你的配置：
-- ✅ **requirements.txt** - xformers 已注释（使用 fallback）
-- ✅ **代码支持** - 自动 fallback 到 PyTorch 实现
-- ✅ **功能完整** - 所有功能正常工作
-- ✅ **构建稳定** - 100% 成功率
-**无需进一步操作，可以直接部署！** 🚀

depth_anything_3/app/css_and_html.py CHANGED Viewed

@@ -390,7 +390,7 @@ def get_header_html(logo_base64=None):
                 <a href="https://depth-anything-3.github.io" target="_blank" class="link-btn">
                     <i class="fas fa-globe" style="margin-right: 8px;"></i> Project Page
                 </a>
-                <a href="https://arxiv.org/abs/2406.09414" target="_blank" class="link-btn">
                     <i class="fas fa-file-pdf" style="margin-right: 8px;"></i> Paper
                 </a>
                 <a href="https://github.com/ByteDance-Seed/Depth-Anything-3" target="_blank" class="link-btn">

                 <a href="https://depth-anything-3.github.io" target="_blank" class="link-btn">
                     <i class="fas fa-globe" style="margin-right: 8px;"></i> Project Page
                 </a>
+                <a href="https://arxiv.org/abs/2511.10647" target="_blank" class="link-btn">
                     <i class="fas fa-file-pdf" style="margin-right: 8px;"></i> Paper
                 </a>
                 <a href="https://github.com/ByteDance-Seed/Depth-Anything-3" target="_blank" class="link-btn">

fix_spaces_gpu.patch DELETED Viewed

@@ -1,142 +0,0 @@
---- a/depth_anything_3/app/modules/model_inference.py
-+++ b/depth_anything_3/app/modules/model_inference.py
-@@ -31,47 +31,67 @@ from depth_anything_3.utils.export.glb import export_to_glb
- from depth_anything_3.utils.export.gs import export_to_gs_video
-+# Global cache for model (used in GPU subprocess)
-+# This is safe because @spaces.GPU runs in isolated subprocess
-+_MODEL_CACHE = None
-+
-+
- class ModelInference:
-     """
-     Handles model inference and data processing for Depth Anything 3.
-     """
-     def __init__(self):
--        """Initialize the model inference handler."""
--        self.model = None
--
--    def initialize_model(self, device: str = "cuda") -> None:
-+        """Initialize the model inference handler.
-+
-+        Note: Do NOT store model in instance variable to avoid
-+        state sharing issues with @spaces.GPU decorator.
-+        """
-+        pass  # No instance variables
-+
-+    def initialize_model(self, device: str = "cuda"):
-         """
-         Initialize the DepthAnything3 model.
-+
-+        Uses global cache to store model safely in GPU subprocess.
-+        This avoids CUDA initialization in main process.
-         Args:
-             device: Device to load the model on
-+
-+        Returns:
-+            Model instance
-         """
--        if self.model is None:
-+        global _MODEL_CACHE
-+
-+        if _MODEL_CACHE is None:
-             # Get model directory from environment variable or use default
-             model_dir = os.environ.get(
-                 "DA3_MODEL_DIR", "/dev/shm/da3_models/DA3HF-VITG-METRIC_VITL"
-             )
--            self.model = DepthAnything3.from_pretrained(model_dir)
--            self.model = self.model.to(device)
-+            print(f"Loading model from {model_dir}...")
-+            _MODEL_CACHE = DepthAnything3.from_pretrained(model_dir)
-+            _MODEL_CACHE = _MODEL_CACHE.to(device)
-+            _MODEL_CACHE.eval()
-+            print("Model loaded and moved to GPU")
-         else:
--            self.model = self.model.to(device)
--
--        self.model.eval()
-+            print("Using cached model")
-+            # Ensure model is on correct device
-+            _MODEL_CACHE = _MODEL_CACHE.to(device)
-+
-+        return _MODEL_CACHE
-     def run_inference(
-         self,
-         ...
-         # Initialize model if needed
--        self.initialize_model(device)
-+        model = self.initialize_model(device)
-         ...
-         # Run model inference
-         print(f"Running inference with method: {actual_method}")
-         with torch.no_grad():
--            prediction = self.model.inference(
-+            prediction = model.inference(
-                 image_paths, export_dir=None, process_res_method=actual_method, infer_gs=infer_gs
-             )
-@@ -192,6 +212,10 @@ class ModelInference:
-         # Process results
-         processed_data = self._process_results(target_dir, prediction, image_paths)
-+        # CRITICAL: Move all CUDA tensors to CPU before returning
-+        # This prevents CUDA initialization in main process during unpickling
-+        prediction = self._move_prediction_to_cpu(prediction)
-+
-         # Clean up
-         torch.cuda.empty_cache()
-@@ -282,6 +306,45 @@ class ModelInference:
-         return processed_data
-+    def _move_prediction_to_cpu(self, prediction: Any) -> Any:
-+        """
-+        Move all CUDA tensors in prediction to CPU for safe pickling.
-+
-+        This is REQUIRED for HF Spaces with @spaces.GPU decorator to avoid
-+        CUDA initialization in the main process during unpickling.
-+
-+        Args:
-+            prediction: Prediction object that may contain CUDA tensors
-+
-+        Returns:
-+            Prediction object with all tensors moved to CPU
-+        """
-+        # Move gaussians tensors to CPU
-+        if hasattr(prediction, 'gaussians') and prediction.gaussians is not None:
-+            gaussians = prediction.gaussians
-+
-+            # Move each tensor attribute to CPU
-+            tensor_attrs = ['means', 'scales', 'rotations', 'harmonics', 'opacities']
-+            for attr in tensor_attrs:
-+                if hasattr(gaussians, attr):
-+                    tensor = getattr(gaussians, attr)
-+                    if isinstance(tensor, torch.Tensor) and tensor.is_cuda:
-+                        setattr(gaussians, attr, tensor.cpu())
-+                        print(f"Moved gaussians.{attr} to CPU")
-+
-+        # Move any tensors in aux dict to CPU
-+        if hasattr(prediction, 'aux') and prediction.aux is not None:
-+            for key, value in list(prediction.aux.items()):
-+                if isinstance(value, torch.Tensor) and value.is_cuda:
-+                    prediction.aux[key] = value.cpu()
-+                    print(f"Moved aux['{key}'] to CPU")
-+                elif isinstance(value, dict):
-+                    # Recursively handle nested dicts
-+                    for k, v in list(value.items()):
-+                        if isinstance(v, torch.Tensor) and v.is_cuda:
-+                            value[k] = v.cpu()
-+                            print(f"Moved aux['{key}']['{k}'] to CPU")
-+
-+        return prediction
-+
-     def cleanup(self) -> None:
-         """Clean up GPU memory."""