Add SDMatte model files and scripts
Browse files- README.md +65 -0
- SDMatte_plus_bf16_inference.pth +3 -0
- SDMatte_plus_fp16_inference.pth +3 -0
- convert_precision.py +56 -0
README.md
ADDED
|
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# SDMatte_plus-fp16-and-bf16
|
| 2 |
+
|
| 3 |
+
This repository provides optimized, inference-only versions of the original [SDMatte model by LongfeiHuang](https://huggingface.co/LongfeiHuang/SDMatte).
|
| 4 |
+
|
| 5 |
+
The models here have been specifically processed to be lightweight and efficient for deployment and use in applications like ComfyUI, without compromising the quality of the matting results.
|
| 6 |
+
|
| 7 |
+
### What is this?
|
| 8 |
+
|
| 9 |
+
This repository contains inference-only weights for the **SDMatte** model. The original checkpoint file (`.pth`) was a full training checkpoint, which included not only the model weights but also ~6.5 GB of trainer states (like optimizer states). These trainer states are crucial for resuming training but are unnecessary for performing inference (i.e., actually using the model for matting).
|
| 10 |
+
|
| 11 |
+
### Optimizations Performed
|
| 12 |
+
|
| 13 |
+
1. **Removal of Trainer States**: The largest optimization was stripping the `trainer` key from the original checkpoint. This removes all unnecessary data related to the training process, significantly reducing the file size without affecting the model's output.
|
| 14 |
+
|
| 15 |
+
2. **16-bit Precision Quantization**: The model weights have been converted from their original 32-bit floating-point precision (FP32) to 16-bit precision. We provide two popular formats:
|
| 16 |
+
* **FP16 (half-precision)**: Offers a great balance of speed, reduced memory usage, and high quality. It is supported by most modern NVIDIA GPUs (10-series and newer).
|
| 17 |
+
* **BF16 (bfloat16)**: Offers a dynamic range identical to FP32, making it more resilient to overflow/underflow issues. It provides the best performance on newer NVIDIA GPUs (RTX 30-series and newer).
|
| 18 |
+
|
| 19 |
+
### DIY Quantization with `convert_precision.py`
|
| 20 |
+
|
| 21 |
+
This repository also includes the Python script, `convert_precision.py`, which was used to create these fp16 and bf16 models. You can use this script to convert the original FP32 checkpoint yourself.
|
| 22 |
+
|
| 23 |
+
1. Place the original FP32 `SDMatte_plus.pth` file in the same folder as the script.
|
| 24 |
+
2. Open the `convert_precision.py` file with a text editor.
|
| 25 |
+
3. Modify the `TARGET_PRECISION` variable at the top to either `'fp16'` or `'bf16'`.
|
| 26 |
+
4. Run the script from your terminal: `python convert_precision.py`.
|
| 27 |
+
|
| 28 |
+
### Acknowledgements
|
| 29 |
+
|
| 30 |
+
Huge thanks to **LongfeiHuang** for creating and open-sourcing the original SDMatte model. This repository is merely an optimized packaging of their incredible work. Please visit the original repository for more details on the model's architecture and training.
|
| 31 |
+
|
| 32 |
+
* **Original Model**: [https://huggingface.co/LongfeiHuang/SDMatte](https://huggingface.co/LongfeiHuang/SDMatte)
|
| 33 |
+
|
| 34 |
+
---
|
| 35 |
+
|
| 36 |
+
## 中文版本
|
| 37 |
+
|
| 38 |
+
### 这是什么?
|
| 39 |
+
|
| 40 |
+
本仓库提供原始 [SDMatte 模型(作者 LongfeiHuang)](https://huggingface.co/LongfeiHuang/SDMatte) 的优化版,仅用于推理。
|
| 41 |
+
|
| 42 |
+
这里的模型都经过了专门处理,旨在使其轻量化且高效,以方便在 ComfyUI 等应用中部署和使用,同时不影响抠图效果的质量。
|
| 43 |
+
|
| 44 |
+
### 执行的优化
|
| 45 |
+
|
| 46 |
+
1. **移除 Trainer 状态**:最大的优化是剥离了原始检查点中的 `trainer` 键。这移除了所有与训练过程相关的非必要数据(如优化器状态),在完全不影响模型输出质量的前提下,极大地减小了文件体积。原始文件中约 6.5 GB 的数据都属于此类。
|
| 47 |
+
|
| 48 |
+
2. **16 位精度量化**:模型权重已从原始的 32 位浮点精度 (FP32) 转换为 16 位精度。我们提供了两种主流格式:
|
| 49 |
+
* **FP16 (半精度)**:在速度、显存占用和高质量之间取得了绝佳的平衡。它被大多数现代 NVIDIA GPU(10 系及更新)所支持。
|
| 50 |
+
* **BF16 (bfloat16)**:拥有与 FP32 相同的动态范围,使其在处理数据溢出/下溢问题时更具弹性。它在较新的 NVIDIA GPU(RTX 30 系及更新)上能提供最佳性能。
|
| 51 |
+
|
| 52 |
+
### 使用 `convert_precision.py` 自行量化
|
| 53 |
+
|
| 54 |
+
本仓库同样包含了用于创建这些 fp16 和 bf16 模型的 Python 脚本 `convert_precision.py`。您可以使用此脚本,自行将原始的 FP32 检查点进行转换。
|
| 55 |
+
|
| 56 |
+
1. 将原始的 FP32 `SDMatte_plus.pth` 文件放置于脚本所在的同一个文件夹内。
|
| 57 |
+
2. 使用文本编辑器打开 `convert_precision.py` 文件。
|
| 58 |
+
3. 修改文件顶部的 `TARGET_PRECISION` 变量,将其设置为 `'fp16'` 或 `'bf16'`。
|
| 59 |
+
4. 在您的终端中运行脚本:`python convert_precision.py`。
|
| 60 |
+
|
| 61 |
+
### 致谢
|
| 62 |
+
|
| 63 |
+
非常感谢 **LongfeiHuang** 创建并开源了卓越的 SDMatte 模型。本仓库的工作仅仅是对其出色成果的优化和打包。有关模型架构和训练的更多详细信息,请访问原始仓库。
|
| 64 |
+
|
| 65 |
+
* **原始模型仓库**:[https://huggingface.co/LongfeiHuang/SDMatte](https://huggingface.co/LongfeiHuang/SDMatte)
|
SDMatte_plus_bf16_inference.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9f04e5cb6a50af6516cea8908ed2de77d5ae2753b33b6e4ea0cbb61146a27f55
|
| 3 |
+
size 2594696326
|
SDMatte_plus_fp16_inference.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0f407176bc081bd58636c472fa150891559d9bb24ae00399e9d86254c4624572
|
| 3 |
+
size 2594696326
|
convert_precision.py
ADDED
|
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import torch
|
| 2 |
+
import os
|
| 3 |
+
|
| 4 |
+
# --- 配置 ---
|
| 5 |
+
# 1. 设置目标精度: 'fp16' 或 'bf16'
|
| 6 |
+
TARGET_PRECISION = 'bf16'
|
| 7 |
+
|
| 8 |
+
# 2. 设置原始32位训练检查点文件的路径
|
| 9 |
+
fp32_checkpoint_path = r"E:\comfyui\ComfyUI-aki-v1.3\models\SDMatte\SDMatte_plus.pth"
|
| 10 |
+
|
| 11 |
+
# --------------------------------------------------------------------
|
| 12 |
+
|
| 13 |
+
# 自动生成输出文件名
|
| 14 |
+
output_filename = fp32_checkpoint_path.replace('.pth', f'_{TARGET_PRECISION}_inference.pth')
|
| 15 |
+
|
| 16 |
+
if not os.path.exists(fp32_checkpoint_path):
|
| 17 |
+
print(f"[错误] 文件不存在: {fp32_checkpoint_path}")
|
| 18 |
+
else:
|
| 19 |
+
try:
|
| 20 |
+
print(f"--- 开始处理训练检查点: {fp32_checkpoint_path} ---")
|
| 21 |
+
full_checkpoint = torch.load(fp32_checkpoint_path, map_location="cpu", weights_only=False)
|
| 22 |
+
|
| 23 |
+
# 检查 'model' 键是否存在,这是包含我们所需权重的部分
|
| 24 |
+
if 'model' in full_checkpoint:
|
| 25 |
+
# 明确提取出模型的 state_dict
|
| 26 |
+
state_dict = full_checkpoint['model']
|
| 27 |
+
print("成功提取到 'model' 键中的权重字典。")
|
| 28 |
+
else:
|
| 29 |
+
# 如果没有 'model' 键,则假定整个文件就是 state_dict
|
| 30 |
+
print("[警告] 未在顶层找到 'model' 键,将尝试转换整个文件。")
|
| 31 |
+
state_dict = full_checkpoint
|
| 32 |
+
|
| 33 |
+
print(f"开始将权重转换为 {TARGET_PRECISION} ...")
|
| 34 |
+
|
| 35 |
+
target_dtype = torch.float16 if TARGET_PRECISION == 'fp16' else torch.bfloat16
|
| 36 |
+
|
| 37 |
+
# 遍历权重字典中的每一项并进行转换
|
| 38 |
+
for key in state_dict:
|
| 39 |
+
if isinstance(state_dict[key], torch.Tensor) and state_dict[key].is_floating_point():
|
| 40 |
+
state_dict[key] = state_dict[key].to(target_dtype)
|
| 41 |
+
|
| 42 |
+
print(f"正在保存纯推理模型到: {output_filename} ...")
|
| 43 |
+
# 直接保存处理后的 state_dict,不包含任何训练相关的附加信息
|
| 44 |
+
torch.save(state_dict, output_filename)
|
| 45 |
+
|
| 46 |
+
# 打印大小对比
|
| 47 |
+
original_size_gb = os.path.getsize(fp32_checkpoint_path) / (1024**3)
|
| 48 |
+
final_size_gb = os.path.getsize(output_filename) / (1024**3)
|
| 49 |
+
|
| 50 |
+
print("\n--- 转换成功 ---")
|
| 51 |
+
print(f"原始训练检查点大小: {original_size_gb:.2f} GB")
|
| 52 |
+
print(f"生成的纯推理模型大小 ({TARGET_PRECISION.upper()}): {final_size_gb:.2f} GB")
|
| 53 |
+
print("说明: 新文件只包含用于推理的核心模型权重,已移除训练相关的优化器状态。")
|
| 54 |
+
|
| 55 |
+
except Exception as e:
|
| 56 |
+
print(f"\n[错误] 处理过程中发生错误: {e}")
|