0xLDF commited on
Commit
3026adb
·
verified ·
1 Parent(s): 36c2bb9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -3
README.md CHANGED
@@ -1,3 +1,42 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ <h1 align="center">Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control</h1>
6
+
7
+ <div align='center'>
8
+ <a href="https://github.com/0xLDF" target="_blank">Danfeng Li</a><sup>1*</sup>,</span>
9
+ <a href="https://huizhang0812.github.io/" target="_blank">Hui Zhang</a><sup>1*</sup>,</span>
10
+ <a href="https://www.linkedin.com/in/sheng-wang-4620863a/" target="_blank">Sheng Wang</a><sup>2</sup>,
11
+ <a href="https://scholar.google.com/citations?user=qkaJhBMAAAAJ&hl=zh-CN" target="_blank">Jiacheng Li<a><sup>2</sup>,
12
+ <a href="https://zxwu.azurewebsites.net/" target="_blank">Zuxuan Wu</a><sup>1†</sup>
13
+ </div>
14
+
15
+ <div align='center'>
16
+ <br><sup>1</sup>Fudan University <sup>2</sup>HiThink Research
17
+ <br><small><sup>*</sup>Equal Contribution. <sup>†</sup>Corresponding author. </small>
18
+ </div>
19
+ <br>
20
+
21
+ <div align="center">
22
+ <!-- <a href='LICENSE'><img src='https://img.shields.io/badge/license-MIT-yellow'></a> -->
23
+ <a href='https://seg2any.github.io'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
24
+ <a href='https://arxiv.org/abs/2506.00596'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
25
+ <a href='https://github.com/0xLDF/Seg2Any'><img src='https://img.shields.io/badge/⭐_GitHub-Code-blue' alt='GitHub'></a>
26
+ <a href="https://huggingface.co/datasets/0xLDF/SACap-1M"><img src="https://img.shields.io/badge/🤗_HuggingFace-Dataset-ffbd45.svg" alt="HuggingFace"></a>
27
+ <a href="https://huggingface.co/datasets/0xLDF/SACap-eval"><img src="https://img.shields.io/badge/🤗_HuggingFace-Benchmark-ffbd45.svg" alt="HuggingFace"></a>
28
+
29
+ </div>
30
+ <br>
31
+
32
+ <p align="center">
33
+ <img src="assets/demo.png" width="90%" height="90%">
34
+ </p>
35
+
36
+ ## Overview
37
+
38
+ <p align="center">
39
+ <img src="assets/framework_seg2any.png" width="90%" height="90%">
40
+ </p>
41
+
42
+ (a) An overview of the Seg2Any framework. Seg2Any, which is built on the **FLUX.1-dev** foundation model, first converts segmentation masks into an Entity Contour Map and then encodes them into condition tokens via the frozen VAE. Negligible tokens are filtered out for efficiency. The resulting text, image, and condition tokens are concatenated into a unified sequence for MM-Attention. Our framework applies LoRA to all branches, achieving S2I generation with minimal extra parameters. (b) Attention Masks in MM-Attention, including Semantic Alignment Attention Mask and Attribute Isolation Attention Mask.