0xLDF
/

Seg2Any

Model card Files Files and versions

xet

Community

0xLDF commited on Aug 15, 2025

Commit

3026adb

verified ·

1 Parent(s): 36c2bb9

Update README.md

Browse files

Files changed (1) hide show

README.md +42 -3

README.md CHANGED Viewed

@@ -1,3 +1,42 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+<h1 align="center">Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control</h1>
+<div align='center'>
+    <a href="https://github.com/0xLDF" target="_blank">Danfeng Li</a><sup>1*</sup>,</span>
+    <a href="https://huizhang0812.github.io/" target="_blank">Hui Zhang</a><sup>1*</sup>,</span>
+    <a href="https://www.linkedin.com/in/sheng-wang-4620863a/" target="_blank">Sheng Wang</a><sup>2</sup>,
+    <a href="https://scholar.google.com/citations?user=qkaJhBMAAAAJ&hl=zh-CN" target="_blank">Jiacheng Li<a><sup>2</sup>,
+    <a href="https://zxwu.azurewebsites.net/" target="_blank">Zuxuan Wu</a><sup>1†</sup>
+</div>
+<div align='center'>
+    <br><sup>1</sup>Fudan University <sup>2</sup>HiThink Research
+    <br><small><sup>*</sup>Equal Contribution. <sup>†</sup>Corresponding author. </small>
+</div>
+<br>
+<div align="center">
+  <!-- <a href='LICENSE'><img src='https://img.shields.io/badge/license-MIT-yellow'></a> -->
+  <a href='https://seg2any.github.io'><img src='https://img.shields.io/badge/Project-Page-Green'></a>
+  <a href='https://arxiv.org/abs/2506.00596'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a>
+<a href='https://github.com/0xLDF/Seg2Any'><img src='https://img.shields.io/badge/⭐_GitHub-Code-blue' alt='GitHub'></a>
+  <a href="https://huggingface.co/datasets/0xLDF/SACap-1M"><img src="https://img.shields.io/badge/🤗_HuggingFace-Dataset-ffbd45.svg" alt="HuggingFace"></a>
+  <a href="https://huggingface.co/datasets/0xLDF/SACap-eval"><img src="https://img.shields.io/badge/🤗_HuggingFace-Benchmark-ffbd45.svg" alt="HuggingFace"></a>
+</div>
+<br>
+<p align="center">
+  <img src="assets/demo.png" width="90%" height="90%">
+</p>
+## Overview
+<p align="center">
+  <img src="assets/framework_seg2any.png" width="90%" height="90%">
+</p>
+(a) An overview of the Seg2Any framework. Seg2Any, which is built on the **FLUX.1-dev** foundation model, first converts segmentation masks into an Entity Contour Map and then encodes them into condition tokens via the frozen VAE. Negligible tokens are filtered out for efficiency. The resulting text, image, and condition tokens are concatenated into a unified sequence for MM-Attention. Our framework applies LoRA to all branches, achieving S2I generation with minimal extra parameters. (b) Attention Masks in MM-Attention, including Semantic Alignment Attention Mask and Attribute Isolation Attention Mask.