LucasFang commited on
Commit
1168402
·
verified ·
1 Parent(s): 73ad092

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -19
README.md CHANGED
@@ -1,14 +1,10 @@
1
  # GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
2
  <div align="center">
3
- <a href="https://rongyaofang.github.io/"><img src="https://img.shields.io/badge/Project-Homepage-green" alt="Home"></a>
4
- <a href="https://arxiv.org/abs/xxxx"><img src="https://img.shields.io/badge/ArXiv-xxxx-red"></a>
5
- <img src="https://visitor-badge.laobi.icu/badge?page_id=rongyaofang/GoT" alt="visitors">
6
-
7
- [Rongyao Fang](https://scholar.google.com/citations?user=FtH3CW4AAAAJ&hl=en)<sup>1\*</sup>, [Chengqi Duan](https://scholar.google.com/citations?user=r9qb4ZwAAAAJ&hl=zh-CN)<sup>2\*</sup>, [Kun Wang]()<sup>3</sup>, [Linjiang Huang](https://leonhlj.github.io/)<sup>6</sup>, [Hao Li](https://scholar.google.com/citations?user=qHqQsY4AAAAJ&hl=zh-CN)<sup>1,4</sup>, [Shilin Yan](https://scholar.google.com/citations?user=2VhjOykAAAAJ&hl=zh-CN), [Hao Tian]()<sup>3</sup>, [Xingyu Zeng]()<sup>3</sup>, [Rui Zhao]()<sup>3</sup>, [Jifeng Dai](https://jifengdai.org/)<sup>4,5</sup>, [Xihui Liu](https://xh-liu.github.io/)<sup>2 :envelope:</sup>, [Hongsheng Li](https://www.ee.cuhk.edu.hk/~hsli/)<sup>1 :envelope:</sup>
8
 
9
  <sup>1</sup>CUHK MMLab, <sup>2</sup>HKU MMLab, <sup>3</sup>SenseTime, <sup>4</sup>Shanghai AI Laboratory, <sup>5</sup>Tsinghua University, <sup>6</sup>Beihang University
10
 
11
- *Equal contribution, :envelope:Corresponding authors
12
  </div>
13
 
14
  <div align="center" style="line-height: 1.2;">
@@ -110,19 +106,6 @@ Our approach also demonstrates superior performance on image editing benchmarks:
110
 
111
  </div>
112
 
113
- ### Interactive Generation
114
-
115
- One of the unique capabilities of GoT is interactive generation, allowing users to modify the reasoning chain to customize the generated images:
116
-
117
- <div align="center">
118
- <img src="figures/interactive.png" width="100%" alt="Interactive Generation" />
119
- </div>
120
-
121
- Users can interact with the reasoning chain to:
122
- 1. Replace objects
123
- 2. Adjust object positions
124
- 3. Modify object attributes
125
-
126
  ## Usage
127
 
128
  ### Dependencies
 
1
  # GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
2
  <div align="center">
3
+ [Rongyao Fang](https://scholar.google.com/citations?user=FtH3CW4AAAAJ&hl=en)<sup>1\*</sup>, [Chengqi Duan](https://scholar.google.com/citations?user=r9qb4ZwAAAAJ&hl=zh-CN)<sup>2\*</sup>, [Kun Wang]()<sup>3</sup>, [Linjiang Huang](https://leonhlj.github.io/)<sup>6</sup>, [Hao Li](https://scholar.google.com/citations?user=qHqQsY4AAAAJ&hl=zh-CN)<sup>1,4</sup>, [Shilin Yan](https://scholar.google.com/citations?user=2VhjOykAAAAJ&hl=zh-CN), [Hao Tian]()<sup>3</sup>, [Xingyu Zeng]()<sup>3</sup>, [Rui Zhao]()<sup>3</sup>, [Jifeng Dai](https://jifengdai.org/)<sup>4,5</sup>, [Xihui Liu](https://xh-liu.github.io/)<sup>2</sup>, [Hongsheng Li](https://www.ee.cuhk.edu.hk/~hsli/)<sup>1</sup>
 
 
 
 
4
 
5
  <sup>1</sup>CUHK MMLab, <sup>2</sup>HKU MMLab, <sup>3</sup>SenseTime, <sup>4</sup>Shanghai AI Laboratory, <sup>5</sup>Tsinghua University, <sup>6</sup>Beihang University
6
 
7
+ *Equal contribution
8
  </div>
9
 
10
  <div align="center" style="line-height: 1.2;">
 
106
 
107
  </div>
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
  ## Usage
110
 
111
  ### Dependencies