File size: 4,273 Bytes
2bb06dc
 
 
 
 
 
 
 
 
3824828
 
344c904
2bb06dc
a6bbf6e
2bb06dc
a6bbf6e
 
0a00f14
a6bbf6e
 
0a00f14
 
a6bbf6e
2bb06dc
a6bbf6e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2bb06dc
 
a6bbf6e
 
 
 
 
 
 
 
 
 
2bb06dc
 
a6bbf6e
 
 
 
 
2bb06dc
 
a6bbf6e
 
 
2bb06dc
a6bbf6e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2bb06dc
a6bbf6e
 
 
2bb06dc
 
a6bbf6e
2bb06dc
a6bbf6e
0a00f14
 
 
 
 
 
2bb06dc
a6bbf6e
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
---
license: apache-2.0
datasets:
- WeiChow/CrispEdit-2M
language:
- en
pipeline_tag: image-to-image
tags:
- image-edit
base_model:
- google/gemma-2-2b-it
- MeissonFlow/Meissonic
---
# EditMGT

<div align="center">
  
[![arXiv](https://img.shields.io/badge/arXiv-2512.11715-b31b1b.svg)](https://arxiv.org/abs/2512.11715)
[![Dataset](https://img.shields.io/badge/πŸ€—%20CrispEdit2M-Dataset-yellow)](https://huggingface.co/datasets/WeiChow/CrispEdit-2M)
[![Checkpoint](https://img.shields.io/badge/🧨%20EditMGT-CKPT-blue)](https://huggingface.co/WeiChow/EditMGT)
[![GitHub](https://img.shields.io/badge/GitHub-Repo-181717?logo=github)](https://github.com/weichow23/EditMGT/tree/main)
[![Page](https://img.shields.io/badge/🏠%20Home-Page-b3.svg)](https://weichow23.github.io/EditMGT/)
[![Python 3.9](https://img.shields.io/badge/Python-3.9-blue.svg?logo=python)](https://www.python.org/downloads/release/python-392/)

</div>

## 🌟 Overview

This is the official repository for **EditMGT: Unleashing the Potential of Masked Generative Transformer in Image Editing** ✨. 

EditMGT is a novel framework that leverages Masked Generative Transformers for advanced image editing tasks. Our approach enables precise and controllable image modifications while preserving original content integrity.

<p align="center">
  <img src="asset/editmgt.png" alt="EditMGT Architecture" width="800px">
</p>

## ✨ Features

- 🎨 Great style transfer capabilities
- πŸ” Attention control over editing regions
- ⚑ The model backbone is only 960M, resulting in fast inference speed.
- πŸ“Š Trained on the [CrispEdit-2M]() dataset

## ⚑ Quick Start  

First, clone the repository and navigate to the project root:  
```shell
git clone https://github.com/weichow23/editmgt
cd editmgt
```

## πŸ”§ Environment Setup

```bash
# Create and activate conda environment
conda create --name editmgt python=3.9.2
conda activate editmgt

# Optional: Install system dependencies
sudo apt-get install libgl1-mesa-glx libglib2.0-0 -y

# Install Python dependencies
pip3 install git+https://github.com/openai/CLIP
pip3 install -r requirements.txt
```

⚠️ **Note**: If you encounter any strange environment library errors, please refer to [Issues](https://github.com/viiika/Meissonic/issues/14) to find the correct version that might fix the error.

## πŸš€ Inference

Run the following script in the `editmgt` directory:

```python
import os
import sys
sys.path.append("./")
from PIL import Image
from src.editmgt import init_edit_mgt
from src.v2_model import negative_prompt

if __name__ == "__main__":
    pipe = init_edit_mgt(device='cuda:0')
    # Forcing the use of bf16 can improve speed, but it will incur a performance penalty. 
    # We noticed that GEditBench dropped by about 0.8.
    # pipe = init_edit_mgt(device, enable_bf16=False)

    # pipe.local_guidance=0.01 # After starting, it will use the local GS auxiliary mode.
    # pipe.local_query_text = 'owl' # Use specific words as attention queries
    # pipe.attention_enable_blocks = [i for i in range(28, 37)]  # attention layer used
    input_image = Image.open('assets/case_5.jspg')
    result = pipe(
        prompt=['Make it into Ghibli style'],
        height=1024,
        width=1024,
        num_inference_steps=36, # For some simple tasks, 16 steps are enough!
        guidance_scale=6,
        reference_strength=1.1,
        reference_image=[input_image.resize((1024, 1024))],
        negative_prompt=negative_prompt or None,
    )
    output_dir = "./output"
    os.makedirs(output_dir, exist_ok=True)

    file_path = os.path.join(output_dir, f"edited_case_5.png")
    w, h = input_image.size
    result.images[0].resize((w, h)).save(file_path)
```

## πŸ“‘ Citation

```bibtex
@article{chow2025editmgt,
  title={EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing},
  author={Chow, Wei and Li, Linfeng and Kong, Lingdong and Li, Zefeng and Xu, Qi and Song, Hang and Ye, Tian and Wang, Xian and Bai, Jinbin and Xu, Shilin and others},
  journal={arXiv preprint arXiv:2512.11715},
  year={2025}
}
```

## πŸ™ Acknowledgements

We extend our sincere gratitude to all contributors and the research community for their valuable feedback and support in the development of this project.