File size: 4,273 Bytes
2bb06dc 3824828 344c904 2bb06dc a6bbf6e 2bb06dc a6bbf6e 0a00f14 a6bbf6e 0a00f14 a6bbf6e 2bb06dc a6bbf6e 2bb06dc a6bbf6e 2bb06dc a6bbf6e 2bb06dc a6bbf6e 2bb06dc a6bbf6e 2bb06dc a6bbf6e 2bb06dc a6bbf6e 2bb06dc a6bbf6e 0a00f14 2bb06dc a6bbf6e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 | ---
license: apache-2.0
datasets:
- WeiChow/CrispEdit-2M
language:
- en
pipeline_tag: image-to-image
tags:
- image-edit
base_model:
- google/gemma-2-2b-it
- MeissonFlow/Meissonic
---
# EditMGT
<div align="center">
[](https://arxiv.org/abs/2512.11715)
[](https://huggingface.co/datasets/WeiChow/CrispEdit-2M)
[](https://huggingface.co/WeiChow/EditMGT)
[](https://github.com/weichow23/EditMGT/tree/main)
[](https://weichow23.github.io/EditMGT/)
[](https://www.python.org/downloads/release/python-392/)
</div>
## π Overview
This is the official repository for **EditMGT: Unleashing the Potential of Masked Generative Transformer in Image Editing** β¨.
EditMGT is a novel framework that leverages Masked Generative Transformers for advanced image editing tasks. Our approach enables precise and controllable image modifications while preserving original content integrity.
<p align="center">
<img src="asset/editmgt.png" alt="EditMGT Architecture" width="800px">
</p>
## β¨ Features
- π¨ Great style transfer capabilities
- π Attention control over editing regions
- β‘ The model backbone is only 960M, resulting in fast inference speed.
- π Trained on the [CrispEdit-2M]() dataset
## β‘ Quick Start
First, clone the repository and navigate to the project root:
```shell
git clone https://github.com/weichow23/editmgt
cd editmgt
```
## π§ Environment Setup
```bash
# Create and activate conda environment
conda create --name editmgt python=3.9.2
conda activate editmgt
# Optional: Install system dependencies
sudo apt-get install libgl1-mesa-glx libglib2.0-0 -y
# Install Python dependencies
pip3 install git+https://github.com/openai/CLIP
pip3 install -r requirements.txt
```
β οΈ **Note**: If you encounter any strange environment library errors, please refer to [Issues](https://github.com/viiika/Meissonic/issues/14) to find the correct version that might fix the error.
## π Inference
Run the following script in the `editmgt` directory:
```python
import os
import sys
sys.path.append("./")
from PIL import Image
from src.editmgt import init_edit_mgt
from src.v2_model import negative_prompt
if __name__ == "__main__":
pipe = init_edit_mgt(device='cuda:0')
# Forcing the use of bf16 can improve speed, but it will incur a performance penalty.
# We noticed that GEditBench dropped by about 0.8.
# pipe = init_edit_mgt(device, enable_bf16=False)
# pipe.local_guidance=0.01 # After starting, it will use the local GS auxiliary mode.
# pipe.local_query_text = 'owl' # Use specific words as attention queries
# pipe.attention_enable_blocks = [i for i in range(28, 37)] # attention layer used
input_image = Image.open('assets/case_5.jspg')
result = pipe(
prompt=['Make it into Ghibli style'],
height=1024,
width=1024,
num_inference_steps=36, # For some simple tasks, 16 steps are enough!
guidance_scale=6,
reference_strength=1.1,
reference_image=[input_image.resize((1024, 1024))],
negative_prompt=negative_prompt or None,
)
output_dir = "./output"
os.makedirs(output_dir, exist_ok=True)
file_path = os.path.join(output_dir, f"edited_case_5.png")
w, h = input_image.size
result.images[0].resize((w, h)).save(file_path)
```
## π Citation
```bibtex
@article{chow2025editmgt,
title={EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing},
author={Chow, Wei and Li, Linfeng and Kong, Lingdong and Li, Zefeng and Xu, Qi and Song, Hang and Ye, Tian and Wang, Xian and Bai, Jinbin and Xu, Shilin and others},
journal={arXiv preprint arXiv:2512.11715},
year={2025}
}
```
## π Acknowledgements
We extend our sincere gratitude to all contributors and the research community for their valuable feedback and support in the development of this project. |