WeiChow commited on
Commit
a6bbf6e
Β·
verified Β·
1 Parent(s): 1361f95

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -43
README.md CHANGED
@@ -10,61 +10,107 @@ tags:
10
  base_model:
11
  - google/gemma-2-2b-it
12
  ---
13
- # EditMGT Model for HuggingFace Transformers
14
 
15
- This repository contains a HuggingFace Transformers-compatible implementation of the EditMGT model for image editing based on text instructions.
 
 
 
 
 
 
 
16
 
17
- ## Installation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  ```bash
20
- pip install transformers pillow
21
- # Install other required dependencies from the original EditMGT repository
 
 
 
 
 
 
 
 
22
  ```
23
 
24
- ## Usage
 
 
 
 
25
 
26
  ```python
27
- from eval.utils import init_edit_mgt
28
- from src.v2_model import negative_prompt
 
29
  from PIL import Image
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- # Initialize the pipeline
32
- pipe = init_edit_mgt(
33
- ckpt_path='./runs/editmgt-ct/checkpoint',
34
- enable_fp16=False,
35
- device='cuda:0',
36
- use_ema=False
37
- )
38
-
39
- # Load your image
40
- reference_image = Image.open("your_image.jpg").resize((1024, 1024))
41
-
42
- # Generate edited image
43
- result = pipe(
44
- prompt=["Make the sky more blue and add clouds"],
45
- negative_prompt=[negative_prompt],
46
- height=1024,
47
- width=1024,
48
- num_inference_steps=36,
49
- guidance_scale=6,
50
- num_images_per_prompt=1,
51
- reference_image=[reference_image],
52
- reference_strength=1.1,
53
- )
54
-
55
- # Save the result
56
- result.images[0].save("edited_image.png")
57
  ```
58
 
59
- ## Citation
60
 
61
- If you use this model, please cite the original work:
62
 
63
  ```
64
- @article{editmgt2023,
65
- title={EditMGT: Text-guided Image Editing with Masked Generative Transformer},
66
- author={...},
67
- journal={...},
68
- year={2023}
69
- }
70
- ```
 
10
  base_model:
11
  - google/gemma-2-2b-it
12
  ---
13
+ # EditMGT
14
 
15
+ <div align="center">
16
+
17
+ [![arXiv](https://img.shields.io/badge/arXiv-XXXX.XXXXX-b31b1b.svg)]()
18
+ [![Dataset](https://img.shields.io/badge/πŸ€—%20CrispEdit2M-Dataset-yellow)](https://huggingface.co/datasets/WeiChow/CrispEdit-2M)
19
+ [![Checkpoint](https://img.shields.io/badge/🧨%20EditMGT-CKPT-blue)](https://huggingface.co/WeiChow/EditMGT)
20
+ [![GitHub](https://img.shields.io/badge/GitHub-Repo-181717?logo=github)](https://github.com/weichow23/editmgt/tree/main)
21
+ [![Page](https://img.shields.io/badge/🏠%20Home-Page-b3.svg)](https://weichow23.github.io/editmgt/)
22
+ [![Python 3.9](https://img.shields.io/badge/Python-3.9-blue.svg?logo=python)](https://www.python.org/downloads/release/python-392/)
23
 
24
+ </div>
25
+
26
+ ## 🌟 Overview
27
+
28
+ This is the official repository for **EditMGT: Unleashing the Potential of Masked Generative Transformer in Image Editing** ✨.
29
+
30
+ EditMGT is a novel framework that leverages Masked Generative Transformers for advanced image editing tasks. Our approach enables precise and controllable image modifications while preserving original content integrity.
31
+
32
+ <p align="center">
33
+ <img src="asset/editmgt.png" alt="EditMGT Architecture" width="800px">
34
+ </p>
35
+
36
+ ## ✨ Features
37
+
38
+ - 🎨 Great style transfer capabilities
39
+ - πŸ” Attention control over editing regions
40
+ - ⚑ The model backbone is only 960M, resulting in fast inference speed.
41
+ - πŸ“Š Trained on the [CrispEdit-2M]() dataset
42
+
43
+ ## ⚑ Quick Start
44
+
45
+ First, clone the repository and navigate to the project root:
46
+ ```shell
47
+ git clone https://github.com/weichow23/editmgt
48
+ cd editmgt
49
+ ```
50
+
51
+ ## πŸ”§ Environment Setup
52
 
53
  ```bash
54
+ # Create and activate conda environment
55
+ conda create --name editmgt python=3.9.2
56
+ conda activate editmgt
57
+
58
+ # Optional: Install system dependencies
59
+ sudo apt-get install libgl1-mesa-glx libglib2.0-0 -y
60
+
61
+ # Install Python dependencies
62
+ pip3 install git+https://github.com/openai/CLIP
63
+ pip3 install -r requirements.txt
64
  ```
65
 
66
+ ⚠️ **Note**: If you encounter any strange environment library errors, please refer to [Issues](https://github.com/viiika/Meissonic/issues/14) to find the correct version that might fix the error.
67
+
68
+ ## πŸš€ Inference
69
+
70
+ Run the following script in the `editmgt` directory:
71
 
72
  ```python
73
+ import os
74
+ import sys
75
+ sys.path.append("./")
76
  from PIL import Image
77
+ from src.editmgt import init_edit_mgt
78
+ from src.v2_model import negative_prompt
79
+
80
+ if __name__ == "__main__":
81
+ pipe = init_edit_mgt(device='cuda:0')
82
+ # Forcing the use of bf16 can improve speed, but it will incur a performance penalty.
83
+ # We noticed that GEditBench dropped by about 0.8.
84
+ # pipe = init_edit_mgt(device, enable_bf16=False)
85
+
86
+ # pipe.local_guidance=0.01 # After starting, it will use the local GS auxiliary mode.
87
+ # pipe.local_query_text = 'owl' # Use specific words as attention queries
88
+ # pipe.attention_enable_blocks = [i for i in range(28, 37)] # attention layer used
89
+ input_image = Image.open('assets/case_5.jspg')
90
+ result = pipe(
91
+ prompt=['Make it into Ghibli style'],
92
+ height=1024,
93
+ width=1024,
94
+ num_inference_steps=36, # For some simple tasks, 16 steps are enough!
95
+ guidance_scale=6,
96
+ reference_strength=1.1,
97
+ reference_image=[input_image.resize((1024, 1024))],
98
+ negative_prompt=negative_prompt or None,
99
+ )
100
+ output_dir = "./output"
101
+ os.makedirs(output_dir, exist_ok=True)
102
 
103
+ file_path = os.path.join(output_dir, f"edited_case_5.png")
104
+ w, h = input_image.size
105
+ result.images[0].resize((w, h)).save(file_path)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
106
  ```
107
 
108
+ ## πŸ“‘ Citation
109
 
110
+ ```bibtex
111
 
112
  ```
113
+
114
+ ## πŸ™ Acknowledgements
115
+
116
+ We extend our sincere gratitude to all contributors and the research community for their valuable feedback and support in the development of this project.