Enhance model card with pipeline tag, paper, project, and code links (#1)

eeaa5e6 verified about 2 months ago

3.42 kB

	---
	license: cc-by-nc-sa-4.0
	pipeline_tag: image-to-image
	---

	# 🪶 MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues

	- Paper: [MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues](https://huggingface.co/papers/2512.03046)
	- Project Page: https://magicquill.art/v2/
	- Code Repository: https://github.com/zliucz/MagicQuillV2
	- Hugging Face Spaces Demo: https://huggingface.co/spaces/AI4Editing/MagicQuillV2

	<br>

	<div align="center">
	<video src="https://github.com/user-attachments/assets/58079152-7729-48ed-9bb4-0ddfd1873dd0" width="100%" controls autoplay muted loop></video>
	</div>

	<br>

	TLDR: MagicQuill V2 introduces a layered composition paradigm to generative image editing, disentangling creative intent into controllable visual cues (Content, Spatial, Structural, Color) for precise and intuitive control.

	## Hardware Requirements

	Our model is based on Flux Kontext, which is large and computationally intensive.
	- VRAM: Approximately 40GB of VRAM is required for inference.
	- Speed: It takes about 30 seconds to generate a single image.

	> Important: This is a research project focused on pushing the boundaries of interactive image editing. If you do not have sufficient GPU memory, we recommend checking out our [MagicQuill V1](https://github.com/ant-research/MagicQuill) or trying the online demo on [Hugging Face Spaces](https://huggingface.co/spaces/AI4Editing/MagicQuillV2).

	## Setup

	1. Clone the repository
	```bash
	git clone https://github.com/magic-quill/MagicQuillV2.git
	cd MagicQuillV2
	```

	2. Create environment
	```bash
	conda create -n MagicQuillV2 python=3.10 -y
	conda activate MagicQuillV2
	```

	3. Install dependencies
	```bash
	pip install -r requirements.txt
	```

	4. Download models
	Download the models from [Hugging Face](https://huggingface.co/LiuZichen/MagicQuillV2-models) and place them in the `models/` directory.

	```bash
	huggingface-cli download LiuZichen/MagicQuillV2-models --local-dir models
	```

	5. Run the demo
	```bash
	python app.py
	```

	## System Overview

	The MagicQuill V2 interactive system is designed to unify our layered composition framework.

	<div align="center">
	<img src="https://github.com/zliucz/MagicQuillV2/raw/main/assets/V2_UI.png" alt="MagicQuill V2 UI" width="100%">
	</div>

	### Key Upgrades from V1

	1. Toolbar (A): Features a new Local Edit Brush for defining the target editing area, along with tools for sketching edges and applying color.
	2. Visual Cue Manager (B): Holds all content layer visual cues (foreground props) that users can drag onto the canvas to define what to generate.
	3. Image Segmentation Panel (C): Accessed via the segment icon, this panel allows precise object extraction using SAM (Segment Anything Model) with positive/negative dots or bounding boxes.

	## Citation

	If you find MagicQuill V2 useful for your research, please cite our paper:

	```bibtex
	@article{liu2025magicquillv2,
	title={MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues},
	author={Zichen Liu, Yue Yu, Hao Ouyang, Qiuyu Wang, Shuailei Ma, Ka Leong Cheng, Wen Wang, Qingyan Bai, Yuxuan Zhang, Yanhong Zeng, Yixuan Li, Xing Zhu, Yujun Shen, Qifeng Chen},
	journal={arXiv:2512.03046},
	year={2025}
	}
	```