Qwen Image Edit 2511 model just published and it is literally competing against Nano Banana Pro at image editing tasks. With native whopping 2560x2560 pixels image output capability and with only 12 steps it is next level. With our installers and specially made Quant FP8 Scaled model, you can run this amazing beast even as low as 6 GB GPUs. In this tutorial, I have compared Qwen Image Edit 2511 with previous successor model Qwen Image 2509 with 12 different unique and hard prompts and cases. Everything is step by step explained and provided.
Architecture is based on stateful real-time processing with innovational asynchronous memory update. Instead of reprocessing all the conversation history for each message, it's processing only single query with all the context moved to dedicated memory layers. Memory is updated after generating the answer, so it's not influencing latency - in tests, time to first token was almost the same as generating a single token. It has also better quality/accuracy in multi-turn dialogue than the same size stateless decoder-only model.
Initial experiments were small scale (12M to 160M params models trained on simple synthetic datasets), but just now I'm starting training of bigger 270M params model on real data