Could we reconsider FLUX.2 VAE?

#200

by ArranEye - opened 10 days ago

The Qwen vae is much weaker than flux2vae. Even when decoding/encoding a single image, Qwen's output image will have some loss, while Flux2's difference is difficult to discern with the naked eye. I think this speaks volumes. At the same time, flux2vae can also potentially help with future edit models.

Kimmypox

9 days ago

If using flux2vae means have to retrain the entire model. But if you have a dataset of at least 200,000 images with captions, you could train it yourself—though it’s a waste of money, unless you happen to have a high-end GPU on hand.

ArranEye

9 days ago

If using flux2vae means have to retrain the entire model. But if you have a dataset of at least 200,000 images with captions, you could train it yourself—though it’s a waste of money, unless you happen to have a high-end GPU on hand.

If circlestone-labs want to make the Anima better. Qwen vae is definitely the first thing to be discarded. I just want to mention this thing again.

Kimmypox

9 days ago

If using flux2vae means have to retrain the entire model. But if you have a dataset of at least 200,000 images with captions, you could train it yourself—though it’s a waste of money, unless you happen to have a high-end GPU on hand.

If circlestone-labs want to make the Anima better. Qwen vae is definitely the first thing to be discarded. I just want to mention this thing again.

You know, the more people are repeatedly told to do something, the less willing they become to do it. Also, have you read the Flux.2 license terms yet? In this world, you can’t simply take components from different models and combine them without checking their licenses first. Once you understand that, you’ll see why so many newly released models are choosing to use Qwen VAE or Wan VAE instead.

ArranEye

9 days ago

If using flux2vae means have to retrain the entire model. But if you have a dataset of at least 200,000 images with captions, you could train it yourself—though it’s a waste of money, unless you happen to have a high-end GPU on hand.

If circlestone-labs want to make the Anima better. Qwen vae is definitely the first thing to be discarded. I just want to mention this thing again.

You know, the more people are repeatedly told to do something, the less willing they become to do it. Also, have you read the Flux.2 license terms yet? In this world, you can’t simply take components from different models and combine them without checking their licenses first. Once you understand that, you’ll see why so many newly released models are choosing to use Qwen VAE or Wan VAE instead.

It is Apache 2.0, didn't it? It shall not block anything. Many newly released model use qwen vae just because they didn't have flux2vae when they start the project. (such as Krea2 turbo, but their full version also use flux2vae) SeFi-image even use a finetuned flux2vae to fit their training. Flux2vae was published independently.

Kimmypox

9 days ago

I suggest you re-read the Flux.2 license. Also, SeFi-image is just a research model intended for specific testing purposes.

ArranEye

9 days ago

I suggest you re-read the Flux.2 license. Also, SeFi-image is just a research model intended for specific testing purposes.

YOU SHALL READ IT AGAIN. Flux.2 vae isn't under Flux.2 license.

Kimmypox

9 days ago

I suggest you re-read the Flux.2 license. Also, SeFi-image is just a research model intended for specific testing purposes.

YOU SHALL READ IT AGAIN. Flux.2 vae isn't under Flux.2 license.

Nah haha, congratulations — you’ve spent time and effort analyzing something that really doesn’t matter. ;P
But I’ll say this one more time: don’t keep repeating it. If circlestone-labs wants to use Flux2VAE, they’ll handle it themselves.

ArranEye

9 days ago

I suggest you re-read the Flux.2 license. Also, SeFi-image is just a research model intended for specific testing purposes.

YOU SHALL READ IT AGAIN. Flux.2 vae isn't under Flux.2 license.

Nah haha, congratulations — you’ve spent time and effort analyzing something that really doesn’t matter. ;P
But I’ll say this one more time: don’t keep repeating it. If circlestone-labs wants to use Flux2VAE, they’ll handle it themselves.

I didn't read it again, because they had already placed the statement "flux2vae is independent of flux2 " at the very beginning.

Is it something you're proud of to hinder technological progress? I don't understand what you're rambling about here. Either provide a feasible training method or stop talking.

kongbai-84

9 days ago

I don't think swapping out the VAE alone is a good idea right now. The improvement might not be that noticeable, and it could disrupt Anima’s existing ecosystem. Plus, Qwen's current VAE hasn't really hit a bottleneck yet—at least, neither my community members nor I feel it’s underperforming. If we absolutely have to upgrade a component, I’d personally prefer a more powerful text encoder; the current one is decent, but there's still room for improvement.

Ultimately, instead of rushing to replace the VAE or other parts, I believe a better approach would be to gather more training experience, curate a larger and higher-quality dataset, and then train a brand-new model from scratch for an all-around upgrade. That seems far more effective than just swapping a single component for a piecemeal boost.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment