Instructions to use ByteDance-Seed/BAGEL-7B-MoT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Bagel
How to use ByteDance-Seed/BAGEL-7B-MoT with Bagel:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
SigLIP2 or SigLIP1
License
BAGEL is licensed under the Apache 2.0 license. It is finetuned from Qwen2.5-7B-Instruct and siglip-so400m-14-980-flash-attn2-navit model, and uses the FLUX.1-schnell VAE model, all under Apache 2.0.
siglip-so400m-14-980-flash-attn2-navit by HuggingFaceM4 is SigLIP1, but in your paper
We adopt
SigLIP2-so400m/14 [74] with a fixed 384-resolution as the initialization of the ViT encoder. Building
Thanks for the pointing out this issue! We use siglip-so400m-14-384-flash-attn2. The information in license is updated.
I'm still confused.
'siglip-so400m-14-384-flash-attn2' seems to be SigLIP1. But SigLIP2-so400m/14 was mentioned in your paper.
I'm still confused.
'siglip-so400m-14-384-flash-attn2' seems to be SigLIP1. But SigLIP2-so400m/14 was mentioned in your paper.
let me clarify this. we use SigLIP2-so400m/14 with a 384x384 input resolution. then we interpolate the position embeddings to 980x980.
this is actually what siglip-so400m-14-384-flash-attn2 has done to siglip-so400m-14-384.