| language: | |
| - en | |
| license: apache-2.0 | |
| library_name: diffusers | |
| pipeline_tag: image-text-to-text | |
| This is BLIP3o-4B checkpoint trained on the **open source** data described in the paper [BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset](https://huggingface.co/papers/2505.09568). | |
| | Model | Pretrain Data | GenEval | DBP | WISE | | |
| |---------------------|-----------------------------------------------------------|---------|--------|------| | |
| | 4B (open source) | 30 million open-source data | 0.81 | 79.36 | 0.50 | | |
| | 8B (open source) | 30 million open-source data | 0.83 | 80.73 | 0.52 | | |
| | 8B (paper reported) | 30 million open-source + 30 million proprietary data | 0.84 | 81.60 | 0.62 | | |
| See https://github.com/JiuhaiChen/BLIP3o for the code. | |
| ### Download | |
| ``` | |
| from huggingface_hub import snapshot_download | |
| snapshot_download( | |
| repo_id="BLIP3o/BLIP3o-Model-4B", | |
| repo_type="model" | |
| ) | |
| ``` | |
| Clone the repo (if you haven’t already) and install the environment: | |
| ``` | |
| git clone https://github.com/JiuhaiChen/BLIP3o.git | |
| ``` | |
| Change to the demo folder: | |
| ``` | |
| cd gradio | |
| ``` | |
| Launch with your model path: | |
| ``` | |
| python app.py /path/to/your/model | |
| ``` |