Buckets:
| # Ernie-Image | |
| [ERNIE-Image] is a powerful and highly efficient image generation model with 8B parameters. Currently there's only two models to be released: | |
| |Model|Hugging Face| | |
| |---|---| | |
| |ERNIE-Image|https://huggingface.co/baidu/ERNIE-Image| | |
| |ERNIE-Image-Turbo|https://huggingface.co/baidu/ERNIE-Image-Turbo| | |
| ## ERNIE-Image | |
| ERNIE-Image is designed with a relatively compact architecture and solid instruction-following capability, emphasizing parameter efficiency. Based on an 8B DiT backbone, it provides performance that is comparable in some scenarios to larger (20B+) models, while maintaining reasonable parameter efficiency. It offers a relatively stable level of performance in instruction understanding and execution, text generation (e.g., English / Chinese / Japanese), and overall stability. | |
| ## ERNIE-Image-Turbo | |
| ERNIE-Image-Turbo is a distilled variant of ERNIE-Image, requiring only 8 NFEs (Number of Function Evaluations) and offering a more efficient alternative with relatively comparable performance to the full model in certain cases. | |
| ## ErnieImagePipeline | |
| Use [ErnieImagePipeline] to generate images from text prompts. The pipeline supports Prompt Enhancer (PE) by default, which enhances the user’s raw prompt to improve output quality, though it may reduce instruction-following accuracy. | |
| We provide a pretrained 3B-parameter PE model; however, using larger language models (e.g., Gemini or ChatGPT) for prompt enhancement may yield better results. The system prompt template is available at: https://huggingface.co/baidu/ERNIE-Image/blob/main/pe/chat_template.jinja. | |
| If you prefer not to use PE, set use_pe=False. | |
| ```python | |
| import torch | |
| from diffusers import ErnieImagePipeline | |
| from diffusers.utils import load_image | |
| pipe = ErnieImagePipeline.from_pretrained("baidu/ERNIE-Image", torch_dtype=torch.bfloat16) | |
| pipe.to("cuda") | |
| # If you are running low on GPU VRAM, you can enable offloading | |
| pipe.enable_model_cpu_offload() | |
| prompt = "一只黑白相间的中华田园犬" | |
| images = pipe( | |
| prompt=prompt, | |
| height=1024, | |
| width=1024, | |
| num_inference_steps=50, | |
| guidance_scale=4.0, | |
| generator=torch.Generator("cuda").manual_seed(42), | |
| use_pe=True, | |
| ).images | |
| images[0].save("ernie-image-output.png") | |
| ``` | |
| ```python | |
| import torch | |
| from diffusers import ErnieImagePipeline | |
| from diffusers.utils import load_image | |
| pipe = ErnieImagePipeline.from_pretrained("baidu/ERNIE-Image-Turbo", torch_dtype=torch.bfloat16) | |
| pipe.to("cuda") | |
| # If you are running low on GPU VRAM, you can enable offloading | |
| pipe.enable_model_cpu_offload() | |
| prompt = "一只黑白相间的中华田园犬" | |
| images = pipe( | |
| prompt=prompt, | |
| height=1024, | |
| width=1024, | |
| num_inference_steps=8, | |
| guidance_scale=1.0, | |
| generator=torch.Generator("cuda").manual_seed(42), | |
| use_pe=True, | |
| ).images | |
| images[0].save("ernie-image-turbo-output.png") | |
| ``` | |
Xet Storage Details
- Size:
- 2.87 kB
- Xet hash:
- d1afcd0deb1b9f6f0b80995457f92c1d7b6faedc20b2861ebd085c52beca302d
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.