SeFi-Image Non-Commercial License Agreement

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

By clicking "Agree and access repository", you acknowledge that you have read and agree to the Creative Commons Attribution-NonCommercial 4.0 International license (CC BY-NC 4.0). You agree to use SeFi-Image checkpoints for non-commercial purposes only and to comply with all applicable laws and responsible AI use requirements.

Log in or Sign Up to review the conditions and access this model content.

SeFi-Image

Project Page   arXiv   Inference Code   Hugging Face Models

SeFi-Image is a text-to-image foundation model family built with Semantic-First Diffusion. It separates generation into semantic and texture latent streams, denoising semantic structure slightly ahead of texture details. This design gives the texture stream a cleaner structural anchor and improves the reconstruction-generation trade-off in latent diffusion.

SeFi-Image generated examples More SeFi-Image generated examples

Highlights

Semantic-first generation icon
Semantic-first generation
Semantic latents denoise ahead of texture latents, providing a cleaner structural anchor for image synthesis.
Faster training icon
Faster training
The 5B model reaches strong benchmark performance with about 125K A800 GPU hours.
Generation-reconstruction trade-off icon
Better generation-reconstruction trade-off
A high-fidelity texture latent preserves reconstruction detail, while a compact semantic latent simplifies generation.

Performance

The following numbers follow the main evaluation tables in the SeFi-Image technical report and summarize SeFi-Image-5B across representative benchmarks.

SeFi-Image-5B performance overview

Model Zoo

Family Model Checkpoint Steps Guidance
Base SeFi-Image-1B-Base SeFi-Image/SeFi-Image-1B-Base 50 4.0
Base SeFi-Image-2B-Base SeFi-Image/SeFi-Image-2B-Base 50 4.0
Base SeFi-Image-5B-Base SeFi-Image/SeFi-Image-5B-Base 50 4.0
RL SeFi-Image-5B-RL SeFi-Image/SeFi-Image-5B-RL 50 4.0
Turbo SeFi-Image-1B-turbo SeFi-Image/SeFi-Image-1B-turbo 4 1.0
Turbo SeFi-Image-2B-turbo SeFi-Image/SeFi-Image-2B-turbo 4 1.0
Turbo SeFi-Image-5B-turbo SeFi-Image/SeFi-Image-5B-turbo 4 1.0

Quick Start

Install the SeFi inference code and dependencies from the SeFi-Image inference repository, then pass a Hugging Face checkpoint repo id:

python inference.py \
  --checkpoint SeFi-Image/SeFi-Image-5B-Base \
  --prompt "A red apple on a wooden table." \
  --output-dir outputs/inference/sefi_5b_base

Turbo checkpoints use the same command pattern:

python inference.py \
  --checkpoint SeFi-Image/SeFi-Image-5B-turbo \
  --prompt "A blue ceramic mug on a white desk." \
  --steps 4 \
  --output-dir outputs/inference/sefi_5b_turbo

Python API:

from sefi import SEFIInferencePipeline

pipe = SEFIInferencePipeline.from_pretrained(
    "SeFi-Image/SeFi-Image-5B-Base",
)
images = pipe(
    "A red apple on a wooden table.",
    seed=42,
)
images[0].save("example.png")

Turbo checkpoints use the same API:

from sefi import SEFIInferencePipeline

pipe = SEFIInferencePipeline.from_pretrained(
    "SeFi-Image/SeFi-Image-5B-turbo",
)
images = pipe(
    "A blue ceramic mug on a white desk.",
    num_inference_steps=4,
    guidance_scale=1.0,
    seed=123,
)
images[0].save("turbo_example.png")

Intended Use

SeFi-Image is intended for research and creative text-to-image generation, including prompt following, bilingual text rendering, style exploration, and model development. The Base checkpoints are suitable starting points for fine-tuning and analysis. Turbo checkpoints are intended for fast generation. The RL checkpoint is intended for stronger alignment-oriented generation.

Citation

If you find SeFi-Image useful, please cite the paper:

@misc{sefiteam2026sefiimagetexttoimagefoundationmodel,
      title={SeFi-Image: A Text-to-Image Foundation Model with Semantic-First Diffusion}, 
      author={SeFi-Team},
      year={2026},
      eprint={2606.22568},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2606.22568}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for SeFi-Image/SeFi-Image-2B-Base