VA-Pi / README.md

LilShake66

Update README.md

8e0ed35 verified 15 days ago

preview code

raw

history blame contribute delete

3.38 kB

metadata

license: mit
pipeline_tag: text-to-image
tags:
  - text-to-image
  - image-generation
  - autoregressive
  - reinforcement-learning
  - alignment
  - llamagen
  - janus

VA-π aligned checkpoints (VA-Pi)

This repo hosts post-trained checkpoints for the paper “VA-π: Variational Policy Alignment for Pixel-Aware Autoregressive Generation”.

Paper / code: https://github.com/Lil-Shake/VA-Pi
Project page: https://lil-shake.github.io/va-pi.github.io/
arXiv: https://arxiv.org/abs/2512.19680

These weights are provided as PyTorch .pth files. Only load weights you trust.

Files

LlamaGen C2I (ImageNet class-to-image)

c2i/

c2i-vapi-xl-384.pth
c2i-vapi-xxl-384.pth
c2i-ste-xxl-384.pth (STE finetuned checkpoints)
c2i-pt-xxl-384-decoder.pth (Post-train tokenizer checkpoints)

T2I (two tracks)

t2i/

t2i-vapi-xl-256.pth (LlamaGen T2I aligned checkpoint)
t2i-vapi-janus-256.pth (Janus-Pro-1B aligned checkpoint)

Quickstart

1) Download a weight file from Hugging Face

from huggingface_hub import hf_hub_download

ckpt = hf_hub_download(
    repo_id="LilShake66/VA-Pi",
    filename="c2i/c2i-vapi-xxl-384.pth",  # or another file above
)
print("downloaded:", ckpt)

2) LlamaGen C2I sampling (recommended entry)

Use the official script from the VA-Pi codebase:

git clone https://github.com/Lil-Shake/VA-Pi
cd VA-Pi/LlamaGen

# Install deps (note: folder name is "LlamaGen", not "llamaGen")
pip install -r requirements.txt

# You also need the VQ checkpoint from LlamaGen (see VA-Pi README)
bash scripts/autoregressive/sample_c2i.sh \
  /path/to/vq_ds16_c2i.pt \
  /path/to/c2i-vapi-xxl-384.pth \
  /path/to/output_samples

Notes:

The script defaults to FROM_FSDP=1. If your checkpoint is not FSDP-style, set FROM_FSDP=0 in env.

3) LlamaGen T2I (GenEval) sampling

cd VA-Pi/LlamaGen

# You need: VQ checkpoint + cached T5 features + Geneval prompts jsonl
bash scripts/autoregressive/sample_t2i_geneval.sh \
  /path/to/vq_ds16_t2i.pt \
  /path/to/t2i-vapi-xl-256.pth \
  /path/to/t5_cache_dir \
  /path/to/geneval_prompts.jsonl \
  /path/to/output_geneval_samples

4) Janus-Pro GenEval inference

The Janus evaluation script supports either:

a full HF model repo (processor+config), or
a checkpoint folder containing consolidated.pth.

Since this Hub repo provides a single .pth file, the simplest way is:

mkdir -p /tmp/janus-vapi
cp /path/to/t2i-vapi-janus-256.pth /tmp/janus-vapi/consolidated.pth

git clone https://github.com/Lil-Shake/VA-Pi
cd VA-Pi/Janus

pip install -r requirements.txt

# Back to repo root for the provided script path
cd ..

bash Janus/run_geneval_infer.sh \
  --prompts-dir /path/to/evaluation_metadata_geneval.jsonl \
  --base-model-path deepseek-ai/Janus-Pro-1B \
  --model-path /tmp/janus-vapi \
  --reason-prompt /path/to/reasoning_prompt.txt \
  --save-root /path/to/output_geneval_samples

Citation

@misc{vapi2025,
  title={VA-$\pi$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation},
  author={Xinyao Liao and Qiyuan He and Kai Xu and Xiaoye Qu and Yicong Li and Wei Wei and Angela Yao},
  year={2025},
  eprint={2512.19680},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2512.19680}
}