| --- |
| license: mit |
| pipeline_tag: text-to-image |
| tags: |
| - text-to-image |
| - image-generation |
| - autoregressive |
| - reinforcement-learning |
| - alignment |
| - llamagen |
| - janus |
| --- |
| |
| # VA-π aligned checkpoints (VA-Pi) |
|
|
| This repo hosts **post-trained checkpoints** for the paper **“VA-π: Variational Policy Alignment for Pixel-Aware Autoregressive Generation”**. |
|
|
| - Paper / code: https://github.com/Lil-Shake/VA-Pi |
| - Project page: https://lil-shake.github.io/va-pi.github.io/ |
| - arXiv: https://arxiv.org/abs/2512.19680 |
|
|
| > These weights are provided as **PyTorch `.pth`** files. Only load weights you trust. |
|
|
| --- |
|
|
| ## Files |
|
|
| ### LlamaGen C2I (ImageNet class-to-image) |
| `c2i/` |
| - `c2i-vapi-xl-384.pth` |
| - `c2i-vapi-xxl-384.pth` |
| - `c2i-ste-xxl-384.pth` (STE finetuned checkpoints) |
| - `c2i-pt-xxl-384-decoder.pth` (Post-train tokenizer checkpoints) |
|
|
| ### T2I (two tracks) |
| `t2i/` |
| - `t2i-vapi-xl-256.pth` (**LlamaGen** T2I aligned checkpoint) |
| - `t2i-vapi-janus-256.pth` (**Janus-Pro-1B** aligned checkpoint) |
|
|
| --- |
|
|
| ## Quickstart |
|
|
| ### 1) Download a weight file from Hugging Face |
|
|
| ~~~python |
| from huggingface_hub import hf_hub_download |
| |
| ckpt = hf_hub_download( |
| repo_id="LilShake66/VA-Pi", |
| filename="c2i/c2i-vapi-xxl-384.pth", # or another file above |
| ) |
| print("downloaded:", ckpt) |
| ~~~ |
|
|
| --- |
|
|
| ### 2) LlamaGen C2I sampling (recommended entry) |
|
|
| Use the official script from the VA-Pi codebase: |
|
|
| ~~~bash |
| git clone https://github.com/Lil-Shake/VA-Pi |
| cd VA-Pi/LlamaGen |
| |
| # Install deps (note: folder name is "LlamaGen", not "llamaGen") |
| pip install -r requirements.txt |
| |
| # You also need the VQ checkpoint from LlamaGen (see VA-Pi README) |
| bash scripts/autoregressive/sample_c2i.sh \ |
| /path/to/vq_ds16_c2i.pt \ |
| /path/to/c2i-vapi-xxl-384.pth \ |
| /path/to/output_samples |
| ~~~ |
|
|
| Notes: |
| - The script defaults to `FROM_FSDP=1`. If your checkpoint is not FSDP-style, set `FROM_FSDP=0` in env. |
|
|
| --- |
|
|
| ### 3) LlamaGen T2I (GenEval) sampling |
|
|
| ~~~bash |
| cd VA-Pi/LlamaGen |
| |
| # You need: VQ checkpoint + cached T5 features + Geneval prompts jsonl |
| bash scripts/autoregressive/sample_t2i_geneval.sh \ |
| /path/to/vq_ds16_t2i.pt \ |
| /path/to/t2i-vapi-xl-256.pth \ |
| /path/to/t5_cache_dir \ |
| /path/to/geneval_prompts.jsonl \ |
| /path/to/output_geneval_samples |
| ~~~ |
|
|
| --- |
|
|
| ### 4) Janus-Pro GenEval inference |
|
|
| The Janus evaluation script supports either: |
| - a **full HF model repo** (processor+config), or |
| - a **checkpoint folder** containing `consolidated.pth`. |
|
|
| Since this Hub repo provides a single `.pth` file, the simplest way is: |
|
|
| ~~~bash |
| mkdir -p /tmp/janus-vapi |
| cp /path/to/t2i-vapi-janus-256.pth /tmp/janus-vapi/consolidated.pth |
| |
| git clone https://github.com/Lil-Shake/VA-Pi |
| cd VA-Pi/Janus |
| |
| pip install -r requirements.txt |
| |
| # Back to repo root for the provided script path |
| cd .. |
| |
| bash Janus/run_geneval_infer.sh \ |
| --prompts-dir /path/to/evaluation_metadata_geneval.jsonl \ |
| --base-model-path deepseek-ai/Janus-Pro-1B \ |
| --model-path /tmp/janus-vapi \ |
| --reason-prompt /path/to/reasoning_prompt.txt \ |
| --save-root /path/to/output_geneval_samples |
| ~~~ |
|
|
| --- |
|
|
| ## Citation |
|
|
| ~~~bibtex |
| @misc{vapi2025, |
| title={VA-$\pi$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation}, |
| author={Xinyao Liao and Qiyuan He and Kai Xu and Xiaoye Qu and Yicong Li and Wei Wei and Angela Yao}, |
| year={2025}, |
| eprint={2512.19680}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.CV}, |
| url={https://arxiv.org/abs/2512.19680} |
| } |
| ~~~ |