VA-Pi / README.md
LilShake66's picture
Update README.md
8e0ed35 verified
---
license: mit
pipeline_tag: text-to-image
tags:
- text-to-image
- image-generation
- autoregressive
- reinforcement-learning
- alignment
- llamagen
- janus
---
# VA-π aligned checkpoints (VA-Pi)
This repo hosts **post-trained checkpoints** for the paper **“VA-π: Variational Policy Alignment for Pixel-Aware Autoregressive Generation”**.
- Paper / code: https://github.com/Lil-Shake/VA-Pi
- Project page: https://lil-shake.github.io/va-pi.github.io/
- arXiv: https://arxiv.org/abs/2512.19680
> These weights are provided as **PyTorch `.pth`** files. Only load weights you trust.
---
## Files
### LlamaGen C2I (ImageNet class-to-image)
`c2i/`
- `c2i-vapi-xl-384.pth`
- `c2i-vapi-xxl-384.pth`
- `c2i-ste-xxl-384.pth` (STE finetuned checkpoints)
- `c2i-pt-xxl-384-decoder.pth` (Post-train tokenizer checkpoints)
### T2I (two tracks)
`t2i/`
- `t2i-vapi-xl-256.pth` (**LlamaGen** T2I aligned checkpoint)
- `t2i-vapi-janus-256.pth` (**Janus-Pro-1B** aligned checkpoint)
---
## Quickstart
### 1) Download a weight file from Hugging Face
~~~python
from huggingface_hub import hf_hub_download
ckpt = hf_hub_download(
repo_id="LilShake66/VA-Pi",
filename="c2i/c2i-vapi-xxl-384.pth", # or another file above
)
print("downloaded:", ckpt)
~~~
---
### 2) LlamaGen C2I sampling (recommended entry)
Use the official script from the VA-Pi codebase:
~~~bash
git clone https://github.com/Lil-Shake/VA-Pi
cd VA-Pi/LlamaGen
# Install deps (note: folder name is "LlamaGen", not "llamaGen")
pip install -r requirements.txt
# You also need the VQ checkpoint from LlamaGen (see VA-Pi README)
bash scripts/autoregressive/sample_c2i.sh \
/path/to/vq_ds16_c2i.pt \
/path/to/c2i-vapi-xxl-384.pth \
/path/to/output_samples
~~~
Notes:
- The script defaults to `FROM_FSDP=1`. If your checkpoint is not FSDP-style, set `FROM_FSDP=0` in env.
---
### 3) LlamaGen T2I (GenEval) sampling
~~~bash
cd VA-Pi/LlamaGen
# You need: VQ checkpoint + cached T5 features + Geneval prompts jsonl
bash scripts/autoregressive/sample_t2i_geneval.sh \
/path/to/vq_ds16_t2i.pt \
/path/to/t2i-vapi-xl-256.pth \
/path/to/t5_cache_dir \
/path/to/geneval_prompts.jsonl \
/path/to/output_geneval_samples
~~~
---
### 4) Janus-Pro GenEval inference
The Janus evaluation script supports either:
- a **full HF model repo** (processor+config), or
- a **checkpoint folder** containing `consolidated.pth`.
Since this Hub repo provides a single `.pth` file, the simplest way is:
~~~bash
mkdir -p /tmp/janus-vapi
cp /path/to/t2i-vapi-janus-256.pth /tmp/janus-vapi/consolidated.pth
git clone https://github.com/Lil-Shake/VA-Pi
cd VA-Pi/Janus
pip install -r requirements.txt
# Back to repo root for the provided script path
cd ..
bash Janus/run_geneval_infer.sh \
--prompts-dir /path/to/evaluation_metadata_geneval.jsonl \
--base-model-path deepseek-ai/Janus-Pro-1B \
--model-path /tmp/janus-vapi \
--reason-prompt /path/to/reasoning_prompt.txt \
--save-root /path/to/output_geneval_samples
~~~
---
## Citation
~~~bibtex
@misc{vapi2025,
title={VA-$\pi$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation},
author={Xinyao Liao and Qiyuan He and Kai Xu and Xiaoye Qu and Yicong Li and Wei Wei and Angela Yao},
year={2025},
eprint={2512.19680},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.19680}
}
~~~