LilShake66
/

VA-Pi

+---
+license: mit
+pipeline_tag: text-to-image
+tags:
+- text-to-image
+- image-generation
+- autoregressive
+- reinforcement-learning
+- alignment
+- llamagen
+- janus
+---
+# VA-π aligned checkpoints (VA-Pi)
+This repo hosts **post-trained checkpoints** for the paper **“VA-π: Variational Policy Alignment for Pixel-Aware Autoregressive Generation”**.
+- Paper / code: https://github.com/Lil-Shake/VA-Pi
+- Project page: https://lil-shake.github.io/va-pi.github.io/
+- arXiv: https://arxiv.org/abs/2512.19680
+> These weights are provided as **PyTorch `.pth`** files. Only load weights you trust.
+---
+## Files
+### LlamaGen C2I (ImageNet class-to-image)
+`c2i/`
+- `c2i-vapi-xl-384.pth`
+- `c2i-vapi-xxl-384.pth`
+- `c2i-ste-xxl-384.pth` (STE finetuned checkpoints)
+- `c2i-pt-xxl-384-decoder.pth` (Post-train tokenizer checkpoints)
+### T2I (two tracks)
+`t2i/`
+- `t2i-vapi-xl-256.pth` (**LlamaGen** T2I aligned checkpoint)
+- `t2i-vapi-janus-256.pth` (**Janus-Pro-1B** aligned checkpoint)
+---
+## Quickstart
+### 1) Download a weight file from Hugging Face
+~~~python
+from huggingface_hub import hf_hub_download
+ckpt = hf_hub_download(
+    repo_id="LilShake66/VA-Pi",
+    filename="c2i/c2i-vapi-xxl-384.pth",  # or another file above
+)
+print("downloaded:", ckpt)
+~~~
+---
+### 2) LlamaGen C2I sampling (recommended entry)
+Use the official script from the VA-Pi codebase:
+~~~bash
+git clone https://github.com/Lil-Shake/VA-Pi
+cd VA-Pi/LlamaGen
+# Install deps (note: folder name is "LlamaGen", not "llamaGen")
+pip install -r requirements.txt
+# You also need the VQ checkpoint from LlamaGen (see VA-Pi README)
+bash scripts/autoregressive/sample_c2i.sh \
+  /path/to/vq_ds16_c2i.pt \
+  /path/to/c2i-vapi-xxl-384.pth \
+  /path/to/output_samples
+~~~
+Notes:
+- The script defaults to `FROM_FSDP=1`. If your checkpoint is not FSDP-style, set `FROM_FSDP=0` in env.
+---
+### 3) LlamaGen T2I (GenEval) sampling
+~~~bash
+cd VA-Pi/LlamaGen
+# You need: VQ checkpoint + cached T5 features + Geneval prompts jsonl
+bash scripts/autoregressive/sample_t2i_geneval.sh \
+  /path/to/vq_ds16_t2i.pt \
+  /path/to/t2i-vapi-xl-256.pth \
+  /path/to/t5_cache_dir \
+  /path/to/geneval_prompts.jsonl \
+  /path/to/output_geneval_samples
+~~~
+---
+### 4) Janus-Pro GenEval inference
+The Janus evaluation script supports either:
+- a **full HF model repo** (processor+config), or
+- a **checkpoint folder** containing `consolidated.pth`.
+Since this Hub repo provides a single `.pth` file, the simplest way is:
+~~~bash
+mkdir -p /tmp/janus-vapi
+cp /path/to/t2i-vapi-janus-256.pth /tmp/janus-vapi/consolidated.pth
+git clone https://github.com/Lil-Shake/VA-Pi
+cd VA-Pi/Janus
+pip install -r requirements.txt
+# Back to repo root for the provided script path
+cd ..
+bash Janus/run_geneval_infer.sh \
+  --prompts-dir /path/to/evaluation_metadata_geneval.jsonl \
+  --base-model-path deepseek-ai/Janus-Pro-1B \
+  --model-path /tmp/janus-vapi \
+  --reason-prompt /path/to/reasoning_prompt.txt \
+  --save-root /path/to/output_geneval_samples
+~~~
+---
+## Citation
+~~~bibtex
+@misc{vapi2025,
+  title={VA-$\pi$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation},
+  author={Xinyao Liao and Qiyuan He and Kai Xu and Xiaoye Qu and Yicong Li and Wei Wei and Angela Yao},
+  year={2025},
+  eprint={2512.19680},
+  archivePrefix={arXiv},
+  primaryClass={cs.CV},
+  url={https://arxiv.org/abs/2512.19680}
+}
+~~~