LilShake66
/

VA-Pi

image-generation

reinforcement-learning

Model card Files Files and versions

VA-Pi / README.md

LilShake66's picture

Update README.md

8e0ed35 verified 16 days ago

|

history blame contribute delete

3.38 kB

	---
	license: mit
	pipeline_tag: text-to-image
	tags:
	- text-to-image
	- image-generation
	- autoregressive
	- reinforcement-learning
	- alignment
	- llamagen
	- janus
	---

	# VA-π aligned checkpoints (VA-Pi)

	This repo hosts post-trained checkpoints for the paper “VA-π: Variational Policy Alignment for Pixel-Aware Autoregressive Generation”.

	- Paper / code: https://github.com/Lil-Shake/VA-Pi
	- Project page: https://lil-shake.github.io/va-pi.github.io/
	- arXiv: https://arxiv.org/abs/2512.19680

	> These weights are provided as PyTorch `.pth` files. Only load weights you trust.

	---

	## Files

	### LlamaGen C2I (ImageNet class-to-image)
	`c2i/`
	- `c2i-vapi-xl-384.pth`
	- `c2i-vapi-xxl-384.pth`
	- `c2i-ste-xxl-384.pth` (STE finetuned checkpoints)
	- `c2i-pt-xxl-384-decoder.pth` (Post-train tokenizer checkpoints)

	### T2I (two tracks)
	`t2i/`
	- `t2i-vapi-xl-256.pth` (LlamaGen T2I aligned checkpoint)
	- `t2i-vapi-janus-256.pth` (Janus-Pro-1B aligned checkpoint)

	---

	## Quickstart

	### 1) Download a weight file from Hugging Face

	~~~python
	from huggingface_hub import hf_hub_download

	ckpt = hf_hub_download(
	repo_id="LilShake66/VA-Pi",
	filename="c2i/c2i-vapi-xxl-384.pth", # or another file above
	)
	print("downloaded:", ckpt)
	~~~

	---

	### 2) LlamaGen C2I sampling (recommended entry)

	Use the official script from the VA-Pi codebase:

	~~~bash
	git clone https://github.com/Lil-Shake/VA-Pi
	cd VA-Pi/LlamaGen

	# Install deps (note: folder name is "LlamaGen", not "llamaGen")
	pip install -r requirements.txt

	# You also need the VQ checkpoint from LlamaGen (see VA-Pi README)
	bash scripts/autoregressive/sample_c2i.sh \
	/path/to/vq_ds16_c2i.pt \
	/path/to/c2i-vapi-xxl-384.pth \
	/path/to/output_samples
	~~~

	Notes:
	- The script defaults to `FROM_FSDP=1`. If your checkpoint is not FSDP-style, set `FROM_FSDP=0` in env.

	---

	### 3) LlamaGen T2I (GenEval) sampling

	~~~bash
	cd VA-Pi/LlamaGen

	# You need: VQ checkpoint + cached T5 features + Geneval prompts jsonl
	bash scripts/autoregressive/sample_t2i_geneval.sh \
	/path/to/vq_ds16_t2i.pt \
	/path/to/t2i-vapi-xl-256.pth \
	/path/to/t5_cache_dir \
	/path/to/geneval_prompts.jsonl \
	/path/to/output_geneval_samples
	~~~

	---

	### 4) Janus-Pro GenEval inference

	The Janus evaluation script supports either:
	- a full HF model repo (processor+config), or
	- a checkpoint folder containing `consolidated.pth`.

	Since this Hub repo provides a single `.pth` file, the simplest way is:

	~~~bash
	mkdir -p /tmp/janus-vapi
	cp /path/to/t2i-vapi-janus-256.pth /tmp/janus-vapi/consolidated.pth

	git clone https://github.com/Lil-Shake/VA-Pi
	cd VA-Pi/Janus

	pip install -r requirements.txt

	# Back to repo root for the provided script path
	cd ..

	bash Janus/run_geneval_infer.sh \
	--prompts-dir /path/to/evaluation_metadata_geneval.jsonl \
	--base-model-path deepseek-ai/Janus-Pro-1B \
	--model-path /tmp/janus-vapi \
	--reason-prompt /path/to/reasoning_prompt.txt \
	--save-root /path/to/output_geneval_samples
	~~~

	---

	## Citation

	~~~bibtex
	@misc{vapi2025,
	title={VA-$\pi$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation},
	author={Xinyao Liao and Qiyuan He and Kai Xu and Xiaoye Qu and Yicong Li and Wei Wei and Angela Yao},
	year={2025},
	eprint={2512.19680},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2512.19680}
	}
	~~~