Improve model card and add metadata
Browse filesThis PR improves the model card by adding relevant metadata, including the `pipeline_tag` for better discoverability. It also expands the README to include links to the original paper, GitHub repository, and project page, provides a model zoo table for the different checkpoints available, and includes a sample usage command derived from the official repository.
README.md
CHANGED
|
@@ -1,9 +1,58 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
-
|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
-
Paper:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
pipeline_tag: unconditional-image-generation
|
| 4 |
+
tags:
|
| 5 |
+
- image-generation
|
| 6 |
+
- autoregressive
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# Autoregressive Image Generation with Masked Bit Modeling (BAR)
|
| 10 |
|
| 11 |
+
This repository contains the checkpoints for **BAR** (*masked **B**it **A**uto**R**egressive modeling*), a scalable framework for discrete visual generation that challenges the dominance of continuous pipelines.
|
| 12 |
|
| 13 |
+
[**Project Page**](https://bar-gen.github.io/) | [**Paper (ArXiv)**](https://huggingface.co/papers/2602.09024) | [**GitHub**](https://github.com/amazon-far/BAR)
|
| 14 |
+
|
| 15 |
+
## Introduction
|
| 16 |
+
|
| 17 |
+
Contrary to the belief that discrete tokenizers are intrinsically inferior, BAR demonstrates that the performance gap arises primarily from the total number of bits allocated in the latent space. By equipping an autoregressive transformer with a **Masked Bit Modeling (MBM)** head, BAR predicts discrete tokens through progressively generating their constituent bits.
|
| 18 |
+
|
| 19 |
+
- **Scalability**: Supports arbitrary codebook sizes with reduced memory complexity.
|
| 20 |
+
- **Performance**: Achieves a state-of-the-art gFID of **0.99** on ImageNet-256.
|
| 21 |
+
- **Efficiency**: Offers faster convergence and significantly lower sampling costs compared to diffusion models.
|
| 22 |
+
|
| 23 |
+
## Model Zoo
|
| 24 |
+
|
| 25 |
+
| Model | Config | Size | gFID | IS |
|
| 26 |
+
| ------------- | ------------- | ------------- | ------------- | ------------- |
|
| 27 |
+
| BAR-B/4 | [bar_b_patch4.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_b_patch4.yaml) | 416M | 2.34 | 274.7 |
|
| 28 |
+
| BAR-B/2 | [bar_b_patch2.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_b_patch2.yaml) | 415M | 1.35 | 293.4 |
|
| 29 |
+
| BAR-B | [bar_b.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_b.yaml) | 415M | 1.13 | 289.0 |
|
| 30 |
+
| BAR-L | [bar_l.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_l.yaml) | 1.1B | 0.99 | 296.9 |
|
| 31 |
+
| BAR-L-res512 | [bar_l_res512.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_l_res512.yaml) | 1.1B | 1.09 | 311.1 |
|
| 32 |
+
|
| 33 |
+
## Sample Usage
|
| 34 |
+
|
| 35 |
+
To generate samples using the weights in this repository, please follow the setup instructions in the [official GitHub repository](https://github.com/amazon-far/BAR). Below is an example command for generating samples with **BAR-L**:
|
| 36 |
+
|
| 37 |
+
```bash
|
| 38 |
+
torchrun --nnodes=1 --nproc_per_node=1 --rdzv-endpoint=localhost:9999 \
|
| 39 |
+
sample_imagenet.py \
|
| 40 |
+
config=configs/generator/bar_l.yaml \
|
| 41 |
+
experiment.output_dir="bar_l" \
|
| 42 |
+
experiment.generator_checkpoint=assets/generator/bar_l.bin \
|
| 43 |
+
experiment.tokenizer_checkpoint=assets/tokenizer/bar_fsq_16bits_ft.bin \
|
| 44 |
+
model.generator.guidance_scale=5.3 \
|
| 45 |
+
model.generator.mbm_head.randomize_temperature=3.0 \
|
| 46 |
+
'model.generator.mbm_head.tokens_allocation=[2,2,5,7]'
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
## Citation
|
| 50 |
+
|
| 51 |
+
```bibtex
|
| 52 |
+
@article{yu2026autoregressive,
|
| 53 |
+
title = {Autoregressive Image Generation with Masked Bit Modeling},
|
| 54 |
+
author = {Yu, Qihang and Liu, Qihao and He, Ju and Zhang, Xinyang and Liu, Yang and Chen, Liang-Chieh and Chen, Xi},
|
| 55 |
+
journal = {arXiv preprint},
|
| 56 |
+
year = {2026}
|
| 57 |
+
}
|
| 58 |
+
```
|