Improve model card and add metadata
#1
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,9 +1,58 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
-
|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
-
Paper:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
pipeline_tag: unconditional-image-generation
|
| 4 |
+
tags:
|
| 5 |
+
- image-generation
|
| 6 |
+
- autoregressive
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# Autoregressive Image Generation with Masked Bit Modeling (BAR)
|
| 10 |
|
| 11 |
+
This repository contains the checkpoints for **BAR** (*masked **B**it **A**uto**R**egressive modeling*), a scalable framework for discrete visual generation that challenges the dominance of continuous pipelines.
|
| 12 |
|
| 13 |
+
[**Project Page**](https://bar-gen.github.io/) | [**Paper (ArXiv)**](https://huggingface.co/papers/2602.09024) | [**GitHub**](https://github.com/amazon-far/BAR)
|
| 14 |
+
|
| 15 |
+
## Introduction
|
| 16 |
+
|
| 17 |
+
Contrary to the belief that discrete tokenizers are intrinsically inferior, BAR demonstrates that the performance gap arises primarily from the total number of bits allocated in the latent space. By equipping an autoregressive transformer with a **Masked Bit Modeling (MBM)** head, BAR predicts discrete tokens through progressively generating their constituent bits.
|
| 18 |
+
|
| 19 |
+
- **Scalability**: Supports arbitrary codebook sizes with reduced memory complexity.
|
| 20 |
+
- **Performance**: Achieves a state-of-the-art gFID of **0.99** on ImageNet-256.
|
| 21 |
+
- **Efficiency**: Offers faster convergence and significantly lower sampling costs compared to diffusion models.
|
| 22 |
+
|
| 23 |
+
## Model Zoo
|
| 24 |
+
|
| 25 |
+
| Model | Config | Size | gFID | IS |
|
| 26 |
+
| ------------- | ------------- | ------------- | ------------- | ------------- |
|
| 27 |
+
| BAR-B/4 | [bar_b_patch4.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_b_patch4.yaml) | 416M | 2.34 | 274.7 |
|
| 28 |
+
| BAR-B/2 | [bar_b_patch2.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_b_patch2.yaml) | 415M | 1.35 | 293.4 |
|
| 29 |
+
| BAR-B | [bar_b.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_b.yaml) | 415M | 1.13 | 289.0 |
|
| 30 |
+
| BAR-L | [bar_l.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_l.yaml) | 1.1B | 0.99 | 296.9 |
|
| 31 |
+
| BAR-L-res512 | [bar_l_res512.yaml](https://github.com/amazon-far/BAR/blob/main/configs/generator/bar_l_res512.yaml) | 1.1B | 1.09 | 311.1 |
|
| 32 |
+
|
| 33 |
+
## Sample Usage
|
| 34 |
+
|
| 35 |
+
To generate samples using the weights in this repository, please follow the setup instructions in the [official GitHub repository](https://github.com/amazon-far/BAR). Below is an example command for generating samples with **BAR-L**:
|
| 36 |
+
|
| 37 |
+
```bash
|
| 38 |
+
torchrun --nnodes=1 --nproc_per_node=1 --rdzv-endpoint=localhost:9999 \
|
| 39 |
+
sample_imagenet.py \
|
| 40 |
+
config=configs/generator/bar_l.yaml \
|
| 41 |
+
experiment.output_dir="bar_l" \
|
| 42 |
+
experiment.generator_checkpoint=assets/generator/bar_l.bin \
|
| 43 |
+
experiment.tokenizer_checkpoint=assets/tokenizer/bar_fsq_16bits_ft.bin \
|
| 44 |
+
model.generator.guidance_scale=5.3 \
|
| 45 |
+
model.generator.mbm_head.randomize_temperature=3.0 \
|
| 46 |
+
'model.generator.mbm_head.tokens_allocation=[2,2,5,7]'
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
## Citation
|
| 50 |
+
|
| 51 |
+
```bibtex
|
| 52 |
+
@article{yu2026autoregressive,
|
| 53 |
+
title = {Autoregressive Image Generation with Masked Bit Modeling},
|
| 54 |
+
author = {Yu, Qihang and Liu, Qihao and He, Ju and Zhang, Xinyang and Liu, Yang and Chen, Liang-Chieh and Chen, Xi},
|
| 55 |
+
journal = {arXiv preprint},
|
| 56 |
+
year = {2026}
|
| 57 |
+
}
|
| 58 |
+
```
|