Papers
arxiv:2604.09168

ELT: Elastic Looped Transformers for Visual Generation

Published on Apr 10
· Submitted by
taesiri
on Apr 13
Authors:
,
,
,

Abstract

Elastic Looped Transformers utilize recurrent transformer architecture with weight-sharing and intra-loop self-distillation to achieve parameter-efficient visual generation with adjustable computational cost and generation quality.

AI-generated summary

We introduce Elastic Looped Transformers (ELT), a highly parameter-efficient class of visual generative models based on a recurrent transformer architecture. While conventional generative models rely on deep stacks of unique transformer layers, our approach employs iterative, weight-shared transformer blocks to drastically reduce parameter counts while maintaining high synthesis quality. To effectively train these models for image and video generation, we propose the idea of Intra-Loop Self Distillation (ILSD), where student configurations (intermediate loops) are distilled from the teacher configuration (maximum training loops) to ensure consistency across the model's depth in a single training step. Our framework yields a family of elastic models from a single training run, enabling Any-Time inference capability with dynamic trade-offs between computational cost and generation quality, with the same parameter count. ELT significantly shifts the efficiency frontier for visual synthesis. With 4times reduction in parameter count under iso-inference-compute settings, ELT achieves a competitive FID of 2.0 on class-conditional ImageNet 256 times 256 and FVD of 72.8 on class-conditional UCF-101.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.09168
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.09168 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.09168 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.09168 in a Space README.md to link it from this page.

Collections including this paper 2