Papers
arxiv:2605.11125

Language Modeling with Hyperspherical Flows

Published on May 11
Authors:
,

Abstract

Latent flow language models operating on hyperspheres improve upon discrete diffusion models by enabling parallel generation while addressing issues with one-hot vector representations and semantic noise interpretation.

AI-generated summary

Discrete Diffusion Language Models progressed rapidly as an alternative to autoregressive (AR) models, motivated by their parallel generation abilities. However, for tractability, discrete diffusion models sample from a factorized distribution, which is less expressive than AR. Recent Flow Language Models (FLMs) apply continuous flows to language, transporting noise to data with a deterministic ODE that avoids factorized sampling. FLMs operate on one-hot vectors whose dimension scales with the vocabulary size, making FLMs costly to train. Moreover, since all distinct one-hot embeddings are equidistant in ell_2, adding Gaussian noise does not have a clear semantic interpretation (unlike images, where Gaussian noise progressively degrades structure). We introduce S-FLM, a latent FLM in the hypersphere. S-FLM generates sequences by rotating vectors in S^{d-1} along a velocity field learned with cross-entropy, avoiding the overhead of materializing one-hot vectors. Previous FLMs match AR in Generative Perplexity (Gen.\ PPL), but samples with high likelihood are not necessarily correct in verifiable domains such as math and code. S-FLM substantially improves continuous flow language models on large-vocabulary reasoning and closes the gap to masked diffusion under standard-temperature sampling (T=1), while a gap remains under optimized low-temperature (T=0.1) decoding.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.11125 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.11125 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.