Language Modeling with Hyperspherical Flows
Paper • 2605.11125 • Published
By Justin Deschenaux and Caglar Gulcehre.
This repo hosts the pretrained checkpoints for Language Modeling with Hyperspherical Flows (𝕊-FLM). For the abstract, training/sampling code, and reproduction scripts, see the companion code repo: jdeschena/s-flm.
𝕊-FLM and the baselines we compare against (AR, MDLM, Duo, FLM, CANDI), trained on TinyGSM (250k steps, SmolLM-135M tokenizer) and OpenWebText (1M steps, GPT-2 tokenizer).
tinygsm/{ar,mdlm,duo}.ckpt
tinygsm/candi/{lr3e-4,lr1e-3}.ckpt
tinygsm/flm/{default,caps}.ckpt
tinygsm/sfm/{sphere_dit_truncated_fixed_no_renorm,
sphere_dit_truncated_adaptive_no_renorm,
sphere_arch_truncated_adaptive_no_renorm}.ckpt
owt/{ar,mdlm,duo,flm,sfm}.ckpt
huggingface-cli download jdeschena/s-flm tinygsm/duo.ckpt --local-dir ./checkpoints
Loading and sampling are handled by the code repo — see jdeschena/s-flm for the scripts.
@misc{deschenaux2026languagemodelinghypersphericalflows,
title={Language Modeling with Hyperspherical Flows},
author={Justin Deschenaux and Caglar Gulcehre},
year={2026},
eprint={2605.11125},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2605.11125},
}