Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators Paper • 2602.22647 • Published Feb 26 • 4