Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators Paper • 2602.22647 • Published 5 days ago • 2