\documentclass{article} \usepackage{amsmath} \title{A Quick Sample} \author{Test} \begin{document} \maketitle \section{Introduction} This paper investigate the role of attention mechanisms in transformer architectures. Recent work~\cite{vaswani2017} have shown that self-attention outperform recurrence on many task. See also~\cite{nonexistent2024} for context. We propose an novel method, which we call FastAttn, that reduce the quadratic cost of attention to $O(n \log n)$. Their approach are evaluated on three benchmarks and shows consistent improvment over the baselines. \bibliographystyle{plain} \bibliography{refs} \end{document}