File size: 643 Bytes
cc9d861
 
 
 
 
 
 
 
 
 
 
d71fbfa
 
cc9d861
 
 
 
 
d71fbfa
 
 
cc9d861
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
\documentclass{article}
\usepackage{amsmath}

\title{A Quick Sample}
\author{Test}
\begin{document}
\maketitle

\section{Introduction}
This paper investigate the role of attention mechanisms in transformer
architectures. Recent work~\cite{vaswani2017} have shown that self-attention
outperform recurrence on many task. See also~\cite{nonexistent2024} for
context.

We propose an novel method, which we call FastAttn, that reduce the quadratic
cost of attention to $O(n \log n)$. Their approach are evaluated on three
benchmarks and shows consistent improvment over the baselines.

\bibliographystyle{plain}
\bibliography{refs}

\end{document}