joaogante commited on
Commit
7bbd1e8
·
1 Parent(s): 936e74c

add readme

Browse files
Files changed (1) hide show
  1. README.md +28 -0
README.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - custom_generate
5
+ ---
6
+
7
+ ## Description
8
+ Implementation of the cache introduced in the [Attention Sinks paper](https://arxiv.org/abs/2309.17453). It allows the
9
+ model to generate beyond the length of its context window, without losing fluency in the conversation. As it discards
10
+ past tokens, the model will lose the ability to generate tokens that depend on the context that was discarded.
11
+
12
+ This implementation should match the `SinkCache` class present in `transformers<4.53.0`.
13
+
14
+ ![Sink Cache diagram from the original paper](https://arxiv.org/html/2309.17453v4/x1.png)
15
+
16
+ ## Base model:
17
+
18
+
19
+ ## Model compatibility
20
+
21
+
22
+ ## Additional Arguments
23
+
24
+
25
+ ## Output Type changes
26
+
27
+
28
+ ## Example usage