transformers-community
/

sink_cache

custom_generate

Model card Files Files and versions

joaogante commited on May 22, 2025

Commit

7bbd1e8

·

1 Parent(s): 936e74c

add readme

Files changed (1) hide show

README.md +28 -0

README.md ADDED Viewed

	@@ -0,0 +1,28 @@

+---
+library_name: transformers
+tags:
+  - custom_generate
+---
+## Description
+Implementation of the cache introduced in the [Attention Sinks paper](https://arxiv.org/abs/2309.17453). It allows the
+model to generate beyond the length of its context window, without losing fluency in the conversation. As it discards
+past tokens, the model will lose the ability to generate tokens that depend on the context that was discarded.
+This implementation should match the `SinkCache` class present in `transformers<4.53.0`.
+![Sink Cache diagram from the original paper](https://arxiv.org/html/2309.17453v4/x1.png)
+## Base model:
+## Model compatibility
+## Additional Arguments
+## Output Type changes
+## Example usage