Update README.md
Browse files
README.md
CHANGED
|
@@ -19,6 +19,9 @@ The core idea behind TRIM-KV is to learn the intrinsic importance of each key–
|
|
| 19 |
The retention score is query-agnostic and captures the long-term utility of tokens. This is different from attention scores, which are query-dependent: they capture the short-term utility for predicting the next token and are recomputed at every step, making them local, myopic, and highly dependent on the transient decoding state.
|
| 20 |
|
| 21 |
|
|
|
|
|
|
|
|
|
|
| 22 |
### Why TRIM-KV?
|
| 23 |
|
| 24 |
It's fast
|
|
|
|
| 19 |
The retention score is query-agnostic and captures the long-term utility of tokens. This is different from attention scores, which are query-dependent: they capture the short-term utility for predicting the next token and are recomputed at every step, making them local, myopic, and highly dependent on the transient decoding state.
|
| 20 |
|
| 21 |
|
| 22 |
+
<a href="https://arxiv.org/pdf/2512.03324"><img src="https://img.shields.io/badge/arxiv-2512.03324-red?style=for-the-badge"></a>
|
| 23 |
+
|
| 24 |
+
|
| 25 |
### Why TRIM-KV?
|
| 26 |
|
| 27 |
It's fast
|