JustinDuc
/

saute

feature-extraction

masked-language-modeling

Model card Files Files and versions

JustinDuc commited on Jun 9, 2025

Commit

3ed12c3

·

verified ·

1 Parent(s): cbbb54a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -50,7 +50,7 @@ It avoids the quadratic cost of full self-attention by summarizing per-speaker m
 - 🧠 **Speaker-Aware Memory**: Structured per-speaker representation of dialogue context.
 - ⚡ **Linear Attention**: Efficient and scalable to long dialogues.
 - 🧩 **Pretrained Transformer Compatible**: Can plug into frozen or fine-tuned BERT models.
-- 🪶 **Lightweight**: Adds ~2M parameters with strong MLM performance improvements.
 ---

 - 🧠 **Speaker-Aware Memory**: Structured per-speaker representation of dialogue context.
 - ⚡ **Linear Attention**: Efficient and scalable to long dialogues.
 - 🧩 **Pretrained Transformer Compatible**: Can plug into frozen or fine-tuned BERT models.
+- 🪶 **Lightweight**: ~4M parameters less than 2-layer with strong MLM performance improvements.
 ---