·
AI & ML interests
I like to fine-tune the small models of the Doge series.
Organizations
-
-
-
-
-
-
-
-
-
-
-
view article
Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models
upvoted
a
paper
6 months ago
upvoted
a
paper
about 1 year ago