7 3

shijiecao

AI & ML interests

None yet

Recent Activity

upvoted a paper about 15 hours ago

Universal YOCO for Efficient Depth Scaling

upvoted a paper about 2 months ago

HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing

upvoted a paper 3 months ago

MiMo-V2-Flash Technical Report

View all activity

Organizations

None yet

upvoted a paper about 15 hours ago

Universal YOCO for Efficient Depth Scaling

Paper • 2604.01220 • Published 1 day ago • 11

upvoted a paper about 2 months ago

HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing

Paper • 2602.03560 • Published Feb 3 • 49

upvoted a paper 3 months ago

MiMo-V2-Flash Technical Report

Paper • 2601.02780 • Published Jan 6 • 37

upvoted a collection 4 months ago

MiMo-V2-Flash

Collection

MiMo-V2-Flash Series • 2 items • Updated Dec 17, 2025 • 23

liked 2 models 4 months ago

XiaomiMiMo/MiMo-V2-Flash-Base

Text Generation • 310B • Updated Dec 17, 2025 • 240 • 48

XiaomiMiMo/MiMo-V2-Flash

Text Generation • 310B • Updated Feb 27 • 49.1k • • 693

upvoted a paper 8 months ago

Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning

Paper • 2508.07101 • Published Aug 9, 2025 • 14

authored 2 papers 10 months ago

BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation

Paper • 2402.10631 • Published Feb 16, 2024 • 2

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Paper • 2410.13276 • Published Oct 17, 2024 • 29

upvoted 2 papers 10 months ago

Rectified Sparse Attention

Paper • 2506.04108 • Published Jun 4, 2025 • 11

SeerAttention-R: Sparse Attention Adaptation for Long Reasoning

Paper • 2506.08889 • Published Jun 10, 2025 • 23

liked a model about 1 year ago

SeerAttention/SeerAttention-Llama-3.1-8B-AttnGates

Text Generation • Updated Mar 3, 2025 • 4.59k • 4

shijiecao

AI & ML interests

Recent Activity

Organizations

shijiecao's activity