HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published 3 days ago • 37
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments Paper • 2507.10548 • Published Jul 14, 2025 • 37
SeerAttention-R: Sparse Attention Adaptation for Long Reasoning Paper • 2506.08889 • Published Jun 10, 2025 • 23
SeerAttention/SeerAttention-Decode-Qwen3-4B-AttnGates Text Generation • Updated Jun 9, 2025 • 205 • 2
SeerAttention/SeerAttention-Decode-R1-Distill-Qwen-14B-AttnGates Text Generation • Updated Jun 9, 2025 • 2
SeerAttention/SeerAttention-Decode-R1-Distill-Qwen-14B-AttnGates Text Generation • Updated Jun 9, 2025 • 2
SeerAttention/SeerAttention-Decode-Qwen3-4B-AttnGates Text Generation • Updated Jun 9, 2025 • 205 • 2
SeerAttention/SeerAttention-DeepSeek-R1-Distill-Qwen-32B-AttnGates Text Generation • Updated Mar 3, 2025 • 2 • 1