Echo2334
smy111
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper about 1 month ago
Rethinking Cross-Layer Information Routing in Diffusion Transformers upvoted a paper about 1 month ago
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps new activity 6 months ago
RTP-LLM/Qwen3-Coder-30B-A3B-Instruct-RTPurbo:DuoAttention(ICLR 2025)