rahim-xelpmoc 's Collections papers to read
updated
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection
Paper
• 2512.23273
• Published • 15
A 58-Addition, Rank-23 Scheme for General 3x3 Matrix Multiplication
Paper
• 2512.21980
• Published • 3
Step-DeepResearch Technical Report
Paper
• 2512.20491
• Published • 87
SAM Audio: Segment Anything in Audio
Paper
• 2512.18099
• Published • 24
LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding
Paper
• 2512.16229
• Published • 16
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion
Paper
• 2512.19535
• Published • 12
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers
Paper
• 2512.17351
• Published • 28
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing
Paper
• 2512.14681
• Published • 42
Janus: Disaggregating Attention and Experts for Scalable MoE Inference
Paper
• 2512.13525
• Published • 6
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management
Paper
• 2512.12967
• Published • 111
Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics
Paper
• 2512.12602
• Published • 44
Sliding Window Attention Adaptation
Paper
• 2512.10411
• Published • 21
mHC: Manifold-Constrained Hyper-Connections
Paper
• 2512.24880
• Published • 322
FlashSampling: Fast and Memory-Efficient Exact Sampling
Paper
• 2603.15854
• Published • 9
Note done
Paper
• 2603.15031
• Published • 180
Mixture-of-Depths Attention
Paper
• 2603.15619
• Published • 80
Flash-KMeans: Fast and Memory-Efficient Exact K-Means
Paper
• 2603.09229
• Published • 82
Note done
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression
Paper
• 2604.04921
• Published • 107
Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference
Paper
• 2604.07394
• Published • 16