-
RedPajama: an Open Dataset for Training Large Language Models
Paper • 2411.12372 • Published • 58 -
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration
Paper • 2411.10958 • Published • 57 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 240 -
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration
Paper • 2602.01734 • Published • 32
Kristinn Vikar
KristinnVikarJ
AI & ML interests
None yet
Organizations
None yet
to-read
-
RedPajama: an Open Dataset for Training Large Language Models
Paper • 2411.12372 • Published • 58 -
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration
Paper • 2411.10958 • Published • 57 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 240 -
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration
Paper • 2602.01734 • Published • 32
models 0
None public yet
datasets 0
None public yet