-
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space
Paper • 2512.24617 • Published • 55 -
Recursive Language Models
Paper • 2512.24601 • Published • 53 -
Nested Learning: The Illusion of Deep Learning Architectures
Paper • 2512.24695 • Published • 34 -
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
Paper • 2512.02556 • Published • 249
Brian Lin
lzhbrian
·
AI & ML interests
None yet
Recent Activity
updated
a collection
about 11 hours ago
NN Arch Components
updated
a collection
5 days ago
Loop
updated
a collection
5 days ago
NN Arch Components
Organizations
None yet
Loop
TTT
NN Arch Components
-
Deep Delta Learning
Paper • 2601.00417 • Published • 29 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 240 -
VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse
Paper • 2512.14531 • Published • 12 -
Stronger Normalization-Free Transformers
Paper • 2512.10938 • Published • 19
Linear Attention
-
Higher-order Linear Attention
Paper • 2510.27258 • Published • 14 -
RWKV-7 "Goose" with Expressive Dynamic State Evolution
Paper • 2503.14456 • Published • 153 -
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference
Paper • 2503.13427 • Published • 3 -
MoM: Linear Sequence Modeling with Mixture-of-Memories
Paper • 2502.13685 • Published • 36
NN Arch
-
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space
Paper • 2512.24617 • Published • 55 -
Recursive Language Models
Paper • 2512.24601 • Published • 53 -
Nested Learning: The Illusion of Deep Learning Architectures
Paper • 2512.24695 • Published • 34 -
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
Paper • 2512.02556 • Published • 249
NN Arch Components
-
Deep Delta Learning
Paper • 2601.00417 • Published • 29 -
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 240 -
VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse
Paper • 2512.14531 • Published • 12 -
Stronger Normalization-Free Transformers
Paper • 2512.10938 • Published • 19
Loop
Linear Attention
-
Higher-order Linear Attention
Paper • 2510.27258 • Published • 14 -
RWKV-7 "Goose" with Expressive Dynamic State Evolution
Paper • 2503.14456 • Published • 153 -
xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference
Paper • 2503.13427 • Published • 3 -
MoM: Linear Sequence Modeling with Mixture-of-Memories
Paper • 2502.13685 • Published • 36
TTT