2 61 6

Peter

Tempo14

AI & ML interests

None yet

Recent Activity

updated a collection 1 day ago

robotic

updated a collection 1 day ago

updated a collection 3 days ago

View all activity

Organizations

upvoted 2 articles 19 days ago

Article

MiniMax Goes Sparse: Decoding M3's Attention from a Single Diagram

AtlasCloud-AI

•

29 days ago

• 10

Article

LeRobot Humanoid: An Open, Low-Cost, 3D-Printed Humanoid for Robot Learning

VirgileBatto

•

May 21

• 62

upvoted a paper about 2 months ago

From Context to Skills: Can Language Models Learn from Context Skillfully?

Paper • 2604.27660 • Published May 3 • 171

upvoted an article about 2 months ago

Article

DeepSeek-V4: a million-token context that agents can actually use

burtenshaw

•

Apr 24

• 50

upvoted a paper about 2 months ago

How Much Is One Recurrence Worth? Iso-Depth Scaling Laws for Looped Language Models

Paper • 2604.21106 • Published Apr 27 • 10

upvoted a paper 2 months ago

Sessa: Selective State Space Attention

Paper • 2604.18580 • Published Apr 21 • 14

upvoted 11 papers 3 months ago

SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?

Paper • 2603.15401 • Published Mar 16 • 20

GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent

Paper • 2603.13875 • Published Mar 14 • 36

Online Experiential Learning for Language Models

Paper • 2603.16856 • Published Mar 17 • 60

V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning

Paper • 2603.14482 • Published Mar 15 • 36

Memento-Skills: Let Agents Design Agents

Paper • 2603.18743 • Published Mar 19 • 58

Effective Distillation to Hybrid xLSTM Architectures

Paper • 2603.15590 • Published Mar 16 • 34

Mixture-of-Depths Attention

Paper • 2603.15619 • Published Mar 16 • 81

Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

Paper • 2603.12228 • Published Mar 12 • 12

upvoted 2 articles 4 months ago

Article

Did GPT 5.2 make a breakthrough discovery in theoretical physics?

dlouapre

•

Feb 19

• 62

Article

How to Use Multiple GPUs in Hugging Face Transformers: Device Map vs Tensor Parallelism

ariG23498

•

Feb 12

• 20

upvoted a paper 6 months ago

Nested Learning: The Illusion of Deep Learning Architectures

Paper • 2512.24695 • Published Dec 31, 2025 • 46

Peter

AI & ML interests

Recent Activity

Organizations

Tempo14's activity

MiniMax Goes Sparse: Decoding M3's Attention from a Single Diagram

LeRobot Humanoid: An Open, Low-Cost, 3D-Printed Humanoid for Robot Learning

DeepSeek-V4: a million-token context that agents can actually use

Did GPT 5.2 make a breakthrough discovery in theoretical physics?

How to Use Multiple GPUs in Hugging Face Transformers: Device Map vs Tensor Parallelism