Mao Song

MaoSong2022

1 16 6

https://maosong2022.github.io/

MaoSong2022

AI & ML interests

None yet

Recent Activity

upvoted an article about 2 months ago

Tool Use, Unified

upvoted an article about 2 months ago

Open-R1: a fully open reproduction of DeepSeek-R1

liked a Space 3 months ago

HuggingFaceTB/smol-training-playbook

View all activity

Organizations

upvoted 2 articles about 2 months ago

Article

Tool Use, Unified

Rocketknight1

•

Aug 12, 2024

• 120

Article

Open-R1: a fully open reproduction of DeepSeek-R1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 890

upvoted an article 4 months ago

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego

•

Mar 10

• 165

upvoted a collection 5 months ago

Finetuned Eagle Models

Collection

[ICLR 2026] Official Implementation of paper 'Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders' • 3 items • Updated Feb 13 • 1

upvoted 3 articles 5 months ago

Article

Aligning to What? Rethinking Agent Generalization in MiniMax M2

MiniMax-AI

•

Oct 30, 2025

• 43

Article

Why Did MiniMax M2 End Up as a Full Attention Model?

MiniMax-AI

•

Oct 30, 2025

• 80

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

AviSoori1x

•

Jun 23, 2024

• 40

upvoted a paper 6 months ago

NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

Paper • 2601.02204 • Published Jan 5 • 64

upvoted a collection 7 months ago

Olmo 3

Collection

Artifacts for the Olmo 3 release. • 7 items • Updated Mar 2 • 171

upvoted an article 12 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 780

upvoted 2 articles about 1 year ago

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 488

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

ariG23498, lusxvr, andito, sergiopaniego, merve, pcuenq, reach-vb

•

May 21, 2025

• 261

upvoted 4 papers over 1 year ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19, 2025 • 219

Chimera: Improving Generalist Model with Domain-Specific Experts

Paper • 2412.05983 • Published Dec 8, 2024 • 9

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 162

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

Paper • 2412.03069 • Published Dec 4, 2024 • 34

Mao Song

AI & ML interests

Recent Activity

Organizations

MaoSong2022's activity

Tool Use, Unified

Open-R1: a fully open reproduction of DeepSeek-R1

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

Aligning to What? Rethinking Agent Generalization in MiniMax M2

Why Did MiniMax M2 End Up as a Full Attention Model?

SeeMoE: Implementing a MoE Vision Language Model from Scratch

SmolLM3: smol, multilingual, long-context reasoner

You could have designed state of the art positional encoding

nanoVLM: The simplest repository to train your VLM in pure PyTorch