7 16 212

Lucas Chen

leocnj

AI & ML interests

NLP, speech, multimodal, deep learning

Recent Activity

upvoted an article about 1 month ago

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

upvoted an article about 1 month ago

Mixture of Experts (MoEs) in Transformers

upvoted an article about 1 month ago

TRL v1.0: Post-Training Library Built to Move with the Field

View all activity

Organizations

None yet

upvoted 4 articles about 1 month ago

Article

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

ServiceNow-AI

•

Sep 22, 2025

• 14

Article

Mixture of Experts (MoEs) in Transformers

ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap

•

Feb 26

• 159

Article

TRL v1.0: Post-Training Library Built to Move with the Field

qgallouedec, stevhliu, pcuenq, sergiopaniego

•

Mar 31

• 51

Article

Multimodal Embedding & Reranker Models with Sentence Transformers

tomaarsen

•

Apr 9

• 59

upvoted an article 2 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

sirluk

•

Oct 7, 2024

• 71

liked a model 3 months ago

google/functiongemma-270m-it

Text Generation • Updated Jan 14 • 93.6k • 985

upvoted an article 6 months ago

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 478

liked 2 Spaces 7 months ago

The Smol Training Playbook

📚

3.17k

The secrets to building world-class LLMs

The Ultra-Scale Playbook

🌌

3.84k

The ultimate guide to training LLM on large GPU Clusters

liked a model 8 months ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27, 2025 • 549k • • 3.11k

liked 5 datasets 9 months ago

liked a Space 10 months ago

FLUX.1 Krea Dev

📚

371

Generate images from text prompts

liked 4 datasets 10 months ago

interstellarninja/hermes_reasoning_tool_use

Viewer • Updated Dec 26, 2025 • 51k • 2.31k • 167

nvidia/Llama-Nemotron-Post-Training-Dataset

Viewer • Updated May 8, 2025 • 3.91M • 4.68k • 662

MathAndMagic/function-calling

Viewer • Updated Feb 2, 2024 • 86.9k • 300 • 9

BluebrainAI/Function_calling_SFT

Viewer • Updated Mar 24, 2025 • 348k • 223 • 3

Lucas Chen

AI & ML interests

Recent Activity

Organizations

leocnj's activity

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

Mixture of Experts (MoEs) in Transformers

TRL v1.0: Post-Training Library Built to Move with the Field

Multimodal Embedding & Reranker Models with Sentence Transformers

Efficient LLM Pretraining: Packed Sequences and Masked Attention

You could have designed state of the art positional encoding

The Smol Training Playbook

The Ultra-Scale Playbook

FLUX.1 Krea Dev