jingjia (Jingjia Peng)

upvoted an article 4 months ago

Article

Chat Templates: An End to the Silent Performance Killer

Rocketknight1

•

Oct 3, 2023

• 32

upvoted a paper 4 months ago

Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

Paper • 2602.16855 • Published Feb 15 • 51

upvoted an article 5 months ago

Article

SeeMoE: Implementing a MoE Vision Language Model from Scratch

AviSoori1x

•

Jun 23, 2024

• 40

upvoted a collection 5 months ago

NEPA

Collection

5 items • Updated Dec 19, 2025 • 12

upvoted an article 9 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

NormalUhr

•

Aug 9, 2025

• 128

upvoted an article 10 months ago

Article

The Large Language Model Course

mlabonne

•

Jan 16, 2025

• 230

upvoted an article about 1 year ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

+2

natolambert, LouisCastricato, lvwerra, Dahoas

•

Dec 9, 2022

• 418

upvoted a paper about 1 year ago

EXP-Bench: Can AI Conduct AI Research Experiments?

Paper • 2505.24785 • Published May 30, 2025 • 24

upvoted a paper over 1 year ago

Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents

Paper • 2502.16069 • Published Feb 22, 2025 • 20

Jingjia Peng

AI & ML interests

Organizations

Chat Templates: An End to the Silent Performance Killer

Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

SeeMoE: Implementing a MoE Vision Language Model from Scratch

NEPA

From GRPO to DAPO and GSPO: What, Why, and How

The Large Language Model Course

Illustrating Reinforcement Learning from Human Feedback (RLHF)

EXP-Bench: Can AI Conduct AI Research Experiments?

Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents

Jingjia Peng

AI & ML interests

Organizations

jingjia's activity

Chat Templates: An End to the Silent Performance Killer

SeeMoE: Implementing a MoE Vision Language Model from Scratch

From GRPO to DAPO and GSPO: What, Why, and How

The Large Language Model Course

Illustrating Reinforcement Learning from Human Feedback (RLHF)