zhang's picture

7 1

zhang

kekueknu2

·

AI & ML interests

None yet

Organizations

upvoted a paper 5 months ago

daVinci-Dev: Agent-native Mid-training for Software Engineering

Paper • 2601.18418 • Published Jan 26 • 126

upvoted an article over 1 year ago

Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

NormalUhr

•

Feb 4, 2025

• 17

upvoted an article almost 2 years ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

+2

natolambert, LouisCastricato, lvwerra, Dahoas

•

Dec 9, 2022

• 417

upvoted a collection about 2 years ago

LLM papers

It is a collection of papers that are useful in studying LLM. • 14 items • Updated Apr 3, 2024 • 16

upvoted a paper about 2 years ago

Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15, 2024 • 111

upvoted 2 collections about 2 years ago

Foundation AI Papers

Curated List of Must-Reads on LLM reasoning at Temus AI team • 135 items • Updated Jun 15, 2024 • 36

Reading Papers

231 items • Updated Jul 28, 2025 • 13