Leon's picture

Leon

Leon-Leee

·

yucc-leon

AI & ML interests

LLMs, code generation, chatbot, workflows

Recent Activity

liked a dataset 2 days ago

osunlp/QUEST-Mid-Training-Data

liked a model about 2 months ago

BAAI/OpenSeek-Mid-v1

updated a dataset 2 months ago

Leon-Leee/zh-wiki-disambig

View all activity

Organizations

upvoted a collection 3 months ago

CodeScout

RL-trained code search agents (1.7B, 4B, 14B) that outperform 2–18× larger models using only a Unix terminal. 📄 arxiv.org/abs/2603.17829 • 12 items • Updated Mar 19 • 8

upvoted a collection 6 months ago

Nemotron-Cascade

Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 14 items • Updated 15 days ago • 55

upvoted a collection 9 months ago

DeepSeek-V3.2

4 items • Updated Dec 1, 2025 • 544

upvoted a paper 11 months ago

Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding

Paper • 2507.19427 • Published Jul 25, 2025 • 22

upvoted a paper 12 months ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published Jun 25, 2025 • 49

upvoted 2 papers about 1 year ago

MegaMath: Pushing the Limits of Open Math Corpora

Paper • 2504.02807 • Published Apr 3, 2025 • 35

SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks

Paper • 2506.10954 • Published Jun 12, 2025 • 54

upvoted a collection about 1 year ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.82k

upvoted a paper about 1 year ago

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Paper • 2505.15612 • Published May 21, 2025 • 34

upvoted 2 articles about 1 year ago

Article

wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR??

catherinearnett

•

Sep 27, 2024

• 55

Article

I trained a Language Model to schedule events with GRPO!

anakin87

•

Apr 29, 2025

• 95

upvoted a paper over 1 year ago

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

Paper • 2502.19361 • Published Feb 26, 2025 • 28

upvoted an article over 1 year ago

Article

Revisiting TemplateGSM: Advancing Mathematical Reasoning in Language Models with Template-based Data Generation

yifAI

•

Nov 14, 2024

• 3

upvoted 3 collections over 1 year ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19, 2025 • 190

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated Mar 2 • 100

ProX Refining Models

Adapted small language models used to generate data refining programs • 5 items • Updated Oct 10, 2024 • 5

upvoted a collection almost 2 years ago

Magpie-Qwen2 Datasets

Dataset built with Qwen2 72B and Qwen2 7B. • 6 items • Updated Jan 13, 2025 • 10

upvoted 3 papers about 2 years ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 105

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 71

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 96