1 3 5

sungyub kim

sungyub

AI & ML interests

None yet

Recent Activity

updated a dataset about 5 hours ago

sungyub/deepscaler-preview-verl

new activity 3 months ago

sungyub/ifbench-verl:Hello, I find datatrove libarary doesn't have datatrove.utils.reward_score.ifeval

liked a Space 4 months ago

HuggingFaceFW/finephrase

View all activity

Organizations

None yet

updated a dataset about 5 hours ago

sungyub/deepscaler-preview-verl

Viewer • Updated about 5 hours ago • 38k • 646

New activity in sungyub/ifbench-verl 3 months ago

Hello, I find datatrove libarary doesn't have datatrove.utils.reward_score.ifeval

#1 opened 3 months ago by

Wuyangqian

liked a Space 4 months ago

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

📝

262

Visualize synthetic‑data experiments as an interactive bookshelf

upvoted 2 articles 5 months ago

Article

We Got Claude to Build CUDA Kernels and teach open models!

burtenshaw, evalstate, merve, pcuenq

•

Jan 28

• 158

Article

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

•

Jan 27

• 80

updated a collection 6 months ago

VERL QA Datasets

Collection

High-quality QA generation datasets in VERL format: document QA, table reasoning, and multi-hop reasoning tasks. • 6 items • Updated Mar 2

updated a dataset 6 months ago

sungyub/qa-verl-unified

Viewer • Updated Jan 8 • 86.4k • 69

published a dataset 6 months ago

sungyub/qa-verl-unified

Viewer • Updated Jan 8 • 86.4k • 69

updated 2 datasets 6 months ago

sungyub/docqa-rl-verl

Viewer • Updated Jan 8 • 3.6k • 27

sungyub/code-verl-unified

Viewer • Updated Jan 8 • 959k • 203 • 1

liked 2 Spaces 7 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.38k

Explore and download the FineWeb web‑scale text dataset

Evaluation Guidebook

📝

330

Explore LLM benchmark scores over time

updated a dataset 8 months ago

sungyub/codev-r1-verl

Viewer • Updated Nov 11, 2025 • 3.13k • 33

upvoted an article 8 months ago

Article

Let's talk about LLM evaluation

clefourrier

•

May 23, 2024

• 212

liked 2 Spaces 8 months ago

The Ultra-Scale Playbook

🌌

3.9k

The ultimate guide to training LLM on large GPU Clusters

The Smol Training Playbook

📚

3.22k

The secrets to building world-class LLMs

updated 4 datasets 8 months ago

sungyub kim

AI & ML interests

Recent Activity

Organizations

sungyub's activity

Hello, I find datatrove libarary doesn't have datatrove.utils.reward_score.ifeval

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

We Got Claude to Build CUDA Kernels and teach open models!

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

FineWeb: decanting the web for the finest text data at scale

Evaluation Guidebook

Let's talk about LLM evaluation

The Ultra-Scale Playbook

The Smol Training Playbook