10 45

Shota Kaji

ShotaKaji

AI & ML interests

None yet

Recent Activity

upvoted an article 1 day ago

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

liked a dataset 5 months ago

team-victory/qa_10k

liked a dataset 6 months ago

zwhe99/DeepMath-103K

View all activity

Organizations

upvoted an article 1 day ago

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego

•

Mar 10

• 152

upvoted an article 6 months ago

Article

Nemotron-Personas-Japan: Synthesized Data for Sovereign AI

nvidia

•

Sep 23, 2025

• 27

upvoted an article 9 months ago

Article

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

toslali-ibm, mirinflim, qgallouedec, esnible, rganti, mudhakar

•

Jun 3, 2025

• 101

upvoted a collection 11 months ago

Reward Bench 2

Collection

Datasets, spaces, and models for Reward Bench 2 benchmark and paper! • 11 items • Updated Dec 23, 2025 • 16

upvoted a collection 12 months ago

Tulu 3 Datasets

Collection

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated Mar 2 • 97

upvoted an article about 1 year ago

Article

The N Implementation Details of RLHF with PPO

vwxyzjn, tianlinliu0121, lvwerra

•

Oct 24, 2023

• 72

upvoted a collection about 1 year ago

OLMo 2

Collection

Artifacts for the OLMo 2 release. • 35 items • Updated Mar 3 • 155

upvoted an article about 1 year ago

Article

Reasoning Datasets Competition

bespokelabs

•

Apr 9, 2025

• 38

upvoted an article over 1 year ago

Article

Open-R1: Update #1

open-r1

•

Feb 2, 2025

• 305

upvoted a paper over 1 year ago

Minimum Entropy Coupling with Bottleneck

Paper • 2410.21666 • Published Oct 29, 2024 • 5

Shota Kaji

AI & ML interests

Recent Activity

Organizations

ShotaKaji's activity

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

Nemotron-Personas-Japan: Synthesized Data for Sovereign AI

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

The N Implementation Details of RLHF with PPO

Reasoning Datasets Competition

Open-R1: Update #1