11 41

Koi

KOIIIII

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

upvoted a paper about 1 month ago

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

liked a dataset about 2 months ago

datatune/LogiHard-2K

View all activity

Organizations

None yet

upvoted 2 papers about 1 month ago

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

Paper • 2605.13779 • Published May 13 • 223

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Paper • 2605.13724 • Published May 13 • 105

upvoted a paper 2 months ago

Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator

Paper • 2604.08121 • Published Apr 9 • 44

upvoted a paper 3 months ago

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Paper • 2604.01658 • Published Apr 2 • 55

upvoted a paper 7 months ago

PIPPA: A Partially Synthetic Conversational Dataset

Paper • 2308.05884 • Published Aug 11, 2023 • 34

upvoted a collection 7 months ago

InternVL3.5

Collection

This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 45 items • Updated Mar 2 • 110

upvoted an article 12 months ago

Article

Mixture of Experts Explained

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.15k

upvoted an article about 1 year ago

Article

The N Implementation Details of RLHF with PPO

vwxyzjn, tianlinliu0121, lvwerra

•

Oct 24, 2023

• 72

upvoted 3 papers about 1 year ago

Nash Learning from Human Feedback

Paper • 2312.00886 • Published Dec 1, 2023 • 18

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 148

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 60

Koi

AI & ML interests

Recent Activity

Organizations

KOIIIII's activity

Mixture of Experts Explained

The N Implementation Details of RLHF with PPO