4 6 15

Geyang

geyang627

AI & ML interests

None yet

Recent Activity

updated a dataset about 1 month ago

geyang627/care_pro

published a dataset about 1 month ago

geyang627/care_pro

upvoted a paper 3 months ago

Safe and Scalable Web Agent Learning via Recreated Websites

View all activity

Organizations

updated a dataset about 1 month ago

geyang627/care_pro

Viewer • Updated May 14 • 775 • 30

published a dataset about 1 month ago

geyang627/care_pro

Viewer • Updated May 14 • 775 • 30

upvoted a paper 3 months ago

Safe and Scalable Web Agent Learning via Recreated Websites

Paper • 2603.10505 • Published Mar 11 • 27

upvoted 2 articles 4 months ago

Article

Deriving the PPO Loss from First Principles

garg-aayush

•

Dec 25, 2025

• 45

Article

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

karina-zadorozhny

•

Jan 19

• 30

New activity in QCRI/MultiNativQA 7 months ago

All is_reliable is True

#2 opened 7 months ago by

geyang627

upvoted a collection 10 months ago

Qwen3

Collection

84 items • Updated Dec 31, 2025 • 1.82k

updated 3 models 12 months ago

updated a collection 12 months ago

CARE

Collection

14 items • Updated Jun 30, 2025 • 2

updated 5 models 12 months ago

geyang627/care-arabic-mistral-7b

7B • Updated Jun 30, 2025 • 1 • 1

geyang627/care-japanese-qwen2.5-7b

8B • Updated Jun 28, 2025 • 5

geyang627/care-japanese-mistral-7b

7B • Updated Jun 28, 2025 • 4 • 1

geyang627/care-japanese-llama3.1-8b

8B • Updated Jun 28, 2025 • 6 • 1

geyang627/care-japanese-gemma2-9b

9B • Updated Jun 28, 2025 • 2 • 1

published a model 12 months ago

geyang627/care-japanese-qwen2.5-7b

8B • Updated Jun 28, 2025 • 5

Geyang

AI & ML interests

Recent Activity

Organizations

geyang627's activity

Deriving the PPO Loss from First Principles

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

All is_reliable is True