Mark's picture

Mark

Makrrr

·

AI & ML interests

NLP, RLHF, IR

Recent Activity

upvoted a paper 24 days ago

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

upvoted a paper about 2 months ago

SkillOS: Learning Skill Curation for Self-Evolving Agents

updated a model 2 months ago

CL-From-Nothing/Qwen3-4B-SSD-RLVE-Eval20-N20-global-step-500

View all activity

Organizations

New activity in Makrrr/Qwen3-1.7B-GSM8K-GRPO-verl 8 months ago

Can we have the training setting?

#1 opened 9 months ago by

New activity in nanotron/ultrascale-playbook about 1 year ago

How to understand the graph "Tensor parallelism with column linear + row Linear"

#109 opened about 1 year ago by