3 2

TianMuxin

realtmxi

realtmxi

AI & ML interests

None yet

Recent Activity

liked a dataset 22 days ago

ravishekhar/oasissimpdataset

liked a dataset 5 months ago

CharlieDreemur/OpenManus-RL

upvoted a paper 8 months ago

Where LLM Agents Fail and How They can Learn From Failures

View all activity

Organizations

liked a dataset 22 days ago

ravishekhar/oasissimpdataset

Updated Mar 15 • 58 • 1

liked a dataset 5 months ago

CharlieDreemur/OpenManus-RL

Viewer • Updated Mar 15, 2025 • 48.9k • 106 • 86

upvoted a paper 8 months ago

Where LLM Agents Fail and How They can Learn From Failures

Paper • 2509.25370 • Published Sep 29, 2025 • 12

upvoted a paper 9 months ago

Sotopia-RL: Reward Design for Social Intelligence

Paper • 2508.03905 • Published Aug 5, 2025 • 23

upvoted an article 11 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

NormalUhr

•

Feb 7, 2025

• 292

updated a dataset 12 months ago

realtmxi/CC100-sinhala

Viewer • Updated May 15, 2025 • 12.6M • 21 • 1

published a dataset 12 months ago

realtmxi/CC100-sinhala

Viewer • Updated May 15, 2025 • 12.6M • 21 • 1

updated a dataset 12 months ago

realtmxi/MADLAD_CultureX_cleaned

Updated May 15, 2025 • 8

published a dataset 12 months ago

realtmxi/MADLAD_CultureX_cleaned

Updated May 15, 2025 • 8

updated 2 models about 1 year ago

realtmxi/Qwen3-8b-sft-trajL

8B • Updated May 11, 2025 • 4

realtmxi/qwen3-4b-sft-trajL

Updated May 11, 2025

published 2 models about 1 year ago