10 2

tian

Xiaotiank

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

upvoted a paper 4 days ago

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

updated a model 8 months ago

TsinghuaC3I/Qwen2.5-7B-VL-ReAd-R

View all activity

Organizations

upvoted a paper 3 days ago

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Paper • 2606.24530 • Published 5 days ago • 60

upvoted a paper 4 days ago

EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions

Paper • 2606.23654 • Published 6 days ago • 80

updated a model 8 months ago

TsinghuaC3I/Qwen2.5-7B-VL-ReAd-R

8B • Updated Oct 21, 2025 • 43 • 2

upvoted 2 papers 9 months ago

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 119

MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook

Paper • 2509.14142 • Published Sep 17, 2025 • 10

updated a dataset 9 months ago

TsinghuaC3I/AdsQA

Viewer • Updated Sep 16, 2025 • 800 • 60 • 3

published a dataset 9 months ago

TsinghuaC3I/AdsQA

Viewer • Updated Sep 16, 2025 • 800 • 60 • 3

upvoted 3 papers 10 months ago

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Paper • 2509.09674 • Published Sep 11, 2025 • 81

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Paper • 2509.07894 • Published Sep 9, 2025 • 32

published a model 10 months ago

TsinghuaC3I/Qwen2.5-7B-VL-ReAd-R

8B • Updated Oct 21, 2025 • 43 • 2

upvoted 2 papers 10 months ago

Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4, 2025 • 77

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14, 2025 • 97

liked 2 Spaces about 1 year ago

LLM训练终极指南 | The Ultra-Scale Playbook

🔥

269

了解LLM训练的方方面面

FineWeb：大规模提炼网页以获取优质文本数据

🍷

upvoted a paper about 1 year ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 123

tian

AI & ML interests

Recent Activity

Organizations

Xiaotiank's activity

LLM训练终极指南 | The Ultra-Scale Playbook

FineWeb：大规模提炼网页以获取优质文本数据