Yang Penghui

ygyjrc

1 8 4

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

Qwen-AgentWorld: Language World Models for General Agents

upvoted a paper 18 days ago

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

upvoted a paper about 1 month ago

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

View all activity

Organizations

None yet

upvoted a paper 11 days ago

Qwen-AgentWorld: Language World Models for General Agents

Paper • 2606.24597 • Published 13 days ago • 144

upvoted a paper 18 days ago

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

Paper • 2606.19338 • Published 19 days ago • 49

upvoted 2 papers about 1 month ago

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

Paper • 2606.03890 • Published Jun 2 • 31

COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation

Paper • 2605.31264 • Published May 29 • 123

liked a dataset about 1 month ago

internlm/CapRL-Video-QA-20K

Viewer • Updated 25 days ago • 20k • 147 • 6

upvoted a paper about 1 month ago

SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction

Paper • 2605.20110 • Published May 19 • 4

liked a dataset about 1 month ago

internlm/CapRL-Video-178K

Viewer • Updated 25 days ago • 170k • 131 • 8

upvoted a paper about 1 month ago

ETCHR: Editing To Clarify and Harness Reasoning

Paper • 2605.23897 • Published May 22 • 13

liked a model about 1 month ago

internlm/CapRL-Video-4B

5B • Updated 13 days ago • 203 • 10

New activity in NemoStation/Marlin-2B about 2 months ago

Question about the evaluation metrics for captioning benchmarks

#3 opened about 2 months ago by

ygyjrc

upvoted a paper about 2 months ago

WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

Paper • 2605.10912 • Published May 11 • 46

upvoted a paper 3 months ago

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published Mar 26 • 134

liked a dataset 3 months ago

internlm/WildClawBench

Benchmark • Updated May 15 • 28.2k • 62

Yang Penghui

AI & ML interests

Recent Activity

Organizations

ygyjrc's activity

Question about the evaluation metrics for captioning benchmarks