16 17

panyaning

AI & ML interests

None yet

Recent Activity

upvoted a paper 22 days ago

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

upvoted a paper about 1 month ago

OmniNFT: Modality-wise Omni Diffusion Reinforcement for Joint Audio-Video Generation

upvoted a paper 2 months ago

DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation

View all activity

Organizations

upvoted a paper 22 days ago

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Paper • 2606.02060 • Published 25 days ago • 55

upvoted a paper about 1 month ago

OmniNFT: Modality-wise Omni Diffusion Reinforcement for Joint Audio-Video Generation

Paper • 2605.12480 • Published May 12 • 4

upvoted a paper 2 months ago

DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation

Paper • 2604.14683 • Published Apr 16 • 36

upvoted a collection 4 months ago

Baichuan-M3

Collection

Modeling Clinical Inquiry for Reliable Medical Decision-Making • 6 items • Updated Mar 2 • 17

liked a model 5 months ago

massaki75/progemu

8B • Updated Nov 7, 2025 • 11 • 1

updated a dataset 6 months ago

NJU-LINK/MT-Video-Bench

Updated Jan 7 • 82 • 4

upvoted a paper 6 months ago

T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation

Paper • 2512.21094 • Published Dec 24, 2025 • 25

upvoted a collection 6 months ago

MaskFocus

Collection

MaskFocus • 2 items • Updated Dec 21, 2025 • 2

upvoted an article 7 months ago

Article

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

tngtech

•

Apr 16, 2025

• 81

upvoted a paper 7 months ago

ViDiC: Video Difference Captioning

Paper • 2512.03405 • Published Dec 3, 2025 • 29

upvoted 4 papers 8 months ago

IF-VidCap: Can Video Caption Models Follow Instructions?

Paper • 2510.18726 • Published Oct 21, 2025 • 27

Chem-R: Learning to Reason as a Chemist

Paper • 2510.16880 • Published Oct 19, 2025 • 53

MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues

Paper • 2510.17722 • Published Oct 20, 2025 • 20

UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation

Paper • 2510.18701 • Published Oct 21, 2025 • 68

liked 2 datasets 8 months ago

adlbh/PediatricsMQA

Viewer • Updated Aug 22, 2025 • 5.48k • 23 • 1

NJU-LINK/MT-Video-Bench

Updated Jan 7 • 82 • 4

published a dataset 8 months ago

NJU-LINK/MT-Video-Bench

Updated Jan 7 • 82 • 4

authored a paper 8 months ago

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

Paper • 2510.10689 • Published Oct 12, 2025 • 46

upvoted 2 papers 8 months ago

RLFR: Extending Reinforcement Learning for LLMs with Flow Environment

Paper • 2510.10201 • Published Oct 11, 2025 • 36

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

Paper • 2510.10689 • Published Oct 12, 2025 • 46

panyaning

AI & ML interests

Recent Activity

Organizations

panyaning's activity

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance