Zheng Liu

starriver030515

9 36 23

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

upvoted a paper about 1 month ago

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

new activity about 2 months ago

webagentlab/WebChain:图片路径问题

View all activity

Organizations

upvoted 2 papers about 1 month ago

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Paper • 2605.30280 • Published May 28 • 146

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Paper • 2605.25624 • Published May 25 • 34

upvoted 2 papers 2 months ago

Near-Future Policy Optimization

Paper • 2604.20733 • Published Apr 22 • 77

OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Paper • 2604.18486 • Published Apr 20 • 96

upvoted 4 papers 3 months ago

Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs

Paper • 2604.10480 • Published Apr 12 • 20

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Paper • 2604.04707 • Published Apr 6 • 203

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published Apr 6 • 124

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Paper • 2603.26164 • Published Mar 27 • 365

upvoted a paper 4 months ago

Unlocking Data Value in Finance: A Study on Distillation and Difficulty-Aware Training

Paper • 2603.07223 • Published Mar 7 • 13

upvoted a collection 5 months ago

MMFineReason

Collection

High-quality STEM reasoning dataset for Multimodal LLM post-training. • 8 items • Updated May 7 • 24

upvoted 2 papers 5 months ago

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Paper • 2601.21821 • Published Jan 29 • 62

Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility

Paper • 2601.17027 • Published Jan 17 • 42

upvoted a collection 5 months ago

ODA-Mixture

Collection

High-quality mixture datasets for post-training covering multiple domains. • 7 items • Updated Mar 31 • 5

upvoted 2 papers 5 months ago

ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch

Paper • 2601.13606 • Published Jan 20 • 12

Closing the Data Loop: Using OpenDataArena to Engineer Superior Training Datasets

Paper • 2601.09733 • Published Dec 30, 2025 • 9

upvoted a collection 5 months ago

ChartVerse

Collection

8 items • Updated Mar 2 • 9

upvoted a paper 6 months ago

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published Dec 18, 2025 • 225

upvoted a paper 7 months ago

OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value

Paper • 2512.14051 • Published Dec 16, 2025 • 47

upvoted 2 papers 9 months ago

Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning

Paper • 2510.04081 • Published Oct 5, 2025 • 23

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 174

Zheng Liu

AI & ML interests

Recent Activity

Organizations

starriver030515's activity