OSWorld xjbz

community

AI & ML interests

None defined yet.

Recent Activity

tianbaoxiexxx authored a paper 6 days ago

OS-MAP: How Far Can Computer-Using Agents Go in Breadth and Depth?

tianbaoxiexxx authored a paper 6 days ago

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

tianbaoxiexxx authored a paper 6 days ago

OSWorld-MCP: Benchmarking MCP Tool Invocation In Computer-Use Agents

View all activity

authored 5 papers 6 days ago

OS-MAP: How Far Can Computer-Using Agents Go in Breadth and Depth?

Paper • 2507.19132 • Published Jul 25, 2025

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Paper • 2510.24702 • Published Oct 28, 2025 • 31

OSWorld-MCP: Benchmarking MCP Tool Invocation In Computer-Use Agents

Paper • 2510.24563 • Published Oct 28, 2025 • 23

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 163

RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Paper • 2602.02488 • Published Feb 2 • 36

authored a paper 6 days ago

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Paper • 2605.25624 • Published 8 days ago • 30

authored a paper 6 days ago

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Paper • 2605.25624 • Published 8 days ago • 30

authored 2 papers 14 days ago

MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI

Paper • 2605.08678 • Published 24 days ago • 9

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published 15 days ago • 30

authored a paper 29 days ago

CocoaBench: Evaluating Unified Digital Agents in the Wild

Paper • 2604.11201 • Published Apr 13 • 37

authored a paper about 1 month ago

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published Feb 2 • 273

authored a paper 3 months ago

VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos

Paper • 2510.19488 • Published Oct 22, 2025 • 22

authored 2 papers 3 months ago

VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos

Paper • 2510.19488 • Published Oct 22, 2025 • 22

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 59

updated a dataset 6 months ago

OSWorld-xjbz/new_files

Updated Dec 15, 2025 • 102

updated a dataset 6 months ago

OSWorld-xjbz/new_files

Updated Dec 15, 2025 • 102

published a dataset 6 months ago

OSWorld-xjbz/new_files

Updated Dec 15, 2025 • 102

authored a paper 10 months ago

AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant

Paper • 2410.18603 • Published Oct 24, 2024 • 32

authored a paper 10 months ago

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19, 2025 • 46

authored a paper 10 months ago

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26, 2025 • 104