Liam Murphy

philicarior5

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 8 hours ago

SkillCoach: Self-Evolving Rubrics for Evaluating and Enhancing Agentic Skill-Use

upvoted a paper 13 days ago

A Benchmark and Framework for Evaluating Next Action Predictions in Spreadsheets

liked a model 30 days ago

babemario/Legal_clause-RAG

View all activity

Organizations

None yet

upvoted a paper about 8 hours ago

SkillCoach: Self-Evolving Rubrics for Evaluating and Enhancing Agentic Skill-Use

Paper • 2607.01874 • Published 2 days ago • 12

upvoted a paper 13 days ago

A Benchmark and Framework for Evaluating Next Action Predictions in Spreadsheets

Paper • 2606.13802 • Published 23 days ago • 1

upvoted 2 papers about 1 month ago

Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

Paper • 2605.28816 • Published May 27 • 431

Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining

Paper • 2605.14747 • Published May 14 • 147

upvoted 3 papers about 2 months ago

SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution

Paper • 2605.18401 • Published May 18 • 130

PASA: A Principled Embedding-Space Watermarking Approach for LLM-Generated Text under Semantic-Invariant Attacks

Paper • 2605.10977 • Published May 9 • 10

WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

Paper • 2605.10912 • Published May 11 • 46

upvoted 5 papers 3 months ago

RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time

Paper • 2604.11626 • Published Apr 13 • 103

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 248

OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training

Paper • 2603.28858 • Published Mar 30 • 9

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 353

Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published Mar 26 • 157

upvoted 2 papers 4 months ago

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 526

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 221

upvoted 2 papers 5 months ago

SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise

Paper • 2602.12783 • Published Feb 13 • 246

NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control

Paper • 2602.09070 • Published Feb 9 • 46