ALIENS's picture

8 8

ALIENS

ALIENS232

·

ALIENS

AI & ML interests

None yet

Organizations

upvoted a paper 3 months ago

Can Vision-Language Models Solve the Shell Game?

Paper • 2603.08436 • Published Mar 9 • 39

upvoted 2 papers 5 months ago

ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World

Paper • 2505.19095 • Published May 25, 2025 • 2

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 207

upvoted a paper 7 months ago

ARE: Scaling Up Agent Environments and Evaluations

Paper • 2509.17158 • Published Sep 21, 2025 • 36

upvoted a paper 11 months ago

Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability

Paper • 2508.04017 • Published Aug 6, 2025 • 11

upvoted a paper about 1 year ago

Don't Take the Premise for Granted: Evaluating the Premise Critique Ability of Large Language Models

Paper • 2505.23715 • Published May 29, 2025 • 2

upvoted 2 papers over 1 year ago

StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following

Paper • 2502.14494 • Published Feb 20, 2025 • 15

Large Language Model Evaluation via Matrix Nuclear-Norm

Paper • 2410.10672 • Published Oct 14, 2024 • 19