kevlin tim

idforecasting

19 4

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows

upvoted a paper about 2 months ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

liked a model 9 months ago

ibm-granite/granite-docling-258M

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows

Paper • 2605.14678 • Published May 19 • 108

upvoted a paper about 2 months ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published May 13 • 165

upvoted 4 papers 9 months ago

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1, 2025 • 21

WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning

Paper • 2509.22644 • Published Sep 26, 2025 • 21

SimpleFold: Folding Proteins is Simpler than You Think

Paper • 2509.18480 • Published Sep 23, 2025 • 12

ExGRPO: Learning to Reason from Experience

Paper • 2510.02245 • Published Oct 2, 2025 • 83

upvoted 4 papers 10 months ago

NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining

Paper • 2507.14119 • Published Jul 18, 2025 • 60

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28, 2025 • 111

EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs

Paper • 2509.09174 • Published Sep 11, 2025 • 62

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Paper • 2509.14760 • Published Sep 18, 2025 • 53

upvoted 9 papers about 1 year ago

The Aloe Family Recipe for Open and Specialized Healthcare LLMs

Paper • 2505.04388 • Published May 7, 2025 • 26

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7, 2025 • 29

Through the Looking Glass: Common Sense Consistency Evaluation of Weird Images

Paper • 2505.07704 • Published May 12, 2025 • 29

Llama-Nemotron: Efficient Reasoning Models

Paper • 2505.00949 • Published May 2, 2025 • 45

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Paper • 2505.10557 • Published May 15, 2025 • 51

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19, 2025 • 46

Bielik 11B v2 Technical Report

Paper • 2505.02410 • Published May 5, 2025 • 55

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20, 2025 • 63

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21, 2025 • 88

kevlin tim

AI & ML interests

Recent Activity

Organizations

idforecasting's activity