K's picture

2 2

K

kkkk328

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Models

liked a dataset 7 months ago

ricdomolm/MATH-500

upvoted a paper 12 months ago

A Unified Agentic Framework for Evaluating Conditional Image Generation

View all activity

Organizations

None yet

upvoted a paper 1 day ago

MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Models

Paper • 2603.28590 • Published 3 days ago • 17

liked a dataset 7 months ago

ricdomolm/MATH-500

Viewer • Updated Feb 6, 2025 • 12.5k • 197 • 4

upvoted a paper 12 months ago

A Unified Agentic Framework for Evaluating Conditional Image Generation

Paper • 2504.07046 • Published Apr 9, 2025 • 30

liked a dataset over 1 year ago

mandarjoshi/trivia_qa

Viewer • Updated Jan 5, 2024 • 848k • 61.9k • 184