Sergio Hernandez

sergio-hernandez

3 1

https://sergiohg.com/

AI & ML interests

Data-centric post-training: environments & signals.

Recent Activity

authored a paper about 6 hours ago

QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents

upvoted a paper about 7 hours ago

QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents

liked a dataset about 7 hours ago

bethgelab/qval

View all activity

Organizations

authored a paper about 6 hours ago

QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents

Paper • 2606.32034 • Published 2 days ago • 8

upvoted a paper about 7 hours ago

QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents

Paper • 2606.32034 • Published 2 days ago • 8

liked a dataset about 7 hours ago

bethgelab/qval

Viewer • Updated about 7 hours ago • 533 • 2

updated a dataset about 7 hours ago

bethgelab/qval

Viewer • Updated about 7 hours ago • 533 • 2

published a dataset about 7 hours ago

bethgelab/qval

Viewer • Updated about 7 hours ago • 533 • 2

upvoted 2 papers 11 months ago

Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory

Paper • 2508.09736 • Published Aug 13, 2025 • 58

AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving

Paper • 2508.09889 • Published Aug 13, 2025 • 32

published 2 models 12 months ago

sergio-hernandez/learning-to-ask_logprob-sampling_round-penalty_step-12

2B • Updated Jul 11, 2025 • 2

sergio-hernandez/learning-to-ask_logprob-sampling_round-penalty_step-60

2B • Updated Jul 11, 2025 • 2

updated 2 models 12 months ago

sergio-hernandez/learning-to-ask_logprob-sampling_round-penalty_step-60

2B • Updated Jul 11, 2025 • 2

sergio-hernandez/learning-to-ask_logprob-sampling_round-penalty_step-12

2B • Updated Jul 11, 2025 • 2

Sergio Hernandez

AI & ML interests

Recent Activity

Organizations

sergio-hernandez's activity