PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning Paper • 2508.21104 • Published Aug 28, 2025 • 37
KTO: Model Alignment as Prospect Theoretic Optimization Paper • 2402.01306 • Published Feb 2, 2024 • 21
google-bert/bert-large-uncased-whole-word-masking-finetuned-squad Question Answering • 0.3B • Updated Feb 19, 2024 • 331k • • 182