Research to build upon - a kudosscience Collection

kudosscience 's Collections

Research to build upon

Research to build upon

updated May 27

Reasoning Shift: How Context Silently Shortens LLM Reasoning

Paper • 2604.01161 • Published Apr 1 • 32
Steerable Visual Representations

Paper • 2604.02327 • Published Apr 2 • 56
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning

Paper • 2604.06427 • Published Apr 7 • 11
BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation

Paper • 2604.09497 • Published Apr 10 • 29
Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness

Paper • 2604.12373 • Published Apr 14 • 9
GlotOCR Bench: OCR Models Still Struggle Beyond a Handful of Unicode Scripts

Paper • 2604.12978 • Published Apr 14 • 5
Target Policy Optimization

Paper • 2604.06159 • Published Apr 7 • 23
IatroBench: Pre-Registered Evidence of Iatrogenic Harm from AI Safety Measures

Paper • 2604.07709 • Published Apr 14 • 1
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language

Paper • 2604.19667 • Published Apr 21 • 23
AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation

Paper • 2604.18240 • Published Apr 20 • 17
Code-Switching Information Retrieval: Benchmarks, Analysis, and the Limits of Current Retrievers

Paper • 2604.17632 • Published Apr 19 • 12
SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents

Paper • 2604.17308 • Published Apr 19 • 23
Code as Agent Harness

Paper • 2605.18747 • Published May 18 • 224
AI for Auto-Research: Roadmap & User Guide

Paper • 2605.18661 • Published May 18 • 69
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

Paper • 2605.22109 • Published May 21 • 171
Foundation Protocol: A Coordination Layer for Agentic Society

Paper • 2605.23218 • Published May 22 • 82