BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation
Paper • 2604.09497 • Published • 19
NLP, Information Retrieval, Computer Vision, Uncertainty Estimation, Trustworthy AI, Bias Estimation, Unbalanced ML, Choice Modeling, Time Series
BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation
Learned Hallucination Detection in Black-Box LLMs using Token-level Entropy Production Rate