Judging LLM-as-a-judge with MT-Bench and Chatbot Arena Paper • 2306.05685 • Published Jun 9, 2023 • 43
view article Article Automatic Hallucination detection with SelfCheckGPT NLI dhuynh95 • Nov 27, 2023 • 7
view article Article The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models +4 pminervini, pingnieuk, clefourrier, rohitsaxena, aryopg, zodiache • Jan 29, 2024 • 38
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture Paper • 2401.08406 • Published Jan 16, 2024 • 38