Post
1459
Are LLMs just memorizing benchmarks?
We developed LastingBench to stop the "cheating" and ensure fair AI evaluation.
Our LastingBench has been accepted to #EMNLP2025 Findings! π
Paper: https://arxiv.org/abs/2506.21614
Code: https://github.com/Seriousss/LastingBench
#NLP #EMNLP2025
We developed LastingBench to stop the "cheating" and ensure fair AI evaluation.
Our LastingBench has been accepted to #EMNLP2025 Findings! π
Paper: https://arxiv.org/abs/2506.21614
Code: https://github.com/Seriousss/LastingBench
#NLP #EMNLP2025