Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# LastingBench: Defend Benchmarks Against Knowledge Leakage.
|
| 2 |
|
| 3 |
Welcome to the repository for the research paper: "LastingBench: Defend Benchmarks Against Knowledge Leakage." This project addresses the growing concern about large language models (LLMs) "cheating" on standard Question Answering (QA) benchmarks by memorizing task-specific data, which undermines the validity of benchmark evaluations as they no longer reflect genuine model capabilities but instead the effects of data leakage.
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: "LastingBench: Defend Benchmarks Against Knowledge Leakage"
|
| 3 |
+
tags:
|
| 4 |
+
- paper
|
| 5 |
+
- benchmark
|
| 6 |
+
license: cc-by-4.0
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
# LastingBench: Defend Benchmarks Against Knowledge Leakage.
|
| 10 |
|
| 11 |
Welcome to the repository for the research paper: "LastingBench: Defend Benchmarks Against Knowledge Leakage." This project addresses the growing concern about large language models (LLMs) "cheating" on standard Question Answering (QA) benchmarks by memorizing task-specific data, which undermines the validity of benchmark evaluations as they no longer reflect genuine model capabilities but instead the effects of data leakage.
|