Update README.md
Browse files
README.md
CHANGED
|
@@ -8,6 +8,10 @@ license: cc-by-4.0
|
|
| 8 |
|
| 9 |
# LastingBench: Defend Benchmarks Against Knowledge Leakage.
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
Welcome to the repository for the research paper: "LastingBench: Defend Benchmarks Against Knowledge Leakage." This project addresses the growing concern about large language models (LLMs) "cheating" on standard Question Answering (QA) benchmarks by memorizing task-specific data, which undermines the validity of benchmark evaluations as they no longer reflect genuine model capabilities but instead the effects of data leakage.
|
| 12 |
|
| 13 |
## Project Overview
|
|
|
|
| 8 |
|
| 9 |
# LastingBench: Defend Benchmarks Against Knowledge Leakage.
|
| 10 |
|
| 11 |
+
<iframe src="https://huggingface.co/kixx/LastingBench/resolve/main/paper.pdf"
|
| 12 |
+
width="100%"
|
| 13 |
+
height="700"></iframe>
|
| 14 |
+
|
| 15 |
Welcome to the repository for the research paper: "LastingBench: Defend Benchmarks Against Knowledge Leakage." This project addresses the growing concern about large language models (LLMs) "cheating" on standard Question Answering (QA) benchmarks by memorizing task-specific data, which undermines the validity of benchmark evaluations as they no longer reflect genuine model capabilities but instead the effects of data leakage.
|
| 16 |
|
| 17 |
## Project Overview
|