Update README.md
Browse files
README.md
CHANGED
|
@@ -6,11 +6,20 @@ tags:
|
|
| 6 |
license: cc-by-4.0
|
| 7 |
---
|
| 8 |
|
| 9 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
height="700"></iframe>
|
| 14 |
|
| 15 |
Welcome to the repository for the research paper: "LastingBench: Defend Benchmarks Against Knowledge Leakage." This project addresses the growing concern about large language models (LLMs) "cheating" on standard Question Answering (QA) benchmarks by memorizing task-specific data, which undermines the validity of benchmark evaluations as they no longer reflect genuine model capabilities but instead the effects of data leakage.
|
| 16 |
|
|
|
|
| 6 |
license: cc-by-4.0
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# 📄 Paper
|
| 10 |
+
|
| 11 |
+
<iframe
|
| 12 |
+
src="https://huggingface.co/kixx/LastingBench/resolve/main/paper.pdf#toolbar=0"
|
| 13 |
+
width="100%"
|
| 14 |
+
height="900"
|
| 15 |
+
style="border:none;">
|
| 16 |
+
</iframe>
|
| 17 |
+
|
| 18 |
+
<!-- 兼容备用: -->
|
| 19 |
+
<p><a href="https://huggingface.co/kixx/LastingBench/resolve/main/paper.pdf">📥 Download the PDF</a></p>
|
| 20 |
|
| 21 |
+
|
| 22 |
+
# LastingBench: Defend Benchmarks Against Knowledge Leakage.
|
|
|
|
| 23 |
|
| 24 |
Welcome to the repository for the research paper: "LastingBench: Defend Benchmarks Against Knowledge Leakage." This project addresses the growing concern about large language models (LLMs) "cheating" on standard Question Answering (QA) benchmarks by memorizing task-specific data, which undermines the validity of benchmark evaluations as they no longer reflect genuine model capabilities but instead the effects of data leakage.
|
| 25 |
|