Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -6,5 +6,21 @@ colorTo: green
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
-
Edit this `README.md` markdown file to author your organization card.
|
|
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
+
# ContinuousBench
|
| 10 |
+
|
| 11 |
+
ContinuousBench measures progress in differentially private synthetic data.
|
| 12 |
+
|
| 13 |
+
ContinuousBench has two tracks:
|
| 14 |
+
* [Geminon](https://huggingface.co/datasets/ContinuousBench/Geminon): Fictional, Gemini-generated corpus
|
| 15 |
+
* [News](https://huggingface.co/datasets/ContinuousBench/News): Scraped news articles from Septmember 2025
|
| 16 |
+
|
| 17 |
+
Both datasets:
|
| 18 |
+
* are designed to contain completely new information that models cannot answer
|
| 19 |
+
* are paired with QA that can only be answered after training on the corpus
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
Generate a DP synthetic version of News or Geminon, then test it: https://github.com/plau666/ContinuousBenchEval.
|
| 23 |
+
Our evaluation trains a model on your DP synthetic version, and then asks the paired QA to see if your DP synthetic data was capable of teaching a model the knowledge present in the original corpus.
|
| 24 |
+
|
| 25 |
+
|
| 26 |
|
|
|