Spaces:
Running
Running
| title: README | |
| emoji: 🔥 | |
| colorFrom: yellow | |
| colorTo: green | |
| sdk: static | |
| pinned: false | |
| # ContinuousBench | |
| [Blog post](https://peihanliu.com/posts/continuousbench.html) | [Arxiv](#) | |
| **ContinuousBench** measures progress in differentially private synthetic data. | |
| ContinuousBench has two tracks: | |
| * [Geminon](https://huggingface.co/datasets/ContinuousBench/Geminon): Fictional, Gemini-generated corpus | |
| * [News](https://huggingface.co/datasets/ContinuousBench/News): Scraped news articles from September 2025 | |
| Both datasets: | |
| * are designed to contain completely new information that models cannot answer | |
| * are paired with QA that can only be answered after training on the corpus | |
| Generate a DP synthetic version of News or Geminon, then test it: https://github.com/plau666/ContinuousBenchEval. | |
| Our evaluation trains a model on your DP synthetic version, and then asks the paired QA to see if your DP synthetic data was capable of teaching a model the knowledge present in the original corpus. | |