ContinuousBench

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

alexbie98  updated a Space about 9 hours ago
ContinuousBench/README
alexbie98  published a Space about 11 hours ago
ContinuousBench/README
alexbie98  updated a dataset 17 days ago
ContinuousBench/Baselines
View all activity

Organization Card

ContinuousBench

Blog post | Arxiv

ContinuousBench measures progress in differentially private synthetic data.

ContinuousBench has two tracks:

  • Geminon: Fictional, Gemini-generated corpus
  • News: Scraped news articles from September 2025

Both datasets:

  • are designed to contain completely new information that models cannot answer
  • are paired with QA that can only be answered after training on the corpus

Generate a DP synthetic version of News or Geminon, then test it: https://github.com/plau666/ContinuousBenchEval.

Our evaluation trains a model on your DP synthetic version, and then asks the paired QA to see if your DP synthetic data was capable of teaching a model the knowledge present in the original corpus.

models 0

None public yet