A newer version of the Gradio SDK is available: 6.17.3
metadata
title: synthkit
emoji: 🧪
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 6.16.0
app_file: app.py
pinned: false
license: mit
synthkit — synthetic data, graded
Generate synthetic LLM data and grade it on validity, uniqueness, diversity, and contamination, with an A+→F headline. This Space is a live, offline demo (template generation + lexical grading).
The full command-line tool adds LLM-backed instruction→output generation, fine-tuning output formats (alpaca / sharegpt / openai), and an embedding-based semantic-dedup axis that catches paraphrase duplicates lexical methods miss.
👉 Source & docs: https://github.com/LaelaZorana/synthkit