Spaces:

trialdesignbench
/

tdb-intake

Running

App Files Files Community

tdb-intake / README.md

tttjjj

Per-question reviews; render questionnaire in app layout

b67309b 2 days ago

preview code

raw

history blame contribute delete

8.2 kB

	---
	title: TDB Intake
	emoji: 🔬
	colorFrom: blue
	colorTo: green
	sdk: streamlit
	sdk_version: "1.39.0"
	app_file: app.py
	pinned: false
	---

	# Trial Design Benchmark — Intake

	A Streamlit intake form for trial statisticians. Submissions are saved to a Hugging Face Dataset repo. An Admin page (in the sidebar) lets reviewers triage submissions (`pending` / `reviewed` / `needs_fix`).

	## What it does

	- Form (`app.py`) — statisticians enter `trial_id`, `username`, and a list of questions. Each question has:
	- `design_element` (dropdown — when "Others" is picked, a free-text input appears)
	- `question_type` (dropdown — `extraction_only` / `derivation_required`)
	- `question` (free text)
	- Rubrics auto-generated by question type:
	- `extraction_only` → 1 rubric: `output.json`
	- `derivation_required` → 4 rubrics: `output.json` × {Inputs used, Calculated value, Method} + `output.R` × {Reproducibility}
	- Each rubric collects `points`, `tolerance`, `criterion`.
	- Versions — every Submit saves a new version. Re-enter the same `trial_id` + `username`, click Find versions, pick one, and Load selected version to pull it back into the form for editing; Submit then saves a new version.
	- Admin page (`pages/1_Admin.py`) — password-gated review console. Shows only the latest version of each trial (one row per `trial_id` + `username`). The questionnaire is rendered in the same layout as the form (read-only). Reviewers can add reviews per question and an overall review; review history covers all versions (each review tagged with its version, and per-question reviews tied to their question). The trial's current status reflects the latest version's most recent overall review. Each review is its own file under `reviews/<trial>__<user>/<version>/`. (Submitters can still see and load all their own versions on the form.)

	## Run locally

	```bash
	python -m venv .venv && source .venv/bin/activate
	pip install -r requirements.txt
	streamlit run app.py
	```

	Without HF env vars set, submissions land in `./data/submissions/<...>.json` on disk — fine for dev.

	## Deploy on Hugging Face Spaces

	### 1. Create a private HF Dataset repo

	- Sign in at <https://huggingface.co>
	- Click your avatar → New Dataset
	- Owner: your username (e.g. `ttt-77`)
	- Name: e.g. `tdb-intake-submissions`
	- Visibility: Private
	- Create. Leave it empty.

	### 2. Generate an HF access token

	- <https://huggingface.co/settings/tokens> → New token
	- Token type: Write
	- Save the `hf_...` string.

	### 3. Create the Space

	- Click your avatar → New Space
	- Name: e.g. `tdb-intake`
	- SDK: Streamlit
	- Visibility: your choice (public works; the form is intended for public submission, only data needs to be private)
	- Create — HF gives you a git repo URL.

	### 4. Push this code to the Space

	```bash
	git remote add hf https://huggingface.co/spaces/<your-username>/tdb-intake
	git push hf main
	```

	Or, in the HF Space's Settings → Repository, link this GitHub repo and HF will auto-sync on push.

	### 5. Add Space secrets

	In the Space → Settings → Variables and secrets → add as secrets:

	\| Name \| Value \|
	\| --- \| --- \|
	\| `HF_TOKEN` \| the token from step 2 \|
	\| `HF_DATASET_REPO` \| `<your-username>/tdb-intake-submissions` \|
	\| `HF_DATASET_BRANCH` \| `main` (optional, defaults to `main`) \|
	\| `ADMIN_PASSWORD` \| a password to share with reviewers \|

	The Space will restart automatically and pick up the new secrets.

	### 6. Test

	- Open the Space URL → fill the form → Submit. A file lands in `submissions/<trial_id>__<username>/<stamp>.json` in the dataset repo. Submitting again saves another version in the same folder.
	- Open the Admin page (left sidebar) → enter password → see the submission with status `pending` → add a review (your name + status + comment). It appears in the review timeline and a new file lands under `reviews/<submission>/`. Add more reviews to build up the history.

	## Dataset layout

	Every submit saves a new version under a per-pair folder — nothing is
	overwritten, so the full version history is kept and any version can be loaded
	back. Each review is a separate file keyed to a specific version, so a
	version can be reviewed many times by different people and concurrent reviews
	never conflict.

	```text
	submissions/<trial>__<user>/<stamp>.json # one file per version
	reviews/<trial>__<user>/<stamp>/<revstamp>__<rev>.json # one file per review of that version
	```

	To load/edit a previous version: on the form, enter the same `trial_id` +
	`username`, click Find versions, pick a version, click **Load selected
	version, edit, then Submit** (which saves a new version).

	### Submission file (`submissions/<trial>__<user>/<stamp>.json`)

	```json
	{
	"submissionId": "submissions/NCT0001__jdoe/2026-06-04T...Z.json",
	"version": "2026-06-04T...Z",
	"submittedAt": "2026-06-04T...",
	"trial_id": "NCT0001",
	"username": "jdoe",
	"comparison": {
	"trial_id": "NCT0001",
	"username": "jdoe",
	"prompts": [
	{
	"id": "P-001",
	"design_element": "Sample size and power",
	"design_element_other": "",
	"question": "Total target PFS events",
	"question_type": "derivation_required",
	"rubrics": [
	{"artifact": "output.json", "dimension": "Inputs used", "points": "5", "criterion": "...", "tolerance": "..."},
	{"artifact": "output.json", "dimension": "Calculated value", "points": "5", "criterion": "...", "tolerance": "±5%"},
	{"artifact": "output.json", "dimension": "Method", "points": "5", "criterion": "...", "tolerance": "..."},
	{"artifact": "output.R", "dimension": "Reproducibility", "points": "5", "criterion": "...", "tolerance": "..."}
	]
	}
	]
	}
	}
	```

	### Review file (`reviews/<trial>__<user>/<stamp>/*.json`)

	```json
	{
	"submissionId": "submissions/NCT0001__jdoe/2026-06-04T...Z.json",
	"at": "2026-06-04T16:00:00+00:00",
	"reviewer": "Dr. Lee",
	"status": "needs_fix",
	"note": "still missing the power assumption",
	"question_id": "P-002"
	}
	```

	`question_id` ties the review to a specific question; an empty `question_id`
	means an overall (whole-version) review. The trial's current status is the
	most recent overall review on the latest version (or `pending` if none).

	### Load everything in Python

	```python
	from huggingface_hub import snapshot_download
	import json, glob, os

	local = snapshot_download("ttt-77/tdb-intake-submissions", repo_type="dataset")

	# every version: submissions/<trial>__<user>/<stamp>.json
	submissions = [json.load(open(f)) for f in glob.glob(f"{local}/submissions//.json")]

	# reviews: reviews/<trial>__<user>/<stamp>/<revstamp>__<rev>.json
	# key = "<trial>__<user>/<stamp>" (matches a submission's submissionId minus prefix/suffix)
	reviews = {}
	for f in glob.glob(f"{local}/reviews///*.json"):
	pair, ver = f.split("/reviews/")[1].split("/")[:2]
	reviews.setdefault(f"{pair}/{ver}", []).append(json.load(open(f)))
	for key in reviews:
	reviews[key].sort(key=lambda r: r["at"]) # oldest first
	```

	## Project structure

	```text
	.
	├── app.py # main intake form (entry point for HF Space)
	├── pages/
	│ └── 1_Admin.py # admin review page (shown in sidebar)
	├── lib/
	│ ├── __init__.py
	│ ├── schema.py # constants, defaults, validators
	│ └── storage.py # HF Dataset I/O + local fs fallback + admin password check
	├── requirements.txt
	└── README.md
	```

	## Privacy notes

	- The dataset repo should be private.
	- `HF_TOKEN` and `ADMIN_PASSWORD` live only in Space secrets — never commit them.
	- Rotate the token periodically.

	## Extending with Python ML libs

	Adding NLP / model checks is now a few lines in `lib/`. Examples:

	- `spaCy` for entity extraction on submitted SAP excerpts
	- `sentence-transformers` for semantic dedup of similar questions
	- `huggingface_hub.InferenceClient` for LLM-as-judge on the criterion text
	- `pandas` directly in the admin page for batch stats / CSV export