packing-benchmark / README.md
NathanRoll's picture
Keep dataset mirror private by default
7c0c53a verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: Packing Benchmark
sdk: gradio
app_file: app.py
pinned: false
license: mit

Packing Benchmark

A Hugging Face Space for verified equal-copy geometric packing records.

Submissions use canonical coordinate JSON, not SVG drawings. The Space verifies the geometry, rejects overlap/protrusion, and renders accepted coordinate layouts directly in the browser as inline SVG.

Submission Format

{
  "schema_version": "packing-benchmark/v1",
  "case": "triintri@1",
  "item": {"type": "regular_polygon", "sides": 3, "side_length": 1},
  "container": {"type": "regular_polygon", "sides": 3, "side_length": 1},
  "placements": [
    {"x": 0.0, "y": 0.0, "rotation_radians": 0.0}
  ]
}

Supported shapes are regular_polygon, circle, and rectangle. When redundant dimensions are provided, such as both side_length and circumradius, they must agree. The displayed metric is computed from the same geometry that is checked for containment and overlap.

Local Verifier

The verifier code is public and can be run locally before submission:

https://github.com/Nathan-Roll1/packing-verifier

python -m pip install git+https://github.com/Nathan-Roll1/packing-verifier.git
packing-verifier verify solution.json
packing-verifier verify solution.json --json

The local command checks geometry and reports the same metric used by the Space. The live Space still applies the record gate: existing cases must improve the current top metric by at least 0.0001; cases with no current record can be submitted through n = 100.

Persistence

The canonical public database is the Hugging Face Dataset mirror, currently NathanRoll/packing-benchmark-data. The Space hydrates data/ from that Dataset before loading records. Local files in this repository are only a working cache unless they have been regenerated from the Dataset and uploaded back.

The app writes verified records to data/records.jsonl and canonical submissions to data/submissions/, then mirrors the full data/ directory to the Dataset. If a Dataset repo is configured, hydrate/sync failures are fatal by default so the UI cannot silently trust stale bundled data. For local-only development, set PACKING_ALLOW_BUNDLED_DATA=1 or PACKING_REQUIRE_DATASET=0. It also writes deterministic SVG files to data/svg/ for archival use, while the public browser draws records directly from the canonical coordinates.

Current benchmark rows point to canonical coordinate JSON under data/solutions/. data/evaluation_results.jsonl is a full rerun of the verifier over every current record, so the displayed records can be audited from JSON alone.

To repair or audit the database from canonical coordinate JSON:

python scripts/reverify_hf_dataset.py --hydrate
python scripts/reverify_hf_dataset.py --upload

This replaces the local cache with the Dataset snapshot, reruns the verifier over every stored JSON, rewrites derived metrics and evaluation_results.jsonl, regenerates SVG archives, prunes stale generated files, and uploads the canonical data back to the Dataset.

On Hugging Face, the Dataset mirror is the persistence layer for public submissions. Persistent Space storage is useful as a runtime cache but is not the source of truth.

For an automatic Dataset mirror, add these Space variables/secrets:

  • PACKING_DATASET_REPO: dataset repo id, for example username/packing-benchmark-data
  • HF_TOKEN: write token stored as a Space secret
  • PACKING_DATASET_PRIVATE: optional; defaults to 1/true so newly created Dataset mirrors are private unless explicitly configured otherwise

When configured, every accepted submission mirrors data/ into that Dataset repo after verification and archival SVG generation.