Spaces:

Melady
/

TemporalBench_Leaderboard

Running

App Files Files Community

TemporalBench_Leaderboard / README.md

Ray0202

update

3718ffe 14 days ago

preview code

raw

history blame contribute delete

1.93 kB

A newer version of the Gradio SDK is available: 6.6.0

Upgrade

metadata

title: TemporalBench Leaderboard
emoji: 🥇
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: Read-only TemporalBench leaderboard for offline results.
sdk_version: 5.49.1
tags:
  - leaderboard

TemporalBench Leaderboard

This Space is a read-only visualization and validation layer for offline TemporalBench results.
It does not execute agents, call LLM APIs, or accept API keys.

Configuration

Set the local results file path via TEMPORALBENCH_RESULTS_PATH.
Default is data/results.json.
Submissions are stored in data/submissions/ for manual review (override with TEMPORALBENCH_SUBMISSIONS_PATH).
Update descriptive text in src/about.py.

Results File Format

Results must be a JSON list or CSV table, where each record is one agent configuration.
Required fields per record:

{
  "model_name": "string",
  "agent_name": "string",
  "agent_type": "string",
  "base_model": "string",
  "T1_acc": 0.0,
  "T2_acc": 0.0,
  "T3_acc": 0.0,
  "T4_acc": 0.0,
  "T2_sMAPE": 0.0,
  "T2_MAE": 0.0,
  "T4_sMAPE": 0.0,
  "T4_MAE": 0.0,
  "FreshRetailNet_T2_sMAPE": 0.0,
  "FreshRetailNet_T2_MAE": 0.0,
  "MIMIC_T2_OW_sMAPE": 0.0,
  "MIMIC_T2_OW_RMSSE": 0.0
}

Notes:

T2_sMAPE, T2_MAE, T4_sMAPE, T4_MAE are optional (forecasting metrics).
Dataset-level columns are optional and displayed if present.
For MIMIC forecasting, only OW_sMAPE and OW_RMSSE are expected.
Any additional numeric columns are treated as optional domain metrics and will be shown.
Records must have a consistent schema and numeric metric values.

Project Structure

app.py: Gradio UI + leaderboard rendering
src/leaderboard/load_results.py: Load + validate results
src/leaderboard/schema.py: Identity/metric field definitions
src/about.py: Text and descriptions
src/display/css_html_js.py: Custom styling