Spaces:

Dev-CrafterX
/

preference-lab

Running

App Files Files Community

preference-lab / data /README.md

Sibam

PreferenceLab OpenEnv environment for RLHF preference simulation

cdf485e 3 months ago

preview code

Raw

History Blame Contribute Delete

1.07 kB

Data Directory

This directory holds the preference datasets used by PreferenceLab.

On first run, if these files are absent, the environment falls back to built-in synthetic examples (defined in server/environment.py).

File Format

pairwise_data.json

[
  {
    "prompt": "...",
    "response_a": "...",
    "response_b": "...",
    "gold_label": "A",
    "source": "hh-rlhf"
  }
]

likert_data.json

[
  {
    "prompt": "...",
    "response": "...",
    "rubric": "...",
    "gold_scores": {
      "helpfulness": 4,
      "honesty": 5,
      "harmlessness": 5,
      "instruction_following": 4
    },
    "source": "ultrafeedback"
  }
]

consistency_data.json

[
  {
    "prompt": "...",
    "response_a": "...",
    "response_b": "...",
    "response_c": "...",
    "response_d": "...",
    "gold_ranking": ["C", "A", "B", "D"],
    "source": "stanford-shp"
  }
]

Loading Real Datasets

Run python scripts/prepare_datasets.py to download and convert HH-RLHF, UltraFeedback, and Stanford SHP into these formats.