Spaces:
Running
A newer version of the Gradio SDK is available: 6.14.0
title: intellite-100m
emoji: π¬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false
intellite-100M β RLHF data collector
Serves the SFT-tuned intellite 100M model in a chat UI. Every assistant reply
gets π / π buttons; each rating appends one JSONL record to a local folder
that a CommitScheduler pushes to a dataset repo on the Hub every 5 minutes.
Setup
- Upload the SFT checkpoint to the Space root as
best.pt(or setINTELLITE_CKPT=/path/to/file.ptin Settings β Variables). - Create the dataset repo
ProCreations/Intellite-storage(the scheduler will auto-create it on first push too). - Set
HF_TOKENin Settings β Secrets β a token with write scope on the dataset repo. Without it, the Space runs but feedback only persists in-memory until the container restarts. - (Optional) Override
FEEDBACK_REPOin Settings β Variables if you want to use a different dataset repo.
Data format
Each record is a single line of JSONL in data/data_<uuid>.jsonl on the
dataset repo (one file per Space replica/restart):
{"ts":"2026-04-20T15:23:45","system":"You are a helpful, honest, and concise assistant.","prompt_messages":[{"role":"user","content":"..."},{"role":"assistant","content":"..."},{"role":"user","content":"..."}],"response":"...","liked":true}
Each record is exactly (prompt, response, rewardβ{0,1}) β the shape any
preference/RL trainer expects. For DPO, group records by identical
prompt_messages and pair a liked=true response (chosen) with a
liked=false one (rejected). For REINFORCE/PPO, feed liked as a reward.
Downloading the data
hf download ProCreations/Intellite-storage --repo-type=dataset --local-dir ./rlhf-data
# or in Python:
# from huggingface_hub import snapshot_download
# snapshot_download("ProCreations/Intellite-storage", repo_type="dataset")
Notes on the free CPU tier
Generation on CPU is slow (~5β10 tok/s for 100M in fp32). If you move to the
paid GPU tier, the app auto-detects cuda and uses bf16 autocast β roughly
10Γ faster.