--- title: intellite-100m emoji: πŸ’¬ colorFrom: blue colorTo: purple sdk: gradio sdk_version: 5.34.2 app_file: app.py pinned: false --- # intellite-100M β€” RLHF data collector Serves the SFT-tuned intellite 100M model in a chat UI. Every assistant reply gets πŸ‘ / πŸ‘Ž buttons; each rating appends one JSONL record to a local folder that a `CommitScheduler` pushes to a dataset repo on the Hub every 5 minutes. ## Setup 1. **Upload the SFT checkpoint** to the Space root as `best.pt` (or set `INTELLITE_CKPT=/path/to/file.pt` in Settings β†’ Variables). 2. **Create the dataset repo** `ProCreations/Intellite-storage` (the scheduler will auto-create it on first push too). 3. **Set `HF_TOKEN`** in Settings β†’ Secrets β€” a token with **write** scope on the dataset repo. Without it, the Space runs but feedback only persists in-memory until the container restarts. 4. (Optional) Override `FEEDBACK_REPO` in Settings β†’ Variables if you want to use a different dataset repo. ## Data format Each record is a single line of JSONL in `data/data_.jsonl` on the dataset repo (one file per Space replica/restart): ```json {"ts":"2026-04-20T15:23:45","system":"You are a helpful, honest, and concise assistant.","prompt_messages":[{"role":"user","content":"..."},{"role":"assistant","content":"..."},{"role":"user","content":"..."}],"response":"...","liked":true} ``` Each record is exactly `(prompt, response, reward∈{0,1})` β€” the shape any preference/RL trainer expects. For DPO, group records by identical `prompt_messages` and pair a `liked=true` response (chosen) with a `liked=false` one (rejected). For REINFORCE/PPO, feed `liked` as a reward. ## Downloading the data ```bash hf download ProCreations/Intellite-storage --repo-type=dataset --local-dir ./rlhf-data # or in Python: # from huggingface_hub import snapshot_download # snapshot_download("ProCreations/Intellite-storage", repo_type="dataset") ``` ## Notes on the free CPU tier Generation on CPU is slow (~5–10 tok/s for 100M in fp32). If you move to the paid GPU tier, the app auto-detects `cuda` and uses bf16 autocast β€” roughly 10Γ— faster.