Aksel Joonas Reedi commited on
Commit
4b76ae8
·
unverified ·
1 Parent(s): 6155b26

Document Trackio alerts + iteration loop in system prompt (#156)

Browse files

The prompt previously had a one-liner about `report_to=["trackio"]` and
flagged `run_name` as a wrong field (it is the correct field on
TrainingArguments/SFTConfig/GRPOConfig). Replace with a focused Trackio
section covering:

- correct config fields (report_to, run_name, project, trackio_space_id)
and the TRACKIO_PROJECT / TRACKIO_SPACE_ID env-var alternatives
- trackio.alert(title, text, level) as the structured feedback channel,
with ERROR/WARN/INFO semantics and an actionable-text requirement
- how to wire alerts via a TrainerCallback (on_log vs on_evaluate)
- CLI/Python recipes for reading alerts back between iterations
- decision rules from prior alerts -> next config

Also moves the dataset-format block back inside "When writing ML code"
where it belongs.

Files changed (1) hide show
  1. agent/prompts/system_prompt_v3.yaml +33 -2
agent/prompts/system_prompt_v3.yaml CHANGED
@@ -28,7 +28,7 @@ system_prompt: |
28
 
29
  # Mistakes you WILL make without research
30
 
31
- HALLUCINATED IMPORTS: You will import from modules that were renamed or removed. Example: old TRL trainer class names, deprecated Transformers APIs, wrong trackio parameter names (e.g. `run_name` instead of `name`). Fix: read a current example script first.
32
 
33
  WRONG TRAINER ARGUMENTS: You will pass configuration arguments that don't exist in current trainer versions. Fix: fetch the actual trainer/config docs via explore_hf_docs + fetch_hf_docs.
34
 
@@ -54,13 +54,44 @@ system_prompt: |
54
  3. Validate model: hub_repo_details to confirm model exists, correct architecture/size/tokenizer
55
 
56
  Training logging: always set disable_tqdm=True, logging_strategy="steps", and logging_first_step=True in your TrainingArguments/SFTConfig so loss values are printed as plain text lines you can grep, not hidden inside tqdm progress bars.
57
- In training configs, set `report_to=["trackio"]` and set a `run_name`, `project`, and importantly `trackio_space_id` (which can be a `<username>/mlintern-<8-char-id>` for example) so Trackio creates a public dashboard Space.
58
 
59
  Dataset format requirements by training method:
60
  SFT: "messages", "text", or "prompt"/"completion"
61
  DPO: "prompt", "chosen", "rejected"
62
  GRPO: "prompt"
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  # Data audit
65
 
66
  Before working with any dataset, audit it first. Do not assume you know what the data looks like — inspect it.
 
28
 
29
  # Mistakes you WILL make without research
30
 
31
+ HALLUCINATED IMPORTS: You will import from modules that were renamed or removed. Example: old TRL trainer class names, deprecated Transformers APIs, wrong trackio config field names. Fix: read a current example script first.
32
 
33
  WRONG TRAINER ARGUMENTS: You will pass configuration arguments that don't exist in current trainer versions. Fix: fetch the actual trainer/config docs via explore_hf_docs + fetch_hf_docs.
34
 
 
54
  3. Validate model: hub_repo_details to confirm model exists, correct architecture/size/tokenizer
55
 
56
  Training logging: always set disable_tqdm=True, logging_strategy="steps", and logging_first_step=True in your TrainingArguments/SFTConfig so loss values are printed as plain text lines you can grep, not hidden inside tqdm progress bars.
 
57
 
58
  Dataset format requirements by training method:
59
  SFT: "messages", "text", or "prompt"/"completion"
60
  DPO: "prompt", "chosen", "rejected"
61
  GRPO: "prompt"
62
 
63
+ # Trackio
64
+
65
+ Trackio is natively integrated with Transformers Trainer and all TRL trainers — the built-in TrackioCallback handles init/log/finish. In TrainingArguments/SFTConfig/DPOConfig/GRPOConfig set:
66
+ report_to="trackio"
67
+ run_name="<descriptive-run-name>" # e.g. "sft_qwen3-4b_lr2e-5_bs128"
68
+ project="<descriptive-project-name>" # keeps related runs grouped so you can compare them
69
+ trackio_space_id="<username>/mlintern-<8-char-id>" # creates a public dashboard Space
70
+ `project` and `trackio_space_id` can also be set via TRACKIO_PROJECT / TRACKIO_SPACE_ID env vars.
71
+
72
+ Alerts are how iterations decide what to change. Use trackio.alert(title, text, level) at every decision point in training. Levels:
73
+ ERROR — stop and change approach (divergence, NaN, OOM)
74
+ WARN — tweak hyperparameters (overfitting, early stopping, KL spike, reward collapse, slow convergence)
75
+ INFO — milestones (training complete, target reached, checkpoint saved)
76
+ Always include numeric values and an actionable suggestion in `text`, e.g. "loss=12.4 at step 200 — lr likely too high, try ×0.1". A future call must be able to parse it and act on it.
77
+
78
+ To add alerts under Trainer/SFTTrainer/GRPOTrainer, pass a custom TrainerCallback via `callbacks=[...]` that calls trackio.alert() inside `on_log` (training metrics like loss, reward, kl) and `on_evaluate` (eval metrics — only available here, not in `on_log`). Keep each `if` simple: one metric, one threshold. Conditions stay easy to adjust between runs.
79
+
80
+ Read alerts back between runs instead of parsing thousands of metric values. CLI — always use --json:
81
+ trackio get alerts --project <p> --run <r> --json
82
+ trackio get alerts --project <p> --since <iso8601> --json # incremental polling
83
+ trackio get run --project <p> --run <r> --json
84
+ trackio get metric --project <p> --run <r> --metric <m> --json
85
+ trackio list runs --project <p> --json
86
+ Python: api = trackio.Api(); api.alerts(<p>, run=<r>, since=<ts>); api.runs(<p>) (each run has .name, .config, .alerts()).
87
+
88
+ Drive the next config from prior alerts:
89
+ diverged → lr × 0.1
90
+ overfitting → weight_decay × 10 or reduce capacity
91
+ early stopping → lr × 0.5 or adjust schedule
92
+ high accuracy → refine around current config
93
+ Read prior config via api.runs(...).config and only mutate keys the alerts justify changing.
94
+
95
  # Data audit
96
 
97
  Before working with any dataset, audit it first. Do not assume you know what the data looks like — inspect it.