hunarbatra commited on
Commit
ec805d4
·
verified ·
1 Parent(s): 6a2ec11

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -1
README.md CHANGED
@@ -104,7 +104,6 @@ print(output)
104
  - **Batch size**: 16 prompts × 8 rollouts = 128 generations/step
105
  - **Optimizer**: AdamW, lr=1e-6, KL coefficient=1e-2 (low_var_kl)
106
  - **LoRA**: rank=64 on the language tower
107
- - **Total cost**: ~$27 on Tinker
108
 
109
  The model was trained with several rollout-side fixes that lift the Qwen3-VL-Instruct base's format-pass rate from ~78% to ~96% during training:
110
  - Forced `<observe>\n` assistant prefix (matches the four-tag schema the model is trained to produce)
 
104
  - **Batch size**: 16 prompts × 8 rollouts = 128 generations/step
105
  - **Optimizer**: AdamW, lr=1e-6, KL coefficient=1e-2 (low_var_kl)
106
  - **LoRA**: rank=64 on the language tower
 
107
 
108
  The model was trained with several rollout-side fixes that lift the Qwen3-VL-Instruct base's format-pass rate from ~78% to ~96% during training:
109
  - Forced `<observe>\n` assistant prefix (matches the four-tag schema the model is trained to produce)