kl1
/

DFM-1.3B

@@ -12,14 +12,14 @@ datasets:
 ## Summary
 `DFM` is a continued-pretraining checkpoint based on Apple's fs-dfm weights. It is trained with Flow Matching code and released for research/non-commercial use only.
 Base checkpoint (external, not on HF):
 ```
 https://ml-site.cdn-apple.com/models/fs-dfm/checkpoint.pth
 ```
 ## Training
-- Continued pretraining from Apple's fs-dfm checkpoint
 - Dataset: SlimPajama-627B
 - Steps: 250,000
 - Global batch size: 256

 ## Summary
 `DFM` is a continued-pretraining checkpoint based on Apple's fs-dfm weights. It is trained with Flow Matching code and released for research/non-commercial use only.
+This model was continued from a uniform‑noise trained checkpoint to a masked‑diffusion variant.
 Base checkpoint (external, not on HF):
 ```
 https://ml-site.cdn-apple.com/models/fs-dfm/checkpoint.pth
 ```
 ## Training
+- Continued pretraining from Apple's fs-dfm checkpoint. Init: uniform‑noise checkpoint → continued training to mask‑diffusion
 - Dataset: SlimPajama-627B
 - Steps: 250,000
 - Global batch size: 256