Pinkstackorg
/

PinkQwen2.5-3B-1M-DPO-preview

Text Generation

text-generation-inference

Model card Files Files and versions

Pinkstack commited on Apr 23, 2025

Commit

ec99b1a

·

verified ·

1 Parent(s): 8081def

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ Fine tuned on finetome, pinkchat-sft, pinkchat-dpo, the model is able to generat
 ## Additional fine-tuning is needed.
 The model does not perform well, yet it does work. It has been fine tuned on 2 billion tokens of mostly syntetic data and some human made data in the sft process.
-Phase 0: In mergekit, we remove 16 layers (out of 28, so 12 layers left) using passthrough.
 Phase 1a: Fine tuning the model on a limited amount of data, lora 16 (21% trained). This phase is to get the model started on generating some sense, mainly for healing the model and nothing else, very low quality text would be generated.

 ## Additional fine-tuning is needed.
 The model does not perform well, yet it does work. It has been fine tuned on 2 billion tokens of mostly syntetic data and some human made data in the sft process.
+Phase 0: In mergekit, we remove 16 layers (out of 28, so 12 layers left: Pinkstackorg/Qwen2.5-3Bprunebase-1M) using passthrough.
 Phase 1a: Fine tuning the model on a limited amount of data, lora 16 (21% trained). This phase is to get the model started on generating some sense, mainly for healing the model and nothing else, very low quality text would be generated.