Training in progress, step 100

Files changed (5) hide show

README.md CHANGED Viewed

@@ -1,18 +1,17 @@
 ---
 base_model: Qwen/Qwen2.5-7B-Instruct
-datasets: DeepMath-103k
 library_name: transformers
 model_name: QWEN7_THIP
 tags:
 - generated_from_trainer
-- trl
 - grpo
 licence: license
 ---
 # Model Card for QWEN7_THIP
-This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on the [DeepMath-103k](https://huggingface.co/datasets/DeepMath-103k) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
@@ -28,7 +27,7 @@ print(output["generated_text"])
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/pthpark1/THIP_COMPARE_QWEN7/runs/606inito)
 This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).

 ---
 base_model: Qwen/Qwen2.5-7B-Instruct
 library_name: transformers
 model_name: QWEN7_THIP
 tags:
 - generated_from_trainer
 - grpo
+- trl
 licence: license
 ---
 # Model Card for QWEN7_THIP
+This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/pthpark1/THIP_COMPARE_QWEN7/runs/ayndenuq)
 This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).

model-00001-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:abb043cdd898bf230ad4900341c0ced1c8dbcfc0417fea686ece626dc89f5bda
 size 4877660776

 version https://git-lfs.github.com/spec/v1
+oid sha256:9680dafec67b89aaa46f7a2901a9e62cd995742e855ba1c3574f9816aa1ad8e8
 size 4877660776

model-00002-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d6a0a251b82f46269774f454230cb3a648a554356a978674c93fc384fddd3b26
 size 4932751008

 version https://git-lfs.github.com/spec/v1
+oid sha256:9845c0d6b43877c33c70bbcb9dd092d83eab709ad8366cb341f2b5d88e0950d7
 size 4932751008

model-00003-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2e949a9e49dc00a6cd635d562b96159a73a02fb584a5e63e2e6ef28761d4fbe3
 size 4330865200

 version https://git-lfs.github.com/spec/v1
+oid sha256:0576e362c907628f077efed4ba9f0d063cc6dd9876d6031b273234ebdfe1fa4e
 size 4330865200

model-00004-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:957269831bdd6e2896a7c0a379f902da467865fc735d13c682e77a952036c881
 size 1089994880

 version https://git-lfs.github.com/spec/v1
+oid sha256:8ee9a53587de644c4bb737559fc9a74aa4e6c922bf623451f678901f89e7568f
 size 1089994880