Model save

Files changed (10) hide show

README.md CHANGED Viewed

@@ -1,11 +1,9 @@
 ---
 base_model: Qwen/Qwen2.5-Math-7B
-datasets: open-r1/OpenR1-Math-220k
 library_name: transformers
 model_name: Qwen-2.5-7B-Simple-RL-OpenR1-Data
 tags:
 - generated_from_trainer
-- open-r1
 - trl
 - grpo
 licence: license
@@ -13,7 +11,7 @@ licence: license
 # Model Card for Qwen-2.5-7B-Simple-RL-OpenR1-Data
-This model is a fine-tuned version of [Qwen/Qwen2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B) on the [open-r1/OpenR1-Math-220k](https://huggingface.co/datasets/open-r1/OpenR1-Math-220k) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
@@ -29,7 +27,7 @@ print(output["generated_text"])
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/xu-chenhui-university-at-buffalo/huggingface/runs/pwba5bp1)
 This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).

 ---
 base_model: Qwen/Qwen2.5-Math-7B
 library_name: transformers
 model_name: Qwen-2.5-7B-Simple-RL-OpenR1-Data
 tags:
 - generated_from_trainer
 - trl
 - grpo
 licence: license
 # Model Card for Qwen-2.5-7B-Simple-RL-OpenR1-Data
+This model is a fine-tuned version of [Qwen/Qwen2.5-Math-7B](https://huggingface.co/Qwen/Qwen2.5-Math-7B).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/xu-chenhui-university-at-buffalo/huggingface/runs/gv19br8l)
 This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).

all_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "total_flos": 0.0,
-    "train_loss": 0.01471865121748956,
-    "train_runtime": 485753.1625,
     "train_samples": 93733,
-    "train_samples_per_second": 0.193,
-    "train_steps_per_second": 0.006
 }

 {
     "total_flos": 0.0,
+    "train_loss": 0.22819482568542057,
+    "train_runtime": 410492.8024,
     "train_samples": 93733,
+    "train_samples_per_second": 0.228,
+    "train_steps_per_second": 0.007
 }

config.json CHANGED Viewed

@@ -23,7 +23,7 @@
   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.49.0",
-  "use_cache": true,
   "use_mrope": false,
   "use_sliding_window": false,
   "vocab_size": 152064

   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.49.0",
+  "use_cache": false,
   "use_mrope": false,
   "use_sliding_window": false,
   "vocab_size": 152064

model-00001-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:22ba60c8335aa256e395cd11ac854d771bc9d5dc5293ed1becefa3e41b3e317a
 size 4877660776

 version https://git-lfs.github.com/spec/v1
+oid sha256:e11bf01fed228d7a3fa59f2a0ee51511cafdd87bd411855b88b19d94b1577640
 size 4877660776

model-00002-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b099ee524a8a491936822feab0ca0875f83e135b5224410954d236e0782d3ba8
 size 4932751008

 version https://git-lfs.github.com/spec/v1
+oid sha256:372ce04fa1c87d1cea954b8f4d4a3269272ca9a7b524dad2f48fae0c04f93a73
 size 4932751008

model-00003-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:eba14ad181a3bf5fc37a2ac90d1564d7d122ec53daa88bf86db73e295bc66ff3
 size 4330865200

 version https://git-lfs.github.com/spec/v1
+oid sha256:c1c15f6584c53531d940b627a2e987c99936e15a09bc27bc54fd27b87c249d33
 size 4330865200

model-00004-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bcf41fb37d7a778f78aaf69e5140046d26477fdefe73391786e510609ab8de7e
 size 1089994880

 version https://git-lfs.github.com/spec/v1
+oid sha256:6f457efb31a242e68d6b7b04caef47e19a501f189a02b0c65babc9881298383e
 size 1089994880

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "total_flos": 0.0,
-    "train_loss": 0.01471865121748956,
-    "train_runtime": 485753.1625,
     "train_samples": 93733,
-    "train_samples_per_second": 0.193,
-    "train_steps_per_second": 0.006
 }

 {
     "total_flos": 0.0,
+    "train_loss": 0.22819482568542057,
+    "train_runtime": 410492.8024,
     "train_samples": 93733,
+    "train_samples_per_second": 0.228,
+    "train_steps_per_second": 0.007
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f717cc9f7e41458609bc8857ccd78ecba01a34d1bd8c12bea439bbf245970886
 size 8056

 version https://git-lfs.github.com/spec/v1
+oid sha256:23d3446e7eb88dd8ed6488d1ec794fe2372c1e540b1d400b7d85b10c48142f9d
 size 8056