Training in progress, epoch 1

Files changed (7) hide show

README.md CHANGED Viewed

@@ -1,20 +1,17 @@
 ---
-base_model: Qwen/Qwen2.5-Coder-7B-Instruct
-datasets: flattened_successful_trajectories/v0
 library_name: transformers
 model_name: Qwen2.5-Coder-7B-Instruct-Solver-RFT
 tags:
 - generated_from_trainer
-- kordn
 - trl
 - sft
-- cogzero
 licence: license
 ---
 # Model Card for Qwen2.5-Coder-7B-Instruct-Solver-RFT
-This model is a fine-tuned version of [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) on the [flattened_successful_trajectories/v0](https://huggingface.co/datasets/flattened_successful_trajectories/v0) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
@@ -30,7 +27,7 @@ print(output["generated_text"])
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/maxkordn-epfl/huggingface/runs/hcuz99tp)
 This model was trained with SFT.

 ---
+base_model: maxkordn/Qwen2.5-Coder-7B-Instruct-Solver-RFT
 library_name: transformers
 model_name: Qwen2.5-Coder-7B-Instruct-Solver-RFT
 tags:
 - generated_from_trainer
 - trl
 - sft
 licence: license
 ---
 # Model Card for Qwen2.5-Coder-7B-Instruct-Solver-RFT
+This model is a fine-tuned version of [maxkordn/Qwen2.5-Coder-7B-Instruct-Solver-RFT](https://huggingface.co/maxkordn/Qwen2.5-Coder-7B-Instruct-Solver-RFT).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/maxkordn-epfl/huggingface/runs/qxeslnw9)
 This model was trained with SFT.

config.json CHANGED Viewed

@@ -52,7 +52,7 @@
   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.55.4",
-  "use_cache": true,
   "use_sliding_window": false,
   "vocab_size": 152064
 }

   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.55.4",
+  "use_cache": false,
   "use_sliding_window": false,
   "vocab_size": 152064
 }

model-00001-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:badc9defe1bafcd2745d904af68150eb442295e3948a92975dfdf01b9046479a
 size 4877660776

 version https://git-lfs.github.com/spec/v1
+oid sha256:a0b2149b0377e5558ac876d3a94602a03e1d15e77178433377b6d3519cdd8c07
 size 4877660776

model-00002-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0acd8a178d7bf9b4a2abbb657c853e81f3de46c0a2ba51e21cd4ab03a4e357fa
 size 4932751008

 version https://git-lfs.github.com/spec/v1
+oid sha256:de4e8b9a1a97d2dd081d19de3bd7c68169fddfbeb2d64fd2ecf27fbc3e5bc733
 size 4932751008

model-00003-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ada56970ab49e361c3d18854ccc45e860285f806813be889bcd06f2f8d99f7fa
 size 4330865200

 version https://git-lfs.github.com/spec/v1
+oid sha256:04b76d5584da4750734cd42fc637c4828b8abb4c520e872e0208e17c21f9f3da
 size 4330865200

model-00004-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:06d52844c988c60b6936439113cd9986af012ee01e281511f8bd4364438e9f27
 size 1089994880

 version https://git-lfs.github.com/spec/v1
+oid sha256:b1acdebf03b123138ed2b8ec4e93c31569f788db5b533f8aa8b806be219d06ca
 size 1089994880

training_args.bin CHANGED Viewed

Binary files a/training_args.bin and b/training_args.bin differ