maxkordn commited on
Commit
ed61be4
·
verified ·
1 Parent(s): c15d218

Training in progress, epoch 1

Browse files
README.md CHANGED
@@ -1,20 +1,17 @@
1
  ---
2
- base_model: Qwen/Qwen2.5-Coder-7B-Instruct
3
- datasets: flattened_successful_trajectories/v0
4
  library_name: transformers
5
  model_name: Qwen2.5-Coder-7B-Instruct-Solver-RFT
6
  tags:
7
  - generated_from_trainer
8
- - kordn
9
  - trl
10
  - sft
11
- - cogzero
12
  licence: license
13
  ---
14
 
15
  # Model Card for Qwen2.5-Coder-7B-Instruct-Solver-RFT
16
 
17
- This model is a fine-tuned version of [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) on the [flattened_successful_trajectories/v0](https://huggingface.co/datasets/flattened_successful_trajectories/v0) dataset.
18
  It has been trained using [TRL](https://github.com/huggingface/trl).
19
 
20
  ## Quick start
@@ -30,7 +27,7 @@ print(output["generated_text"])
30
 
31
  ## Training procedure
32
 
33
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/maxkordn-epfl/huggingface/runs/hcuz99tp)
34
 
35
 
36
  This model was trained with SFT.
 
1
  ---
2
+ base_model: maxkordn/Qwen2.5-Coder-7B-Instruct-Solver-RFT
 
3
  library_name: transformers
4
  model_name: Qwen2.5-Coder-7B-Instruct-Solver-RFT
5
  tags:
6
  - generated_from_trainer
 
7
  - trl
8
  - sft
 
9
  licence: license
10
  ---
11
 
12
  # Model Card for Qwen2.5-Coder-7B-Instruct-Solver-RFT
13
 
14
+ This model is a fine-tuned version of [maxkordn/Qwen2.5-Coder-7B-Instruct-Solver-RFT](https://huggingface.co/maxkordn/Qwen2.5-Coder-7B-Instruct-Solver-RFT).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
  ## Quick start
 
27
 
28
  ## Training procedure
29
 
30
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/maxkordn-epfl/huggingface/runs/qxeslnw9)
31
 
32
 
33
  This model was trained with SFT.
config.json CHANGED
@@ -52,7 +52,7 @@
52
  "tie_word_embeddings": false,
53
  "torch_dtype": "bfloat16",
54
  "transformers_version": "4.55.4",
55
- "use_cache": true,
56
  "use_sliding_window": false,
57
  "vocab_size": 152064
58
  }
 
52
  "tie_word_embeddings": false,
53
  "torch_dtype": "bfloat16",
54
  "transformers_version": "4.55.4",
55
+ "use_cache": false,
56
  "use_sliding_window": false,
57
  "vocab_size": 152064
58
  }
model-00001-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:badc9defe1bafcd2745d904af68150eb442295e3948a92975dfdf01b9046479a
3
  size 4877660776
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a0b2149b0377e5558ac876d3a94602a03e1d15e77178433377b6d3519cdd8c07
3
  size 4877660776
model-00002-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0acd8a178d7bf9b4a2abbb657c853e81f3de46c0a2ba51e21cd4ab03a4e357fa
3
  size 4932751008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de4e8b9a1a97d2dd081d19de3bd7c68169fddfbeb2d64fd2ecf27fbc3e5bc733
3
  size 4932751008
model-00003-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ada56970ab49e361c3d18854ccc45e860285f806813be889bcd06f2f8d99f7fa
3
  size 4330865200
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:04b76d5584da4750734cd42fc637c4828b8abb4c520e872e0208e17c21f9f3da
3
  size 4330865200
model-00004-of-00004.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:06d52844c988c60b6936439113cd9986af012ee01e281511f8bd4364438e9f27
3
  size 1089994880
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1acdebf03b123138ed2b8ec4e93c31569f788db5b533f8aa8b806be219d06ca
3
  size 1089994880
training_args.bin CHANGED
Binary files a/training_args.bin and b/training_args.bin differ