wheattoast11 commited on
Commit
a2b3537
·
verified ·
1 Parent(s): eb4697e

Training in progress, step 50

Browse files
README.md CHANGED
@@ -4,11 +4,10 @@ library_name: transformers
4
  model_name: OmniCoder-9B-Zero-Phase2
5
  tags:
6
  - generated_from_trainer
7
- - trl
8
- - grpo
9
- - trackio:https://wheattoast11-trackio.hf.space?project=zero-rl&runs=phase2-vlm-grpo-carl&sidebar=collapsed
10
  - hf_jobs
11
- - trackio
 
 
12
  licence: license
13
  ---
14
 
@@ -31,7 +30,7 @@ print(output["generated_text"])
31
  ## Training procedure
32
 
33
 
34
- [<img src="https://raw.githubusercontent.com/gradio-app/trackio/refs/heads/main/trackio/assets/badge.png" alt="Visualize in Trackio" title="Visualize in Trackio" width="150" height="24"/>](https://wheattoast11-trackio.hf.space?project=zero-rl&runs=phase2-vlm-grpo-carl&sidebar=collapsed)
35
 
36
 
37
  This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
@@ -39,8 +38,8 @@ This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing
39
  ### Framework versions
40
 
41
  - TRL: 1.0.0
42
- - Transformers: 5.4.0
43
- - Pytorch: 2.11.0
44
  - Datasets: 4.8.4
45
  - Tokenizers: 0.22.2
46
 
 
4
  model_name: OmniCoder-9B-Zero-Phase2
5
  tags:
6
  - generated_from_trainer
 
 
 
7
  - hf_jobs
8
+ - trackio:https://wheattoast11-trackio.hf.space?project=zero-rl&runs=phase2-vlm-grpo-carl-v2&sidebar=collapsed
9
+ - grpo
10
+ - trl
11
  licence: license
12
  ---
13
 
 
30
  ## Training procedure
31
 
32
 
33
+ [<img src="https://raw.githubusercontent.com/gradio-app/trackio/refs/heads/main/trackio/assets/badge.png" alt="Visualize in Trackio" title="Visualize in Trackio" width="150" height="24"/>](https://wheattoast11-trackio.hf.space?project=zero-rl&runs=phase2-vlm-grpo-carl-v2&sidebar=collapsed)
34
 
35
 
36
  This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models](https://huggingface.co/papers/2402.03300).
 
38
  ### Framework versions
39
 
40
  - TRL: 1.0.0
41
+ - Transformers: 5.5.0
42
+ - Pytorch: 2.6.0
43
  - Datasets: 4.8.4
44
  - Tokenizers: 0.22.2
45
 
adapter_config.json CHANGED
@@ -29,13 +29,13 @@
29
  "rank_pattern": {},
30
  "revision": null,
31
  "target_modules": [
32
- "k_proj",
33
- "q_proj",
34
- "up_proj",
35
  "v_proj",
36
- "o_proj",
37
  "down_proj",
38
- "gate_proj"
 
 
 
 
39
  ],
40
  "target_parameters": null,
41
  "task_type": "CAUSAL_LM",
 
29
  "rank_pattern": {},
30
  "revision": null,
31
  "target_modules": [
 
 
 
32
  "v_proj",
 
33
  "down_proj",
34
+ "q_proj",
35
+ "k_proj",
36
+ "gate_proj",
37
+ "up_proj",
38
+ "o_proj"
39
  ],
40
  "target_parameters": null,
41
  "task_type": "CAUSAL_LM",
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7646df83e1efe2e482c6981fac6087cdd3c9cb9fc1c824f3ccb76f9d947cbae8
3
- size 232818320
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:33f5eff5f769d20354bddfd57a6fc3de820b7503147aecdc2b8d01256ce98697
3
+ size 465602056
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:16e71421c5d4e5b01c8be19a0ed8c9b91f9fd257ed0b17b39182561f48376ca9
3
- size 19989441
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:87a7830d63fcf43bf241c3c5242e96e62dd3fdc29224ca26fed8ea333db72de4
3
+ size 19989343
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c1812a19d9d40df9d7dbf1be808a6660c4110138b77f3296aaadddc82f33eb5b
3
- size 7249
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:751d4f9e6ed7152c38cfc2dd82e1b7212694c0a84f4fa167e942a23f6a708ffb
3
+ size 6904