pintu008 commited on
Commit
096f789
·
verified ·
1 Parent(s): c28bb42

Training in progress, step 10

Browse files
README.md CHANGED
@@ -1,18 +1,17 @@
1
  ---
2
  base_model: Qwen/Qwen2.5-VL-3B-Instruct
3
- datasets: lmms-lab/multimodal-open-r1-8k-verified
4
  library_name: transformers
5
  model_name: Qwen2.5-VL-3B-Instruct-Thinking
6
  tags:
7
  - generated_from_trainer
8
- - grpo
9
  - trl
 
10
  licence: license
11
  ---
12
 
13
  # Model Card for Qwen2.5-VL-3B-Instruct-Thinking
14
 
15
- This model is a fine-tuned version of [Qwen/Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) on the [lmms-lab/multimodal-open-r1-8k-verified](https://huggingface.co/datasets/lmms-lab/multimodal-open-r1-8k-verified) dataset.
16
  It has been trained using [TRL](https://github.com/huggingface/trl).
17
 
18
  ## Quick start
@@ -37,7 +36,7 @@ This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing
37
 
38
  - TRL: 0.22.0.dev0
39
  - Transformers: 4.55.0
40
- - Pytorch: 2.8.0+cu126
41
  - Datasets: 4.0.0
42
  - Tokenizers: 0.21.4
43
 
 
1
  ---
2
  base_model: Qwen/Qwen2.5-VL-3B-Instruct
 
3
  library_name: transformers
4
  model_name: Qwen2.5-VL-3B-Instruct-Thinking
5
  tags:
6
  - generated_from_trainer
 
7
  - trl
8
+ - grpo
9
  licence: license
10
  ---
11
 
12
  # Model Card for Qwen2.5-VL-3B-Instruct-Thinking
13
 
14
+ This model is a fine-tuned version of [Qwen/Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
  ## Quick start
 
36
 
37
  - TRL: 0.22.0.dev0
38
  - Transformers: 4.55.0
39
+ - Pytorch: 2.6.0+cu124
40
  - Datasets: 4.0.0
41
  - Tokenizers: 0.21.4
42
 
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:15be507a01940bd549ca58762394c90ef0c64a8f56cea5c290f19bd9dc6ed61b
3
  size 7393888
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a0cec4cb2b8bbb020c1e706b65b5eecf0a83a80193c8a07d5882dfa4ba26aa80
3
  size 7393888
runs/Aug08_05-50-33_e9baed9eaec7/events.out.tfevents.1754632252.e9baed9eaec7.184.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4132bab9572b1f6ebdb8a25ca29eaec9e9b78fdb6a0f4c9056946af06c79f12b
3
+ size 10601
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:73a8c18c0fe004efd26f09e3e0731b3358c4bc6700d7661f43d85ee581a21741
3
- size 6993
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:131042c4a2334e315d9e0e0b90a758a53888c968f07cf068c453a0cdb6c68a5c
3
+ size 6584