Update README.md
Browse files
README.md
CHANGED
|
@@ -27,6 +27,7 @@ cutoff_len: 8192
|
|
| 27 |
per_device_train_batch_size: 1
|
| 28 |
gradient_accumulation_steps: 16
|
| 29 |
learning_rate: 1.0e-5
|
|
|
|
| 30 |
num_train_epochs: 1.0
|
| 31 |
lr_scheduler_type: cosine
|
| 32 |
warmup_ratio: 0.05
|
|
@@ -107,7 +108,7 @@ print(output_text)
|
|
| 107 |
We are working on the release of a smaller, more efficient 3B model, which is designed to provide a balance between performance and resource efficiency. This model aims to deliver strong multimodal reasoning capabilities while being more accessible and optimized for environments with limited computational resources, offering a more compact alternative to the current 7B model.
|
| 108 |
|
| 109 |
## R1-Onevision Authors
|
| 110 |
-
- Yi Yang*, Xiaoxuan He*, Hongkun Pan*, Xiyan Jiang, Yan Deng, Xingtao Yang, Haoyu Lu, Minfeng Zhu†, Bo Zhang
|
| 111 |
- *Equal contribution. †Corresponding authors.
|
| 112 |
|
| 113 |
## Model Contact
|
|
|
|
| 27 |
per_device_train_batch_size: 1
|
| 28 |
gradient_accumulation_steps: 16
|
| 29 |
learning_rate: 1.0e-5
|
| 30 |
+
|
| 31 |
num_train_epochs: 1.0
|
| 32 |
lr_scheduler_type: cosine
|
| 33 |
warmup_ratio: 0.05
|
|
|
|
| 108 |
We are working on the release of a smaller, more efficient 3B model, which is designed to provide a balance between performance and resource efficiency. This model aims to deliver strong multimodal reasoning capabilities while being more accessible and optimized for environments with limited computational resources, offering a more compact alternative to the current 7B model.
|
| 109 |
|
| 110 |
## R1-Onevision Authors
|
| 111 |
+
- Yi Yang*, Xiaoxuan He*, Hongkun Pan*, Xiyan Jiang, Yan Deng, Xingtao Yang, Haoyu Lu, Minfeng Zhu†, Bo Zhang†
|
| 112 |
- *Equal contribution. †Corresponding authors.
|
| 113 |
|
| 114 |
## Model Contact
|