File size: 1,084 Bytes
16680e6 db43103 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
---
library_name: transformers
license: apache-2.0
base_model:
- Qwen/Qwen2.5-0.5B-Instruct
tags:
- llama-factory
- full
- generated_from_trainer
model-index:
- name: QwenThinker0.5B
datasets:
- open-thoughts/open-thoughts-114k
# QwenThinker0.5B
This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) on the
[OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) dataset.
The dataset is derived by distilling DeepSeek-R1 using the [data pipeline available on github](https://github.com/open-thoughts/open-thoughts).
More info about the dataset can be found on the dataset card at [OpenThoughts-114k dataset](https://huggingface.co/datasets/open-thoughts/open-thoughts-114k).
Trained with [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
### Training hyperparameters
- 288 global batch size
- learning_rate: 1e-05
- num_epochs: 1.0
- learning_rate: 1e-05.

|