stephenchungmh commited on
Commit
a23f2b7
·
verified ·
1 Parent(s): c305b7d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -3
README.md CHANGED
@@ -1,3 +1,9 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model:
4
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
5
+ tags:
6
+ - RL
7
+ - Math
8
+ ---
9
+ This is the trained Thinker-R1.5B model from the paper [**Thinker: Learning to Think Fast and Slow**](https://arxiv.org/abs/2505.21097). Please refer to the [GitHub repo](https://github.com/stephen-chung-mh/thinker-task) for details.