Salma204 commited on
Commit
0100b42
·
verified ·
1 Parent(s): aed4a8c

GRPO v1 (thinking ON) - val 51.5 pct, replaces SFT v1_off

Browse files
Files changed (2) hide show
  1. chat_template.jinja +1 -1
  2. model.safetensors +1 -1
chat_template.jinja CHANGED
@@ -1,4 +1,4 @@
1
- {%- set enable_thinking = false %}
2
  {%- if tools %}
3
  {{- '<|im_start|>system\n' }}
4
  {%- if messages[0].role == 'system' %}
 
1
+ {%- set enable_thinking = true %}
2
  {%- if tools %}
3
  {{- '<|im_start|>system\n' }}
4
  {%- if messages[0].role == 'system' %}
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4c3acee8d898fbb647ec2693dd78cc3504834f4a2ea57c2131792206c083588b
3
  size 3441185608
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:30724f44c5f51cf2e51e565e763d7d6f40dd7d481d1fec9fd24c85ff369d189b
3
  size 3441185608