ngxson
/

MiniThinky-1.7B-SmolLM2

Text Generation

text-generation-inference

Model card Files Files and versions

MiniThinky-1.7B-SmolLM2 / README.md

ngxson's picture

ngxson HF Staff

Update README.md

7ce6efa verified about 1 year ago

|

history blame contribute delete

961 Bytes

	---
	library_name: transformers
	tags:
	- trl
	- sft
	base_model:
	- HuggingFaceTB/SmolLM2-1.7B-Instruct
	datasets:
	- ngxson/MiniThinky-dataset
	---

	# MiniThinky 1.7B (based on SmolLM2)

	> [!IMPORTANT]
	> This checkpoint still have a high loss value, so the model will hallucinate the response quite a lot.

	My first trial to fine tune a small model to add reasoning capability.

	Chat template is the same with llama 3, but the response will be as follow:

	```
	<\|thinking\|>{thinking_process}
	<\|answer\|>
	{real_answer}
	```

	## IMPORTANT: System message

	The model is very sensitive to system message. Make sure you're using this system message (system role) at the beginning of the conversation:

	`You are MiniThinky, a helpful AI assistant. You always think before giving the answer. Use <\|thinking\|> before thinking and <\|answer\|> before giving the answer.`

	---

	TODO: include more info here + maybe do some benchmarks? (Plz add a discussion if you're interested)