Upload README.md with huggingface_hub

ff83ecb verified about 1 month ago

3.7 kB

	---
	license: gemma
	base_model: google/gemma-2-2b
	tags:
	- thinking
	- reasoning
	- chain-of-thought
	- cot
	- gemma
	- fine-tuned
	- lora
	datasets:
	- opus-research/opus-thinking-10k
	language:
	- en
	pipeline_tag: text-generation
	---

	# 🧠 Gemma 2 2B Thinking

	Gemma 2 2B fine-tuned on the Opus Thinking 10k dataset to exhibit explicit chain-of-thought reasoning.

	This model outputs its reasoning process in a `Thinking...` block before providing the final answer.

	## Model Details

	\| Attribute \| Value \|
	\|-----------\|-------\|
	\| Base Model \| [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) \|
	\| Training Dataset \| [opus-research/opus-thinking-10k](https://huggingface.co/datasets/opus-research/opus-thinking-10k) \|
	\| Training Method \| LoRA (r=16, alpha=32) \|
	\| Precision \| bfloat16 \|
	\| Training Steps \| 566 (~1 epoch on 9k examples) \|

	## Output Format

	The model follows this format:

	```
	Thinking...
	[Internal reasoning process - analyzing the question, considering options, working through logic]
	...done Thinking!

	[Final helpful response]
	```

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained(
	"google/gemma-2-2b",
	torch_dtype="bfloat16",
	device_map="auto"
	)

	# Load LoRA adapters
	model = PeftModel.from_pretrained(base_model, "opus-research/gemma-2-2b-thinking")
	tokenizer = AutoTokenizer.from_pretrained("opus-research/gemma-2-2b-thinking")

	# Generate
	prompt = "<start_of_turn>user\nWhat is 15 times 7?<end_of_turn>\n<start_of_turn>model\n"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=200)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Example Outputs

	Here are real outputs from the fine-tuned model:

	### Programming Advice
	User: Should I learn Python or JavaScript first?

	Model:
	```
	Thinking...
	Python is more general-purpose, JavaScript is web-focused. Python is easier to learn, JavaScript is more widely used.
	...done Thinking!

	Python is often seen as a more beginner-friendly language due to its simpler syntax and readability. However, JavaScript is widely used in web development and is essential for creating interactive web applications. The best choice depends on your career goals and interests. If you're interested in web development, start with JavaScript. If you want a more general-purpose language, start with Python.
	```

	### Humor
	User: Tell me a joke.

	Model:
	```
	Thinking...
	Joke search.
	...done Thinking!

	Okay, here's one: Why don't scientists trust atoms? Because they make up everything!
	```

	### Notes
	- The model successfully learned the `Thinking...` / `...done Thinking!` format
	- As a 2B parameter model, reasoning quality varies - larger models would benefit more from this training
	- Best results on advice, explanations, and creative tasks

	## Training

	Trained using the [TRL](https://github.com/huggingface/trl) library with LoRA adapters on 2x RTX 4090 GPUs.

	### Loss Curve

	![Training Loss](training_loss_chart.png)

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Start Loss \| 3.56 \|
	\| Final Loss \| 0.69 \|
	\| Reduction \| 80.7% \|
	\| Training Time \| 1h 42m \|
	\| Hardware \| 2x NVIDIA RTX 4090 \|

	## License

	This model inherits the [Gemma license](https://ai.google.dev/gemma/terms) from the base model.

	## Citation

	```bibtex
	@misc{opus-gemma-thinking,
	author = {Opus Research},
	title = {Gemma 2 2B Thinking - Chain-of-Thought Fine-tune},
	year = {2024},
	publisher = {Hugging Face},
	url = {https://huggingface.co/opus-research/gemma-2-2b-thinking}
	}
	```