🧠 Gemma 2 2B Thinking

Gemma 2 2B fine-tuned on the Opus Thinking 10k dataset to exhibit explicit chain-of-thought reasoning.

This model outputs its reasoning process in a Thinking... block before providing the final answer.

Model Details

Attribute	Value
Base Model	google/gemma-2-2b
Training Dataset	opus-research/opus-thinking-10k
Training Method	LoRA (r=16, alpha=32)
Precision	bfloat16
Training Steps	566 (~1 epoch on 9k examples)

Output Format

The model follows this format:

Thinking...
[Internal reasoning process - analyzing the question, considering options, working through logic]
...done Thinking!

[Final helpful response]

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b",
    torch_dtype="bfloat16",
    device_map="auto"
)

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "opus-research/gemma-2-2b-thinking")
tokenizer = AutoTokenizer.from_pretrained("opus-research/gemma-2-2b-thinking")

# Generate
prompt = "<start_of_turn>user\nWhat is 15 times 7?<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Example Outputs

Here are real outputs from the fine-tuned model:

Programming Advice

User: Should I learn Python or JavaScript first?

Model:

Thinking...
Python is more general-purpose, JavaScript is web-focused. Python is easier to learn, JavaScript is more widely used.
...done Thinking!

Python is often seen as a more beginner-friendly language due to its simpler syntax and readability. However, JavaScript is widely used in web development and is essential for creating interactive web applications. The best choice depends on your career goals and interests. If you're interested in web development, start with JavaScript. If you want a more general-purpose language, start with Python.

Humor

User: Tell me a joke.

Model:

Thinking...
Joke search.
...done Thinking!

Okay, here's one: Why don't scientists trust atoms? Because they make up everything!

Notes

The model successfully learned the Thinking... / ...done Thinking! format
As a 2B parameter model, reasoning quality varies - larger models would benefit more from this training
Best results on advice, explanations, and creative tasks

Training

Trained using the TRL library with LoRA adapters on 2x RTX 4090 GPUs.

Loss Curve

Metric	Value
Start Loss	3.56
Final Loss	0.69
Reduction	80.7%
Training Time	1h 42m
Hardware	2x NVIDIA RTX 4090

License

This model inherits the Gemma license from the base model.

Citation

@misc{opus-gemma-thinking,
  author = {Opus Research},
  title = {Gemma 2 2B Thinking - Chain-of-Thought Fine-tune},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/opus-research/gemma-2-2b-thinking}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for opus-research/gemma-2-2b-thinking

Base model

google/gemma-2-2b

Adapter

(272)

this model

opus-research
/

gemma-2-2b-thinking