Instructions to use langdai/gemma-2-2b-it-tool-think with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use langdai/gemma-2-2b-it-tool-think with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="langdai/gemma-2-2b-it-tool-think")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("langdai/gemma-2-2b-it-tool-think")
model = AutoModelForCausalLM.from_pretrained("langdai/gemma-2-2b-it-tool-think")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use langdai/gemma-2-2b-it-tool-think with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "langdai/gemma-2-2b-it-tool-think"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "langdai/gemma-2-2b-it-tool-think",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/langdai/gemma-2-2b-it-tool-think

SGLang

How to use langdai/gemma-2-2b-it-tool-think with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "langdai/gemma-2-2b-it-tool-think" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "langdai/gemma-2-2b-it-tool-think",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "langdai/gemma-2-2b-it-tool-think" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "langdai/gemma-2-2b-it-tool-think",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use langdai/gemma-2-2b-it-tool-think with Docker Model Runner:
```
docker model run hf.co/langdai/gemma-2-2b-it-tool-think
```

Model Card for Model ID

This model is merged with peft fine tuned model and it is standalone model.

Model Details

Model Description

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: [Liching]
Funded by: [hobby]
Model type: [text-generation]
Language(s) (NLP): [En]
License: [MIT]
Finetuned from model: [gemma-2b-it]

Uses

gemma-2b-it cannot be used for tool call and responds with reasoning like the recent developed Deepseek r1, these limitations are taken into consideration by fine tuning the model

Bias, Risks, and Limitations

The Model is finetuned for 1 epoch due to which Bias and error are prone

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from transformers import pipeline
import torch
model_id = "langdai/gemma-2-2b-it-tool-think"
model = AutoModelForCausalLM.from_pretrained(model_id,
                                             device_map="cuda:0",
                                            ) # For GPU
tokenizer = AutoTokenizer.from_pretrained(model_id)
# model.to(torch.bfloat16)
model.eval()
generator = pipeline("text-generation", model= model, tokenizer= tokenizer)

prompt="""<bos><start_of_turn>human
You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags.You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions.Here are the available tools:<tools> [{'type': 'function', 'function': {'name': 'convert_currency', 'description': 'Convert from one currency to another', 'parameters': {'type': 'object', 'properties': {'amount': {'type': 'number', 'description': 'The amount to convert'}, 'from_currency': {'type': 'string', 'description': 'The currency to convert from'}, 'to_currency': {'type': 'string', 'description': 'The currency to convert to'}}, 'required': ['amount', 'from_currency', 'to_currency']}}}, {'type': 'function', 'function': {'name': 'calculate_distance', 'description': 'Calculate the distance between two locations', 'parameters': {'type': 'object', 'properties': {'start_location': {'type': 'string', 'description': 'The starting location'}, 'end_location': {'type': 'string', 'description': 'The ending location'}}, 'required': ['start_location', 'end_location']}}}] </tools>Use the following pydantic model json schema for each tool call you will make: {'title': 'FunctionCall', 'type': 'object', 'properties': {'arguments': {'title': 'Arguments', 'type': 'object'}, 'name': {'title': 'Name', 'type': 'string'}}, 'required': ['arguments', 'name']}For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{tool_call}
</tool_call>Also, before making a call to a function take the time to plan the function to take. Make that thinking process between <think>{your thoughts}</think>

Hi, I need to convert 500 INR to Euros. Can you help me with that?<end_of_turn><eos>
<start_of_turn>model
<think>"""

output = generator([{"role": "user", "content": prompt}], max_new_tokens=512, return_full_text=False)[0]

print(output)

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [T4 24GPU]
Hours used: [4 hours]

Downloads last month: 2

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for langdai/gemma-2-2b-it-tool-think

Base model

google/gemma-2b-it

Finetuned

(124)

this model

Paper for langdai/gemma-2-2b-it-tool-think

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 58