Instructions to use jtatman/functioncall-llama2-chat-q3-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use jtatman/functioncall-llama2-chat-q3-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="jtatman/functioncall-llama2-chat-q3-gguf", filename="Llama-2-7b-chat-hf-function-calling-v3-Q2_K.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use jtatman/functioncall-llama2-chat-q3-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jtatman/functioncall-llama2-chat-q3-gguf:Q2_K # Run inference directly in the terminal: llama-cli -hf jtatman/functioncall-llama2-chat-q3-gguf:Q2_K
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf jtatman/functioncall-llama2-chat-q3-gguf:Q2_K # Run inference directly in the terminal: llama-cli -hf jtatman/functioncall-llama2-chat-q3-gguf:Q2_K
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf jtatman/functioncall-llama2-chat-q3-gguf:Q2_K # Run inference directly in the terminal: ./llama-cli -hf jtatman/functioncall-llama2-chat-q3-gguf:Q2_K
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf jtatman/functioncall-llama2-chat-q3-gguf:Q2_K # Run inference directly in the terminal: ./build/bin/llama-cli -hf jtatman/functioncall-llama2-chat-q3-gguf:Q2_K
Use Docker
docker model run hf.co/jtatman/functioncall-llama2-chat-q3-gguf:Q2_K
- LM Studio
- Jan
- Ollama
How to use jtatman/functioncall-llama2-chat-q3-gguf with Ollama:
ollama run hf.co/jtatman/functioncall-llama2-chat-q3-gguf:Q2_K
- Unsloth Studio new
How to use jtatman/functioncall-llama2-chat-q3-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jtatman/functioncall-llama2-chat-q3-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for jtatman/functioncall-llama2-chat-q3-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for jtatman/functioncall-llama2-chat-q3-gguf to start chatting
- Docker Model Runner
How to use jtatman/functioncall-llama2-chat-q3-gguf with Docker Model Runner:
docker model run hf.co/jtatman/functioncall-llama2-chat-q3-gguf:Q2_K
- Lemonade
How to use jtatman/functioncall-llama2-chat-q3-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull jtatman/functioncall-llama2-chat-q3-gguf:Q2_K
Run and chat with the model
lemonade run user.functioncall-llama2-chat-q3-gguf-Q2_K
List all available models
lemonade list
A gguf version of the v1 model of llama 2 function calling model in:
- fLlama-2-7b-chat.q3_K_M.gguf GGUF versions of v3 in:
- Llama-2-7b-chat-hf-function-calling-v3-Q4_0.gguf
- Llama-2-7b-chat-hf-function-calling-v3-Q_4_K_M.gguf
- Llama-2-7b-chat-hf-function-calling-v3-Q2_K.gguf
Set up like so:
[INST] You have access to the following functions. Use them if required:
[
{
"type": "function",
"function": {
"name": "get_big_stocks",
"description": "Get the names of the largest N stocks by market cap",
"parameters": {
"type": "object",
"properties": {
"number": {
"type": "integer",
"description": "The number of largest stocks to get the names of, e.g. 25"
},
"region": {
"type": "string",
"description": "The region to consider, can be \"US\" or \"World\"."
}
},
"required": [
"number"
]
}
}
},
{
"type": "function",
"function": {
"name": "get_stock_price",
"description": "Get the stock price of an array of stocks",
"parameters": {
"type": "object",
"properties": {
"names": {
"type": "array",
"items": {
"type": "string"
},
"description": "An array of stocks"
}
},
"required": [
"names"
]
}
}
}
]
[INST] Get the names of the five largest stocks in the US by market cap [/INST]
{
"name": "get_big_stocks",
"arguments": {
"number": 5,
"region": "US"
}
}</s>
or this:
<s>[INST] <<SYS>>
You are a helpful research assistant. The following functions are available for you to fetch further data to answer user questions, if relevant:
{
"function": "search_bing",
"description": "Search the web for content on Bing. This allows users to search online/the internet/the web for content.",
"arguments": [
{
"name": "query",
"type": "string",
"description": "The search query string"
}
]
}
{
"function": "search_arxiv",
"description": "Search for research papers on ArXiv. Make use of AND, OR and NOT operators as appropriate to join terms within the query.",
"arguments": [
{
"name": "query",
"type": "string",
"description": "The search query string"
}
]
}
To call a function, respond - immediately and only - with a JSON object of the following format:
{
"function": "function_name",
"arguments": {
"argument1": "argument_value",
"argument2": "argument_value"
}
}
<</SYS>>[/INST]
[INST] Find papers on high pressure batch reverse osmosis [/INST]
Good results through standard llama.cpp chat web interface - also can be used for openai proxy.
Original Creds go here:
(Trelis/Llama-2-7b-chat-hf-function-calling-v3)[https://huggingface.co/Trelis/Llama-2-7b-chat-hf-function-calling-v3]
- Downloads last month
- 36
Hardware compatibility
Log In to add your hardware
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Collection including jtatman/functioncall-llama2-chat-q3-gguf
Collection
quantized models that will run on almost anything • 4 items • Updated