How to use from
llama.cpp
Install from brew
brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf goodasdgood/dracarys2-72b-instruct
# Run inference directly in the terminal:
llama-cli -hf goodasdgood/dracarys2-72b-instruct
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf goodasdgood/dracarys2-72b-instruct
# Run inference directly in the terminal:
llama-cli -hf goodasdgood/dracarys2-72b-instruct
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf goodasdgood/dracarys2-72b-instruct
# Run inference directly in the terminal:
./llama-cli -hf goodasdgood/dracarys2-72b-instruct
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf goodasdgood/dracarys2-72b-instruct
# Run inference directly in the terminal:
./build/bin/llama-cli -hf goodasdgood/dracarys2-72b-instruct
Use Docker
docker model run hf.co/goodasdgood/dracarys2-72b-instruct
Quick Links

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

!git clone https://github.com/ggerganov/llama.cpp

%cd llama.cpp

!make

!./llama-cli -h

!./llama-gguf-split --split-max-size 5G /content/dracarys2-72b-instruct.Q2_K.gguf dracarys2-72b-instruct.gguf

from huggingface_hub import upload_file

رفع جزء النموذج الأول

upload_file( path_or_fileobj="/content/llama.cpp/dracarys2-72b-instruct.gguf-00001-of-00006.gguf", # استبدل بمسار الجزء الأول path_in_repo="dracarys2-72b-instruct.gguf-00001-of-00006.gguf", # اسم الملف في المستودع repo_id=repo_name, # اسم المستودع )

رفع جزء النموذج الثاني

upload_file( path_or_fileobj="/content/llama.cpp/dracarys2-72b-instruct.gguf-00002-of-00006.gguf", # استبدل بمسار الجزء الثاني path_in_repo="dracarys2-72b-instruct.gguf-00002-of-00006.gguf", # اسم الملف في المستودع repo_id=repo_name, # اسم المستودع )

وهكذا لبقية الأجزاء

!./llama-cli -m "/content/dracarys2-72b-instruct.Q2_K.gguf" -p "who is ai?" -n 50 -e -t 4 --no-warmup

!./bin/llama-cli -m "/content/dracarys2-72b-instruct.Q2_K.gguf" -p "Hi you how are you" -n 50 -e -ngl 33 -t 4

run it !./llama-cli -m "/content/llama.cpp/dracarys2-72b-instruct.gguf-00001-of-00006.gguf" -p "who is ai?" -n 50 -e -t 4 --no-warmup

Downloads last month
6
GGUF
Model size
73B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support