Instructions to use tensorblock/phi-2-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use tensorblock/phi-2-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="tensorblock/phi-2-GGUF",
	filename="phi-2-Q2_K.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use tensorblock/phi-2-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf tensorblock/phi-2-GGUF:Q2_K
# Run inference directly in the terminal:
llama-cli -hf tensorblock/phi-2-GGUF:Q2_K

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf tensorblock/phi-2-GGUF:Q2_K
# Run inference directly in the terminal:
llama-cli -hf tensorblock/phi-2-GGUF:Q2_K

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf tensorblock/phi-2-GGUF:Q2_K
# Run inference directly in the terminal:
./llama-cli -hf tensorblock/phi-2-GGUF:Q2_K

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf tensorblock/phi-2-GGUF:Q2_K
# Run inference directly in the terminal:
./build/bin/llama-cli -hf tensorblock/phi-2-GGUF:Q2_K

Use Docker

docker model run hf.co/tensorblock/phi-2-GGUF:Q2_K

LM Studio
Jan
Ollama
How to use tensorblock/phi-2-GGUF with Ollama:
```
ollama run hf.co/tensorblock/phi-2-GGUF:Q2_K
```

Unsloth Studio new

How to use tensorblock/phi-2-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for tensorblock/phi-2-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for tensorblock/phi-2-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for tensorblock/phi-2-GGUF to start chatting

Docker Model Runner
How to use tensorblock/phi-2-GGUF with Docker Model Runner:
```
docker model run hf.co/tensorblock/phi-2-GGUF:Q2_K
```

Lemonade

How to use tensorblock/phi-2-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull tensorblock/phi-2-GGUF:Q2_K

Run and chat with the model

lemonade run user.phi-2-GGUF-Q2_K

List all available models

lemonade list

morriszms commited on Nov 18, 2024

Commit

7e7a283

verified ·

1 Parent(s): 5fa40b9

Upload folder using huggingface_hub

Browse files

Files changed (13) hide show

README.md +5 -11
phi-2-Q2_K.gguf +2 -2
phi-2-Q3_K_L.gguf +2 -2
phi-2-Q3_K_M.gguf +2 -2
phi-2-Q3_K_S.gguf +2 -2
phi-2-Q4_0.gguf +2 -2
phi-2-Q4_K_M.gguf +2 -2
phi-2-Q4_K_S.gguf +2 -2
phi-2-Q5_0.gguf +2 -2
phi-2-Q5_K_M.gguf +2 -2
phi-2-Q5_K_S.gguf +2 -2
phi-2-Q6_K.gguf +2 -2
phi-2-Q8_0.gguf +2 -2

README.md CHANGED Viewed

@@ -1,15 +1,11 @@
 ---
 license: mit
-license_link: https://huggingface.co/microsoft/phi-2/resolve/main/LICENSE
-language:
-- en
-pipeline_tag: text-generation
 tags:
-- nlp
-- code
 - TensorBlock
 - GGUF
-base_model: microsoft/phi-2
 ---
 <div style="width: auto; margin-left: auto; margin-right: auto">
@@ -23,13 +19,12 @@ base_model: microsoft/phi-2
     </div>
 </div>
-## microsoft/phi-2 - GGUF
-This repo contains GGUF format model files for [microsoft/phi-2](https://huggingface.co/microsoft/phi-2).
 The files were quantized using machines provided by [TensorBlock](https://tensorblock.co/), and they are compatible with llama.cpp as of [commit b4011](https://github.com/ggerganov/llama.cpp/commit/a6744e43e80f4be6398fc7733a01642c846dce1d).
 <div style="text-align: left; margin: 20px 0;">
     <a href="https://tensorblock.co/waitlist/client" style="display: inline-block; padding: 10px 20px; background-color: #007bff; color: white; text-decoration: none; border-radius: 5px; font-weight: bold;">
         Run them on the TensorBlock client using your local machine ↗
@@ -38,7 +33,6 @@ The files were quantized using machines provided by [TensorBlock](https://tensor
 ## Prompt template
 ```
 ```

 ---
 license: mit
+license_name: microsoft-research-license
+license_link: LICENSE
 tags:
 - TensorBlock
 - GGUF
+base_model: susnato/phi-2
 ---
 <div style="width: auto; margin-left: auto; margin-right: auto">
     </div>
 </div>
+## susnato/phi-2 - GGUF
+This repo contains GGUF format model files for [susnato/phi-2](https://huggingface.co/susnato/phi-2).
 The files were quantized using machines provided by [TensorBlock](https://tensorblock.co/), and they are compatible with llama.cpp as of [commit b4011](https://github.com/ggerganov/llama.cpp/commit/a6744e43e80f4be6398fc7733a01642c846dce1d).
 <div style="text-align: left; margin: 20px 0;">
     <a href="https://tensorblock.co/waitlist/client" style="display: inline-block; padding: 10px 20px; background-color: #007bff; color: white; text-decoration: none; border-radius: 5px; font-weight: bold;">
         Run them on the TensorBlock client using your local machine ↗
 ## Prompt template
 ```
 ```

phi-2-Q2_K.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e08d782b8cba58be242d78cca8a6c69effc10a5da7b3d858dbc15e32b77c0b88
-size 1109720128

 version https://git-lfs.github.com/spec/v1
+oid sha256:7cdd6b914c0e5452036743d9da147b14e579ffeb092a9c4c6da3288f988c9328
+size 1109719968

phi-2-Q3_K_L.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c85410d6327c53ce696befe7804e2474cea01f57cf0a13a5263640d26de4508a
-size 1575230528

 version https://git-lfs.github.com/spec/v1
+oid sha256:340f906aa21921be6dbd33d471702055461f915668901a169b243a5b19235b12
+size 1575230368

phi-2-Q3_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4e6d79af8b4d7eddbe2d55cdda3f60639aea5bc80cd59f5df114dd58a8285ffa
-size 1426136128

 version https://git-lfs.github.com/spec/v1
+oid sha256:72ec700e0a8217b12c5c9be65be900fd507d8c1c8f5653a6cbb813a6e3ed5873
+size 1426135968

phi-2-Q3_K_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:43e97b3b844bcc65eb49ebecf640317eb626f4ae905ac9b86d56d42f15863ab5
-size 1250827328

 version https://git-lfs.github.com/spec/v1
+oid sha256:6b0dee54a1e218d78808ab3ceb4e02b6e808d8672f39ee7caf43975a5cc89017
+size 1250827168

phi-2-Q4_0.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7654105c20b53439b190d6fa1d511ceefb0594384cc25e0057e451e89ede9f07
-size 1602468928

 version https://git-lfs.github.com/spec/v1
+oid sha256:2c6a89ff9d61b3ae4afa4045de7dba608e67565c003d3bfbb164050d077e1768
+size 1602468768

phi-2-Q4_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8c689e657c3f0351ff73da7bb053b9ab004e3100676d61d24de7e740c5078a7a
-size 1737636928

 version https://git-lfs.github.com/spec/v1
+oid sha256:ba75e23a235892d3a6c122108d5f1a6108f90d36d507f798f330b47b71778a86
+size 1737636768

phi-2-Q4_K_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:22497949c7e04bc3055d7d48d224424358d71d007a2860004e00e4a77049a582
-size 1618852928

 version https://git-lfs.github.com/spec/v1
+oid sha256:a4d88305a75369c1a4331c9fc92aec40394713ae51abdc74c85ebc73ad4f01bf
+size 1618852768

phi-2-Q5_0.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cafb945cfc51577be3d6d9d3c86e08a5ccaaa4b6e7b7b77abbef3e6c6a6396c6
-size 1933425728

 version https://git-lfs.github.com/spec/v1
+oid sha256:20b5bc573ea3cc1f15951a84b0293b2205720263294e5594394a2a9ecd4e4180
+size 1933425568

phi-2-Q5_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3959e50cbda13da92855c6f8f81b65f03754f2de24a19281231b03cd04cd55c8
-size 2003057728

 version https://git-lfs.github.com/spec/v1
+oid sha256:f25c41f7ddc0db90690b8281526b16b3c7a33d8c77533692158fb33efe14053e
+size 2003057568

phi-2-Q5_K_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:93397c5f1a14bd4b11a3afdccdc7539d6947d45ce997cc222ac132ee09cd7498
-size 1933425728

 version https://git-lfs.github.com/spec/v1
+oid sha256:1ecdd72a4eddbf4f29deb3df073cc7bcafc09c4db4a688965a030efdcff4a685
+size 1933425568

phi-2-Q6_K.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1649b9bd7d5965c15e858039756b7398dffa3f52297a1ab4c8f4d997924fa166
-size 2285067328

 version https://git-lfs.github.com/spec/v1
+oid sha256:23ee71e08f9540b2da1563be9f705aaa025c5b5e00e88be676f0f9734d6cde44
+size 2285067168

phi-2-Q8_0.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bc29d0e0c95c28a3baaeb25459fb845ae6ab265e3c2ea78329434aba4250d125
-size 2958040128

 version https://git-lfs.github.com/spec/v1
+oid sha256:128570e8b4215f8297439c43c3bb5f65176805560b08901802a73cbfc32ac1d1
+size 2958039968