Instructions to use tensorblock/MicroLlama-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use tensorblock/MicroLlama-GGUF with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("tensorblock/MicroLlama-GGUF", dtype="auto") - llama-cpp-python
How to use tensorblock/MicroLlama-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="tensorblock/MicroLlama-GGUF", filename="MicroLlama-Q2_K.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use tensorblock/MicroLlama-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf tensorblock/MicroLlama-GGUF:Q2_K # Run inference directly in the terminal: llama-cli -hf tensorblock/MicroLlama-GGUF:Q2_K
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf tensorblock/MicroLlama-GGUF:Q2_K # Run inference directly in the terminal: llama-cli -hf tensorblock/MicroLlama-GGUF:Q2_K
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf tensorblock/MicroLlama-GGUF:Q2_K # Run inference directly in the terminal: ./llama-cli -hf tensorblock/MicroLlama-GGUF:Q2_K
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf tensorblock/MicroLlama-GGUF:Q2_K # Run inference directly in the terminal: ./build/bin/llama-cli -hf tensorblock/MicroLlama-GGUF:Q2_K
Use Docker
docker model run hf.co/tensorblock/MicroLlama-GGUF:Q2_K
- LM Studio
- Jan
- Ollama
How to use tensorblock/MicroLlama-GGUF with Ollama:
ollama run hf.co/tensorblock/MicroLlama-GGUF:Q2_K
- Unsloth Studio
How to use tensorblock/MicroLlama-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for tensorblock/MicroLlama-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for tensorblock/MicroLlama-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for tensorblock/MicroLlama-GGUF to start chatting
- Docker Model Runner
How to use tensorblock/MicroLlama-GGUF with Docker Model Runner:
docker model run hf.co/tensorblock/MicroLlama-GGUF:Q2_K
- Lemonade
How to use tensorblock/MicroLlama-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull tensorblock/MicroLlama-GGUF:Q2_K
Run and chat with the model
lemonade run user.MicroLlama-GGUF-Q2_K
List all available models
lemonade list
Remove .gguf files (keep Q2_K.gguf)
Browse files- MicroLlama-Q3_K_L.gguf +0 -3
- MicroLlama-Q3_K_M.gguf +0 -3
- MicroLlama-Q3_K_S.gguf +0 -3
- MicroLlama-Q4_0.gguf +0 -3
- MicroLlama-Q4_K_M.gguf +0 -3
- MicroLlama-Q4_K_S.gguf +0 -3
- MicroLlama-Q5_0.gguf +0 -3
- MicroLlama-Q5_K_M.gguf +0 -3
- MicroLlama-Q5_K_S.gguf +0 -3
- MicroLlama-Q6_K.gguf +0 -3
- MicroLlama-Q8_0.gguf +0 -3
MicroLlama-Q3_K_L.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:518737c710080f16210eddf41fe591c33e9e33f117491c3e64376b2adcaef4e2
|
| 3 |
-
size 166418144
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q3_K_M.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:3726631e6008dd8c06af7eb35e5274cb4c18f8ff9bee2960b5f63c6203fbd8a3
|
| 3 |
-
size 155866848
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q3_K_S.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:316e879e372e1448911f8bacc18239840cbd02a2c18960493c5d02f4b44785ac
|
| 3 |
-
size 144520928
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q4_0.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:70d348c6df9ff960d56f727605b14e31229b26b037c92b7171be4ac96228b866
|
| 3 |
-
size 180625120
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q4_K_M.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:216db27b323b4c62641b6b608cefd0829c6b085ff178d0dbfaac0bb59a797860
|
| 3 |
-
size 189951712
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q4_K_S.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:ae503fee34f8929bbdd99e60bf8952eded49853ef6a0bd81d0beba2e8efa2a29
|
| 3 |
-
size 181477088
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q5_0.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:2e55ca444870e093c4a5c5bccf12739af2039492ba773f4fdae76722f31d0c75
|
| 3 |
-
size 214605536
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q5_K_M.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:bab25f6b159f8f540627ee38382e880ee57399a9031d88126d1be99a317a68ff
|
| 3 |
-
size 219410144
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q5_K_S.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:80b757288c27cb1433c532ef15ccd29dda978199ed32b2a335a238eade6f6a42
|
| 3 |
-
size 214605536
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q6_K.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:4500e1525a33f8c315e6b55fabb45aad0c3abecfd83bb147b3ca1b6b495e4cbd
|
| 3 |
-
size 250709728
|
|
|
|
|
|
|
|
|
|
|
|
MicroLlama-Q8_0.gguf
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:ad33f419fa4fc3eef76bd43216fb50dbdd46dc606d3e432033040d7c34264972
|
| 3 |
-
size 324482784
|
|
|
|
|
|
|
|
|
|
|
|