Instructions to use bartowski/LLaMA-Mesh-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use bartowski/LLaMA-Mesh-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="bartowski/LLaMA-Mesh-GGUF",
	filename="LLaMA-Mesh-IQ2_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use bartowski/LLaMA-Mesh-GGUF with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf bartowski/LLaMA-Mesh-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf bartowski/LLaMA-Mesh-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf bartowski/LLaMA-Mesh-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf bartowski/LLaMA-Mesh-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf bartowski/LLaMA-Mesh-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf bartowski/LLaMA-Mesh-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf bartowski/LLaMA-Mesh-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf bartowski/LLaMA-Mesh-GGUF:Q4_K_M

Use Docker

docker model run hf.co/bartowski/LLaMA-Mesh-GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use bartowski/LLaMA-Mesh-GGUF with Ollama:
```
ollama run hf.co/bartowski/LLaMA-Mesh-GGUF:Q4_K_M
```

Unsloth Studio

How to use bartowski/LLaMA-Mesh-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for bartowski/LLaMA-Mesh-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for bartowski/LLaMA-Mesh-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for bartowski/LLaMA-Mesh-GGUF to start chatting

Atomic Chat new
Docker Model Runner
How to use bartowski/LLaMA-Mesh-GGUF with Docker Model Runner:
```
docker model run hf.co/bartowski/LLaMA-Mesh-GGUF:Q4_K_M
```

Lemonade

How to use bartowski/LLaMA-Mesh-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull bartowski/LLaMA-Mesh-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.LLaMA-Mesh-GGUF-Q4_K_M

List all available models

lemonade list

bad quantization ?

by zoldaten - opened Jan 20, 2025

Discussion

zoldaten

Jan 20, 2025

•

edited Jan 20, 2025

i tried some models in the row and all of them (LLaMA-Mesh-f16.gguf, LLaMA-Mesh-Q6_K_L.gguf, LLaMA-Mesh-Q8_0.gguf)didnt return appropriate result:
promt: "Create a 3D obj file using the following description: a lamp"

import os from llama_cpp import Llama from huggingface_hub import hf_hub_download import numpy as np

model = Llama(
model_path=hf_hub_download(
repo_id=os.environ.get("REPO_ID", "bartowski/LLaMA-Mesh-GGUF"),
filename=os.environ.get("MODEL_FILE", "LLaMA-Mesh-f16.gguf"),
),
n_gpu_layers=-1
)

message = "Create a 3D obj file using the following description: a lamp"
#message = "Create a 3D model of a table."

response = model.create_chat_completion(
messages=[{"role": "user", "content": message}],
temperature=0.9,
max_tokens=4096,
top_p=0.96,
stream=True,
)
temp=""
for streamed in response:
delta = streamed["choices"][0].get("delta", {})
text_chunk = delta.get("content", "")

    temp += text_chunk

print(temp)

bartowski

Owner Jan 20, 2025

Odd, there shouldn't be anything wrong with the quantization itself, but I also haven't tried to use it. Is this an expected use case that should work? Can you try the original safetensors?

zoldaten

Jan 21, 2025

i tried original on demo page - its not ideal sometimes but it works.

my images above result on windows 10 with llama_cli:
llama-cli -m LLaMA-Mesh-Q6_K_L.gguf -p "Create low poly 3D model of a coffe cup" or llama-cli -m LLaMA-Mesh-Q6_K_L.gguf -p "Create a 3D obj file using the following description: a lamp"

ps.
i also use llama_cpp_python code (see above) on ubuntu but model provides a cut of 3d model and finishes thinking its OK:

johneliot1978

Apr 16, 2025

i cant get Q8 to generate anything other than garbage either, something wrong. i can generate 50 models and every now and again one will turn out like you would expect, the rest are just mush of vertices

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment