Instructions to use eramax/Magicoder-S-CL-7B-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use eramax/Magicoder-S-CL-7B-gguf with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="eramax/Magicoder-S-CL-7B-gguf",
	filename="Magicoder-S-CL-7B-Q5_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use eramax/Magicoder-S-CL-7B-gguf with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf eramax/Magicoder-S-CL-7B-gguf:Q5_K_M
# Run inference directly in the terminal:
llama cli -hf eramax/Magicoder-S-CL-7B-gguf:Q5_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf eramax/Magicoder-S-CL-7B-gguf:Q5_K_M
# Run inference directly in the terminal:
llama cli -hf eramax/Magicoder-S-CL-7B-gguf:Q5_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf eramax/Magicoder-S-CL-7B-gguf:Q5_K_M
# Run inference directly in the terminal:
./llama-cli -hf eramax/Magicoder-S-CL-7B-gguf:Q5_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf eramax/Magicoder-S-CL-7B-gguf:Q5_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf eramax/Magicoder-S-CL-7B-gguf:Q5_K_M

Use Docker

docker model run hf.co/eramax/Magicoder-S-CL-7B-gguf:Q5_K_M

LM Studio
Jan
Ollama
How to use eramax/Magicoder-S-CL-7B-gguf with Ollama:
```
ollama run hf.co/eramax/Magicoder-S-CL-7B-gguf:Q5_K_M
```

Unsloth Studio

How to use eramax/Magicoder-S-CL-7B-gguf with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for eramax/Magicoder-S-CL-7B-gguf to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for eramax/Magicoder-S-CL-7B-gguf to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for eramax/Magicoder-S-CL-7B-gguf to start chatting

Atomic Chat new
Docker Model Runner
How to use eramax/Magicoder-S-CL-7B-gguf with Docker Model Runner:
```
docker model run hf.co/eramax/Magicoder-S-CL-7B-gguf:Q5_K_M
```

Lemonade

How to use eramax/Magicoder-S-CL-7B-gguf with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull eramax/Magicoder-S-CL-7B-gguf:Q5_K_M

Run and chat with the model

lemonade run user.Magicoder-S-CL-7B-gguf-Q5_K_M

List all available models

lemonade list

Conversion process

by AlfredWALLACE - opened Dec 8, 2023

Discussion

AlfredWALLACE

Dec 8, 2023

•

edited Dec 8, 2023

Thanks for the quantized model which allows us to test this great AI.
Would you share your conversion method as I was not able to do it myself with llama.cpp scripts and would like to quantize more versions ?

eramax

Owner Dec 12, 2023

Sure @AlfredWALLACE
You have to download and compile llama.cpp from github

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make LLAMA_CUBLAS=1

then you need to create python env and install requirements of llama.cpp

 pip install -r requirements.txt

then run the convert script to make the f16 format

python ~/dev/llama.cpp/convert.py ./Magicoder-S-CL-7B --outtype f16

then run the compiled app quantize which will be generated after compiling llama.cpp

quantize ./Magicoder-S-CL-7B/ggml-model-f16.gguf q5_k_m

Good Luck.

AlfredWALLACE

Dec 16, 2023

Thanks! I had no luck with loading the model quantized with the same commands, previous to my post, but with a S-DS model.

akhil3417

Feb 3, 2024

•

edited Feb 3, 2024

Thanks! I had no luck with loading the model quantized with the same commands, previous to my post, but with a S-DS model.

try this fork , will work for sure
https://github.com/akhil3417/llama.cpp

eramax

Owner Feb 3, 2024

Thanks! I had no luck with loading the model quantized with the same commands, previous to my post, but with a S-DS model.

try this fork , will work for sure
https://github.com/akhil3417/llama.cpp

Could you please explain what is the changes or features in your fork ?

akhil3417

Feb 4, 2024

merged '417884e regex_gpt2_preprocess pr'

AlfredWALLACE

Feb 18, 2024

Thanks! I'll try! in the mean time, the GPTQ version works really well and also loads on low VRAM.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment