rnj-1
Collection
5 items • Updated • 39
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf EssentialAI/rnj-1-instruct-GGUF:Q4_K_M# Run inference directly in the terminal:
llama-cli -hf EssentialAI/rnj-1-instruct-GGUF:Q4_K_M# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf EssentialAI/rnj-1-instruct-GGUF:Q4_K_M# Run inference directly in the terminal:
./llama-cli -hf EssentialAI/rnj-1-instruct-GGUF:Q4_K_Mgit clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf EssentialAI/rnj-1-instruct-GGUF:Q4_K_M# Run inference directly in the terminal:
./build/bin/llama-cli -hf EssentialAI/rnj-1-instruct-GGUF:Q4_K_Mdocker model run hf.co/EssentialAI/rnj-1-instruct-GGUF:Q4_K_MThis is a GGUF-formatted checkpoint of rnj-1-instruct suitable for use in llama.cpp, Ollama, or others. This has been quantized with the Q4_K_M scheme, which results in model weights of size 4.8GB.
For llama.cpp, install (after version 7328, e.g., on Mac OSX brew install llama.cpp) and run either of these commands:
llama-cli -hf EssentialAI/rnj-1-instruct-GGUF
llama-server -hf EssentialAI/rnj-1-instruct-GGUF -c 0 # and open browser to localhost:8080
For Ollama, install (after version v0.13.3 -- versions can be found here) and run:
ollama run rnj-1
4-bit
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf EssentialAI/rnj-1-instruct-GGUF:Q4_K_M# Run inference directly in the terminal: llama-cli -hf EssentialAI/rnj-1-instruct-GGUF:Q4_K_M