Nx
Collection
Main series of models by GoofyLM. • 6 items • Updated
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf GoofyLM/N1-Quant:# Run inference directly in the terminal:
llama-cli -hf GoofyLM/N1-Quant:# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf GoofyLM/N1-Quant:# Run inference directly in the terminal:
./llama-cli -hf GoofyLM/N1-Quant:git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf GoofyLM/N1-Quant:# Run inference directly in the terminal:
./build/bin/llama-cli -hf GoofyLM/N1-Quant:docker model run hf.co/GoofyLM/N1-Quant:Banner by Croissant
N1 is a small, experimental Chain-of-Thought (COT) model based on the LLaMA architecture, developed by GoofyLM.
{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system
You are a helpful AI assistant named N1, trained by GoofyLM<|im_end|>
' }}{% endif %}{{'<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% endif %}
This model is designed for text generation tasks with a focus on reasoning through problems step-by-step (using its Chain-of-Thought).
The model can be loaded using the following:
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="GoofyLM/N1",
filename="N1_Q8_0.gguf",
)
ollama run hf.co/GoofyLM/N1:Q4_K_M
Base model
GoofyLM/N1
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf GoofyLM/N1-Quant:# Run inference directly in the terminal: llama-cli -hf GoofyLM/N1-Quant: