Gear
Collection
The Gear family of LLMs • 2 items • Updated
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf HeavensHackDev/Gear-1-160m# Run inference directly in the terminal:
llama-cli -hf HeavensHackDev/Gear-1-160m# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf HeavensHackDev/Gear-1-160m# Run inference directly in the terminal:
./llama-cli -hf HeavensHackDev/Gear-1-160mgit clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf HeavensHackDev/Gear-1-160m# Run inference directly in the terminal:
./build/bin/llama-cli -hf HeavensHackDev/Gear-1-160mdocker model run hf.co/HeavensHackDev/Gear-1-160mGear-1-160M is a small Transformer LLM with about 160 million parameters in GGUF format, designed to run fast on local machines with low memory (CPU/GPU). It’s good for simple chat and basic tasks, but it may be slow or make mistakes since this is my first attempt at making a neural network. I plan to improve it in the future, so please bear with it. A 300M-parameter version will be coming soon.
We're not able to determine the quantization variants.
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf HeavensHackDev/Gear-1-160m# Run inference directly in the terminal: llama-cli -hf HeavensHackDev/Gear-1-160m