How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf RMDWLLC/Jah-1.0:Q3_K_M
# Run inference directly in the terminal:
llama cli -hf RMDWLLC/Jah-1.0:Q3_K_M
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf RMDWLLC/Jah-1.0:Q3_K_M
# Run inference directly in the terminal:
llama cli -hf RMDWLLC/Jah-1.0:Q3_K_M
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf RMDWLLC/Jah-1.0:Q3_K_M
# Run inference directly in the terminal:
./llama-cli -hf RMDWLLC/Jah-1.0:Q3_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf RMDWLLC/Jah-1.0:Q3_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf RMDWLLC/Jah-1.0:Q3_K_M
Use Docker
docker model run hf.co/RMDWLLC/Jah-1.0:Q3_K_M
Quick Links

Jah 1.0

Jah is the private AI that powers Echols β€” RMDW's private alternative to ChatGPT and Claude. It runs entirely on hardware RMDW owns and controls. Nothing you type leaves to a third-party cloud, nothing is stored externally, and nothing is ever used to train another company's model. What you bring to Jah stays yours.

This is not a chatbot demo. Jah is the brain of a full private-AI product that people pay for and use every day.

What Jah does

  • Private chat with live, interactive artifacts (apps, charts, dashboards, diagrams), persistent memory, web search with citations, and in-browser code execution.
  • Builds real apps. Describe what you want and Jah writes a complete project, pushes it to your own GitHub, and deploys it to a live URL you can share β€” from a single sentence. You own the code.
  • Image and video generation.
  • Agentic. Acts across your connected accounts (Gmail, Calendar, Telegram) on your behalf.

All of it private, for $25/mo β€” a fraction of what the cloud labs charge for less. Try it at echols.ai.

Why it exists

The best assistants from the big labs run $100–$200 a month and still send everything you type to their servers. Jah is the opposite bet: a genuinely capable assistant that runs on infrastructure you can see and trust, at a price built for everyone. Privacy here isn't a setting you toggle β€” it's the architecture.

Run it yourself

The weights are open, because owning your AI end-to-end should be possible. Serve Jah on your own GPU with llama.cpp:

# point --model at the first GGUF shard you downloaded
llama-server --model jah-00001-of-NNNNN.gguf \
  -ngl 999 -c 32768 -fa on --jinja \
  --host 0.0.0.0 --port 8080

Tip: for one-shot generation, disable reasoning so the full token budget goes to the answer β€” pass "chat_template_kwargs": {"enable_thinking": false} in your request. A ready-to-run Ollama Modelfile is included in this repo.

The bigger idea

Most AI today is rented from a handful of companies. Jah is owned β€” a private model running a real product on hardware RMDW controls, with open weights so anyone can do the same. That is the future RMDW is building: AI you own, not AI you rent. β†’ rmdw.ai


Jah 1.0 β€” RMDW LLC. Private AI, run on your terms.

Downloads last month
11
GGUF
Model size
381B params
Architecture
glm-dsa
Hardware compatibility
Log In to add your hardware

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support