Loki-Omega-70B-6QMerged-GGUF

This repository contains a single merged GGUF file for running ReadyArt/L3.3-The-Omega-Directive-70B-Unslop-v2.0 with llama.cpp (Q6_K quantization).

File

  • Loki-Omega-70B-6QMerged.gguf (57GB) - Full merged model, ready to use

Quick Start (llama.cpp server, OpenAI-compatible)

pip install "llama-cpp-python[server]"
python -m llama_cpp.server \
  --model /path/to/Loki-Omega-70B-6QMerged.gguf \
  --host 0.0.0.0 --port 8000 \
  --n_ctx 32000

## Quantization: Q6_K (maintains quality and nuance)
Note: This is the merged version. For split files, see the original repository.
Downloads last month
4
GGUF
Model size
71B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Babsie/OmegaDirective-70B-6Q-Merged