How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf ventilabs/miseai
# Run inference directly in the terminal:
llama cli -hf ventilabs/miseai
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf ventilabs/miseai
# Run inference directly in the terminal:
llama cli -hf ventilabs/miseai
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf ventilabs/miseai
# Run inference directly in the terminal:
./llama-cli -hf ventilabs/miseai
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf ventilabs/miseai
# Run inference directly in the terminal:
./build/bin/llama-cli -hf ventilabs/miseai
Use Docker
docker model run hf.co/ventilabs/miseai
Quick Links

Venti MiseAI 1.1

Intelligence that lives on your machine.

MiseAI is a powerful, private AI assistant built by Venti Labs. It runs 100% locally on your hardware β€” no cloud, no API keys, no data leaving your device.

Highlights

  • 🧠 7B Parameters β€” Fine-tuned from Qwen 2.5 Coder 7B
  • πŸ”’ Fully Private β€” Runs offline, no internet required after download
  • πŸ’» Expert Coder β€” Production-ready code generation and refactoring
  • ⚑ 8GB VRAM β€” Optimized to run on consumer GPUs
  • πŸ“¦ GGUF Format β€” Ready for Ollama, llama.cpp, LM Studio

Quick Start (Ollama)

ollama run ventilabs/miseai

Or install the Venti CLI:

irm venti-labs.xyz/install | iex
venti launch mise

Model Details

Property Value
Base Model Qwen 2.5 Coder 7B
Fine-tuning LoRA (QLoRA)
Quantization Q8_0
File Size ~8.1 GB
Context Window 16,384 tokens
Max Output 8,192 tokens

Use Cases

  • Code Generation β€” Write production-ready code in any language
  • Code Refactoring β€” Optimize and restructure existing codebases
  • Problem Solving β€” Step-by-step reasoning through complex challenges
  • Technical Writing β€” Documentation, README files, and technical articles

Links


Built with ❀️ by Venti Labs © 2026

Downloads last month
115
GGUF
Model size
8B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ventilabs/miseai

Base model

Qwen/Qwen2.5-7B
Quantized
(43)
this model