duckdb-nsql-7b-mlx-8bit
This repository contains an MLX-optimized 8-bit quantized variant of motherduckdb/DuckDB-NSQL-7B-v0.1, intended for fast, memory-efficient inference on Apple Silicon (M1/M2/M3/M4).
Model description
DuckDB-NSQL-7B is a 7B parameter language model fine-tuned to translate natural language questions into DuckDB SQL. The 8-bit MLX conversion reduces RAM/VRAM usage significantly versus FP16 while typically preserving near-FP16 quality for most NL→SQL workloads.
Conversion details
- Base model: motherduckdb/DuckDB-NSQL-7B-v0.1 (fine-tuned from Llama 2 7B)
- Format: MLX
- Precision: 8-bit quantized
- Typical memory footprint: ~7–8 GB (varies by MLX quantization / runtime)
- Recommended for: production + dev on 16–32 GB Macs (best quality/speed balance)
Installation
pip install mlx-lm
Usage
Python
from mlx_lm import load, generate
model, tokenizer = load("Nuxera/duckdb-nsql-7b-mlx-8bit")
schema = """
CREATE TABLE hospitals (
hospital_id BIGINT,
hospital_name VARCHAR,
region VARCHAR,
bed_capacity INTEGER
);
CREATE TABLE encounters (
encounter_id BIGINT,
hospital_id BIGINT,
encounter_datetime TIMESTAMP,
encounter_type VARCHAR
);
"""
question = "For each hospital region, how many encounters happened this month?"
prompt = f"""You are an assistant that writes valid DuckDB SQL queries.
### Schema:
{schema}
### Question:
{question}
### Response (DuckDB SQL only):"""
out = generate(model, tokenizer, prompt=prompt, max_tokens=256, temp=0.0)
print(out)
Run as a local server
mlx_lm.server --model Nuxera/duckdb-nsql-7b-mlx-8bit --port 8080
curl -X POST http://localhost:8080/v1/completions -H "Content-Type: application/json" -d '{
"prompt": "### Schema:\nCREATE TABLE patients(...);\n\n### Question:\nCount patients by region\n\n### Response (DuckDB SQL only):",
"max_tokens": 200,
"temperature": 0
}'
Prompt format
This model works best when you provide:
- Clear schema (tables + columns)
- One question
- Explicit instruction to output SQL only
Example:
You are an assistant that writes valid DuckDB SQL queries.
### Schema:
CREATE TABLE ...
### Question:
...
### Response (DuckDB SQL only):
Known limitations
- Optimized for DuckDB SQL; other dialects may require edits.
- Very complex joins / nested queries may occasionally need post-processing.
- Like most text-to-SQL models, ambiguous questions can yield multiple “valid” SQL interpretations.
License
This model inherits the Llama 2 license from the base model.
Citation
@misc{nuxera_duckdb_nsql_mlx_8bit,
title={DuckDB-NSQL-7B MLX 8-bit Quantized Conversion},
author={Nuxera AI},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/Nuxera/duckdb-nsql-7b-mlx-8bit}}
}
Base model:
@misc{duckdb_nsql,
title={DuckDB-NSQL-7B: Natural Language to SQL for DuckDB},
author={MotherDuck},
year={2024},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/motherduckdb/DuckDB-NSQL-7B-v0.1}}
}
Acknowledgments
- Original model by MotherDuck
- MLX framework by Apple ML Research
- MLX-LM: mlx-lm
- Nuxera AI
- Downloads last month
- 8
Model size
7B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
8-bit
Model tree for Nuxera/duckdb-nsql-7b-mlx-8bit
Base model
meta-llama/Llama-2-7b Finetuned
motherduckdb/DuckDB-NSQL-7B-v0.1