QMD Query Expansion 1.7B (GGUF)

Fine-tuned Qwen3-1.7B for search query expansion in QMD (Query Markup Documents).

Model Details

  • Base model: Qwen/Qwen3-1.7B
  • Fine-tuning: SFT on query expansion pairs using MLX on Apple Silicon
  • Format: GGUF (Q4_K_M quantization)
  • Size: ~1GB
  • Use case: Expanding short search queries into richer search terms for hybrid retrieval

Usage

This model is used by QMD's hybrid search pipeline. Given a short query, it generates expanded terms including lexical keywords, semantic variations, and hypothetical document excerpts (HyDE).

Input: "backup strategy"
Output:
/lex: backup restore restic incremental snapshot retention
/sem: data protection disaster recovery redundancy
/hyde: A comprehensive backup strategy includes regular incremental snapshots...

Training

  • Fine-tuned with MLX on Apple Silicon (M4)
  • SFT dataset: curated query-expansion pairs
  • Exported: MLX → dequantize FP16 → llama.cpp convert → GGUF Q4_K_M

Files

File Quant Size Description
qmd-query-expansion-1.7B-q4_k_m.gguf Q4_K_M ~1GB Recommended for production

Configuration

Set QMD_GENERATE_MODEL environment variable:

export QMD_GENERATE_MODEL="hf:maft-foundation/qmd-query-expansion-1.7B-gguf/qmd-query-expansion-1.7B-q4_k_m.gguf"
Downloads last month
32
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for maft-foundation/qmd-query-expansion-1.7B-gguf

Finetuned
Qwen/Qwen3-1.7B
Quantized
(255)
this model