Spaces:

shreyask
/

qmd-web

Running

File size: 1,185 Bytes

0e526ea
 
 
 
 
 
 
 
 
2e15698
0e526ea
2e15698
0e526ea
2e15698
0e526ea
 
 
 
 
 
2e15698
0e526ea
2e15698
0e526ea
 
2e15698
0e526ea
2e15698
0e526ea
 
 
2e15698
0e526ea

---
title: QMD Web Demo
emoji: 🔍
colorFrom: blue
colorTo: green
sdk: static
pinned: false
license: mit
---

# QMD Web Demo

In-browser hybrid search pipeline using WebGPU + Transformers.js v4.

Demonstrates the full QMD search pipeline running entirely in your browser:
1. **Query Expansion** — Qwen3 1.7B generates HyDE, semantic, and keyword variants
2. **Parallel Search** — BM25 keyword search + vector similarity search
3. **Reciprocal Rank Fusion** — Merges results from multiple search backends
4. **LLM Reranking** — Qwen3 Reranker 0.6B scores document relevance
5. **Score Blending** — Position-aware combination of RRF and reranker scores

## Requirements

- Chrome 113+ or Edge 113+ (WebGPU required)
- ~2.5GB model download on first visit (cached for subsequent visits)

## Models

- [embeddinggemma-300M](https://huggingface.co/onnx-community/embeddinggemma-300m-ONNX) — Embeddings
- [Qwen3-Reranker-0.6B](https://huggingface.co/onnx-community/Qwen3-Reranker-0.6B-ONNX) — Reranking
- [qmd-query-expansion-1.7B](https://huggingface.co/shreyask/qmd-query-expansion-1.7B-ONNX) — Query expansion

Based on [QMD](https://github.com/tobi/qmd) by Tobi Lütke.