Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.19.0
metadata
title: SQL Helper RAG
emoji: 🧮
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
license: mit
short_description: RAG Q&A over SQL knowledge base (Jina + Qdrant + Groq)
SQL Helper — RAG Q&A
A retrieval-augmented Q&A demo that answers SQL questions grounded in a curated SQL knowledge base.
Architecture
User question
│
â–¼
Jina embeddings (jina-embeddings-v3, retrieval.query)
│
â–¼
Qdrant Cloud — top-5 similar chunks from collection "sql_kb"
│
â–¼
Groq + openai/gpt-oss-20b — answer grounded in retrieved context
│
â–¼
Answer + cited sources
Stack
| Layer | Tool | Why |
|---|---|---|
| Embeddings | Jina jina-embeddings-v3 (1024-dim, multilingual) |
Free tier, supports Ukrainian, asymmetric query/passage encoding |
| Vector DB | Qdrant Cloud | Free 1 GB cluster, managed, low-latency |
| LLM | Groq openai/gpt-oss-20b |
Free tier, very fast inference |
| UI | Gradio | Standard for HF Spaces, quick prototyping |
Knowledge base
7 markdown documents covering:
- SELECT basics (WHERE, ORDER BY, NULL handling, DISTINCT, aliases)
- JOINs (INNER, LEFT, FULL, CROSS, self-join, common mistakes)
- Aggregations and GROUP BY (HAVING vs WHERE, NULL behavior, ROLLUP)
- Window functions (ranking, frames, LAG/LEAD, FIRST_VALUE)
- Subqueries and CTEs (EXISTS, recursive CTE, scalar/derived/correlated)
- Indexes and performance (composite indexes, EXPLAIN, common slow patterns)
- Common gotchas (NULL behavior, integer division, JOIN-explosion, deep pagination)
Design choices worth noting
- Asymmetric encoding — documents use Jina
retrieval.passage, questions useretrieval.query. More accurate than encoding both as one type. - Honest refusal on out-of-scope — system prompt explicitly says "if not in context, say so". Tested with non-SQL questions (e.g. MongoDB) — model correctly refuses.
- Multilingual — Jina v3 handles Ukrainian/Russian/etc; ask in any language, get answer in same language.
- Source citations — every answer shows which chunks were retrieved and their similarity scores.
- Low temperature (0.2) — factual Q&A, not creative writing.
Possible extensions
- Hybrid search (semantic + BM25) for better code/identifier matching
- Jina reranker v2 on top-20 → top-5 for higher precision
- LLM-as-judge eval set for measurable quality
Author
Built as part of a portfolio for AI/LLM Engineer roles.
- Companion models on HF:
llama-3.2-3b-text2sql-lora,llama-3.2-3b-ukrainian-alpaca-lora