Spaces:

notingemiu
/

rag-sql-helper

Sleeping

App Files Files Community

rag-sql-helper / README.md

notingemiu

Initial commit: RAG SQL helper Gradio Space

53ff64f verified 14 days ago

preview code

Raw

History Blame Contribute Delete

2.74 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

metadata

title: SQL Helper RAG
emoji: 🧮
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
license: mit
short_description: RAG Q&A over SQL knowledge base (Jina + Qdrant + Groq)

SQL Helper — RAG Q&A

A retrieval-augmented Q&A demo that answers SQL questions grounded in a curated SQL knowledge base.

Architecture

User question
     │
     ▼
Jina embeddings (jina-embeddings-v3, retrieval.query)
     │
     ▼
Qdrant Cloud — top-5 similar chunks from collection "sql_kb"
     │
     ▼
Groq + openai/gpt-oss-20b — answer grounded in retrieved context
     │
     ▼
Answer + cited sources

Stack

Layer	Tool	Why
Embeddings	Jina `jina-embeddings-v3` (1024-dim, multilingual)	Free tier, supports Ukrainian, asymmetric query/passage encoding
Vector DB	Qdrant Cloud	Free 1 GB cluster, managed, low-latency
LLM	Groq `openai/gpt-oss-20b`	Free tier, very fast inference
UI	Gradio	Standard for HF Spaces, quick prototyping

Knowledge base

7 markdown documents covering:

SELECT basics (WHERE, ORDER BY, NULL handling, DISTINCT, aliases)
JOINs (INNER, LEFT, FULL, CROSS, self-join, common mistakes)
Aggregations and GROUP BY (HAVING vs WHERE, NULL behavior, ROLLUP)
Window functions (ranking, frames, LAG/LEAD, FIRST_VALUE)
Subqueries and CTEs (EXISTS, recursive CTE, scalar/derived/correlated)
Indexes and performance (composite indexes, EXPLAIN, common slow patterns)
Common gotchas (NULL behavior, integer division, JOIN-explosion, deep pagination)

Design choices worth noting

Asymmetric encoding — documents use Jina retrieval.passage, questions use retrieval.query. More accurate than encoding both as one type.
Honest refusal on out-of-scope — system prompt explicitly says "if not in context, say so". Tested with non-SQL questions (e.g. MongoDB) — model correctly refuses.
Multilingual — Jina v3 handles Ukrainian/Russian/etc; ask in any language, get answer in same language.
Source citations — every answer shows which chunks were retrieved and their similarity scores.
Low temperature (0.2) — factual Q&A, not creative writing.

Possible extensions

Hybrid search (semantic + BM25) for better code/identifier matching
Jina reranker v2 on top-20 → top-5 for higher precision
LLM-as-judge eval set for measurable quality

Author

Built as part of a portfolio for AI/LLM Engineer roles.

Companion models on HF: llama-3.2-3b-text2sql-lora, llama-3.2-3b-ukrainian-alpaca-lora