rag-sql-helper / README.md
notingemiu's picture
Initial commit: RAG SQL helper Gradio Space
53ff64f verified
|
Raw
History Blame Contribute Delete
2.74 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: SQL Helper RAG
emoji: 🧮
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
license: mit
short_description: RAG Q&A over SQL knowledge base (Jina + Qdrant + Groq)

SQL Helper — RAG Q&A

A retrieval-augmented Q&A demo that answers SQL questions grounded in a curated SQL knowledge base.

Architecture

User question
     │
     â–¼
Jina embeddings (jina-embeddings-v3, retrieval.query)
     │
     â–¼
Qdrant Cloud — top-5 similar chunks from collection "sql_kb"
     │
     â–¼
Groq + openai/gpt-oss-20b — answer grounded in retrieved context
     │
     â–¼
Answer + cited sources

Stack

Layer Tool Why
Embeddings Jina jina-embeddings-v3 (1024-dim, multilingual) Free tier, supports Ukrainian, asymmetric query/passage encoding
Vector DB Qdrant Cloud Free 1 GB cluster, managed, low-latency
LLM Groq openai/gpt-oss-20b Free tier, very fast inference
UI Gradio Standard for HF Spaces, quick prototyping

Knowledge base

7 markdown documents covering:

  • SELECT basics (WHERE, ORDER BY, NULL handling, DISTINCT, aliases)
  • JOINs (INNER, LEFT, FULL, CROSS, self-join, common mistakes)
  • Aggregations and GROUP BY (HAVING vs WHERE, NULL behavior, ROLLUP)
  • Window functions (ranking, frames, LAG/LEAD, FIRST_VALUE)
  • Subqueries and CTEs (EXISTS, recursive CTE, scalar/derived/correlated)
  • Indexes and performance (composite indexes, EXPLAIN, common slow patterns)
  • Common gotchas (NULL behavior, integer division, JOIN-explosion, deep pagination)

Design choices worth noting

  • Asymmetric encoding — documents use Jina retrieval.passage, questions use retrieval.query. More accurate than encoding both as one type.
  • Honest refusal on out-of-scope — system prompt explicitly says "if not in context, say so". Tested with non-SQL questions (e.g. MongoDB) — model correctly refuses.
  • Multilingual — Jina v3 handles Ukrainian/Russian/etc; ask in any language, get answer in same language.
  • Source citations — every answer shows which chunks were retrieved and their similarity scores.
  • Low temperature (0.2) — factual Q&A, not creative writing.

Possible extensions

  • Hybrid search (semantic + BM25) for better code/identifier matching
  • Jina reranker v2 on top-20 → top-5 for higher precision
  • LLM-as-judge eval set for measurable quality

Author

Built as part of a portfolio for AI/LLM Engineer roles.