| --- |
| license: mit |
| tags: |
| - text-to-sql |
| - education |
| - socratic-learning |
| - instruction-tuning |
| - sql |
| - STEM |
| - pedagogy |
| datasets: |
| - SQL-Instruct |
| --- |
| |
| # SQL Socratic Models |
|
|
| ## Model Description |
|
|
| SQL Socratic Models are a collection of fine-tuned large language models designed for **Socratic SQL instruction in higher education**. Unlike standard Text-to-SQL systems, these models are trained to **guide learners through reasoning steps without producing final SQL solutions**, supporting conceptual understanding and active learning in STEM contexts. |
|
|
| Supported architectures: |
| - Phi-3 |
| - Qwen2.5 |
| - Gemma2 |
|
|
| --- |
|
|
| ## Intended Use |
|
|
| These models are designed for: |
|
|
| - Teaching SQL concepts in higher education |
| - Supporting STEM learners through guided reasoning |
| - Providing step-by-step Socratic hints for SQL problems |
| - Assisting debugging and conceptual clarification |
|
|
| ### Important Constraint |
| The models are intentionally trained to: |
| - ✅ Provide reasoning steps and conceptual hints |
| - ❌ Avoid generating complete SQL solutions |
|
|
| This ensures alignment with pedagogical goals such as scaffolding and learner engagement. |
|
|
| --- |
|
|
| ## Training Data: SQL-Instruct Corpus |
|
|
| We construct **SQL-Instruct**, a domain-specific Socratic instruction corpus, by mining high-quality interactions from Stack Overflow. This platform captures real-world misconceptions, debugging challenges, and conceptual gaps encountered by learners and practitioners. |
|
|
| ### Data Collection |
|
|
| To ensure high-quality instructional signals, we filter SQL-tagged questions based on community impact. The resulting dataset has: |
|
|
| - **1.27 billion total views** |
| - **128,535 average views per question** |
|
|
| For each selected entry, we extract: |
| - Problem descriptions |
| - User-submitted SQL attempts |
| - Executable SQL from accepted solutions |
|
|
| This yields **9,916 unique questions**. |
|
|
| --- |
|
|
| ### Socratic Augmentation |
|
|
| Each example is transformed into a Socratic instructional format using GPT-4o, which generates: |
|
|
| - Guided reasoning steps |
| - Conceptual hints |
| - Question decomposition |
|
|
| This ensures the dataset emphasizes **instructional scaffolding rather than answer generation**. |
|
|
| --- |
|
|
| ### Dataset Composition |
|
|
| - **Intermediate questions:** 8,604 |
| - **Advanced questions:** 629 |
| - **Debugging tasks:** 531 |
|
|
| The dataset emphasizes challenging reasoning scenarios, particularly: |
|
|
| - JOIN operations |
| - Aggregations and grouping |
| - Query optimization |
|
|
| We further ensure reliability by selecting entries with a **median Stack Overflow score of 27**. |
|
|
| --- |
|
|
| ## Training Procedure |
|
|
| ### Phase 2: Fine-Tuning |
|
|
| We apply **Full Fine-Tuning (FFT)** on small, open-source LLMs under pedagogical constraints designed to: |
|
|
| - Encourage conceptual scaffolding |
| - Promote step-by-step reasoning |
| - Discourage direct SQL answer generation |
|
|
| --- |
|
|
| ## Evaluation |
|
|
| ### Phase 3 Metrics |
|
|
| Models are evaluated using: |
|
|
| - **BERTScore** → semantic alignment with expected reasoning |
| - **ROUGE-L** → detection of answer leakage (i.e., unintended full SQL generation) |
|
|
| --- |
|
|
| ## Key Contributions |
|
|
| - Socratic SQL instruction tuning for higher education |
| - SQL-Instruct dataset derived from real-world misconceptions |
| - Multi-model fine-tuning across Phi-3, Qwen2.5, and Gemma2 |
| - Evaluation framework balancing reasoning quality and answer leakage |
| - Ablation study identifying factors enabling: |
| - Misconception-based feedback |
| - Iterative guidance |
| - Instructor-like reasoning behavior |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - Models may still occasionally generate partial SQL fragments |
| - Evaluation focuses on semantic similarity rather than full pedagogical outcomes |
| - Dataset is derived from Stack Overflow and may reflect community biases |
|
|
| --- |
|
|
| ## Ethical Considerations |
|
|
| These models are designed to support learning, not replace it. By avoiding full solution generation, they aim to: |
|
|
| - Encourage critical thinking |
| - Reduce over-reliance on AI-generated answers |
| - Support equitable access to SQL learning resources |
|
|
| --- |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained("sriram882004/SQL-Socratic-Models/phi3") |
| tokenizer = AutoTokenizer.from_pretrained("sriram882004/SQL-Socratic-Models/phi3") |