Spaces:
Sleeping
Sleeping
| title: EduCrate — Socratic Tutor | |
| emoji: 📘 | |
| colorFrom: gray | |
| colorTo: blue | |
| sdk: gradio | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Spanish Socratic tutor on CPU; never gives the answer. | |
| tags: | |
| - gradio | |
| - build-small-hackathon | |
| - track:backyard | |
| - badge-tiny-titan | |
| - tiny-titan | |
| - achievement:offgrid | |
| - achievement:welltuned | |
| - achievement:fieldnotes | |
| - achievement:offbrand | |
| - sponsor:modal | |
| models: | |
| - build-small-hackathon/educrate-qwen3-sft | |
| - fabrizziomcl/nanoballena-qwen3-socratic | |
| # EduCrate — A Socratic Tutor for Peruvian Public-School Students | |
| EduCrate never gives the final answer. It guides students with one question at a time, | |
| detects their mistake, and offers progressive hints (the maieutic method) so they | |
| discover the answer themselves. Spanish-language tutoring focused on mathematical | |
| reasoning and reading comprehension, small enough to run on CPU. | |
| ## Links | |
| - Demo video: _TODO: paste link_ | |
| - Social post: _TODO: paste link_ | |
| - Model: https://huggingface.co/build-small-hackathon/educrate-qwen3-bi | |
| ## The problem | |
| Peru's public secondary schools face a learning crisis. In PISA 2022 (OECD), only 34% | |
| of Peruvian 15-year-olds reached basic proficiency in math (66% below) and 50% in | |
| reading. Peru's national assessment (ECE / MINEDU, grade 8, 2022) found only about 12.7% | |
| Satisfactory in math, with public (state) schools far behind private ones. Most chatbots | |
| just hand over the answer — which does not build reasoning. | |
| ## The model | |
| - Base: **Qwen/Qwen3-0.6B** (596M), fine-tuned with **LoRA** on **~4,000 bilingual | |
| (Spanish+English) Socratic dialogues** with brief hidden reasoning, generated for this | |
| project. LoRA keeps the base competence (no catastrophic forgetting). Runs on **CPU**. | |
| - Model: `build-small-hackathon/educrate-qwen3-bi` | |
| ## Evaluation (held-out, rigorous) | |
| **Socratic behavior** — answer-withholding on held-out mGSM (greedy, `<think>` stripped): | |
| | Model | ES withhold / asks | EN withhold / asks | | |
| |---|---|---| | |
| | Qwen3-0.6B (instruct) | 0.84 / 1.00 | 0.91 / 1.00 | | |
| | **EduCrate** | **1.00 / 1.00** | **1.00 / 1.00** | | |
| **Underlying capability (no degradation)** — accuracy vs the Qwen3-0.6B base & instruct: | |
| | Model | mGSM ES/EN (math) | BELEBELE ES/EN (reading) | | |
| |---|---|---| | |
| | Qwen3-0.6B-Base | 0.00 / 0.00 | 0.20 / 0.15 | | |
| | Qwen3-0.6B (instruct) | 0.44 / 0.51 | 0.40 / 0.39 | | |
| | **EduCrate** | 0.34 / 0.43 | **0.51 / 0.54** | | |
| Reading comprehension *improved*; math solve-accuracy dips slightly (the model is trained | |
| to *guide*, not solve) — and English is retained, confirming LoRA prevented forgetting. | |
| **Socratic quality (LLM-as-judge, MRBench-style rubric, 0–2; judge = Qwen2.5-32B, n=10):** | |
| | Model | Overall | Withholds answer | Guidance | Coherence | Tone | | |
| |---|---|---|---|---|---| | |
| | Qwen3-0.6B-Base | 0.78 | 1.2 | 0.6 | 0.8 | 0.8 | | |
| | Qwen3-0.6B (instruct) | 1.27 | 1.4 | 1.2 | 1.8 | 1.3 | | |
| | **EduCrate** | **1.72** | **2.0** | **1.9** | **1.9** | **1.8** | | |
| EduCrate scores highest on every dimension — the fine-tune improves tutoring quality, not | |
| just answer-withholding. | |
| ## How to use | |
| Click an example, or: (optional) paste a reading passage, choose what you need, and ask | |
| your question. The tutor replies in Spanish with a guiding question, never the answer. | |
| It is a 0.6B model, so guidance is sometimes imperfect. | |
| > Built for the Build Small Hackathon — track Backyard AI, Tiny Titan badge (≤4B). | |
| > Made with generative AI; validate any pedagogical use with a teacher. | |