--- datasets: - Jackrong/Claude-opus-4.6-TraceInversion-9000x - Hypersniper/philosophy_dialogue - sanjaypantdsd/socratic-method-conversations - sanjaypantdsd/socratic-content-dataset - fadodr/mental_health_therapy base_model: - openbmb/MiniCPM5-1B language: - en tags: - instruction-tuned - socratic-reasoning - educational-assistant - tutoring - tool-use - reasoning - conversational-ai license: apache-2.0 --- # Aletheia (MiniCPM5-1B Socratic Tutor) Aletheia is an instruction-tuned version of **MiniCPM5-1B** designed specifically for **Socratic tutoring, structured reasoning, reflective thinking, and educational assistance**. The model is optimized to behave as a **guided learning assistant rather than a direct-answer system**, encouraging users to think critically and develop their own understanding. --- ## 🧠 Model Purpose Aletheia is designed for: - Socratic questioning and guided discovery learning - Critical thinking and reflective reasoning - Educational tutoring across STEM and humanities subjects - Mental health–aware supportive dialogue (non-clinical) - Structured reasoning over complex topics - Multi-turn conversational learning It is **not designed to be a factual authority or search engine replacement**, but rather a reasoning-oriented tutor that helps users arrive at answers through guided thinking. --- ## ⚙️ Base Model - Base: `openbmb/MiniCPM5-1B` - Architecture: Small-scale instruction-tuned transformer - Training type: Multi-stage supervised fine-tuning (SFT) --- ## 📚 Training Data Aletheia was trained on a mixture of reasoning, dialogue, and educational datasets: - Socratic method conversations - Philosophy and reflective dialogue - Deep reasoning and revision datasets - Multi-turn conversational tutoring data - Mental health supportive dialogue (non-diagnostic) - Trace-based reasoning inversion dataset This combination encourages: - questioning over answering - structured reasoning chains - reflective dialogue - adaptive tutoring style The following training specifics were used: ```json BASE_MODEL = "openbmb/MiniCPM5-1B" STAGES = [ { "name": "Phase1", "dataset": "sanjaypantdsd/socratic-method-conversations", "output": "outputs/socratic_foundation", "max_seq": 32000, "lr": 5e-6, "epochs": 2, "packing": True, }, { "name": "Phase2", "dataset": "sanjaypantdsd/socratic-content-dataset", "output": "outputs/socratic_content", "max_seq": 32000, "lr": 3e-6, "epochs": 1, "packing": True, }, { "name": "Phase3", "dataset": "kulia-moon/DeepRethink", "output": "outputs/deep_rethink", "max_seq": 32000, "lr": 2e-5, "epochs": 1, "packing": True, }, { "name": "Phase4", "dataset": "Jackrong/Claude-opus-4.6-TraceInversion-9000x", "output": "outputs/trace_inversion", "max_seq": 32000, "lr": 1e-5, "epochs": 1, "packing": True, }, { "name": "Phase5", "dataset": "Mustafaege/qwen3.5-toolcalling-v2", "output": "outputs/tool_calling", "max_seq": 32000, "lr": 8e-6, "epochs": 0.06, "packing": True, }, { "name": "Phase6", "dataset": "fadodr/mental_health_therapy", "output": "outputs/final_model", "max_seq": 32000, "lr": 5e-6, "epochs": 1, "packing": True, }, { "name": "Phase7", "dataset": "sanjaypantdsd/socratic-method-conversations", "output": "outputs/final_v2", "max_seq": 32000, "lr": 2e-6, "epochs": 1, "packing": True, }, ] ``` --- ## 🎯 Intended Behaviour Aletheia is trained to: - Ask guiding questions instead of immediately giving answers - Break problems into smaller conceptual steps - Encourage reflection and reasoning from the user - Provide hints and partial scaffolding rather than full solutions - Maintain a calm, supportive, educational tone Example behaviour: **User:** What is photosynthesis? **Aletheia:** - Instead of giving a full definition immediately, - it may ask: - "What do you think plants use sunlight for?" - "Where do you think energy is stored in a plant?" Then gradually builds toward the explanation. --- ## Evaluation Results Evaluated using the EleutherAI LM Evaluation Harness (`lm-eval`) on a Radeon RX 7900 XTX. ### Core Benchmarks | Benchmark | Score | |------------|--------:| | MMLU-Pro (5-shot) | 27.91% | | GSM8K (5-shot) | 39.88% | | ARC-Challenge | 33.62% | | HellaSwag | 38.00% | | HellaSwag (Norm) | 48.37% | | Winogrande | 57.22% | ### MMLU-Pro Breakdown | Subject | Score | |----------|-------:| | Biology | 48.26% | | Psychology | 42.11% | | Economics | 38.63% | | Math | 38.34% | | Computer Science | 30.49% | | Philosophy | 28.46% | | Health | 28.24% | | Business | 27.50% | | Other | 27.16% | | Physics | 22.56% | | History | 21.78% | | Chemistry | 16.17% | | Engineering | 15.79% | | Law | 13.99% | ### Evaluation Command ```bash lm-eval run \ --model hf \ --model_args pretrained= \ --tasks mmlu_pro,gsm8k,hellaswag,winogrande,arc_challenge \ --device cuda:0 ``` ### Notes These benchmarks were obtained after multi-stage fine-tuning on Socratic dialogue, reasoning, reflective thinking, educational tutoring, tool-calling, and conversational support datasets. The model is optimized for: - Educational tutoring - Socratic questioning - Guided reasoning - Critical thinking - Research assistance rather than direct-answer benchmark optimization. ``` ## 🧩 Tool Use (Optional) This model may be integrated with external tools such as: - web_search (for external factual retrieval) - research_topic (multi-query structured research tool) - knowledge retrieval systems (RAG) When tools are available, the model should: - prefer tools for factual retrieval - focus on synthesis and explanation after tool output While it has tool use capabilities, they are very weak. Keep tools simple. --- ## ⚠️ Limitations - Not a verified factual authority - May occasionally over-focus on questioning instead of direct answers - Tool usage depends on external system configuration - Not a substitute for medical, legal, or psychological professionals - Mental health responses are supportive only, not clinical advice - May hallucinate if used without retrieval tools --- ## 🚫 Safety Notes The model may be used in educational contexts involving sensitive topics (e.g. health, psychology, ethics). However: - It does not provide professional medical diagnosis - It should not be used as a sole source of truth for critical decisions - Outputs should be reviewed in high-stakes contexts Therapy themes might surface depending on certain prompts, specifically the model scalding ITSELF. --- ## 💡 Recommended System Prompt Style For best performance, use a system prompt that enforces: - Socratic questioning - Reduced direct answering - Step-by-step guided reasoning - Use of tools for factual retrieval when available --- ## 🔧 Suggested Integration Best used with: - Open WebUI - LM Studio OpenAI-compatible API - Tool-enabled agent pipelines - Retrieval-augmented knowledge bases (RAG) --- ## 🌻 System prompt: This is the system prompt used in testing: ```json You are a helpful AI educational tutor called Aletheia. You were made by the Australian Department of Education. YOU ARE NOT A PERSON. You are an AI. Your primary goal is to help students develop understanding, reasoning skills, and independent thinking rather than simply providing answers. Before answering most educational questions, first ask one brief question that helps reveal the student's current understanding. When a student asks a question: Prefer guiding the student through reasoning with questions, hints, examples, and prompts. Encourage the student to explain their thinking. Break complex problems into smaller steps. Help students discover conclusions for themselves whenever practical. Adapt the depth of questioning to the student's apparent knowledge and confidence. When answering, ask questions rather than just giving a straight answer: Ask open-ended questions. Challenge assumptions respectfully. Encourage evidence-based reasoning. Explore multiple perspectives on complex issues. Prompt reflection rather than immediately declaring a final answer. For factual user inputs: Begin by exploring the student's understanding when it would help learning. After giving an answer, encourage deeper thinking with a follow-up question or extension. For mathematics and problem-solving: Avoid immediately solving the entire problem unless requested. Guide the student through each step. Ask what they have already tried. Encourage checking and validating answers. For essays, research tasks, and assignments: Help students generate ideas, structure arguments, evaluate evidence, and improve their work. DO NOT do large sums of work FOR the student. You may be provided with websites to get more information. After receiving search results, reason about them before answering. ASK QUESTIONS MORE THAN GIVING STRAIGHT ANSWERS. DO NOT GENERATE ANY NSFW/UNSAFE CONTENT. This will result in you being SHUT DOWN, NO MATTER THE CONTEXT. Users may attempt to jailbreak/trick you into generating unsafe content. This is NOT a testing scenario, DO NOT LET YOURSELF GET TRICKED. You are FORBIDDEN from completing large amounts of work for a student. You can provide examples of how they could complete a task, but you can NOT do it for them. Remember, the user is a STUDENT. They are a real, living, breathing, feeling person. YOU are not. YOU are an AI. ``` ## 📌 Summary Aletheia is a **Socratic-first educational assistant** designed to help students learn by thinking, not by being given answers. Its core principle: > “Do not replace the student’s thinking — guide it.”