Update README.md

Browse files

Files changed (1) hide show

README.md +119 -96

README.md CHANGED Viewed

@@ -1,149 +1,172 @@
 ---
 license: apache-2.0
 datasets:
-- fka/awesome-chatgpt-prompts
-- microsoft/rStar-Coder
-- gsm8k-rerun/Qwen_Qwen2.5-1.5B-Instruct
 language:
-- ar
-- en
-- he
 metrics:
-- accuracy
-- perplexity
-- wer
-base_model:
-- Qwen/Qwen2-1.5B-Instruct
 pipeline_tag: text-generation
 library_name: transformers
 tags:
-- multilingual
-- arabic
-- hebrew
-- qwen
-- educational
-- fine-tuned
 ---
 <style>
 body {
-  font-family: 'Segoe UI', sans-serif;
-  line-height: 1.7;
-  color: #1e1e1e;
-  background: #ffffff;
-  padding: 1.5em;
 }
-h1, h2, h3 {
-  color: #2c3e50;
   border-bottom: 2px solid #eee;
-  padding-bottom: 5px;
 }
-img {
-  display: block;
-  margin: 1em auto;
-  max-width: 200px;
 }
-code, pre {
-  background-color: #f5f5f5;
-  padding: 0.5em;
-  border-radius: 5px;
-  font-family: 'Courier New', monospace;
-  display: block;
-  white-space: pre-wrap;
 }
 blockquote {
-  border-left: 4px solid #3498db;
-  background: #ecf6fd;
-  padding: 0.8em;
-  color: #333;
-  margin: 1.5em 0;
 }
 </style>
-# 🤖 إدراكي (Edraky) - Multilingual Educational AI Model
-![Edraky Logo](https://cdn-uploads.huggingface.co/production/uploads/686e726239f003427404a1be/uuB7LFKDX1C5B28DGJyZN.png)
-> A fine-tuned AI assistant that helps Egyptian students — specially 3rd preparatory — with Arabic, English, Hebrew content, and educational support.
----
-## 🧠 Model Details
-- **Base Model:** Qwen/Qwen2-1.5B-Instruct
-- **Languages:** Arabic, English, Hebrew
-- **Trained on:** Educational, Q&A, mathematical, and multilingual prompts
-- **License:** Apache-2.0
-- **Tags:** multilingual, Arabic, Hebrew, educational, fine-tuned, transformers
----
-## 🚀 How to Use
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model = AutoModelForCausalLM.from_pretrained("Edraky/Edraky")
-tokenizer = AutoTokenizer.from_pretrained("Edraky/Edraky")
-input_text = "اشرح الثورة العرابية"
-inputs = tokenizer(input_text, return_tensors="pt")
-outputs = model.generate(**inputs, max_new_tokens=100)
-print(tokenizer.decode(outputs[0]))
-✅ Intended Uses
-🧑‍🏫 Educational chatbot for classroom topics
-📚 Answering curriculum-based questions
-✍️ Writing, completion, and explanation for Arabic texts
-🏫 Perfect fit for 3rd preparatory grade in Egypt
-🚫 Limitations / Out-of-Scope
-❌ Not designed for real-time tutoring
-❌ No legal, political, or medical advice
-❌ Not for biased, violent, or harmful use
-📊 Evaluation & Training
-Datasets:
-fka/awesome-chatgpt-prompts
-microsoft/rStar-Coder
-gsm8k-rerun/Qwen_Qwen2.5-1.5B-Instruct
-Metrics Used:
-Accuracy
-Perplexity
-Word Error Rate (WER)
-🌍 Environment Impact (Optional)
-Trained on GPU (details coming soon). Expected carbon footprint is low due to small model size (1.5B).
-👨‍💻 Maintainers & Contact
-Created by: Edraky Team
-Contact: edraky.edu@gmail.com
-License: Apache 2.0
-📜 Citation (if needed)
-bibtex
-Copy
-Edit
 @misc{edraky2025,
   title={Edraky: Multilingual Educational AI Model},
   author={Edraky Team},
   year={2025},
-  howpublished={\\url{https://huggingface.co/Edraky/Edraky}}
-}

 ---
+title: 🤖 إدراكي (Edraky) - Multilingual Educational AI Model 🇪🇬
+emoji: 🧠
+colorFrom: indigo
+colorTo: emerald
+sdk: gradio
+sdk_version: "4.25.0"
+app_file: app.py
+pinned: false
 license: apache-2.0
 datasets:
+  - fka/awesome-chatgpt-prompts
+  - microsoft/rStar-Coder
+  - gsm8k-rerun/Qwen_Qwen2.5-1.5B-Instruct
 language:
+  - ar
+  - en
+  - he
 metrics:
+  - accuracy
+  - perplexity
+  - wer
+base_model: Qwen/Qwen2-1.5B-Instruct
 pipeline_tag: text-generation
 library_name: transformers
 tags:
+  - multilingual
+  - arabic
+  - hebrew
+  - qwen
+  - educational
+  - fine-tuned
+  - open-source
+  - egyptian-curriculum
 ---
 <style>
 body {
+  font-family: 'Cairo', sans-serif;
+  background: linear-gradient(to left, #f9f9f9, #e0ecf7);
+  color: #222;
+  padding: 2em;
+  line-height: 1.8;
 }
+h1, h2, h3, h4 {
+  color: #003366;
   border-bottom: 2px solid #eee;
+  padding-bottom: 0.3em;
+}
+code {
+  background-color: #f4f4f4;
+  padding: 0.2em 0.4em;
+  border-radius: 4px;
+  font-family: Consolas, monospace;
+  color: #c7254e;
 }
+pre {
+  background-color: #f0f0f0;
+  padding: 1em;
+  border-radius: 8px;
+  overflow-x: auto;
 }
+ul {
+  padding-left: 1.5em;
 }
 blockquote {
+  background: #f9f9f9;
+  border-left: 5px solid #ccc;
+  padding: 1em;
+  font-style: italic;
+  color: #666;
 }
 </style>
+# 🤖 إدراكي (Edraky) - Multilingual Educational AI Model 🇪🇬
+**Edraky** is a fine-tuned multilingual model built on `Qwen2-1.5B-Instruct`, designed to provide educational support for Arabic-speaking students, especially targeting Egypt's 3rd preparatory curriculum. It supports Arabic, English, and Hebrew to ensure flexible, broad usage in multilingual environments.
+## 🧠 About Edraky
+Edraky is part of the **"إدراكي"** educational initiative to democratize access to AI-powered tools for students in Egypt and the broader Arab world. By fine-tuning the powerful Qwen2 base model, Edraky delivers context-aware, curriculum-aligned, and interactive responses that help learners understand core subjects such as:
+- اللغة العربية (Arabic Language)
+- الدراسات الاجتماعية (Social Studies)
+- التاريخ والجغرافيا (History and Geography)
+- اللغة الإنجليزية (English)
+## 🚀 Key Features
+- 🤖 **Text Generation & Q&A**: Answer student questions in an educational and child-safe manner.
+- 📖 **Curriculum Support**: Focused especially on 3rd preparatory grade in Egypt.
+- 🌍 **Multilingual Input**: Supports Arabic, English, and Hebrew.
+- 🔀 **Open-Source**: Available for research, personal, or educational use.
+- 📚 **Trained on curated educational prompts** for logic, language understanding, and curriculum-based queries.
+## 🧪 Training & Fine-Tuning
+**Base model:** `Qwen/Qwen2-1.5B-Instruct`
+**Training Data Sources:**
+- fka/awesome-chatgpt-prompts
+- gsm8k-rerun/Qwen_Qwen2.5-1.5B-Instruct
+- Additional data created from Arabic curriculum-style questions and student textbooks
+**Training Methodology:**
+- Supervised fine-tuning
+- Prompt-optimized inputs
+- Tokenized using Hugging Face’s tokenizer compatible with Qwen2 models
+## 🔍 Evaluation
+Model was evaluated on:
+- ✔️ Accuracy for subject-specific answers
+- ✔️ Perplexity for fluency and coherence
+- ✔️ WER (Word Error Rate) for language understanding
+> Evaluation still in progress for full benchmarks — to be published soon.
+## 🧑‍💻 Example Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("Edraky/Edraky")
+tokenizer = AutoTokenizer.from_pretrained("Edraky/Edraky")
+prompt = "اشرح الثورة العرابية بإيجاز"
+inputs = tokenizer(prompt, return_tensors="pt")
+output = model.generate(**inputs, max_new_tokens=150)
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+## 🧑‍📓 Intended Use
+- 💬 Classroom support AI assistant
+- ✍️ Writing and summarization in Arabic
+- ❓ Question answering for exam preparation
+- 🔍 Fact recall for historical, literary, and social studies content
+### ❌ Not Intended For:
+- ❌ Political or religious fatwa content
+- ❌ Personal decision-making
+- ❌ Generating offensive or misleading answers
+## 🌱 Future Plans
+- ✅ Add voice input/output via Whisper integration
+- ✅ Online quiz companion
+- ✅ Add visual aids (diagrams, maps)
+- ✅ Full web platform integration (see [edraky.rf.gd](https://edraky.rf.gd))
+## 📢 Maintainers
+**Developed by:** Edraky AI Team
+🌐 Website: [https://edraky.rf.gd](https://edraky.rf.gd)
+📧 Contact: edraky.edu@gmail.com
+## 📜 Citation
+```bibtex
 @misc{edraky2025,
   title={Edraky: Multilingual Educational AI Model},
   author={Edraky Team},
   year={2025},
+  howpublished={\url{https://huggingface.co/Edraky/Edraky}}
+}
+```
+> هذا المشروع من أجل دعم التعليم في مصر باستخدام الذكاء الاصطناعي. نرجو أن يكون مفيدًا لجميع الطلاب والمعلمين 🌟