Edraky commited on
Commit
0c2361f
ยท
verified ยท
1 Parent(s): c2b8fdf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +119 -96
README.md CHANGED
@@ -1,149 +1,172 @@
1
  ---
 
 
 
 
 
 
 
 
2
  license: apache-2.0
3
  datasets:
4
- - fka/awesome-chatgpt-prompts
5
- - microsoft/rStar-Coder
6
- - gsm8k-rerun/Qwen_Qwen2.5-1.5B-Instruct
7
  language:
8
- - ar
9
- - en
10
- - he
11
  metrics:
12
- - accuracy
13
- - perplexity
14
- - wer
15
- base_model:
16
- - Qwen/Qwen2-1.5B-Instruct
17
  pipeline_tag: text-generation
18
  library_name: transformers
19
  tags:
20
- - multilingual
21
- - arabic
22
- - hebrew
23
- - qwen
24
- - educational
25
- - fine-tuned
 
 
26
  ---
27
 
28
  <style>
29
  body {
30
- font-family: 'Segoe UI', sans-serif;
31
- line-height: 1.7;
32
- color: #1e1e1e;
33
- background: #ffffff;
34
- padding: 1.5em;
35
  }
36
 
37
- h1, h2, h3 {
38
- color: #2c3e50;
39
  border-bottom: 2px solid #eee;
40
- padding-bottom: 5px;
 
 
 
 
 
 
 
 
41
  }
42
 
43
- img {
44
- display: block;
45
- margin: 1em auto;
46
- max-width: 200px;
 
47
  }
48
 
49
- code, pre {
50
- background-color: #f5f5f5;
51
- padding: 0.5em;
52
- border-radius: 5px;
53
- font-family: 'Courier New', monospace;
54
- display: block;
55
- white-space: pre-wrap;
56
  }
57
 
58
  blockquote {
59
- border-left: 4px solid #3498db;
60
- background: #ecf6fd;
61
- padding: 0.8em;
62
- color: #333;
63
- margin: 1.5em 0;
64
  }
65
  </style>
66
 
67
- # ๐Ÿค– ุฅุฏุฑุงูƒูŠ (Edraky) - Multilingual Educational AI Model
68
-
69
- ![Edraky Logo](https://cdn-uploads.huggingface.co/production/uploads/686e726239f003427404a1be/uuB7LFKDX1C5B28DGJyZN.png)
70
-
71
- > A fine-tuned AI assistant that helps Egyptian students โ€” specially 3rd preparatory โ€” with Arabic, English, Hebrew content, and educational support.
72
 
73
- ---
74
 
75
- ## ๐Ÿง  Model Details
76
 
77
- - **Base Model:** Qwen/Qwen2-1.5B-Instruct
78
- - **Languages:** Arabic, English, Hebrew
79
- - **Trained on:** Educational, Q&A, mathematical, and multilingual prompts
80
- - **License:** Apache-2.0
81
- - **Tags:** multilingual, Arabic, Hebrew, educational, fine-tuned, transformers
82
 
83
- ---
 
 
 
84
 
85
- ## ๐Ÿš€ How to Use
86
 
87
- ```python
88
- from transformers import AutoModelForCausalLM, AutoTokenizer
 
 
 
89
 
90
- model = AutoModelForCausalLM.from_pretrained("Edraky/Edraky")
91
- tokenizer = AutoTokenizer.from_pretrained("Edraky/Edraky")
92
 
93
- input_text = "ุงุดุฑุญ ุงู„ุซูˆุฑุฉ ุงู„ุนุฑุงุจูŠุฉ"
94
- inputs = tokenizer(input_text, return_tensors="pt")
95
- outputs = model.generate(**inputs, max_new_tokens=100)
96
- print(tokenizer.decode(outputs[0]))
97
- โœ… Intended Uses
98
- ๐Ÿง‘โ€๐Ÿซ Educational chatbot for classroom topics
99
 
100
- ๐Ÿ“š Answering curriculum-based questions
101
-
102
- โœ๏ธ Writing, completion, and explanation for Arabic texts
103
-
104
- ๐Ÿซ Perfect fit for 3rd preparatory grade in Egypt
105
-
106
- ๐Ÿšซ Limitations / Out-of-Scope
107
- โŒ Not designed for real-time tutoring
108
 
109
- โŒ No legal, political, or medical advice
 
 
 
110
 
111
- โŒ Not for biased, violent, or harmful use
112
 
113
- ๐Ÿ“Š Evaluation & Training
114
- Datasets:
 
 
 
115
 
116
- fka/awesome-chatgpt-prompts
117
 
118
- microsoft/rStar-Coder
 
 
 
 
 
 
 
 
119
 
120
- gsm8k-rerun/Qwen_Qwen2.5-1.5B-Instruct
121
 
122
- Metrics Used:
 
 
 
123
 
124
- Accuracy
125
 
126
- Perplexity
 
 
127
 
128
- Word Error Rate (WER)
129
 
130
- ๐ŸŒ Environment Impact (Optional)
131
- Trained on GPU (details coming soon). Expected carbon footprint is low due to small model size (1.5B).
 
 
132
 
133
- ๐Ÿ‘จโ€๐Ÿ’ป Maintainers & Contact
134
- Created by: Edraky Team
135
 
136
- Contact: edraky.edu@gmail.com
 
 
137
 
138
- License: Apache 2.0
139
 
140
- ๐Ÿ“œ Citation (if needed)
141
- bibtex
142
- Copy
143
- Edit
144
  @misc{edraky2025,
145
  title={Edraky: Multilingual Educational AI Model},
146
  author={Edraky Team},
147
  year={2025},
148
- howpublished={\\url{https://huggingface.co/Edraky/Edraky}}
149
- }
 
 
 
 
1
  ---
2
+ title: ๐Ÿค– ุฅุฏุฑุงูƒูŠ (Edraky) - Multilingual Educational AI Model ๐Ÿ‡ช๐Ÿ‡ฌ
3
+ emoji: ๐Ÿง 
4
+ colorFrom: indigo
5
+ colorTo: emerald
6
+ sdk: gradio
7
+ sdk_version: "4.25.0"
8
+ app_file: app.py
9
+ pinned: false
10
  license: apache-2.0
11
  datasets:
12
+ - fka/awesome-chatgpt-prompts
13
+ - microsoft/rStar-Coder
14
+ - gsm8k-rerun/Qwen_Qwen2.5-1.5B-Instruct
15
  language:
16
+ - ar
17
+ - en
18
+ - he
19
  metrics:
20
+ - accuracy
21
+ - perplexity
22
+ - wer
23
+ base_model: Qwen/Qwen2-1.5B-Instruct
 
24
  pipeline_tag: text-generation
25
  library_name: transformers
26
  tags:
27
+ - multilingual
28
+ - arabic
29
+ - hebrew
30
+ - qwen
31
+ - educational
32
+ - fine-tuned
33
+ - open-source
34
+ - egyptian-curriculum
35
  ---
36
 
37
  <style>
38
  body {
39
+ font-family: 'Cairo', sans-serif;
40
+ background: linear-gradient(to left, #f9f9f9, #e0ecf7);
41
+ color: #222;
42
+ padding: 2em;
43
+ line-height: 1.8;
44
  }
45
 
46
+ h1, h2, h3, h4 {
47
+ color: #003366;
48
  border-bottom: 2px solid #eee;
49
+ padding-bottom: 0.3em;
50
+ }
51
+
52
+ code {
53
+ background-color: #f4f4f4;
54
+ padding: 0.2em 0.4em;
55
+ border-radius: 4px;
56
+ font-family: Consolas, monospace;
57
+ color: #c7254e;
58
  }
59
 
60
+ pre {
61
+ background-color: #f0f0f0;
62
+ padding: 1em;
63
+ border-radius: 8px;
64
+ overflow-x: auto;
65
  }
66
 
67
+ ul {
68
+ padding-left: 1.5em;
 
 
 
 
 
69
  }
70
 
71
  blockquote {
72
+ background: #f9f9f9;
73
+ border-left: 5px solid #ccc;
74
+ padding: 1em;
75
+ font-style: italic;
76
+ color: #666;
77
  }
78
  </style>
79
 
80
+ # ๐Ÿค– ุฅุฏุฑุงูƒูŠ (Edraky) - Multilingual Educational AI Model ๐Ÿ‡ช๐Ÿ‡ฌ
 
 
 
 
81
 
82
+ **Edraky** is a fine-tuned multilingual model built on `Qwen2-1.5B-Instruct`, designed to provide educational support for Arabic-speaking students, especially targeting Egypt's 3rd preparatory curriculum. It supports Arabic, English, and Hebrew to ensure flexible, broad usage in multilingual environments.
83
 
84
+ ## ๐Ÿง  About Edraky
85
 
86
+ Edraky is part of the **"ุฅุฏุฑุงูƒูŠ"** educational initiative to democratize access to AI-powered tools for students in Egypt and the broader Arab world. By fine-tuning the powerful Qwen2 base model, Edraky delivers context-aware, curriculum-aligned, and interactive responses that help learners understand core subjects such as:
 
 
 
 
87
 
88
+ - ุงู„ู„ุบุฉ ุงู„ุนุฑุจูŠุฉ (Arabic Language)
89
+ - ุงู„ุฏุฑุงุณุงุช ุงู„ุงุฌุชู…ุงุนูŠุฉ (Social Studies)
90
+ - ุงู„ุชุงุฑูŠุฎ ูˆุงู„ุฌุบุฑุงููŠุง (History and Geography)
91
+ - ุงู„ู„ุบุฉ ุงู„ุฅู†ุฌู„ูŠุฒูŠุฉ (English)
92
 
93
+ ## ๐Ÿš€ Key Features
94
 
95
+ - ๐Ÿค– **Text Generation & Q&A**: Answer student questions in an educational and child-safe manner.
96
+ - ๐Ÿ“– **Curriculum Support**: Focused especially on 3rd preparatory grade in Egypt.
97
+ - ๐ŸŒ **Multilingual Input**: Supports Arabic, English, and Hebrew.
98
+ - ๐Ÿ”€ **Open-Source**: Available for research, personal, or educational use.
99
+ - ๐Ÿ“š **Trained on curated educational prompts** for logic, language understanding, and curriculum-based queries.
100
 
101
+ ## ๐Ÿงช Training & Fine-Tuning
 
102
 
103
+ **Base model:** `Qwen/Qwen2-1.5B-Instruct`
 
 
 
 
 
104
 
105
+ **Training Data Sources:**
106
+ - fka/awesome-chatgpt-prompts
107
+ - gsm8k-rerun/Qwen_Qwen2.5-1.5B-Instruct
108
+ - Additional data created from Arabic curriculum-style questions and student textbooks
 
 
 
 
109
 
110
+ **Training Methodology:**
111
+ - Supervised fine-tuning
112
+ - Prompt-optimized inputs
113
+ - Tokenized using Hugging Faceโ€™s tokenizer compatible with Qwen2 models
114
 
115
+ ## ๐Ÿ” Evaluation
116
 
117
+ Model was evaluated on:
118
+ - โœ”๏ธ Accuracy for subject-specific answers
119
+ - โœ”๏ธ Perplexity for fluency and coherence
120
+ - โœ”๏ธ WER (Word Error Rate) for language understanding
121
+ > Evaluation still in progress for full benchmarks โ€” to be published soon.
122
 
123
+ ## ๐Ÿง‘โ€๐Ÿ’ป Example Usage
124
 
125
+ ```python
126
+ from transformers import AutoModelForCausalLM, AutoTokenizer
127
+ model = AutoModelForCausalLM.from_pretrained("Edraky/Edraky")
128
+ tokenizer = AutoTokenizer.from_pretrained("Edraky/Edraky")
129
+ prompt = "ุงุดุฑุญ ุงู„ุซูˆุฑุฉ ุงู„ุนุฑุงุจูŠุฉ ุจุฅูŠุฌุงุฒ"
130
+ inputs = tokenizer(prompt, return_tensors="pt")
131
+ output = model.generate(**inputs, max_new_tokens=150)
132
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
133
+ ```
134
 
135
+ ## ๐Ÿง‘โ€๐Ÿ““ Intended Use
136
 
137
+ - ๐Ÿ’ฌ Classroom support AI assistant
138
+ - โœ๏ธ Writing and summarization in Arabic
139
+ - โ“ Question answering for exam preparation
140
+ - ๐Ÿ” Fact recall for historical, literary, and social studies content
141
 
142
+ ### โŒ Not Intended For:
143
 
144
+ - โŒ Political or religious fatwa content
145
+ - โŒ Personal decision-making
146
+ - โŒ Generating offensive or misleading answers
147
 
148
+ ## ๐ŸŒฑ Future Plans
149
 
150
+ - โœ… Add voice input/output via Whisper integration
151
+ - โœ… Online quiz companion
152
+ - โœ… Add visual aids (diagrams, maps)
153
+ - โœ… Full web platform integration (see [edraky.rf.gd](https://edraky.rf.gd))
154
 
155
+ ## ๐Ÿ“ข Maintainers
 
156
 
157
+ **Developed by:** Edraky AI Team
158
+ ๐ŸŒ Website: [https://edraky.rf.gd](https://edraky.rf.gd)
159
+ ๐Ÿ“ง Contact: edraky.edu@gmail.com
160
 
161
+ ## ๐Ÿ“œ Citation
162
 
163
+ ```bibtex
 
 
 
164
  @misc{edraky2025,
165
  title={Edraky: Multilingual Educational AI Model},
166
  author={Edraky Team},
167
  year={2025},
168
+ howpublished={\url{https://huggingface.co/Edraky/Edraky}}
169
+ }
170
+ ```
171
+
172
+ > ู‡ุฐุง ุงู„ู…ุดุฑูˆุน ู…ู† ุฃุฌู„ ุฏุนู… ุงู„ุชุนู„ูŠู… ููŠ ู…ุตุฑ ุจุงุณุชุฎุฏุงู… ุงู„ุฐูƒุงุก ุงู„ุงุตุทู†ุงุนูŠ. ู†ุฑุฌูˆ ุฃู† ูŠูƒูˆู† ู…ููŠุฏู‹ุง ู„ุฌู…ูŠุน ุงู„ุทู„ุงุจ ูˆุงู„ู…ุนู„ู…ูŠู† ๐ŸŒŸ