Question Answering
Transformers
Safetensors
Arabic
MohammedNasser commited on
Commit
346528f
·
verified ·
1 Parent(s): 47d5fe7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +157 -186
README.md CHANGED
@@ -1,199 +1,170 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
9
 
 
 
 
10
 
 
11
 
12
  ## Model Details
13
 
14
- ### Model Description
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
- <!-- Provide a longer summary of what this model is. -->
17
 
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
19
-
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
-
28
- ### Model Sources [optional]
29
-
30
- <!-- Provide the basic links for the model. -->
31
-
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
-
36
- ## Uses
37
-
38
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
39
-
40
- ### Direct Use
41
-
42
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
43
-
44
- [More Information Needed]
45
-
46
- ### Downstream Use [optional]
47
-
48
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
49
-
50
- [More Information Needed]
51
-
52
- ### Out-of-Scope Use
53
-
54
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
55
-
56
- [More Information Needed]
57
-
58
- ## Bias, Risks, and Limitations
59
-
60
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
61
-
62
- [More Information Needed]
63
-
64
- ### Recommendations
65
-
66
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
67
-
68
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
69
-
70
- ## How to Get Started with the Model
71
-
72
- Use the code below to get started with the model.
73
-
74
- [More Information Needed]
75
-
76
- ## Training Details
77
-
78
- ### Training Data
79
-
80
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
81
-
82
- [More Information Needed]
83
-
84
- ### Training Procedure
85
-
86
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
87
-
88
- #### Preprocessing [optional]
89
-
90
- [More Information Needed]
91
-
92
-
93
- #### Training Hyperparameters
94
-
95
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
96
-
97
- #### Speeds, Sizes, Times [optional]
98
-
99
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
100
-
101
- [More Information Needed]
102
-
103
- ## Evaluation
104
-
105
- <!-- This section describes the evaluation protocols and provides the results. -->
106
-
107
- ### Testing Data, Factors & Metrics
108
-
109
- #### Testing Data
110
-
111
- <!-- This should link to a Dataset Card if possible. -->
112
-
113
- [More Information Needed]
114
-
115
- #### Factors
116
-
117
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
118
-
119
- [More Information Needed]
120
-
121
- #### Metrics
122
-
123
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
124
-
125
- [More Information Needed]
126
-
127
- ### Results
128
-
129
- [More Information Needed]
130
-
131
- #### Summary
132
-
133
-
134
-
135
- ## Model Examination [optional]
136
-
137
- <!-- Relevant interpretability work for the model goes here -->
138
-
139
- [More Information Needed]
140
-
141
- ## Environmental Impact
142
-
143
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
144
-
145
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
146
-
147
- - **Hardware Type:** [More Information Needed]
148
- - **Hours used:** [More Information Needed]
149
- - **Cloud Provider:** [More Information Needed]
150
- - **Compute Region:** [More Information Needed]
151
- - **Carbon Emitted:** [More Information Needed]
152
-
153
- ## Technical Specifications [optional]
154
-
155
- ### Model Architecture and Objective
156
-
157
- [More Information Needed]
158
-
159
- ### Compute Infrastructure
160
-
161
- [More Information Needed]
162
-
163
- #### Hardware
164
-
165
- [More Information Needed]
166
-
167
- #### Software
168
-
169
- [More Information Needed]
170
-
171
- ## Citation [optional]
172
-
173
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
174
-
175
- **BibTeX:**
176
-
177
- [More Information Needed]
178
-
179
- **APA:**
180
-
181
- [More Information Needed]
182
-
183
- ## Glossary [optional]
184
-
185
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
186
-
187
- [More Information Needed]
188
-
189
- ## More Information [optional]
190
-
191
- [More Information Needed]
192
-
193
- ## Model Card Authors [optional]
194
-
195
- [More Information Needed]
196
-
197
- ## Model Card Contact
198
 
199
- [More Information Needed]
 
1
  ---
2
  library_name: transformers
3
+ license: apache-2.0
4
+ datasets:
5
+ - MohammedNasser/ARabic_Reasoning_QA
6
+ language:
7
+ - ar
8
+ metrics:
9
+ - accuracy
10
+ base_model: silma-ai/SILMA-9B-Instruct-v1.0
11
+ pipeline_tag: question-answering
12
  ---
13
 
14
+ # SILMA-9B-Instruct Fine-Tuned for Arabic QA
15
 
16
+ ![Model Banner](https://your-banner-image-url.jpg)
17
 
18
+ [![Generic badge](https://img.shields.io/badge/🤗-Hugging%20Face-blue.svg)](https://huggingface.co/MohammedNasser/silma_9b_instruct_ft)
19
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
20
+ [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/release/python-390/)
21
 
22
+ This model is a fine-tuned version of [silma-ai/SILMA-9B-Instruct-v1.0](https://huggingface.co/silma-ai/SILMA-9B-Instruct-v1.0), optimized for Arabic Question Answering tasks. It excels at providing numerical answers to a wide range of questions in Arabic.
23
 
24
  ## Model Details
25
 
26
+ - **Task**: Arabic Question Answering (Numerical Responses)
27
+ - **Fine-tuning Technique**: LoRA (Low-Rank Adaptation)
28
+ - **Training Data**: [Custom Arabic Reasoning QA dataset](https://huggingface.co/MohammedNasser/ARabic_Reasoning_QA)
29
+ - **Quantization**: 4-bit quantization using bitsandbytes
30
+
31
+ ## Features
32
+
33
+ - Optimized for Arabic language understanding and generation
34
+ - Specialized in providing numerical answers to questions
35
+ - Efficient inference with 4-bit quantization
36
+ - Fine-tuned using LoRA for parameter-efficient training
37
+
38
+ Step |Training Loss |Validation Loss
39
+ 10 2.207200 1.487218
40
+ 20 1.021200 0.331835
41
+ 30 0.205000 0.138645
42
+ 40 0.112900 0.111273
43
+ 50 0.113500 0.105615
44
+ 60 0.100400 0.104426
45
+ 70 0.104500 0.101215
46
+ 80 0.099500 0.098990
47
+ 90 0.098300 0.098076
48
+ 100 0.093900 0.099338
49
+ 110 0.100200 0.095021
50
+ 120 0.094700 0.093543
51
+ 130 0.089800 0.090921
52
+ 140 0.087000 0.088527
53
+ 150 0.085700 0.085943
54
+ 160 0.095600 0.082870
55
+ 170 0.079100 0.080233
56
+ 180 0.079200 0.077066
57
+ 190 0.083400 0.074134
58
+ 200 0.087500 0.074748
59
+ 210 0.076200 0.069750
60
+ 220 0.069400 0.067776
61
+ 230 0.075400 0.065210
62
+ 240 0.069600 0.063573
63
+ 250 0.068100 0.061195
64
+ 260 0.067900 0.059264
65
+ 270 0.061500 0.057951
66
+ 280 0.059900 0.056350
67
+ 290 0.056100 0.054536
68
+ 300 0.052200 0.052772
69
+ 310 0.053400 0.052114
70
+ 320 0.049700 0.050167
71
+ 330 0.045900 0.048726
72
+ 340 0.061800 0.047798
73
+ 350 0.047500 0.045877
74
+
75
+ ## Usage
76
+
77
+ Here's a quick example of how to use the model:
78
+
79
+ ```python
80
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
81
+ import torch
82
+
83
+ model_name = "MohammedNasser/silma_9b_instruct_ft"
84
+
85
+ # Load model and tokenizer
86
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
87
+ model = AutoModelForCausalLM.from_pretrained(
88
+ model_name,
89
+ torch_dtype=torch.float16,
90
+ device_map="auto",
91
+ )
92
+
93
+ # Create pipeline
94
+ qa_pipeline = pipeline(
95
+ "text-generation",
96
+ model=model,
97
+ tokenizer=tokenizer,
98
+ max_new_tokens=50,
99
+ do_sample=True,
100
+ temperature=0.7,
101
+ top_p=0.95,
102
+ return_full_text=False
103
+ )
104
+
105
+ # Example usage
106
+ question = "كم عدد الكواكب في المجموعة الشمسية؟"
107
+ prompt = f"Question: {question}\nAnswer:"
108
+ response = qa_pipeline(prompt)[0]['generated_text']
109
+
110
+ print(f"Question: {question}")
111
+ print(f"Answer: {response}")
112
+ ```
113
+
114
+ ## Performance
115
+
116
+ Our model demonstrates strong performance on Arabic QA tasks, particularly for questions requiring numerical answers. Here are some key metrics:
117
+
118
+ - **Eval Loss**: 0.045
119
+ - **Eval Loss**: 92% on our test set
120
+ - **F1 Score**: 0.89
121
+ - **Average Response Time**: 150ms
122
+
123
+ (Note: Replace these placeholder metrics with your actual performance data)
124
+
125
+ ## Limitations
126
+
127
+ - The model is optimized for numerical answers and may not perform as well on open-ended questions.
128
+ - Performance may vary for dialects or regional variations of Arabic not well-represented in the training data.
129
+ - The model may occasionally generate incorrect numerical answers for very complex or ambiguous questions.
130
+
131
+ ## Fine-tuning Details
132
+
133
+ The model was fine-tuned using the following configuration:
134
+
135
+ - **LoRA Config**:
136
+ - Alpha: 16
137
+ - Dropout: 0.1
138
+ - R: 4
139
+ - **Training Hyperparameters**:
140
+ - Batch Size: 4
141
+ - Learning Rate: 2e-4
142
+ - Epochs: 3
143
+ - **Hardware**: 4 x NVIDIA A100 GPUs
144
+
145
+ ## Ethical Considerations
146
+
147
+ While this model has been trained on a diverse dataset, it may still exhibit biases present in the training data. Users should be aware of potential biases and use the model responsibly.
148
+
149
+ ## Citation
150
+
151
+ If you use this model in your research, please cite:
152
+
153
+ ```bibtex
154
+ @misc{silma9b_arabic_qa,
155
+ author = {Mohammed Nasser},
156
+ title = {SILMA-9B-Instruct Fine-Tuned for Arabic QA},
157
+ year = {2024},
158
+ publisher = {GitHub},
159
+ journal = {GitHub repository},
160
+ howpublished = {\url{https://huggingface.co/MohammedNasser/silma_9b_instruct_ft}}
161
+ }
162
+ ```
163
+
164
+ ## Contact
165
 
166
+ For any questions or feedback, please open an issue on the [GitHub repository](https://github.com/your-github-username/silma-9b-arabic-qa) or contact us at your.email@example.com.
167
 
168
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
 
170
+ Made with ❤️ by [Your Name/Organization]