Improve language tag

#2
by lbourdois - opened
Files changed (1) hide show
  1. README.md +326 -314
README.md CHANGED
@@ -1,314 +1,326 @@
1
- ---
2
- license: mit
3
- language:
4
- - ar
5
- base_model:
6
- - Qwen/Qwen2.5-1.5B-Instruct
7
- pipeline_tag: text2text-generation
8
- library_name: transformers
9
- tags:
10
- - Text-To-SQL
11
- - Arabic
12
- - Spider
13
- - SQL
14
- ---
15
-
16
- # Model Card for Arabic Text-To-SQL (OsamaMo)
17
-
18
- ## Model Details
19
-
20
- ### Model Description
21
-
22
- This model is fine-tuned on the Spider dataset with Arabic-translated questions for the Text-To-SQL task. It is based on **Qwen/Qwen2.5-1.5B-Instruct** and trained using LoRA on Kaggle for 15 hours on a **P100 8GB GPU**.
23
-
24
- - **Developed by:** Osama Mohamed ([OsamaMo](https://huggingface.co/OsamaMo))
25
- - **Funded by:** Self-funded
26
- - **Shared by:** Osama Mohamed
27
- - **Model type:** Text-to-SQL fine-tuned model
28
- - **Language(s):** Arabic (ar)
29
- - **License:** MIT
30
- - **Finetuned from:** Qwen/Qwen2.5-1.5B-Instruct
31
-
32
- ### Model Sources
33
-
34
- - **Repository:** [Hugging Face Model Hub](https://huggingface.co/OsamaMo/Arabic_Text-To-SQL)
35
- - **Dataset:** Spider (translated to Arabic)
36
- - **Training Script:** [LLaMA-Factory](https://github.com/huggingface/transformers/tree/main/src/transformers/models/llama_factory)
37
-
38
- ## Uses
39
-
40
- ### Direct Use
41
-
42
- This model is intended for converting **Arabic natural language questions** into SQL queries. It can be used for database querying in Arabic-speaking applications.
43
-
44
- ### Downstream Use
45
-
46
- Can be fine-tuned further for specific databases or Arabic dialect adaptations.
47
-
48
- ### Out-of-Scope Use
49
-
50
- - The model is **not** intended for direct execution of SQL queries.
51
- - Not recommended for non-database-related NLP tasks.
52
-
53
- ## Bias, Risks, and Limitations
54
-
55
- - The model might generate incorrect or non-optimized SQL queries.
56
- - Bias may exist due to dataset translations and model pretraining data.
57
-
58
- ### Recommendations
59
-
60
- - Validate generated SQL queries before execution.
61
- - Ensure compatibility with specific database schemas.
62
-
63
- ## How to Get Started with the Model
64
- ### Load Model
65
- ```python
66
- from transformers import AutoModelForCausalLM, AutoTokenizer
67
- import torch
68
- import re
69
-
70
- device = "cuda" if torch.cuda.is_available() else "cpu"
71
- base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
72
- finetuned_model_id = "OsamaMo/Arabic_Text-To-SQL_using_Qwen2.5-1.5B"
73
-
74
- # Load the base model and adapter for fine-tuning
75
- model = AutoModelForCausalLM.from_pretrained(
76
- base_model_id,
77
- device_map="auto",
78
- torch_dtype=torch.bfloat16
79
- )
80
- model.load_adapter(finetuned_model_id)
81
-
82
- tokenizer = AutoTokenizer.from_pretrained(base_model_id)
83
-
84
- def generate_resp(messages):
85
- text = tokenizer.apply_chat_template(
86
- messages,
87
- tokenize=False,
88
- add_generation_prompt=True
89
- )
90
- model_inputs = tokenizer([text], return_tensors="pt").to(device)
91
- generated_ids = model.generate(
92
- model_inputs.input_ids,
93
- max_new_tokens=1024,
94
- do_sample=False, temperature= False,
95
- )
96
- generated_ids = [
97
- output_ids[len(input_ids):]
98
- for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
99
- ]
100
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
101
- return response
102
- ```
103
-
104
-
105
-
106
- ### Example Usage
107
- ```python
108
-
109
- # Production-ready system message for SQL generation
110
- system_message = (
111
- "You are a highly advanced Arabic text-to-SQL converter. Your mission is to Understand first the db schema and reltions between it and then accurately transform Arabic "
112
- "natural language queries into SQL queries with precision and clarity.\n"
113
- )
114
-
115
- def get_sql_query(db_schema, arabic_query):
116
- # Construct the instruction message including the DB schema and the Arabic query
117
- instruction_message = "\n".join([
118
- "## DB-Schema:",
119
- db_schema,
120
- "",
121
- "## User-Prompt:",
122
- arabic_query,
123
- "# Output SQL:",
124
- "```SQL"
125
- ])
126
-
127
- messages = [
128
- {"role": "system", "content": system_message},
129
- {"role": "user", "content": instruction_message}
130
- ]
131
-
132
- response = generate_resp(messages)
133
-
134
- # Extract the SQL query from the response using a regex to capture text within the ```sql markdown block
135
- match = re.search(r"```sql\s*(.*?)\s*```", response, re.DOTALL | re.IGNORECASE)
136
- if match:
137
- sql_query = match.group(1).strip()
138
- return sql_query
139
- else:
140
- return response.strip()
141
-
142
- # Example usage:
143
- example_db_schema = r'''{
144
- 'Pharmcy':
145
- CREATE TABLE `purchase` (
146
- `BARCODE` varchar(20) NOT NULL,
147
- `NAME` varchar(50) NOT NULL,
148
- `TYPE` varchar(20) NOT NULL,
149
- `COMPANY_NAME` varchar(20) NOT NULL,
150
- `QUANTITY` int NOT NULL,
151
- `PRICE` double NOT NULL,
152
- `AMOUNT` double NOT NULL,
153
- PRIMARY KEY (`BARCODE`),
154
- KEY `fkr3` (`COMPANY_NAME`),
155
- CONSTRAINT `fkr3` FOREIGN KEY (`COMPANY_NAME`) REFERENCES `company` (`NAME`) ON DELETE CASCADE ON UPDATE CASCADE
156
- ) ENGINE=InnoDB DEFAULT CHARSET=latin1
157
-
158
- CREATE TABLE `sales` (
159
- `BARCODE` varchar(20) NOT NULL,
160
- `NAME` varchar(50) NOT NULL,
161
- `TYPE` varchar(10) NOT NULL,
162
- `DOSE` varchar(10) NOT NULL,
163
- `QUANTITY` int NOT NULL,
164
- `PRICE` double NOT NULL,
165
- `AMOUNT` double NOT NULL,
166
- `DATE` varchar(15) NOT NULL
167
- ) ENGINE=InnoDB DEFAULT CHARSET=latin1
168
-
169
- CREATE TABLE `users` (
170
- `ID` int NOT NULL,
171
- `NAME` varchar(50) NOT NULL,
172
- `DOB` varchar(20) NOT NULL,
173
- `ADDRESS` varchar(100) NOT NULL,
174
- `PHONE` varchar(20) NOT NULL,
175
- `SALARY` double NOT NULL,
176
- `PASSWORD` varchar(20) NOT NULL,
177
- PRIMARY KEY (`ID`)
178
- ) ENGINE=InnoDB DEFAULT CHARSET=latin1
179
-
180
- CREATE TABLE `history_sales` (
181
- `USER_NAME` varchar(20) NOT NULL,
182
- `BARCODE` varchar(20) NOT NULL,
183
- `NAME` varchar(50) NOT NULL,
184
- `TYPE` varchar(10) NOT NULL,
185
- `DOSE` varchar(10) NOT NULL,
186
- `QUANTITY` int NOT NULL,
187
- `PRICE` double NOT NULL,
188
- `AMOUNT` double NOT NULL,
189
- `DATE` varchar(15) NOT NULL,
190
- `TIME` varchar(20) NOT NULL
191
- ) ENGINE=InnoDB DEFAULT CHARSET=latin1
192
-
193
- CREATE TABLE `expiry` (
194
- `PRODUCT_NAME` varchar(50) NOT NULL,
195
- `PRODUCT_CODE` varchar(20) NOT NULL,
196
- `DATE_OF_EXPIRY` varchar(10) NOT NULL,
197
- `QUANTITY_REMAIN` int NOT NULL
198
- ) ENGINE=InnoDB DEFAULT CHARSET=latin1
199
-
200
- CREATE TABLE `drugs` (
201
- `NAME` varchar(50) NOT NULL,
202
- `TYPE` varchar(20) NOT NULL,
203
- `BARCODE` varchar(20) NOT NULL,
204
- `DOSE` varchar(10) NOT NULL,
205
- `CODE` varchar(10) NOT NULL,
206
- `COST_PRICE` double NOT NULL,
207
- `SELLING_PRICE` double NOT NULL,
208
- `EXPIRY` varchar(20) NOT NULL,
209
- `COMPANY_NAME` varchar(50) NOT NULL,
210
- `PRODUCTION_DATE` date NOT NULL,
211
- `EXPIRATION_DATE` date NOT NULL,
212
- `PLACE` varchar(20) NOT NULL,
213
- `QUANTITY` int NOT NULL,
214
- PRIMARY KEY (`BARCODE`)
215
- ) ENGINE=InnoDB DEFAULT CHARSET=latin1
216
-
217
- CREATE TABLE `company` (
218
- `NAME` varchar(50) NOT NULL,
219
- `ADDRESS` varchar(50) NOT NULL,
220
- `PHONE` varchar(20) NOT NULL,
221
- PRIMARY KEY (`NAME`)
222
- ) ENGINE=InnoDB DEFAULT CHARSET=latin1
223
-
224
- Answer the following questions about this schema:
225
- }'''
226
-
227
- example_arabic_query = "اريد الباركود الخاص بدواء يبداء اسمه بحرف 's'"
228
-
229
- sql_result = get_sql_query(example_db_schema, example_arabic_query)
230
- print("استعلام SQL الناتج:")
231
- print(sql_result)
232
- ```
233
-
234
- ## Training Details
235
-
236
- ### Training Data
237
-
238
- - Dataset: **Spider (translated into Arabic)**
239
- - Preprocessing: Questions converted to Arabic while keeping SQL queries unchanged.
240
- - Training format:
241
- - System instruction guiding Arabic-to-SQL conversion.
242
- - Database schema provided for context.
243
- - Arabic user queries mapped to correct SQL output.
244
- - Output is strictly formatted SQL queries enclosed in markdown code blocks.
245
-
246
- ### Training Procedure
247
-
248
- #### Training Hyperparameters
249
-
250
- - **Batch size:** 1 (per device)
251
- - **Gradient accumulation:** 4 steps
252
- - **Learning rate:** 1.0e-4
253
- - **Epochs:** 3
254
- - **Scheduler:** Cosine
255
- - **Warmup ratio:** 0.1
256
- - **Precision:** bf16
257
-
258
- #### Speeds, Sizes, Times
259
-
260
- - **Training time:** 15 hours on **NVIDIA P100 8GB**
261
- - **Checkpointing every:** 500 steps
262
-
263
- ## Evaluation
264
-
265
- ### Testing Data
266
-
267
- - Validation dataset: Spider validation set (translated to Arabic)
268
-
269
- ### Metrics
270
-
271
- - Exact Match (EM) for SQL correctness
272
- - Execution Accuracy (EX) on databases
273
-
274
- ### Results
275
-
276
- - Model achieved **competitive SQL generation accuracy** for Arabic queries.
277
- - Further testing required for robustness.
278
-
279
- ## Environmental Impact
280
-
281
- - **Hardware Type:** NVIDIA Tesla P100 8GB
282
- - **Hours used:** 15
283
- - **Cloud Provider:** Kaggle
284
- - **Carbon Emitted:** Estimated using [ML Impact Calculator](https://mlco2.github.io/impact#compute)
285
-
286
- ## Technical Specifications
287
-
288
- ### Model Architecture and Objective
289
-
290
- - Transformer-based **Qwen2.5-1.5B** architecture.
291
- - Fine-tuned for Text-to-SQL task using LoRA.
292
-
293
- ### Compute Infrastructure
294
-
295
- - **Hardware:** Kaggle P100 GPU (8GB VRAM)
296
- - **Software:** Python, Transformers, LLaMA-Factory, Hugging Face Hub
297
-
298
- ## Citation
299
-
300
- If you use this model, please cite:
301
-
302
- ```bibtex
303
- @misc{OsamaMo_ArabicSQL,
304
- author = {Osama Mohamed},
305
- title = {Arabic Text-To-SQL Model},
306
- year = {2024},
307
- howpublished = {\url{https://huggingface.co/OsamaMo/Arabic_Text-To-SQL}}
308
- }
309
- ```
310
-
311
- ## Model Card Contact
312
-
313
- For questions, contact **Osama Mohamed** via Hugging Face ([OsamaMo](https://huggingface.co/OsamaMo)).
314
-
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - zho
5
+ - eng
6
+ - fra
7
+ - spa
8
+ - por
9
+ - deu
10
+ - ita
11
+ - rus
12
+ - jpn
13
+ - kor
14
+ - vie
15
+ - tha
16
+ - ara
17
+ base_model:
18
+ - Qwen/Qwen2.5-1.5B-Instruct
19
+ pipeline_tag: text2text-generation
20
+ library_name: transformers
21
+ tags:
22
+ - Text-To-SQL
23
+ - Arabic
24
+ - Spider
25
+ - SQL
26
+ ---
27
+
28
+ # Model Card for Arabic Text-To-SQL (OsamaMo)
29
+
30
+ ## Model Details
31
+
32
+ ### Model Description
33
+
34
+ This model is fine-tuned on the Spider dataset with Arabic-translated questions for the Text-To-SQL task. It is based on **Qwen/Qwen2.5-1.5B-Instruct** and trained using LoRA on Kaggle for 15 hours on a **P100 8GB GPU**.
35
+
36
+ - **Developed by:** Osama Mohamed ([OsamaMo](https://huggingface.co/OsamaMo))
37
+ - **Funded by:** Self-funded
38
+ - **Shared by:** Osama Mohamed
39
+ - **Model type:** Text-to-SQL fine-tuned model
40
+ - **Language(s):** Arabic (ar)
41
+ - **License:** MIT
42
+ - **Finetuned from:** Qwen/Qwen2.5-1.5B-Instruct
43
+
44
+ ### Model Sources
45
+
46
+ - **Repository:** [Hugging Face Model Hub](https://huggingface.co/OsamaMo/Arabic_Text-To-SQL)
47
+ - **Dataset:** Spider (translated to Arabic)
48
+ - **Training Script:** [LLaMA-Factory](https://github.com/huggingface/transformers/tree/main/src/transformers/models/llama_factory)
49
+
50
+ ## Uses
51
+
52
+ ### Direct Use
53
+
54
+ This model is intended for converting **Arabic natural language questions** into SQL queries. It can be used for database querying in Arabic-speaking applications.
55
+
56
+ ### Downstream Use
57
+
58
+ Can be fine-tuned further for specific databases or Arabic dialect adaptations.
59
+
60
+ ### Out-of-Scope Use
61
+
62
+ - The model is **not** intended for direct execution of SQL queries.
63
+ - Not recommended for non-database-related NLP tasks.
64
+
65
+ ## Bias, Risks, and Limitations
66
+
67
+ - The model might generate incorrect or non-optimized SQL queries.
68
+ - Bias may exist due to dataset translations and model pretraining data.
69
+
70
+ ### Recommendations
71
+
72
+ - Validate generated SQL queries before execution.
73
+ - Ensure compatibility with specific database schemas.
74
+
75
+ ## How to Get Started with the Model
76
+ ### Load Model
77
+ ```python
78
+ from transformers import AutoModelForCausalLM, AutoTokenizer
79
+ import torch
80
+ import re
81
+
82
+ device = "cuda" if torch.cuda.is_available() else "cpu"
83
+ base_model_id = "Qwen/Qwen2.5-1.5B-Instruct"
84
+ finetuned_model_id = "OsamaMo/Arabic_Text-To-SQL_using_Qwen2.5-1.5B"
85
+
86
+ # Load the base model and adapter for fine-tuning
87
+ model = AutoModelForCausalLM.from_pretrained(
88
+ base_model_id,
89
+ device_map="auto",
90
+ torch_dtype=torch.bfloat16
91
+ )
92
+ model.load_adapter(finetuned_model_id)
93
+
94
+ tokenizer = AutoTokenizer.from_pretrained(base_model_id)
95
+
96
+ def generate_resp(messages):
97
+ text = tokenizer.apply_chat_template(
98
+ messages,
99
+ tokenize=False,
100
+ add_generation_prompt=True
101
+ )
102
+ model_inputs = tokenizer([text], return_tensors="pt").to(device)
103
+ generated_ids = model.generate(
104
+ model_inputs.input_ids,
105
+ max_new_tokens=1024,
106
+ do_sample=False, temperature= False,
107
+ )
108
+ generated_ids = [
109
+ output_ids[len(input_ids):]
110
+ for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
111
+ ]
112
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
113
+ return response
114
+ ```
115
+
116
+
117
+
118
+ ### Example Usage
119
+ ```python
120
+
121
+ # Production-ready system message for SQL generation
122
+ system_message = (
123
+ "You are a highly advanced Arabic text-to-SQL converter. Your mission is to Understand first the db schema and reltions between it and then accurately transform Arabic "
124
+ "natural language queries into SQL queries with precision and clarity.\n"
125
+ )
126
+
127
+ def get_sql_query(db_schema, arabic_query):
128
+ # Construct the instruction message including the DB schema and the Arabic query
129
+ instruction_message = "\n".join([
130
+ "## DB-Schema:",
131
+ db_schema,
132
+ "",
133
+ "## User-Prompt:",
134
+ arabic_query,
135
+ "# Output SQL:",
136
+ "```SQL"
137
+ ])
138
+
139
+ messages = [
140
+ {"role": "system", "content": system_message},
141
+ {"role": "user", "content": instruction_message}
142
+ ]
143
+
144
+ response = generate_resp(messages)
145
+
146
+ # Extract the SQL query from the response using a regex to capture text within the ```sql markdown block
147
+ match = re.search(r"```sql\s*(.*?)\s*```", response, re.DOTALL | re.IGNORECASE)
148
+ if match:
149
+ sql_query = match.group(1).strip()
150
+ return sql_query
151
+ else:
152
+ return response.strip()
153
+
154
+ # Example usage:
155
+ example_db_schema = r'''{
156
+ 'Pharmcy':
157
+ CREATE TABLE `purchase` (
158
+ `BARCODE` varchar(20) NOT NULL,
159
+ `NAME` varchar(50) NOT NULL,
160
+ `TYPE` varchar(20) NOT NULL,
161
+ `COMPANY_NAME` varchar(20) NOT NULL,
162
+ `QUANTITY` int NOT NULL,
163
+ `PRICE` double NOT NULL,
164
+ `AMOUNT` double NOT NULL,
165
+ PRIMARY KEY (`BARCODE`),
166
+ KEY `fkr3` (`COMPANY_NAME`),
167
+ CONSTRAINT `fkr3` FOREIGN KEY (`COMPANY_NAME`) REFERENCES `company` (`NAME`) ON DELETE CASCADE ON UPDATE CASCADE
168
+ ) ENGINE=InnoDB DEFAULT CHARSET=latin1
169
+
170
+ CREATE TABLE `sales` (
171
+ `BARCODE` varchar(20) NOT NULL,
172
+ `NAME` varchar(50) NOT NULL,
173
+ `TYPE` varchar(10) NOT NULL,
174
+ `DOSE` varchar(10) NOT NULL,
175
+ `QUANTITY` int NOT NULL,
176
+ `PRICE` double NOT NULL,
177
+ `AMOUNT` double NOT NULL,
178
+ `DATE` varchar(15) NOT NULL
179
+ ) ENGINE=InnoDB DEFAULT CHARSET=latin1
180
+
181
+ CREATE TABLE `users` (
182
+ `ID` int NOT NULL,
183
+ `NAME` varchar(50) NOT NULL,
184
+ `DOB` varchar(20) NOT NULL,
185
+ `ADDRESS` varchar(100) NOT NULL,
186
+ `PHONE` varchar(20) NOT NULL,
187
+ `SALARY` double NOT NULL,
188
+ `PASSWORD` varchar(20) NOT NULL,
189
+ PRIMARY KEY (`ID`)
190
+ ) ENGINE=InnoDB DEFAULT CHARSET=latin1
191
+
192
+ CREATE TABLE `history_sales` (
193
+ `USER_NAME` varchar(20) NOT NULL,
194
+ `BARCODE` varchar(20) NOT NULL,
195
+ `NAME` varchar(50) NOT NULL,
196
+ `TYPE` varchar(10) NOT NULL,
197
+ `DOSE` varchar(10) NOT NULL,
198
+ `QUANTITY` int NOT NULL,
199
+ `PRICE` double NOT NULL,
200
+ `AMOUNT` double NOT NULL,
201
+ `DATE` varchar(15) NOT NULL,
202
+ `TIME` varchar(20) NOT NULL
203
+ ) ENGINE=InnoDB DEFAULT CHARSET=latin1
204
+
205
+ CREATE TABLE `expiry` (
206
+ `PRODUCT_NAME` varchar(50) NOT NULL,
207
+ `PRODUCT_CODE` varchar(20) NOT NULL,
208
+ `DATE_OF_EXPIRY` varchar(10) NOT NULL,
209
+ `QUANTITY_REMAIN` int NOT NULL
210
+ ) ENGINE=InnoDB DEFAULT CHARSET=latin1
211
+
212
+ CREATE TABLE `drugs` (
213
+ `NAME` varchar(50) NOT NULL,
214
+ `TYPE` varchar(20) NOT NULL,
215
+ `BARCODE` varchar(20) NOT NULL,
216
+ `DOSE` varchar(10) NOT NULL,
217
+ `CODE` varchar(10) NOT NULL,
218
+ `COST_PRICE` double NOT NULL,
219
+ `SELLING_PRICE` double NOT NULL,
220
+ `EXPIRY` varchar(20) NOT NULL,
221
+ `COMPANY_NAME` varchar(50) NOT NULL,
222
+ `PRODUCTION_DATE` date NOT NULL,
223
+ `EXPIRATION_DATE` date NOT NULL,
224
+ `PLACE` varchar(20) NOT NULL,
225
+ `QUANTITY` int NOT NULL,
226
+ PRIMARY KEY (`BARCODE`)
227
+ ) ENGINE=InnoDB DEFAULT CHARSET=latin1
228
+
229
+ CREATE TABLE `company` (
230
+ `NAME` varchar(50) NOT NULL,
231
+ `ADDRESS` varchar(50) NOT NULL,
232
+ `PHONE` varchar(20) NOT NULL,
233
+ PRIMARY KEY (`NAME`)
234
+ ) ENGINE=InnoDB DEFAULT CHARSET=latin1
235
+
236
+ Answer the following questions about this schema:
237
+ }'''
238
+
239
+ example_arabic_query = "اريد الباركود الخاص بدواء يبداء اسمه بحرف 's'"
240
+
241
+ sql_result = get_sql_query(example_db_schema, example_arabic_query)
242
+ print("استعلام SQL الناتج:")
243
+ print(sql_result)
244
+ ```
245
+
246
+ ## Training Details
247
+
248
+ ### Training Data
249
+
250
+ - Dataset: **Spider (translated into Arabic)**
251
+ - Preprocessing: Questions converted to Arabic while keeping SQL queries unchanged.
252
+ - Training format:
253
+ - System instruction guiding Arabic-to-SQL conversion.
254
+ - Database schema provided for context.
255
+ - Arabic user queries mapped to correct SQL output.
256
+ - Output is strictly formatted SQL queries enclosed in markdown code blocks.
257
+
258
+ ### Training Procedure
259
+
260
+ #### Training Hyperparameters
261
+
262
+ - **Batch size:** 1 (per device)
263
+ - **Gradient accumulation:** 4 steps
264
+ - **Learning rate:** 1.0e-4
265
+ - **Epochs:** 3
266
+ - **Scheduler:** Cosine
267
+ - **Warmup ratio:** 0.1
268
+ - **Precision:** bf16
269
+
270
+ #### Speeds, Sizes, Times
271
+
272
+ - **Training time:** 15 hours on **NVIDIA P100 8GB**
273
+ - **Checkpointing every:** 500 steps
274
+
275
+ ## Evaluation
276
+
277
+ ### Testing Data
278
+
279
+ - Validation dataset: Spider validation set (translated to Arabic)
280
+
281
+ ### Metrics
282
+
283
+ - Exact Match (EM) for SQL correctness
284
+ - Execution Accuracy (EX) on databases
285
+
286
+ ### Results
287
+
288
+ - Model achieved **competitive SQL generation accuracy** for Arabic queries.
289
+ - Further testing required for robustness.
290
+
291
+ ## Environmental Impact
292
+
293
+ - **Hardware Type:** NVIDIA Tesla P100 8GB
294
+ - **Hours used:** 15
295
+ - **Cloud Provider:** Kaggle
296
+ - **Carbon Emitted:** Estimated using [ML Impact Calculator](https://mlco2.github.io/impact#compute)
297
+
298
+ ## Technical Specifications
299
+
300
+ ### Model Architecture and Objective
301
+
302
+ - Transformer-based **Qwen2.5-1.5B** architecture.
303
+ - Fine-tuned for Text-to-SQL task using LoRA.
304
+
305
+ ### Compute Infrastructure
306
+
307
+ - **Hardware:** Kaggle P100 GPU (8GB VRAM)
308
+ - **Software:** Python, Transformers, LLaMA-Factory, Hugging Face Hub
309
+
310
+ ## Citation
311
+
312
+ If you use this model, please cite:
313
+
314
+ ```bibtex
315
+ @misc{OsamaMo_ArabicSQL,
316
+ author = {Osama Mohamed},
317
+ title = {Arabic Text-To-SQL Model},
318
+ year = {2024},
319
+ howpublished = {\url{https://huggingface.co/OsamaMo/Arabic_Text-To-SQL}}
320
+ }
321
+ ```
322
+
323
+ ## Model Card Contact
324
+
325
+ For questions, contact **Osama Mohamed** via Hugging Face ([OsamaMo](https://huggingface.co/OsamaMo)).
326
+