mou3az commited on
Commit
1477c5e
·
verified ·
1 Parent(s): 5e6e61e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -66
README.md CHANGED
@@ -1,88 +1,88 @@
1
  ### Model Card ###
2
 
3
- # Information:
 
 
 
 
 
 
 
 
 
 
4
 
5
- Base Model: facebook/bart-base
6
 
7
- Fine-tuned : using PEFT-LoRa
8
 
9
- Datasets : squad_v2, drop, mou3az/IT_QA-QG
10
-
11
- Task: Generating questions from context and answers
12
-
13
- Language: English
14
-
15
-
16
- # Performance Metrics on Evaluation Set:
17
-
18
- Training Loss: 1.1.1958
19
-
20
- Evaluation Loss: 1.109059
21
 
22
 
23
  ### Loading the model ###
24
 
25
- ```python
26
- from peft import PeftModel, PeftConfig
27
- from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
28
- HUGGING_FACE_USER_NAME = "mou3az"
29
- model_name = "ITandGeneral_Question-Generation"
30
- peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
31
- config = PeftConfig.from_pretrained(peft_model_id)
32
- model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
33
- QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
34
- QG_model = PeftModel.from_pretrained(model, peft_model_id)
35
- ```
36
 
37
  ### At inference time ###
38
 
39
- ```python
40
- def get_question(context, answer):
41
- device = next(L_model.parameters()).device
42
- input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
43
- encoding = G_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)
44
-
45
- output_tokens = L_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
46
- out = G_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()
47
-
48
- return out
49
- ```
50
 
51
  ### Training parameters and hyperparameters ###
52
 
53
- The following were used during training:
54
-
55
- # For Lora:
56
-
57
- r=18
58
-
59
- alpha=8
60
-
61
-
62
- # For training arguments:
63
-
64
- gradient_accumulation_steps=24
65
-
66
- per_device_train_batch_size=8
67
 
68
- per_device_eval_batch_size=8
69
 
70
- max_steps=1000
71
 
72
- warmup_steps=50
73
-
74
- weight_decay=0.05
75
-
76
- learning_rate=3e-3
77
 
78
- lr_scheduler_type="linear"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
 
80
  ### Training Results ###
81
 
82
- | Epoch | Training Loss | Validation Loss |
83
- |-------|---------------|-----------------|
84
- | 0.0 | 4.6426 | 4.704238 |
85
- | 3.0 | 1.5094 | 1.202135 |
86
- | 6.0 | 1.2677 | 1.146177 |
87
- | 9.0 | 1.2613 | 1.112074 |
88
- | 12.0 | 1.1958 | 1.109059 |
 
1
  ### Model Card ###
2
 
3
+ # Information:
4
+
5
+ Base Model: facebook/bart-base
6
+
7
+ Fine-tuned : using PEFT-LoRa
8
+
9
+ Datasets : squad_v2, drop, mou3az/IT_QA-QG
10
+
11
+ Task: Generating questions from context and answers
12
+
13
+ Language: English
14
 
 
15
 
16
+ # Performance Metrics on Evaluation Set:
17
 
18
+ Training Loss: 1.1.1958
19
+
20
+ Evaluation Loss: 1.109059
 
 
 
 
 
 
 
 
 
21
 
22
 
23
  ### Loading the model ###
24
 
25
+ ```python
26
+ from peft import PeftModel, PeftConfig
27
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
28
+ HUGGING_FACE_USER_NAME = "mou3az"
29
+ model_name = "ITandGeneral_Question-Generation"
30
+ peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
31
+ config = PeftConfig.from_pretrained(peft_model_id)
32
+ model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
33
+ QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
34
+ QG_model = PeftModel.from_pretrained(model, peft_model_id)
35
+ ```
36
 
37
  ### At inference time ###
38
 
39
+ ```python
40
+ def get_question(context, answer):
41
+ device = next(L_model.parameters()).device
42
+ input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
43
+ encoding = G_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)
44
+
45
+ output_tokens = L_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
46
+ out = G_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()
47
+
48
+ return out
49
+ ```
50
 
51
  ### Training parameters and hyperparameters ###
52
 
53
+ The following were used during training:
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
+ # For Lora:
56
 
57
+ r=18
58
 
59
+ alpha=8
60
+
61
+
62
+ # For training arguments:
 
63
 
64
+ gradient_accumulation_steps=24
65
+
66
+ per_device_train_batch_size=8
67
+
68
+ per_device_eval_batch_size=8
69
+
70
+ max_steps=1000
71
+
72
+ warmup_steps=50
73
+
74
+ weight_decay=0.05
75
+
76
+ learning_rate=3e-3
77
+
78
+ lr_scheduler_type="linear"
79
 
80
  ### Training Results ###
81
 
82
+ | Epoch | Training Loss | Validation Loss |
83
+ |-------|---------------|-----------------|
84
+ | 0.0 | 4.6426 | 4.704238 |
85
+ | 3.0 | 1.5094 | 1.202135 |
86
+ | 6.0 | 1.2677 | 1.146177 |
87
+ | 9.0 | 1.2613 | 1.112074 |
88
+ | 12.0 | 1.1958 | 1.109059 |