hamsinimk commited on
Commit
571739c
·
verified ·
1 Parent(s): 7551fbe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -26,7 +26,7 @@ After looking more into the training data generation, I noticed that it is tough
26
 
27
  ## 3. Methodology
28
 
29
- Based on previous experimentation with LoRA, I think the method does decently well to increase the model's ability to perform medical reasoning through the finetuning process based on the accuracy increasing for a medical training task. With LoRA, the model is able to actually change how it thinks and reasons, rather than just reiterating/finetuning the context that the training task is medical at hand (which is what prompt tuning does). Since it focuses on updating a subset of weight matrices using low rank adaptation, the model is able to better understand reasoning patterns to properly answer more complex questions as the method modifies the attention heads. To prevent catastrophic forgetting and possibly increase overall accuracy, the number of epochs for training was increased to 3. Based on these factors, I would say for a complex medical dataset / reasoning task, I would choose LoRA as the appropriate finetuning method. After trying 3 different hyperparameter combinations (low capacity, medium capacity, and high capacity LoRA), the medium and high capacity LoRA performed very similarly in terms of validation and training loss. It didn’t make sense to add more parameters with high capacity LoRA, so I went forward with medium capacity LoRA as it provided essentially the same performance. Medium capacity LoRA had r at 32, alpha at 64, and dropout at 15%. During training, the number of epochs was set to 3, learning rate was 0.00001, and the number of evaluation steps was 200. Furthermore, the auto_find_batch_size parameter was not used and instead, the per_device_train_batch_size and per_device_eval_batch_size was set to 1. As a note, the number of evaluation steps at 200 means that validation and training loss will be calculated every 200 steps up to 800 (training data size) per epoch (2400 steps total across all 3 epochs).
30
 
31
 
32
  ## 4. Evaluation
 
26
 
27
  ## 3. Methodology
28
 
29
+ Based on previous experimentation with LoRA, I think the method does decently well to increase the model's ability to perform medical reasoning through the finetuning process based on the accuracy increasing for a medical training task. With LoRA, the model is able to actually change how it thinks and reasons, rather than just reiterating/finetuning the context of the training task (which is what prompt tuning does). Since it focuses on updating a subset of weight matrices using low rank adaptation, the model is able to better understand reasoning patterns to properly answer more complex questions as the method modifies the attention heads. To prevent catastrophic forgetting and possibly increase overall accuracy, the number of epochs for training was increased to 3. Based on these factors, I would say for a complex medical dataset / reasoning task, I would choose LoRA as the appropriate finetuning method. After trying 3 different hyperparameter combinations (low capacity, medium capacity, and high capacity LoRA), the medium and high capacity LoRA performed very similarly in terms of validation and training loss. It didn’t make sense to add more parameters with high capacity LoRA, so I went forward with medium capacity LoRA as it provided essentially the same performance. Medium capacity LoRA had r at 32, alpha at 64, and dropout at 15%. During training, the number of epochs was set to 3, learning rate was 0.00001, and the number of evaluation steps was 200. Furthermore, the auto_find_batch_size parameter was not used and instead, the per_device_train_batch_size and per_device_eval_batch_size was set to 1. As a note, the number of evaluation steps at 200 means that validation and training loss will be calculated every 200 steps up to 800 (training data size) per epoch (2400 steps total across all 3 epochs).
30
 
31
 
32
  ## 4. Evaluation