Commit ·
386220e
1
Parent(s): a9d086f
Update: Link for data prepare, fine tuning and inference
Browse files
README.md
CHANGED
|
@@ -11,6 +11,9 @@ tags:
|
|
| 11 |
This chatbot model was built via Parameter-Efficient Fine-Tuning of [GPT-J-6B](https://huggingface.co/EleutherAI/gpt-j-6b) on all 16.3k rows of Medical Data. Finetuning was executed on a single A100 (40 GB) for roughly 1 day 7 hours.
|
| 12 |
|
| 13 |
* Model license: GPT-J Community License Agreement
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
### Example prompts and responses
|
| 16 |
|
|
@@ -82,7 +85,16 @@ This model was trained on a single A100 (40 GB) for about 1 Day 7 hours.
|
|
| 82 |
|
| 83 |
Run: July 23, 2023
|
| 84 |
* args: {'lr': 0.001, 'num_epochs': 10, 'seed': 42}
|
| 85 |
-
*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
|
| 87 |
## PreTraining Data
|
| 88 |
For more details on the pretraining process, see [GPT-J-6B](https://huggingface.co/EleutherAI/gpt-j-6b).
|
|
|
|
| 11 |
This chatbot model was built via Parameter-Efficient Fine-Tuning of [GPT-J-6B](https://huggingface.co/EleutherAI/gpt-j-6b) on all 16.3k rows of Medical Data. Finetuning was executed on a single A100 (40 GB) for roughly 1 day 7 hours.
|
| 12 |
|
| 13 |
* Model license: GPT-J Community License Agreement
|
| 14 |
+
* Data Prepare: [data_prepapre code](https://github.com/ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing/blob/main/src/data_generate_prepare/data_prepare.py)
|
| 15 |
+
* Finetuning: [finetune code](https://github.com/ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing/blob/main/src/train_inference_int_peft/trainer_int_peft_lora.py)
|
| 16 |
+
* Inference: [inference code](https://github.com/ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing/blob/main/src/train_inference_int_peft/inference_int_peft_lora.py)
|
| 17 |
|
| 18 |
### Example prompts and responses
|
| 19 |
|
|
|
|
| 85 |
|
| 86 |
Run: July 23, 2023
|
| 87 |
* args: {'lr': 0.001, 'num_epochs': 10, 'seed': 42}
|
| 88 |
+
* log_of_epoch_01:{'eval_loss': 0.9936667084693909, 'eval_runtime': 450.8767, 'eval_samples_per_second': 7.246, 'eval_steps_per_second': 0.455, 'epoch': 1.0}
|
| 89 |
+
* log_of_epoch_02:{'eval_loss': 0.9738781452178955, 'eval_runtime': 447.3755, 'eval_samples_per_second': 7.303, 'eval_steps_per_second': 0.458, 'epoch': 2.0}
|
| 90 |
+
* log_of_epoch_03:{'eval_loss': 0.9600604176521301, 'eval_runtime': 441.2023, 'eval_samples_per_second': 7.405, 'eval_steps_per_second': 0.465, 'epoch': 3.0}
|
| 91 |
+
* log_of_epoch_04:{'eval_loss': 0.9634631872177124, 'eval_runtime': 441.53, 'eval_samples_per_second': 7.399, 'eval_steps_per_second': 0.464, 'epoch': 4.0}
|
| 92 |
+
* log_of_epoch_05:{'eval_loss': 0.961345374584198, 'eval_runtime': 441.3189, 'eval_samples_per_second': 7.403, 'eval_steps_per_second': 0.465, 'epoch': 5.0}
|
| 93 |
+
* log_of_epoch_06:{'eval_loss': 0.9655225872993469, 'eval_runtime': 441.9449, 'eval_samples_per_second': 7.392, 'eval_steps_per_second': 0.464, 'epoch': 6.0}
|
| 94 |
+
* log_of_epoch_07:{'eval_loss': 0.9740663766860962, 'eval_runtime': 441.7603, 'eval_samples_per_second': 7.395, 'eval_steps_per_second': 0.464, 'epoch': 7.0}
|
| 95 |
+
* log_of_epoch_08:{'eval_loss': 0.9907786846160889, 'eval_runtime': 441.6064, 'eval_samples_per_second': 7.398, 'eval_steps_per_second': 0.464, 'epoch': 8.0}
|
| 96 |
+
* log_of_epoch_09:{'eval_loss': 1.0046937465667725, 'eval_runtime': 441.9242, 'eval_samples_per_second': 7.393, 'eval_steps_per_second': 0.464, 'epoch': 9.0}
|
| 97 |
+
* log_of_epoch_10:{'train_runtime': 118063.0495, 'train_samples_per_second': 1.107, 'train_steps_per_second': 0.069, 'train_loss': 0.7715376593637642, 'epoch': 10.0}
|
| 98 |
|
| 99 |
## PreTraining Data
|
| 100 |
For more details on the pretraining process, see [GPT-J-6B](https://huggingface.co/EleutherAI/gpt-j-6b).
|