| Steps taken: | |
| 1 - I extracted one disease from the pdf. I realized all the diseases have one template format. I extracted that into Gpt4 and then added just one disease. (This needs to be done for all the diseases or information, we have to figure out a way to represent it.) | |
| 2 - After adding one diseases i asked codeinterpreter to create a json for me, and then flatten that json so pandas can read it. | |
| 3 - I asked Gpt4 to create for me a synthetic dataset based on the format I gave it. I landed on this article: https://www.datacamp.com/tutorial/fine-tuning-gpt-3-using-the-open-ai-api-and-python | |
| The article made things easy for me, using one-shot learning Gpt4 was able to create a dataset for me under simple instructions. | |
| 4 - Now I just need to set up with the API key and test on the API | |
| Another useful link: https://medium.com/@_sumitsaha_/fine-tuning-openais-gpt3-model-fd9cb06517f6#:~:text=Fine%2Dtuning%20allows%20the%20model,efficient%2C%20and%20easier%20to%20train. | |
| Note: This file contains experimentation by Atwine Mostly. | |