Update README.md
Browse files
README.md
CHANGED
|
@@ -24,13 +24,41 @@ After finetuning, the LLM performed with a 21.578 in the SQuADv2 benchmark, a
|
|
| 24 |
0.597 in the humaneval benchmark, a 5.040 bleu score in the E2E NLG Challenge benchmark, and a bert score mean precision of 0.813, mean recall of 0.848, and mean f1 of 0.830
|
| 25 |
on a train/test split. The bert scores specifically indicate that my model has a strong alignment between generated and expected responses.
|
| 26 |
|
|
|
|
| 27 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
|
| 29 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
|
| 32 |
## Methodology
|
| 33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
## Evaluation
|
| 35 |
|
| 36 |
## Usage and Intended Use
|
|
@@ -39,4 +67,15 @@ on a train/test split. The bert scores specifically indicate that my model has a
|
|
| 39 |
|
| 40 |
## Expected Output Format
|
| 41 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
## Limitations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
0.597 in the humaneval benchmark, a 5.040 bleu score in the E2E NLG Challenge benchmark, and a bert score mean precision of 0.813, mean recall of 0.848, and mean f1 of 0.830
|
| 25 |
on a train/test split. The bert scores specifically indicate that my model has a strong alignment between generated and expected responses.
|
| 26 |
|
| 27 |
+
## Data
|
| 28 |
|
| 29 |
+
I was able to find a training dataset of job postings on Kaggle (Arshkon, 2023), under a project labeled ‘LinkedIn Job Postings 2023 Data Analysis’.
|
| 30 |
+
The dataset used has ~15,000 jobs from LinkedIn. It includes the company, job title, and a description that includes necessary skills.
|
| 31 |
+
This dataset has a variety of different jobs and descriptions, which allowed my LLM to be trained for a multitude of job descriptions that users may input.
|
| 32 |
+
The other two datasets must be synthetically generated, due to the lack of available datasets.
|
| 33 |
+
For both generated datasets, I used the Llama-3.2-1B-Instruct model, due to its ability to efficiently produce accurate natural language answers,
|
| 34 |
+
as well as technical answers, which was necessary for the interview questions. I used the job postings dataset with few-shot prompting to create the user profile
|
| 35 |
+
dataset. For each job posting in the dataset, I had the model create a 'great', 'mediocre', and 'bad' user profile. An example of the few shot prompting for this was:
|
| 36 |
|
| 37 |
+
|
| 38 |
+
'''
|
| 39 |
+
Job Title: Data Scientist
|
| 40 |
+
Job Description: Analyze data and build predictive models.
|
| 41 |
+
Applicant Profile: Experienced in Python, R, and ML models.
|
| 42 |
+
Interview Question: Tell me about a machine learning project you are proud of.
|
| 43 |
+
Optimal Answer: I developed a predictive model using Python and scikit-learn to forecast customer churn, achieving 85% accuracy by carefully preprocessing the data and tuning hyperparameters.
|
| 44 |
+
'''
|
| 45 |
+
|
| 46 |
+
Finetuning Tasks: As you did for the second project check in,
|
| 47 |
+
clearly define the training data you used, making sure to note any
|
| 48 |
+
modifications you made to existing datasets in terms of combining
|
| 49 |
+
or reformatting them. Make sure to also describe how you
|
| 50 |
+
established a training, validation, and testing split of your data
|
| 51 |
+
(e.g., report the proportion and random seed you used and/or if
|
| 52 |
+
your dataset had a built-in testing split)
|
| 53 |
|
| 54 |
|
| 55 |
## Methodology
|
| 56 |
|
| 57 |
+
Finetuning Tasks: As you did in the fourth project check in,
|
| 58 |
+
describe which method you chose for training and why you chose
|
| 59 |
+
that method. Make note of any hyperparameter values you used so
|
| 60 |
+
that others can reproduce your results.
|
| 61 |
+
|
| 62 |
## Evaluation
|
| 63 |
|
| 64 |
## Usage and Intended Use
|
|
|
|
| 67 |
|
| 68 |
## Expected Output Format
|
| 69 |
|
| 70 |
+
This section should
|
| 71 |
+
briefly describe the expected output format for your model and include a
|
| 72 |
+
general code chunk showing an example model response.
|
| 73 |
+
|
| 74 |
## Limitations
|
| 75 |
+
|
| 76 |
+
This section should summarize the main
|
| 77 |
+
limitations of your model. Limitations could be based on benchmark task
|
| 78 |
+
performance, any observations you noticed when examining model
|
| 79 |
+
responses, or any shortcomings your model has relative to your ideal use
|
| 80 |
+
case.
|
| 81 |
+
|