ocbyram
/

Interview_Prep_Help

Transformers

Safetensors

English

Model card Files Files and versions

xet

Community

ocbyram commited on Nov 29, 2025

Commit

d7d3664

verified ·

1 Parent(s): 41de995

Update README.md

Browse files

Files changed (1) hide show

README.md +40 -1

README.md CHANGED Viewed

@@ -24,13 +24,41 @@ After finetuning, the LLM performed with a 21.578 in the SQuADv2 benchmark, a
 0.597 in the humaneval benchmark, a 5.040 bleu score in the E2E NLG Challenge benchmark, and a bert score mean precision of 0.813, mean recall of 0.848, and mean f1 of 0.830
 on a train/test split. The bert scores specifically indicate that my model has a strong alignment between generated and expected responses.
-## Data
 ## Methodology
 ## Evaluation
 ## Usage and Intended Use
@@ -39,4 +67,15 @@ on a train/test split. The bert scores specifically indicate that my model has a
 ## Expected Output Format
 ## Limitations

 0.597 in the humaneval benchmark, a 5.040 bleu score in the E2E NLG Challenge benchmark, and a bert score mean precision of 0.813, mean recall of 0.848, and mean f1 of 0.830
 on a train/test split. The bert scores specifically indicate that my model has a strong alignment between generated and expected responses.
+## Data
+I was able to find a training dataset of job postings on Kaggle (Arshkon, 2023), under a project labeled ‘LinkedIn Job Postings 2023 Data Analysis’.
+The dataset used has ~15,000 jobs from LinkedIn. It includes the company, job title, and a description that includes necessary skills.
+This dataset has a variety of different jobs and descriptions, which allowed my LLM to be trained for a multitude of job descriptions that users may input.
+The other two datasets must be synthetically generated, due to the lack of available datasets.
+For both generated datasets, I used the Llama-3.2-1B-Instruct model, due to its ability to efficiently produce accurate natural language answers,
+as well as technical answers, which was necessary for the interview questions. I used the job postings dataset with few-shot prompting to create the user profile
+dataset. For each job posting in the dataset, I had the model create a 'great', 'mediocre', and 'bad' user profile. An example of the few shot prompting for this was:
+'''
+Job Title: Data Scientist
+Job Description: Analyze data and build predictive models.
+Applicant Profile: Experienced in Python, R, and ML models.
+Interview Question: Tell me about a machine learning project you are proud of.
+Optimal Answer: I developed a predictive model using Python and scikit-learn to forecast customer churn, achieving 85% accuracy by carefully preprocessing the data and tuning hyperparameters.
+'''
+Finetuning Tasks: As you did for the second project check in,
+clearly define the training data you used, making sure to note any
+modifications you made to existing datasets in terms of combining
+or reformatting them. Make sure to also describe how you
+established a training, validation, and testing split of your data
+(e.g., report the proportion and random seed you used and/or if
+your dataset had a built-in testing split)
 ## Methodology
+Finetuning Tasks: As you did in the fourth project check in,
+describe which method you chose for training and why you chose
+that method. Make note of any hyperparameter values you used so
+that others can reproduce your results.
 ## Evaluation
 ## Usage and Intended Use
 ## Expected Output Format
+ This section should
+briefly describe the expected output format for your model and include a
+general code chunk showing an example model response.
 ## Limitations
+This section should summarize the main
+limitations of your model. Limitations could be based on benchmark task
+performance, any observations you noticed when examining model
+responses, or any shortcomings your model has relative to your ideal use
+case.