enemydw
/

Hobby_Recommendation

Model card Files Files and versions

enemydw commited on Apr 30, 2025

Commit

027eb28

·

verified ·

1 Parent(s): dc8e756

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -23,9 +23,9 @@ To evaluate both the general reasoning abilities and the domain-specific perform
 | Llama-3-8B (base)            |    51.4%      |   59.9%   |   73.1%    |
 | Hobby_Recommendation model   |    53.7%      |   59.9%   |   73.2%    |
 | Falcon-7B-Instruct           |    40.2%      |   57.7%   |   67.6%    |
-| Mistral-7B-Instruct          |    --.-%      |   --.-%   |   --.-%    |
-**Model Performance Summary**
 ## Usage and Intended Uses

 | Llama-3-8B (base)            |    51.4%      |   59.9%   |   73.1%    |
 | Hobby_Recommendation model   |    53.7%      |   59.9%   |   73.2%    |
 | Falcon-7B-Instruct           |    40.2%      |   57.7%   |   67.6%    |
+| Mistral-7B-Instruct          |    49.8%      |   56.3%   |   69.6%    |
+The fine-tuned Hobby_Recommendation model showed AN improvements over the base LLaMA-3.1-8B model on some benchmarks and maintained similar performance on HellaSwag. Both Falcon-7B-Instruct and Mistral-7B-Instruct had lower scores than the base model across most benchmarks. While general reasoning ability remained stable, the fine-tuned model performed especially well on hobby-related prompts, suggesting that training on personalized synthetic data helped the model better handle specific recommendation tasks.
 ## Usage and Intended Uses