pfost-bit commited on
Commit
2c46493
·
verified ·
1 Parent(s): 4d14c28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -82,9 +82,12 @@ ROUGE (Recall-Oriented Understudy for Gisting Evaluation) This is used to see if
82
 
83
  BLEU (Bilingual Evaluation Understudy) measures how many words appear in the reference generated human text. This should show if the model is picking up on the "surfer lingo" a higher BLEU score is better.
84
 
 
 
85
  | | QWEN-4B-Instruct-2507 | Llama-3.2-3B-Instruct | google/gemma-2-2b-it | SurfMine |
86
  |-------|:---------------------:|:---------------------:|:--------------------:|:------------:|
87
  | BERT | **.8215** | .8141 | .8201 | **.8717** |
88
  | ROUGE | **.1097** | .1053 | .1075 | **.2074** |
89
  | BLEU | .0051 | .0032 | **.0059** | **.0702** |
90
 
 
 
82
 
83
  BLEU (Bilingual Evaluation Understudy) measures how many words appear in the reference generated human text. This should show if the model is picking up on the "surfer lingo" a higher BLEU score is better.
84
 
85
+ I chose two models of a similar size from large AI researchers as benchmarks.
86
+
87
  | | QWEN-4B-Instruct-2507 | Llama-3.2-3B-Instruct | google/gemma-2-2b-it | SurfMine |
88
  |-------|:---------------------:|:---------------------:|:--------------------:|:------------:|
89
  | BERT | **.8215** | .8141 | .8201 | **.8717** |
90
  | ROUGE | **.1097** | .1053 | .1075 | **.2074** |
91
  | BLEU | .0051 | .0032 | **.0059** | **.0702** |
92
 
93
+ SurfMine does better in all metrics whenm compared to the base model as well as the chosen benchmark models.