pfost-bit commited on
Commit
7e30b6b
·
verified ·
1 Parent(s): e9d7ab4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -1
README.md CHANGED
@@ -64,4 +64,21 @@ learning_rate = 0.0003995209593890016
64
  lora_alpha = 128
65
  lora_dropout = .1
66
  lora_r = 64
67
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  lora_alpha = 128
65
  lora_dropout = .1
66
  lora_r = 64
67
+ ```
68
+
69
+ I experimented with full fine tuning, however the model lost a lot of it's functionality and become a repeater. For this reason, I leveraged PEFT methods. I settled on LoRA as it was fairly simple to implement.
70
+
71
+ ### Evaluation
72
+
73
+ For this models evalutation I used three metrics that are common in natural language tasks.
74
+
75
+ * BERT
76
+ * ROUGE
77
+ * BLEU
78
+
79
+ The primary evaluation is BERT score, this is a way to calcuate the similarity between two text inputs. BERT aims to to assess semantic similarity, it measure the difference between the actual forecast, and the generated forecast to see if they have similar semantic meanings, a higher BERT score is better.
80
+
81
+ ROUGE (Recall-Oriented Understudy for Gisting Evaluation) This is used to see if the general gist is similar between the generated forecast and the actual human forecast. A higher ROUGE score is better.
82
+
83
+ BLEU (Bilingual Evaluation Understudy) measures how many words appear in the reference generated human text. This should show if the model is picking up on the "surfer lingo" a higher BLEU score is better.
84
+